├── .gitignore
├── LICENSE
├── README.md
├── collision.jpeg
├── doge.jpeg
├── find_collision.py
├── requirements.txt
└── titanic.jpeg


/.gitignore:
--------------------------------------------------------------------------------
  1 | .DS_Store
  2 | # Byte-compiled / optimized / DLL files
  3 | __pycache__/
  4 | *.py[cod]
  5 | *$py.class
  6 | 
  7 | # C extensions
  8 | *.so
  9 | 
 10 | # Distribution / packaging
 11 | .Python
 12 | build/
 13 | develop-eggs/
 14 | dist/
 15 | downloads/
 16 | eggs/
 17 | .eggs/
 18 | lib/
 19 | lib64/
 20 | parts/
 21 | sdist/
 22 | var/
 23 | wheels/
 24 | pip-wheel-metadata/
 25 | share/python-wheels/
 26 | *.egg-info/
 27 | .installed.cfg
 28 | *.egg
 29 | MANIFEST
 30 | 
 31 | # PyInstaller
 32 | #  Usually these files are written by a python script from a template
 33 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 34 | *.manifest
 35 | *.spec
 36 | 
 37 | # Installer logs
 38 | pip-log.txt
 39 | pip-delete-this-directory.txt
 40 | 
 41 | # Unit test / coverage reports
 42 | htmlcov/
 43 | .tox/
 44 | .nox/
 45 | .coverage
 46 | .coverage.*
 47 | .cache
 48 | nosetests.xml
 49 | coverage.xml
 50 | *.cover
 51 | *.py,cover
 52 | .hypothesis/
 53 | .pytest_cache/
 54 | 
 55 | # Translations
 56 | *.mo
 57 | *.pot
 58 | 
 59 | # Django stuff:
 60 | *.log
 61 | local_settings.py
 62 | db.sqlite3
 63 | db.sqlite3-journal
 64 | 
 65 | # Flask stuff:
 66 | instance/
 67 | .webassets-cache
 68 | 
 69 | # Scrapy stuff:
 70 | .scrapy
 71 | 
 72 | # Sphinx documentation
 73 | docs/_build/
 74 | 
 75 | # PyBuilder
 76 | target/
 77 | 
 78 | # Jupyter Notebook
 79 | .ipynb_checkpoints
 80 | 
 81 | # IPython
 82 | profile_default/
 83 | ipython_config.py
 84 | 
 85 | # pyenv
 86 | .python-version
 87 | 
 88 | # pipenv
 89 | #   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
 90 | #   However, in case of collaboration, if having platform-specific dependencies or dependencies
 91 | #   having no cross-platform support, pipenv may install dependencies that don't work, or not
 92 | #   install all needed dependencies.
 93 | #Pipfile.lock
 94 | 
 95 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
 96 | __pypackages__/
 97 | 
 98 | # Celery stuff
 99 | celerybeat-schedule
100 | celerybeat.pid
101 | 
102 | # SageMath parsed files
103 | *.sage.py
104 | 
105 | # Environments
106 | .env
107 | .venv
108 | env/
109 | venv/
110 | ENV/
111 | env.bak/
112 | venv.bak/
113 | 
114 | # Spyder project settings
115 | .spyderproject
116 | .spyproject
117 | 
118 | # Rope project settings
119 | .ropeproject
120 | 
121 | # mkdocs documentation
122 | /site
123 | 
124 | # mypy
125 | .mypy_cache/
126 | .dmypy.json
127 | dmypy.json
128 | 
129 | # Pyre type checker
130 | .pyre/
131 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2021 Yannic Kilcher
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Neural Hash Collision Creator
 2 | 
 3 | 1. Use https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX to obtain the model and convert it to onnx.
 4 | 2. Use `onnx_tf` to convert the onnx model to tensorflow and save it in `model.pb`.
 5 | 3. Run the script. It will start off making lots of progress and then slowly approach a loss of -1. If you don't succeed, try raising the `eps` value in the script and play around with `eps_step` to control the noisyness.
 6 | 4. Confirm using `nnhash.py` from https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX.
 7 | 
 8 | Note: The final output might have a slightly different hash due to quantization and clipping. Just try again until you find an exact match.
 9 | 
10 | Images from wikipedia
11 | 
12 | Concurrent work: https://github.com/anishathalye/neural-hash-collider
13 | More concurrent work: https://github.com/greentfrapp/apple-neuralhash-attack
14 | 
15 | Pull requests to improve this repo welcome.
16 | 


--------------------------------------------------------------------------------
/collision.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yk/neural_hash_collision/b18f0de65f28d38cce38c736a4c33a5cc84df2ff/collision.jpeg


--------------------------------------------------------------------------------
/doge.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yk/neural_hash_collision/b18f0de65f28d38cce38c736a4c33a5cc84df2ff/doge.jpeg


--------------------------------------------------------------------------------
/find_collision.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python3
 2 | 
 3 | #partially based on https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/blob/master/nnhash.py
 4 | 
 5 | import tensorflow as tf
 6 | from art.estimators.classification import TensorFlowV2Classifier
 7 | from art.attacks import evasion
 8 | import numpy as np
 9 | from PIL import Image
10 | 
11 | seed_fn = './neuralhash_128x96_seed1.dat'
12 | 
13 | seed1 = open(seed_fn, 'rb').read()[128:]
14 | seed1 = np.frombuffer(seed1, dtype=np.float32)
15 | seed1 = seed1.reshape([96, 128])
16 | 
17 | def _neural_hash(out):
18 |     hash_output = seed1.dot(out.numpy().flatten())
19 |     hash_bits = ''.join(['1' if it >= 0 else '0' for it in hash_output])
20 |     hash_hex = '{:0{}x}'.format(int(hash_bits, 2), len(hash_bits) // 4)
21 |     return hash_hex
22 | 
23 | def _load_img(fname):
24 |     img = Image.open(fname)
25 |     img = img.convert('RGB')
26 |     img = img.resize([360, 360])
27 |     img = np.array(img).astype(np.float32) / 255.0
28 |     img = img * 2.0 - 1.0
29 |     img = img.transpose(2, 0, 1).reshape([1, 3, 360, 360])
30 |     return img
31 | 
32 | model = tf.saved_model.load('./model.pb')
33 | last_input = None
34 | def _model(inp, **kwargs):
35 |     global last_input
36 |     last_input = np.copy(inp)
37 |     output = model(image=inp)[0][0, :, 0, 0]
38 |     return output
39 | 
40 | source_img = _load_img('./doge.jpeg')
41 | target_img = _load_img('./titanic.jpeg')
42 | loss_object = tf.keras.losses.CosineSimilarity()
43 | 
44 | target_hash = _model(target_img)
45 | target_neural_hash = _neural_hash(target_hash)
46 | 
47 | print('TARGET HASH: ', target_neural_hash)
48 | 
49 | def _save_img(img, fname):
50 |     img = img.transpose(1, 2, 0)
51 |     img = img + 1.
52 |     img = img / 2.
53 |     img = img.clip(0., 1.)
54 |     img = np.uint8(img * 255) # final image might not have the exact same hash due to clipping & quantization
55 |     img = Image.fromarray(img, 'RGB')
56 |     img.save(fname, quality=100)
57 | 
58 | def _loss(dummy, img_hash, *args, **kwargs):
59 |     loss = loss_object(img_hash, target_hash)
60 |     print('Loss: ', loss.numpy().item())
61 |     neural_hash = _neural_hash(img_hash)
62 |     print('HASH: ', neural_hash)
63 |     if neural_hash == target_neural_hash:
64 |         print('-------------')
65 |         print('Collision found!!!')
66 |         print('-------------')
67 |         _save_img(last_input[0], 'collision.jpeg')
68 |         exit(0) # we could also continue here & decrease the step size to lower the loss even more
69 |     return -loss
70 | 
71 | classifier = TensorFlowV2Classifier(
72 |     model=_model,
73 |     loss_object=_loss,
74 |     nb_classes=128,
75 |     input_shape=(3, 360, 360),
76 |     clip_values=(-1, 1),
77 | )
78 | 
79 | attack = evasion.ProjectedGradientDescentTensorFlowV2(
80 |         estimator=classifier, 
81 |         norm=2, # other common option is np.inf here, but eps and eps_step need to be changed
82 |         eps=60., # if no success, raise this. for less artifacts, lower this.
83 |         eps_step=.5, # raising this makes it go faster, but more noisy
84 |         targeted=False,
85 |         max_iter=5000,
86 |         )
87 | 
88 | result = attack.generate(x=source_img, y=(0,))
89 | 
90 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | tensorflow
2 | Pillow
3 | numpy
4 | adversarial-robustness-toolbox
5 | 


--------------------------------------------------------------------------------
/titanic.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yk/neural_hash_collision/b18f0de65f28d38cce38c736a4c33a5cc84df2ff/titanic.jpeg


--------------------------------------------------------------------------------