├── requirements.txt
├── .gitignore
├── LICENSE.md
├── util.py
├── README.md
└── collide.py


/requirements.txt:
--------------------------------------------------------------------------------
1 | Pillow
2 | numpy
3 | onnx
4 | onnx_tf
5 | scipy
6 | tensorflow
7 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | NeuralHashv3b-current.espresso.*
2 | model.onnx
3 | neuralhash_128x96_seed1.dat
4 | *.png
5 | *.jpg
6 | 


--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
 1 | The MIT License (MIT)
 2 | =====================
 3 | 
 4 | **Copyright (c) Anish Athalye (me@anishathalye.com)**
 5 | 
 6 | Permission is hereby granted, free of charge, to any person obtaining a copy of
 7 | this software and associated documentation files (the "Software"), to deal in
 8 | the Software without restriction, including without limitation the rights to
 9 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
10 | of the Software, and to permit persons to whom the Software is furnished to do
11 | so, subject to the following conditions:
12 | 
13 | The above copyright notice and this permission notice shall be included in all
14 | copies or substantial portions of the Software.
15 | 
16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22 | SOFTWARE.
23 | 


--------------------------------------------------------------------------------
/util.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) Anish Athalye. Released under the MIT license.
 2 | 
 3 | import numpy as np
 4 | import onnx
 5 | from onnx_tf.backend import prepare
 6 | from PIL import Image
 7 | 
 8 | 
 9 | def load_model(path):
10 |     onnx_model = onnx.load(path)
11 |     model = prepare(onnx_model, training_mode=True)
12 |     return model
13 | 
14 | 
15 | def load_seed(path):
16 |     seed = open(path, 'rb').read()[128:]
17 |     seed = np.frombuffer(seed, dtype=np.float32)
18 |     seed = seed.reshape([96, 128])
19 |     return seed
20 | 
21 | 
22 | def load_image(path):
23 |     im = Image.open(path).convert('RGB')
24 |     im = im.resize([360, 360])
25 |     arr = np.array(im).astype(np.float32) / 255.0
26 |     arr = arr * 2.0 - 1.0
27 |     arr = arr.transpose(2, 0, 1).reshape([1, 3, 360, 360])
28 |     return arr
29 | 
30 | 
31 | def save_image(arr, path):
32 |     arr = arr.reshape([3, 360, 360]).transpose(1, 2, 0)
33 |     arr = (arr + 1.0) * (255.0 / 2.0)
34 |     arr = arr.astype(np.uint8)
35 |     im = Image.fromarray(arr)
36 |     im.save(path)
37 | 
38 | 
39 | def hash_from_hex(hex_repr):
40 |     n = int(hex_repr, 16)
41 |     h = np.zeros(96)
42 |     for i in range(96):
43 |         h[i] = (n >> (95 - i)) & 1
44 |     return h
45 | 
46 | 
47 | def hash_to_hex(h):
48 |     bits = ''.join(['1' if i >= 0.5 else '0' for i in h])
49 |     return '{:0{}x}'.format(int(bits, 2), len(bits) // 4)
50 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # NeuralHash Collider
  2 | 
  3 | Find target [hash collisions] for Apple's [NeuralHash] perceptual hash function.
  4 | 
  5 | For example, starting from a picture of [this
  6 | cat](https://github.com/anishathalye/neural-hash-collider/raw/assets/cat.jpg),
  7 | we can find an adversarial image that has the same hash as the
  8 | [picture](https://user-images.githubusercontent.com/1328/129860794-e7eb0132-d929-4c9d-b92e-4e4faba9e849.png)
  9 | of the dog in [this post][hash collisions]:
 10 | 
 11 | ```bash
 12 | python collide.py --image cat.jpg --target 59a34eabe31910abfb06f308
 13 | ```
 14 | 
 15 | ![Cat image with NeuralHash 59a34eabe31910abfb06f308](https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/cat-adv.png) ![Dog image with NeuralHash 59a34eabe31910abfb06f308](https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/dog.png)
 16 | 
 17 | We can confirm the hash collision using `nnhash.py` from
 18 | [AsuharietYgvar/AppleNeuralHash2ONNX]:
 19 | 
 20 | ```console
 21 | $ python nnhash.py dog.png
 22 | 59a34eabe31910abfb06f308
 23 | $ python nnhash.py adv.png
 24 | 59a34eabe31910abfb06f308
 25 | ```
 26 | 
 27 | [hash collisions]: https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/issues/1
 28 | [NeuralHash]: https://www.apple.com/child-safety/pdf/CSAM_Detection_Technical_Summary.pdf
 29 | 
 30 | ## How it works
 31 | 
 32 | NeuralHash is a [perceptual hash
 33 | function](https://en.wikipedia.org/wiki/Perceptual_hashing) that uses a neural
 34 | network. Images are resized to 360x360 and passed through a neural network to
 35 | produce a 128-dimensional feature vector. Then, the vector is projected onto
 36 | R^96 using a 128x96 "seed" matrix. Finally, to produce a 96-bit hash, the
 37 | 96-dimensional vector is thresholded: negative entries turn into a `0` bit, and
 38 | non-negative entries turn into a `1` bit.
 39 | 
 40 | This entire process, except for the thresholding, is differentiable, so we can
 41 | use gradient descent to find hash collisions. This is a well-known property of
 42 | neural networks, that they are vulnerable to [adversarial
 43 | examples](https://arxiv.org/abs/1312.6199).
 44 | 
 45 | We can define a loss that captures how close an image is to a given target
 46 | hash: this loss is basically just the NeuralHash algorithm as described above,
 47 | but with the final "hard" thresholding step tweaked so that it is "soft" (in
 48 | particular, differentiable). Exactly how this is done (choices of activation
 49 | functions, parameters, etc.) can affect convergence, so it can require some
 50 | experimentation. After choosing the loss function, we can follow the standard
 51 | method to find adversarial examples for neural networks: gradient descent.
 52 | 
 53 | ### Details
 54 | 
 55 | The implementation currently does an alternating projections style attack to
 56 | find an adversarial example that has the intended hash and also looks similar
 57 | to the original. See `collide.py` for the full details. The implementation uses
 58 | two different loss functions: one measures the distance to the target hash, and
 59 | the other measures the quality of the perturbation (l2 norm + total variation).
 60 | We first optimize for a collision, focusing only on matching the target hash.
 61 | Once we find a projection, we alternate between minimizing the perturbation and
 62 | ensuring that the hash value does not change. The attack has a number of
 63 | parameters; run `python collide.py --help` or refer to the code for a full
 64 | list. Tweaking these parameters can make a big difference in convergence time
 65 | and the quality of the output.
 66 | 
 67 | The implementation also supports a flag `--blur [sigma]` that blurs the
 68 | perturbation on every step of the search. This can slow down or break
 69 | convergence, but on some examples, it can be helpful for getting results that
 70 | look more natural and less like glitch art.
 71 | 
 72 | ## Examples
 73 | 
 74 | Reproducing the [Lena](https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/lena.png)/[Barbara](https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/barbara.png) result from [this post](https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/issues/1#issuecomment-903094036):
 75 | 
 76 | <img width="200" src="https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/lena.png"></img> <img width="200" src="https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/lena-adv.png"></img> <img width="200" src="https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/lena-adv-blur-1.0.png"></img> <img width="200" src="https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/barbara.png"></img>
 77 | 
 78 | The first image above is the original Lena image. The second was produced with `--target a426dae78cc63799d01adc32` to collide with Barbara. The third was produced with the additional argument `--blur 1.0`. The fourth is the original Barbara image. Checking their hashes:
 79 | 
 80 | ```console
 81 | $ python nnhash.py lena.png
 82 | 32dac883f7b91bbf45a48296
 83 | $ python nnhash.py lena-adv.png
 84 | a426dae78cc63799d01adc32
 85 | $ python nnhash.py lena-adv-blur-1.0.png
 86 | a426dae78cc63799d01adc32
 87 | $ python nnhash.py barbara.png
 88 | a426dae78cc63799d01adc32
 89 | ```
 90 | 
 91 | Reproducing the [Picard](https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/picard.png)/[Sidious](https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/sidious.png) result from [this post](https://github.com/anishathalye/neural-hash-collider/issues/4):
 92 | 
 93 | <img width="200" src="https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/picard.png"></img> <img width="200" src="https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/picard-adv.png"></img> <img width="200" src="https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/picard-adv-blur-0.5.png"></img> <img width="200" src="https://raw.githubusercontent.com/anishathalye/assets/master/neural-hash-collider/sidious.png"></img>
 94 | 
 95 | The first image above is the original Picard image. The second was produced with `--target e34b3da852103c3c0828fbd1 --tv-weight 3e-4` to collide with Sidious. The third was produced with the additional argument `--blur 0.5`. The fourth is the original Sidious image. Checking their hashes:
 96 | 
 97 | ```console
 98 | $ python nnhash.py picard.png
 99 | 73fae120ad3191075efd5580
100 | $ python nnhash.py picard-adv.png
101 | e34b2da852103c3c0828fbd1
102 | $ python nnhash.py picard-adv-blur-0.5.png
103 | e34b2da852103c3c0828fbd1
104 | $ python nnhash.py sidious.png
105 | e34b2da852103c3c0828fbd1
106 | ```
107 | 
108 | ## Prerequisites
109 | 
110 | - Get Apple's NeuralHash model following the instructions in
111 |   [AsuharietYgvar/AppleNeuralHash2ONNX] and either put all the
112 |   files in this directory or supply the `--model` / `--seed` arguments
113 | - Install Python dependencies: `pip install -r requirements.txt`
114 | 
115 | [AsuharietYgvar/AppleNeuralHash2ONNX]: https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX
116 | 
117 | ## Usage
118 | 
119 | Run `python collide.py --image [path to image] --target [target hash]` to
120 | generate a hash collision. Run `python collide.py --help` to see all the
121 | options, including some knobs you can tweak, like the learning rate and some
122 | other parameters.
123 | 
124 | ## Limitations
125 | 
126 | The code in this repository is intended to be a demonstration, and perhaps a
127 | starting point for other exploration. Tweaking the implementation (choice of
128 | loss function, choice of parameters, etc.) might produce much better results
129 | than this code currently achieves.
130 | 
131 | ## Citation
132 | 
133 | ```bibtex
134 | @misc{athalye2021neuralhashcollider,
135 |   author = {Anish Athalye},
136 |   title = {NeuralHash Collider},
137 |   year = {2021},
138 |   howpublished = {\url{https://github.com/anishathalye/neural-hash-collider}},
139 | }
140 | ```
141 | 


--------------------------------------------------------------------------------
/collide.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Anish Athalye. Released under the MIT license.
  2 | 
  3 | import numpy as np
  4 | import tensorflow as tf
  5 | from scipy.ndimage.filters import gaussian_filter
  6 | import argparse
  7 | import os
  8 | 
  9 | from util import *
 10 | 
 11 | 
 12 | DEFAULT_MODEL_PATH = 'model.onnx'
 13 | DEFAULT_SEED_PATH = 'neuralhash_128x96_seed1.dat'
 14 | DEFAULT_TARGET_HASH = '59a34eabe31910abfb06f308'
 15 | DEFAULT_ITERATIONS = 10000
 16 | DEFAULT_SAVE_ITERATIONS = 0
 17 | DEFAULT_LR = 2.0
 18 | DEFAULT_COMBINED_THRESHOLD = 2
 19 | DEFAULT_K = 10.0
 20 | DEFAULT_CLIP_RANGE = 0.1
 21 | DEFAULT_W_L2 = 2e-3
 22 | DEFAULT_W_TV = 1e-4
 23 | DEFAULT_W_HASH = 0.8
 24 | DEFAULT_BLUR = 0
 25 | 
 26 | 
 27 | def main():
 28 |     tf.compat.v1.disable_eager_execution()
 29 |     options = get_options()
 30 | 
 31 |     model = load_model(options.model)
 32 |     image = model.tensor_dict['image']
 33 |     logits = model.tensor_dict['leaf/logits']
 34 |     seed = load_seed(options.seed)
 35 | 
 36 |     target = hash_from_hex(options.target)
 37 | 
 38 |     original = load_image(options.image)
 39 |     h = hash_from_hex(options.target)
 40 | 
 41 |     with model.graph.as_default():
 42 |         with tf.compat.v1.Session() as sess:
 43 |             sess.run(tf.compat.v1.global_variables_initializer())
 44 | 
 45 |             proj = tf.reshape(tf.linalg.matmul(seed, tf.reshape(logits, (128, 1))), (96,))
 46 |             # proj is in R^96; it's interpreted as a 96-bit hash by mapping
 47 |             # entries < 0 to the bit '0', and entries >= 0 to the bit '1'
 48 |             normalized, _ = tf.linalg.normalize(proj)
 49 |             hash_output = tf.sigmoid(normalized * options.k)
 50 |             # now, hash_output has entries in (0, 1); it's interpreted by
 51 |             # mapping entries < 0.5 to the bit '0' and entries >= 0.5 to the
 52 |             # bit '1'
 53 | 
 54 |             # we clip hash_output to (clip_range, 1-clip_range); this seems to
 55 |             # improve the search (we don't "waste" perturbation tweaking
 56 |             # "strong" bits); the sigmoid already does this to some degree, but
 57 |             # this seems to help
 58 |             hash_output = tf.clip_by_value(hash_output, options.clip_range, 1.0 - options.clip_range) - 0.5
 59 |             hash_output = hash_output * (0.5 / (0.5 - options.clip_range))
 60 |             hash_output = hash_output + 0.5
 61 | 
 62 |             # hash loss: how far away we are from the target hash
 63 |             hash_loss = tf.math.reduce_sum(tf.math.squared_difference(hash_output, h))
 64 | 
 65 |             perturbation = image - original
 66 |             # image loss: how big / noticeable is the perturbation?
 67 |             img_loss = options.l2_weight * tf.nn.l2_loss(perturbation) + options.tv_weight * tf.image.total_variation(perturbation)[0]
 68 | 
 69 |             # combined loss: try to minimize both at once
 70 |             combined_loss = options.hash_weight * hash_loss + (1 - options.hash_weight) * img_loss
 71 | 
 72 |             # gradients of all the losses
 73 |             g_hash_loss, = tf.gradients(hash_loss, image)
 74 |             g_img_loss, = tf.gradients(img_loss, image)
 75 |             g_combined_loss, = tf.gradients(combined_loss, image)
 76 | 
 77 |             # perform attack
 78 | 
 79 |             x = original
 80 |             best = (float('inf'), 0)  # (distance, image quality loss)
 81 |             dist = float('inf')
 82 | 
 83 |             for i in range(options.iterations):
 84 |                 # we do an alternating projections style attack here; if we
 85 |                 # haven't found a colliding image yet, only optimize for that;
 86 |                 # if we have a colliding image, then minimize the size of the
 87 |                 # perturbation; if we're close, then do both at once
 88 |                 if dist == 0:
 89 |                     loss_name, loss, g = 'image', img_loss, g_img_loss
 90 |                 elif best[0] == 0 and dist <= options.combined_threshold:
 91 |                     loss_name, loss, g = 'combined', combined_loss, g_combined_loss
 92 |                 else:
 93 |                     loss_name, loss, g = 'hash', hash_loss, g_hash_loss
 94 | 
 95 |                 # compute loss values and gradient
 96 |                 xq = quantize(x)  # take derivatives wrt the quantized version of the image
 97 |                 hash_output_v, img_loss_v, loss_v, g_v = sess.run([hash_output, img_loss, loss, g], feed_dict={image: xq})
 98 |                 dist = np.sum((hash_output_v >= 0.5) != (h >= 0.5))
 99 | 
100 |                 # if it's better than any image found so far, save it
101 |                 score = (dist, img_loss_v)
102 |                 if score < best or (options.save_iterations > 0 and (i+1) % options.save_iterations == 0):
103 |                     save_image(x, os.path.join(options.save_directory, 'out_iter={:05d}_dist={:02d}_q={:.3f}.png'.format(i+1, dist, img_loss_v)))
104 |                 if score < best:
105 |                     best = score
106 | 
107 |                 # gradient descent step
108 |                 g_v_norm = g_v / np.linalg.norm(g_v)
109 |                 x = x - options.learning_rate * g_v_norm
110 |                 if options.blur > 0:
111 |                     x = blur_perturbation(original, x, options.blur)
112 |                 x = x.clip(-1, 1)
113 |                 print('iteration: {}/{}, best: ({}, {:.3f}), hash: {}, distance: {}, loss: {:.3f} ({})'.format(
114 |                     i+1,
115 |                     options.iterations,
116 |                     best[0],
117 |                     best[1],
118 |                     hash_to_hex(hash_output_v),
119 |                     dist,
120 |                     loss_v,
121 |                     loss_name
122 |                 ))
123 | 
124 | 
125 | def quantize(x):
126 |     x = (x + 1.0) * (255.0 / 2.0)
127 |     x = x.astype(np.uint8).astype(np.float32)
128 |     x = x / (255.0 / 2.0) - 1.0
129 |     return x
130 | 
131 | 
132 | def blur_perturbation(original, x, sigma):
133 |     perturbation = x - original
134 |     perturbation = gaussian_filter_by_channel(perturbation, sigma=sigma)
135 |     return original + perturbation
136 | 
137 | 
138 | def gaussian_filter_by_channel(x, sigma):
139 |     return np.stack([gaussian_filter(x[0,ch,:,:], sigma) for ch in range(x.shape[1])])[np.newaxis]
140 | 
141 | 
142 | def get_options():
143 |     parser = argparse.ArgumentParser()
144 |     parser.add_argument('--image', type=str, help='path to starting image', required=True)
145 |     parser.add_argument('--model', type=str, help='path to model', default=DEFAULT_MODEL_PATH)
146 |     parser.add_argument('--seed', type=str, help='path to seed', default=DEFAULT_SEED_PATH)
147 |     parser.add_argument('--target', type=str, help='target hash', default=DEFAULT_TARGET_HASH)
148 |     parser.add_argument('--learning-rate', type=float, help='learning rate', default=DEFAULT_LR)
149 |     parser.add_argument('--combined-threshold', type=int, help='threshold to start using combined loss', default=DEFAULT_COMBINED_THRESHOLD)
150 |     parser.add_argument('--k', type=float, help='k parameter', default=DEFAULT_K)
151 |     parser.add_argument('--l2-weight', type=float, help='perturbation l2 loss weight', default=DEFAULT_W_L2)
152 |     parser.add_argument('--tv-weight', type=float, help='perturbation total variation loss weight', default=DEFAULT_W_TV)
153 |     parser.add_argument('--hash-weight', type=float, help='relative weight (0.0 to 1.0) of hash in combined loss', default=DEFAULT_W_HASH)
154 |     parser.add_argument('--clip-range', type=float, help='clip range parameter', default=DEFAULT_CLIP_RANGE)
155 |     parser.add_argument('--iterations', type=int, help='max number of iterations', default=DEFAULT_ITERATIONS)
156 |     parser.add_argument('--save-directory', type=str, help='directory to save output images', default='.')
157 |     parser.add_argument('--save-iterations', type=int, help='save this frequently, regardless of improvement', default=DEFAULT_SAVE_ITERATIONS)
158 |     parser.add_argument('--blur', type=float, help='apply Gaussian blur with this sigma on every step', default=DEFAULT_BLUR)
159 |     return parser.parse_args()
160 | 
161 | 
162 | if __name__ == '__main__':
163 |     main()
164 | 


--------------------------------------------------------------------------------