├── .editorconfig ├── .gitignore ├── LICENSES.md ├── README.md ├── image_match ├── __init__.py ├── elasticsearch_driver.py ├── goldberg.py ├── mongodb_driver.py └── signature_database_base.py └── setup.py /.editorconfig: -------------------------------------------------------------------------------- 1 | # EditorConfig is awesome: http://EditorConfig.org 2 | 3 | # top-most EditorConfig file 4 | root = true 5 | 6 | # Unix-style newlines with a newline ending every file 7 | [*] 8 | end_of_line = lf 9 | charset = utf-8 10 | 11 | [*.{md,py}] 12 | max_line_length = 79 13 | 14 | # 4 space indentation 15 | [*.py] 16 | indent_style = space 17 | indent_size = 4 18 | trim_trailing_whitespace = true 19 | insert_final_newline = true 20 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | 5 | # C extensions 6 | *.so 7 | 8 | # Distribution / packaging 9 | .Python 10 | env/ 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | *.egg-info/ 23 | .installed.cfg 24 | *.egg 25 | 26 | # PyInstaller 27 | # Usually these files are written by a python script from a template 28 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 29 | *.manifest 30 | *.spec 31 | 32 | # Installer logs 33 | pip-log.txt 34 | pip-delete-this-directory.txt 35 | 36 | # Unit test / coverage reports 37 | htmlcov/ 38 | .tox/ 39 | .coverage 40 | .coverage.* 41 | .cache 42 | nosetests.xml 43 | coverage.xml 44 | 45 | # Translations 46 | *.mo 47 | *.pot 48 | 49 | # Django stuff: 50 | *.log 51 | 52 | # Sphinx documentation 53 | docs/_build/ 54 | 55 | # PyBuilder 56 | target/ 57 | image_match/web/static/tmp/ 58 | -------------------------------------------------------------------------------- /LICENSES.md: -------------------------------------------------------------------------------- 1 | # Code Licenses 2 | 3 | All code is licensed under the Apache License, Version 2.0, the full text of which can be found at [http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0). 4 | 5 | # Documentation Licenses 6 | 7 | The official image-match documentation, _except for the short code snippets embedded within it_, is licensed under a Creative Commons Attribution-ShareAlike 4.0 International license, the full text of which can be found at [http://creativecommons.org/licenses/by-sa/4.0/legalcode](http://creativecommons.org/licenses/by-sa/4.0/legalcode). -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # This repository is no longer maintained! You can now find image-match at https://github.com/ascribe/image-match 2 | 3 | # image-match 4 | image-match is a simple package for finding approximate image matches from a 5 | corpus. It is similar, for instance, to [pHash](http://www.phash.org/), but 6 | includes a database backend that easily scales to billions of images and 7 | supports sustained high rates of image insertion: up to 10,000 images/s on our 8 | cluster! 9 | 10 | Based on the paper [_An image signature for any kind of image_, Goldberg et 11 | al](http://www.cs.cmu.edu/~hcwong/Pdfs/icip02.ps). There is an existing 12 | [reference implementation](https://www.pureftpd.org/project/libpuzzle) which 13 | may be more suited to your needs. 14 | 15 | ## Getting started 16 | You'll need a scientific Python distribution and a database backend. Currently 17 | we use Elasticsearch as a backend. 18 | 19 | 20 | ### numpy, PIL, skimage, etc. 21 | Image-match requires several scientific Python packages. Although they can be 22 | installed and built individually, they are often bundled in a custom Python 23 | distribution, for instance [Anaconda](https://www.continuum.io/why-anaconda). 24 | Installation instructions can be found 25 | [here](https://www.continuum.io/downloads#_unix). 26 | 27 | 28 | ### Elasticsearch 29 | If you just want to generate and compare image signatures, you can skip this 30 | step. If you want to search over a corpus of millions or billions of image 31 | signatures, you will need a database backend. We built image-match around 32 | [Elasticsearch](https://www.elastic.co/). See download and installation 33 | instructions [here](https://www.elastic.co/downloads/elasticsearch). 34 | 35 | 36 | ### Install image-match 37 | 1. Clone this repository: 38 | 39 | ``` 40 | $ git clone https://github.com/ascribe/image-match.git 41 | ``` 42 | 43 | 2. Install image-match 44 | 45 | ``` 46 | $ pip install numpy 47 | $ pip install . 48 | ``` 49 | 50 | 3. Make sure elasticsearch is running (optional): 51 | 52 | For example, on Ubuntu you can check with: 53 | 54 | ``` 55 | $ sudo service elasticsearch status 56 | ``` 57 | 58 | If it's not running, simply run: 59 | 60 | ``` 61 | $ sudo service elasticsearch start 62 | ``` 63 | 64 | ## Image signatures and distances 65 | Consider these two photographs of the [Mona 66 | Lisa](https://en.wikipedia.org/wiki/Mona_Lisa): 67 | 68 | ![](https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg) 69 | 70 | (credit: 71 | [Wikipedia](https://en.wikipedia.org/wiki/Mona_Lisa#/media/File:Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg) 72 | Public domain) 73 | 74 | ![](https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg) 75 | 76 | (credit: 77 | [WikiImages](https://pixabay.com/en/mona-lisa-painting-art-oil-painting-67506/) 78 | Public domain) 79 | 80 | Though it's obvious to any human observer that this is the same image, we can 81 | find a number of subtle differences: the dimensions, palette, lighting and so 82 | on are different in each image. image-match will give us numerical comparison: 83 | 84 | ```python 85 | from image_match.goldberg import ImageSignature 86 | gis = ImageSignature() 87 | a = gis.generate_signature('https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg') 88 | b = gis.generate_signature('https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg') 89 | gis.normalized_distance(a, b) 90 | ``` 91 | 92 | Returns `0.22095170140933634`. Normalized distances of less than `0.40` are 93 | very likely matches. If we try this again against a dissimilar image, say, 94 | Caravaggio's [Supper at 95 | Emmaus](https://en.wikipedia.org/wiki/Supper_at_Emmaus_(Caravaggio),_London): 96 | ![](https://upload.wikimedia.org/wikipedia/commons/e/e0/Caravaggio_-_Cena_in_Emmaus.jpg) 97 | 98 | (credit: [Wikipedia](https://en.wikipedia.org/wiki/Caravaggio#/media/File:Caravaggio_-_Cena_in_Emmaus.jpg) Public domain) 99 | 100 | against one of the Mona Lisa photographs: 101 | ```python 102 | c = gis.generate_signature('https://upload.wikimedia.org/wikipedia/commons/e/e0/Caravaggio_-_Cena_in_Emmaus.jpg') 103 | gis.normalized_distance(a, c) 104 | ``` 105 | 106 | Returns `0.68446275381507249`, almost certainly not a match. image-match 107 | doesn't have to generate a signature from a URL; a file-path or even an 108 | in-memory bytestream will do (be sure to specify `bytestream=True` in the 109 | latter case). 110 | 111 | Now consider this subtly-modified version of the Mona Lisa: 112 | 113 | ![https://www.flickr.com/photos/planetrussell/6814444991](https://c2.staticflickr.com/8/7158/6814444991_08d82de57e_z.jpg) 114 | 115 | (credit: [Michael Russell](https://www.flickr.com/photos/planetrussell/6814444991) [Attribution-ShareAlike 2.0 Generic](https://creativecommons.org/licenses/by-sa/2.0/)) 116 | 117 | How similar is it to our original Mona Lisa? 118 | ```python 119 | d = gis.generate_signature('https://c2.staticflickr.com/8/7158/6814444991_08d82de57e_z.jpg') 120 | gis.normalized_distance(a, d) 121 | ``` 122 | 123 | This gives us `0.42557196987336648`. So markedly different than the two 124 | original Mona Lisas, but considerably closer than the Caravaggio. 125 | 126 | 127 | ## Storing and searching the Signatures 128 | In addition to generating image signatures, image-match also facilitates 129 | storing and efficient lookup of images—even for up to (at least) a billion 130 | images. Instagram account only has a few million images? Don't worry, you can 131 | get 80M images [here](http://horatio.cs.nyu.edu/mit/tiny/data/index.html]) to 132 | play with. 133 | 134 | A signature database wraps an Elasticsearch index, so you'll need Elasticsearch 135 | up and running. Once that's done, you can set it up like so: 136 | 137 | ```python 138 | from elasticsearch import Elasticsearch 139 | from image_match.elasticsearch_driver import SignatureES 140 | 141 | es = Elasticsearch() 142 | ses = SignatureES(es) 143 | ``` 144 | 145 | By default, the Elasticsearch index name is "images" and the document type 146 | "image," but you can change these via the `index` and `doc_type` parameters. 147 | 148 | Now, let's store those pictures from before in the database: 149 | 150 | ```python 151 | ses.add_image('https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg') 152 | ses.add_image('https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg') 153 | ses.add_image('https://upload.wikimedia.org/wikipedia/commons/e/e0/Caravaggio_-_Cena_in_Emmaus.jpg') 154 | ses.add_image('https://c2.staticflickr.com/8/7158/6814444991_08d82de57e_z.jpg') 155 | ``` 156 | 157 | Now let's search for one of those Mona Lisas: 158 | 159 | ```python 160 | ses.search_image('https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg') 161 | ``` 162 | 163 | The result is a list of hits: 164 | 165 | ```python 166 | [ 167 | {'dist': 0.0, 168 | 'id': u'AVM37oZq0osmmAxpPvx7', 169 | 'path': u'https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg', 170 | 'score': 7.937254}, 171 | {'dist': 0.22095170140933634, 172 | 'id': u'AVM37nMg0osmmAxpPvx6', 173 | 'path': u'https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg', 174 | 'score': 0.28797293}, 175 | {'dist': 0.42557196987336648, 176 | 'id': u'AVM37p530osmmAxpPvx9', 177 | 'path': u'https://c2.staticflickr.com/8/7158/6814444991_08d82de57e_z.jpg', 178 | 'score': 0.0499953} 179 | ] 180 | ``` 181 | 182 | `dist` is the normalized distance, like we computed above. Hence, lower numbers 183 | are better with `0.0` being a perfect match. `id` is an identifier assigned by 184 | the database. `score` is computed by Elasticsearch, and higher numbers are 185 | better here. `path` is the original path (url or file path). 186 | 187 | Notice all three Mona Lisa images appear in the results, with the identical 188 | image being a perfect (`'dist': 0.0`) match. If we search instead for the 189 | Caravaggio, 190 | 191 | ```python 192 | ses.search_image('https://upload.wikimedia.org/wikipedia/commons/e/e0/Caravaggio_-_Cena_in_Emmaus.jpg') 193 | ``` 194 | 195 | You get: 196 | 197 | ```python 198 | [ 199 | {'dist': 0.0, 200 | 'id': u'AVMyXQFw0osmmAxpPvxz', 201 | 'path': u'https://upload.wikimedia.org/wikipedia/commons/e/e0/Caravaggio_-_Cena_in_Emmaus.jpg', 202 | 'score': 7.937254} 203 | ] 204 | ``` 205 | 206 | It only finds the Caravaggio, which makes sense! But what if we wanted an even 207 | more restrictive search? For instance, maybe we only want unmodified Mona Lisas 208 | -- just photographs of the original. We can restrict our search with a hard 209 | cutoff using the `distance_cutoff` keyword argument: 210 | 211 | ```python 212 | ses = SignatureES(es, distance_cutoff=0.3) 213 | ses.search_image('https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg') 214 | ``` 215 | 216 | Which now returns only the unmodified, catless Mona Lisas: 217 | 218 | ```python 219 | [ 220 | {'dist': 0.0, 221 | 'id': u'AVMyXOz30osmmAxpPvxy', 222 | 'path': u'https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg', 223 | 'score': 7.937254}, 224 | {'dist': 0.23889600350807427, 225 | 'id': u'AVMyXMpV0osmmAxpPvxx', 226 | 'path': u'https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg', 227 | 'score': 0.28797293} 228 | ] 229 | ``` 230 | 231 | ### Distorted and transformed images 232 | 233 | image-match is also robust against basic image transforms. Take this squashed 234 | Mona Lisa: 235 | 236 | ![](http://i.imgur.com/CVYBCCy.jpg) 237 | 238 | No problem, just search as usual: 239 | 240 | ```python 241 | ses.search_image('http://i.imgur.com/CVYBCCy.jpg') 242 | ``` 243 | 244 | returns 245 | 246 | ``` 247 | [ 248 | {'dist': 0.15454905655638429, 249 | 'id': u'AVM37oZq0osmmAxpPvx7', 250 | 'path': u'https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg', 251 | 'score': 1.6818419}, 252 | {'dist': 0.24980626832071956, 253 | 'id': u'AVM37nMg0osmmAxpPvx6', 254 | 'path': u'https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg', 255 | 'score': 0.16198477}, 256 | {'dist': 0.43387141782958921, 257 | 'id': u'AVM37p530osmmAxpPvx9', 258 | 'path': u'https://c2.staticflickr.com/8/7158/6814444991_08d82de57e_z.jpg', 259 | 'score': 0.031996995} 260 | ] 261 | ``` 262 | 263 | as expected. Now, consider this rotated version: 264 | 265 | ![](http://i.imgur.com/T5AusYd.jpg) 266 | 267 | image-match doesn't search for rotations and mirror images by default. 268 | Searching for this image will return no results, unless you search with 269 | `all_orientations=True`: 270 | 271 | ```python 272 | ses.search_image('http://i.imgur.com/T5AusYd.jpg', all_orientations=True) 273 | ``` 274 | 275 | Then you get the expected matches. 276 | 277 | 278 | ## Other database backends 279 | Though we designed image-match with Elasticsearch in mind, other database 280 | backends are possible. For demonstration purposes we include also a 281 | [MongoDB](https://www.mongodb.org/) driver: 282 | 283 | ```python 284 | from image_match.mongodb_driver import SignatureMongo 285 | from pymongo import MongoClient 286 | 287 | client = MongoClient(connect=False) 288 | c = client.images.images 289 | 290 | ses = SignatureMongo(c) 291 | ``` 292 | 293 | now you can use the same functionality as above like `ses.add_image(...)`. 294 | 295 | We tried to separate signature logic from the database insertion/search as much 296 | as possible. To write your own database backend, you can inherit from the 297 | `SignatureDatabaseBase` class and override the appropriate methods: 298 | 299 | ```python 300 | from signature_database_base import SignatureDatabaseBase 301 | # other relevant imports 302 | 303 | 304 | class MySignatureBackend(SignatureDatabaseBase): 305 | 306 | # if you need to do some setup, override __init__ 307 | def __init__(self, myarg1, myarg2, *args, **kwargs): 308 | # do some initializing stuff here if necessary 309 | # ... 310 | super(MySignatureBakend, self).__init__(*args, **kwargs) 311 | 312 | # you MUST implement these two functions 313 | def search_single_record(self, rec): 314 | # should query your database given a record generated from signature_database_base.make_record 315 | # ... 316 | # should return a list of dicts like [{'id': 'some_unique_id_from_db', 'dist': 0.109234, 'path': 'url/or/filepath'}, {...} ...] 317 | # you can have other keys, but you need at least id and dist 318 | return formatted_results 319 | 320 | def insert_single_record(self, rec): 321 | # if your database driver or instance can accept a dict as input, this should be very simple 322 | 323 | # ... 324 | ``` 325 | 326 | Unfortunately, implementing a good `search_single_record` function does require 327 | some knowledge of [the search 328 | algorithm](http://www.cs.cmu.edu/~hcwong/Pdfs/icip02.ps). You can also look at 329 | the two included database drivers for guidelines. 330 | 331 | -------------------------------------------------------------------------------- /image_match/__init__.py: -------------------------------------------------------------------------------- 1 | __author__ = 'ryan' 2 | -------------------------------------------------------------------------------- /image_match/elasticsearch_driver.py: -------------------------------------------------------------------------------- 1 | from signature_database_base import SignatureDatabaseBase 2 | from signature_database_base import normalized_distance 3 | from datetime import datetime 4 | import numpy as np 5 | 6 | 7 | class SignatureES(SignatureDatabaseBase): 8 | """Elasticsearch driver for image-match 9 | 10 | """ 11 | 12 | def __init__(self, es, index='images', doc_type='image', timeout=10, size=100, 13 | *args, **kwargs): 14 | """Extra setup for Elasticsearch 15 | 16 | Args: 17 | es (elasticsearch): an instance of the elasticsearch python driver 18 | index (Optional[string]): a name for the Elasticsearch index (default 'images') 19 | doc_type (Optional[string]): a name for the document time (default 'image') 20 | timeout (Optional[int]): how long to wait on an Elasticsearch query, in seconds (default 10) 21 | size (Optional[int]): maximum number of Elasticsearch results (default 100) 22 | *args (Optional): Variable length argument list to pass to base constructor 23 | **kwargs (Optional): Arbitrary keyword arguments to pass to base constructor 24 | 25 | Examples: 26 | >>> from elasticsearch import Elasticsearch 27 | >>> from image_match.elasticsearch_driver import SignatureES 28 | >>> es = Elasticsearch() 29 | >>> ses = SignatureES(es) 30 | >>> ses.add_image('https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg') 31 | >>> ses.search_image('https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg') 32 | [ 33 | {'dist': 0.0, 34 | 'id': u'AVM37nMg0osmmAxpPvx6', 35 | 'path': u'https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg', 36 | 'score': 0.28797293} 37 | ] 38 | 39 | """ 40 | self.es = es 41 | self.index = index 42 | self.doc_type = doc_type 43 | self.timeout = timeout 44 | self.size = size 45 | 46 | super(SignatureES, self).__init__(*args, **kwargs) 47 | 48 | def search_single_record(self, rec): 49 | path = rec.pop('path') 50 | signature = rec.pop('signature') 51 | 52 | fields = ['path', 'signature'] 53 | 54 | # build the 'should' list 55 | should = [{'term': {word: rec[word]}} for word in rec] 56 | res = self.es.search(index=self.index, 57 | doc_type=self.doc_type, 58 | body={'query': 59 | { 60 | 'filtered': { 61 | 'query': { 62 | 'bool': {'should': should} 63 | } 64 | } 65 | }}, 66 | fields=fields, 67 | size=self.size, 68 | timeout=self.timeout)['hits']['hits'] 69 | 70 | sigs = np.array([x['fields']['signature'] for x in res]) 71 | 72 | if sigs.size == 0: 73 | return [] 74 | 75 | dists = normalized_distance(sigs, np.array(signature)) 76 | 77 | formatted_res = [{'id': x['_id'], 78 | 'score': x['_score'], 79 | 'path': x['fields'].get('url', x['fields'].get('path'))[0]} 80 | for x in res] 81 | 82 | for i, row in enumerate(formatted_res): 83 | row['dist'] = dists[i] 84 | formatted_res = filter(lambda y: y['dist'] < self.distance_cutoff, formatted_res) 85 | 86 | return formatted_res 87 | 88 | def insert_single_record(self, rec): 89 | rec['timestamp'] = datetime.now() 90 | self.es.index(index=self.index, doc_type=self.doc_type, body=rec) 91 | 92 | -------------------------------------------------------------------------------- /image_match/goldberg.py: -------------------------------------------------------------------------------- 1 | from skimage.color import rgb2gray 2 | from skimage.io import imread 3 | from PIL import Image 4 | from PIL.MpoImagePlugin import MpoImageFile 5 | from cairosvg import svg2png 6 | from cStringIO import StringIO 7 | import numpy as np 8 | 9 | 10 | class ImageSignature(object): 11 | """Image signature generator. 12 | 13 | Based on the method of Goldberg, et al. Available at http://www.cs.cmu.edu/~hcwong/Pdfs/icip02.ps 14 | """ 15 | 16 | def __init__(self, n=9, crop_percentiles=(5, 95), P=None, diagonal_neighbors=True, 17 | identical_tolerance=2/255., n_levels=2, fix_ratio=False): 18 | """Initialize the signature generator. 19 | 20 | The default parameters match those given in Goldberg's paper. 21 | 22 | Note: 23 | Non-default parameters have not been extensively tested. Use carefully. 24 | 25 | Args: 26 | n (Optional[int]): size of grid imposed on image. Grid is n x n (default 9) 27 | crop_percentiles (Optional[Tuple[int]]): lower and upper bounds when considering how much 28 | variance to keep in the image (default (5, 95)) 29 | P (Optional[int]): size of sample region, P x P. If none, uses a sample region based 30 | on the size of the image (default None) 31 | diagonal_neighbors (Optional[boolean]): whether to include diagonal grid neighbors 32 | (default True) 33 | identical_tolerance (Optional[float]): cutoff difference for declaring two adjacent 34 | grid points identical (default 2/255) 35 | n_levels (Optional[int]): number of positive and negative groups to stratify neighbor 36 | differences into. n = 2 -> [-2, -1, 0, 1, 2] (default 2) 37 | 38 | """ 39 | 40 | # check inputs 41 | assert crop_percentiles is None or len(crop_percentiles) == 2,\ 42 | 'crop_percentiles should be a two-value tuple, or None' 43 | if crop_percentiles is not None: 44 | assert crop_percentiles[0] >= 0,\ 45 | 'Lower crop_percentiles limit should be > 0 (%r given)'\ 46 | % crop_percentiles[0] 47 | assert crop_percentiles[1] <= 100,\ 48 | 'Upper crop_percentiles limit should be < 100 (%r given)'\ 49 | % crop_percentiles[1] 50 | assert crop_percentiles[0] < crop_percentiles[1],\ 51 | 'Upper crop_percentile limit should be greater than lower limit.' 52 | self.lower_percentile = crop_percentiles[0] 53 | self.upper_percentile = crop_percentiles[1] 54 | self.crop_percentiles = crop_percentiles 55 | else: 56 | self.crop_percentiles = crop_percentiles 57 | self.lower_percentile = 0 58 | self.upper_percentile = 100 59 | 60 | assert type(n) is int, 'n should be an integer > 1' 61 | assert n > 1, 'n should be greater than 1 (%r given)' % n 62 | self.n = n 63 | 64 | assert type(P) is int or P is None, 'P should be an integer >= 1, or None' 65 | if P is not None: 66 | assert P >= 1, 'P should be greater than 0 (%r given)' % n 67 | self.P = P 68 | 69 | assert type(diagonal_neighbors) is bool, 'diagonal_neighbors should be boolean' 70 | self.diagonal_neighbors = diagonal_neighbors 71 | self.sig_length = self.n ** 2 * (4 + self.diagonal_neighbors * 4) 72 | 73 | assert type(fix_ratio) is bool, 'fix_ratio should be boolean' 74 | self.fix_ratio = fix_ratio 75 | 76 | assert type(identical_tolerance) is float or type(identical_tolerance) is int,\ 77 | 'identical_tolerance should be a number between 1 and 0' 78 | assert 0. <= identical_tolerance <= 1.,\ 79 | 'identical_tolerance should be greater than zero and less than one (%r given)' % identical_tolerance 80 | self.identical_tolerance = identical_tolerance 81 | 82 | assert type(n_levels) is int, 'n_levels should be an integer' 83 | assert n_levels > 0 84 | 'n_levels should be > 0 (%r given)' % n_levels 85 | self.n_levels = n_levels 86 | 87 | self.handle_mpo = True 88 | 89 | def generate_signature(self, path_or_image, bytestream=False): 90 | """Generates an image signature. 91 | 92 | See section 3 of Goldberg, et al. 93 | 94 | Args: 95 | path_or_image (string or numpy.ndarray): image path, or image array 96 | bytestream (Optional[boolean]): will the image be passed as raw bytes? 97 | That is, is the 'path_or_image' argument an in-memory image? 98 | (default False) 99 | 100 | Returns: 101 | The image signature: A rank 1 numpy array of length n x n x 8 102 | (or n x n x 4 if diagonal_neighbors == False) 103 | 104 | Examples: 105 | >>> from image_match.goldberg import ImageSignature 106 | >>> gis = ImageSignature() 107 | >>> gis.generate_signature('https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg') 108 | array([ 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 2, 2, 2, 0, 109 | 0, 0, 0, 2, 2, 2, 2, 0, 0, 0, -2, 2, 2, 1, 2, 0, 0, 110 | 0, -2, 2, -1, -1, 2, 0, 0, 0, -2, -2, -2, -2, -1, 0, 0, 0, 111 | 2, -1, 2, 2, 2, 0, 0, 0, 1, -1, 2, 2, -1, 0, 0, 0, 1, 112 | 0, 2, -1, 0, 0, -2, -2, 0, -2, 0, 2, 2, -2, -2, -2, 2, 2, 113 | 2, 2, 2, -2, -2, -2, -2, -2, 1, 2, -2, -2, -1, 1, 2, 1, 2, 114 | -1, 1, -2, 1, 2, -1, 2, -1, 0, 2, -2, 2, -2, -2, 1, -2, 1, 115 | 2, 1, -2, -2, -1, -2, 1, 1, -1, -2, -2, -2, 2, -2, 2, 2, 2, 116 | 1, 1, 0, 2, 0, 2, 2, 0, 0, -2, -2, 0, 1, 0, -1, 1, -2, 117 | -2, -1, -1, 1, -1, 1, 1, -2, -2, -2, -1, -2, -1, 1, -1, 2, 1, 118 | 1, 2, 1, 2, 2, -1, -1, 0, 2, -1, 2, 2, -1, -1, -2, -1, -1, 119 | -2, 1, -2, -2, -1, -2, -1, -2, -1, -2, -2, -1, -1, 1, -2, -2, 2, 120 | -1, 1, 1, -2, -2, -2, 0, 1, 0, 1, -1, 0, 0, 1, 1, 0, 1, 121 | 0, 1, 1, -1, -1, 1, -1, 1, -1, 1, -1, -1, -1, -2, -1, -1, -1, 122 | -2, -2, 1, -2, -2, 1, -2, -2, -2, -2, 1, 1, 2, 2, 1, -2, -2, 123 | -1, 1, 2, 2, -1, 2, -2, -1, 1, 1, 1, -1, -2, -1, -2, -2, 0, 124 | 1, -1, -1, 1, -2, -2, 0, 1, 2, 1, 0, 2, 0, 2, 2, 0, 0, 125 | -1, 1, 0, 1, 0, 1, 2, -1, -1, 1, -1, -1, -1, 2, 1, 1, 2, 126 | 2, 1, -2, 2, 2, 1, 2, 2, 2, 2, -1, 2, 2, 2, 2, 2, 2, 127 | 1, 2, 2, 2, 2, 1, 1, 2, -2, 2, 2, 2, 2, -1, 2, 2, -2, 128 | 2, 2, 2, 2, 0, 0, -2, -2, 1, 0, -1, 1, -1, -2, 0, -1, 0, 129 | -1, 1, 0, 0, -1, 1, 0, 2, 0, 2, 2, -2, -2, -2, -2, -1, -1, 130 | -1, 0, -1, -2, -2, 1, -1, -1, 1, 1, -1, -2, -2, 1, 1, 1, 1, 131 | 2, -2, -2, -2, -1, 1, 1, 1, 2, -2, -2, -2, -1, -1, 0, 1, 1, 132 | -2, -2, 0, 1, -1, 1, 1, 1, -2, 1, 1, 1, 2, 2, 2, 2, -1, 133 | -1, 0, -2, 0, 0, 1, 0, 0, -2, 1, 0, -1, 0, -1, -2, -2, 1, 134 | 1, 1, 1, -1, -1, -2, 0, -1, -1, -1, -1, -2, -2, -2, -1, -1, -1, 135 | 1, 1, -2, -2, 1, -2, -1, 0, -1, 0, -2, -1, 1, -2, -1, -1, 0, 136 | 0, -1, 0, 0, -1, -1, -2, 0, -1, 0, 0, -1, -1, -2, 0, 1, 1, 137 | 1, 0, 1, -2, -1, 0, -1, 0, -1, 0, 0, 0, 1, 1, 0, -1, 0, 138 | 2, -1, 2, 1, 2, 1, -2, 2, -1, -2, 2, 2, 2, 2, 2, 2, 2, 139 | 2, 2, 2, 2, -2, 2, 1, 2, 2, -1, 1, 1, -2, 1, -2, -2, -1, 140 | -1, 0, 0, -1, 0, -2, -1, -1, 0, 0, -1, 0, -1, -1, -1, -1, 1, 141 | 0, 1, 1, 1, -1, 0, 1, -1, 0, 0, -1, 0, -1, 0, 0, 0, -2, 142 | -2, 0, -2, 0, 0, 0, 1, 1, -2, 2, -2, 0, 0, 0, 2, -2, -1, 143 | 2, 2, 0, 0, 0, -2, -2, 2, -2, 1, 0, 0, 0, -2, 2, 2, -1, 144 | 2, 0, 0, 0, 1, 1, 1, -2, 1, 0, 0, 0, 1, 1, 1, -1, 1, 145 | 0, 0, 0, 1, 0, 1, -1, 1, 0, 0, 0, -1, 0, 0, -1, 0, 0, 146 | 0, 0], dtype=int8) 147 | 148 | """ 149 | 150 | # Step 1: Load image as array of grey-levels 151 | im_array = self.preprocess_image(path_or_image, handle_mpo=self.handle_mpo, bytestream=bytestream) 152 | 153 | # Step 2a: Determine cropping boundaries 154 | if self.crop_percentiles is not None: 155 | image_limits = self.crop_image(im_array, 156 | lower_percentile=self.lower_percentile, 157 | upper_percentile=self.upper_percentile, 158 | fix_ratio=self.fix_ratio) 159 | else: 160 | image_limits = None 161 | 162 | # Step 2b: Generate grid centers 163 | x_coords, y_coords = self.compute_grid_points(im_array, 164 | n=self.n, window=image_limits) 165 | 166 | # Step 3: Compute grey level mean of each P x P 167 | # square centered at each grid point 168 | avg_grey = self.compute_mean_level(im_array, x_coords, y_coords, P=self.P) 169 | 170 | # Step 4a: Compute array of differences for each 171 | # grid point vis-a-vis each neighbor 172 | diff_mat = self.compute_differentials(avg_grey, 173 | diagonal_neighbors=self.diagonal_neighbors) 174 | 175 | # Step 4b: Bin differences to only 2n+1 values 176 | self.normalize_and_threshold(diff_mat, 177 | identical_tolerance=self.identical_tolerance, 178 | n_levels=self.n_levels) 179 | 180 | # Step 5: Flatten array and return signature 181 | return np.ravel(diff_mat).astype('int8') 182 | 183 | @staticmethod 184 | def preprocess_image(image_or_path, bytestream=False, handle_mpo=False): 185 | """Loads an image and converts to greyscale. 186 | 187 | Corresponds to 'step 1' in Goldberg's paper 188 | 189 | Args: 190 | image_or_path (string or numpy.ndarray): image path, or image array 191 | bytestream (Optional[boolean]): will the image be passed as raw bytes? 192 | That is, is the 'path_or_image' argument an in-memory image? 193 | (default False) 194 | handle_mpo (Optional[boolean]): try to compute a signature for steroscopic 195 | images by extracting the first image of the set (default False) 196 | 197 | Returns: 198 | Array of floats corresponding to greyscale level at each pixel 199 | 200 | Examples: 201 | >>> gis.preprocess_image('https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg') 202 | array([[ 0.26344431, 0.32423294, 0.30406745, ..., 0.35069725, 203 | 0.36499961, 0.36361569], 204 | [ 0.29676627, 0.28640118, 0.34523255, ..., 0.3703051 , 205 | 0.34931333, 0.31655686], 206 | [ 0.35305216, 0.31858431, 0.36202 , ..., 0.40588196, 207 | 0.37284275, 0.30871373], 208 | ..., 209 | [ 0.05932863, 0.05540706, 0.05540706, ..., 0.01954745, 210 | 0.01954745, 0.01562588], 211 | [ 0.0632502 , 0.05540706, 0.05148549, ..., 0.01954745, 212 | 0.02346902, 0.01562588], 213 | [ 0.06717176, 0.05540706, 0.05148549, ..., 0.02346902, 214 | 0.02739059, 0.01954745]]) 215 | 216 | """ 217 | if bytestream: 218 | try: 219 | img = Image.open(StringIO(image_or_path)) 220 | except IOError: 221 | # could be an svg, attempt to convert 222 | img = Image.open(StringIO(svg2png(image_or_path))) 223 | img = img.convert('RGB') 224 | return rgb2gray(np.asarray(img, dtype=np.uint8)) 225 | elif type(image_or_path) is unicode: 226 | return imread(image_or_path, as_grey=True) 227 | elif type(image_or_path) is str: 228 | try: 229 | img = Image.open(image_or_path) 230 | arr = np.array(img.convert('RGB')) 231 | except IOError: 232 | # try again due to PIL weirdness 233 | return imread(image_or_path, as_grey=True) 234 | if handle_mpo: 235 | # take the first images from the MPO 236 | if arr.shape == (2,) and isinstance(arr[1].tolist(), MpoImageFile): 237 | return rgb2gray(arr[0]) 238 | else: 239 | return rgb2gray(arr) 240 | else: 241 | return rgb2gray(arr) 242 | elif type(image_or_path) is np.ndarray: 243 | return rgb2gray(image_or_path) 244 | else: 245 | raise TypeError('Path or image required.') 246 | 247 | @staticmethod 248 | def crop_image(image, lower_percentile=5, upper_percentile=95, fix_ratio=False): 249 | """Crops an image, removing featureless border regions. 250 | 251 | Corresponds to the first part of 'step 2' in Goldberg's paper 252 | 253 | Args: 254 | image (numpy.ndarray): n x m array of floats -- the greyscale image. Typically, the 255 | output of preprocess_image 256 | lower_percentile (Optional[int]): crop image by percentage of difference (default 5) 257 | upper_percentile (Optional[int]): as lower_percentile (default 95) 258 | fix_ratio (Optional[boolean]): use the larger ratio for both directions. This is useful 259 | for using the fast signature transforms on sparse but very similar images (e.g. 260 | renderings from fixed directions). Use with care -- only use if you can guarantee the 261 | incoming image is square (default False). 262 | 263 | Returns: 264 | A pair of tuples describing the 'window' of the image to use in analysis: [(top, bottom), (left, right)] 265 | 266 | Examples: 267 | >>> img = gis.preprocess_image('https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg') 268 | >>> gis.crop_image(img) 269 | [(36, 684), (24, 452)] 270 | 271 | """ 272 | # row-wise differences 273 | rw = np.cumsum(np.sum(np.abs(np.diff(image, axis=1)), axis=1)) 274 | # column-wise differences 275 | cw = np.cumsum(np.sum(np.abs(np.diff(image, axis=0)), axis=0)) 276 | 277 | # compute percentiles 278 | upper_column_limit = np.searchsorted(cw, 279 | np.percentile(cw, upper_percentile), 280 | side='left') 281 | lower_column_limit = np.searchsorted(cw, 282 | np.percentile(cw, lower_percentile), 283 | side='right') 284 | upper_row_limit = np.searchsorted(rw, 285 | np.percentile(rw, upper_percentile), 286 | side='left') 287 | lower_row_limit = np.searchsorted(rw, 288 | np.percentile(rw, lower_percentile), 289 | side='right') 290 | 291 | # if image is nearly featureless, use default region 292 | if lower_row_limit > upper_row_limit: 293 | lower_row_limit = int(lower_percentile/100.*image.shape[0]) 294 | upper_row_limit = int(upper_percentile/100.*image.shape[0]) 295 | if lower_column_limit > upper_column_limit: 296 | lower_column_limit = int(lower_percentile/100.*image.shape[1]) 297 | upper_column_limit = int(upper_percentile/100.*image.shape[1]) 298 | 299 | # if fix_ratio, return both limits as the larger range 300 | if fix_ratio: 301 | if (upper_row_limit - lower_row_limit) > (upper_column_limit - lower_column_limit): 302 | return [(lower_row_limit, upper_row_limit), 303 | (lower_row_limit, upper_row_limit)] 304 | else: 305 | return [(lower_column_limit, upper_column_limit), 306 | (lower_column_limit, upper_column_limit)] 307 | 308 | # otherwise, proceed as normal 309 | return [(lower_row_limit, upper_row_limit), 310 | (lower_column_limit, upper_column_limit)] 311 | 312 | @staticmethod 313 | def compute_grid_points(image, n=9, window=None): 314 | """Computes grid points for image analysis. 315 | 316 | Corresponds to the second part of 'step 2' in the paper 317 | 318 | Args: 319 | image (numpy.ndarray): n x m array of floats -- the greyscale image. Typically, 320 | the output of preprocess_image 321 | n (Optional[int]): number of gridpoints in each direction (default 9) 322 | window (Optional[List[Tuple[int]]]): limiting coordinates [(t, b), (l, r)], typically the 323 | output of (default None) 324 | 325 | Returns: 326 | tuple of arrays indicating the vertical and horizontal locations of the grid points 327 | 328 | Examples: 329 | >>> img = gis.preprocess_image('https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg') 330 | >>> window = gis.crop_image(img) 331 | >>> gis.compute_grid_points(img, window=window) 332 | (array([100, 165, 230, 295, 360, 424, 489, 554, 619]), 333 | array([ 66, 109, 152, 195, 238, 280, 323, 366, 409])) 334 | 335 | """ 336 | 337 | # if no limits are provided, use the entire image 338 | if window is None: 339 | window = [(0, image.shape[0]), (0, image.shape[1])] 340 | 341 | x_coords = np.linspace(window[0][0], window[0][1], n + 2, dtype=int)[1:-1] 342 | y_coords = np.linspace(window[1][0], window[1][1], n + 2, dtype=int)[1:-1] 343 | 344 | return x_coords, y_coords # return pairs 345 | 346 | @staticmethod 347 | def compute_mean_level(image, x_coords, y_coords, P=None): 348 | """Computes array of greyness means. 349 | 350 | Corresponds to 'step 3' 351 | 352 | Args: 353 | image (numpy.ndarray): n x m array of floats -- the greyscale image. Typically, 354 | the output of preprocess_image 355 | x_coords (numpy.ndarray): array of row numbers 356 | y_coords (numpy.ndarray): array of column numbers 357 | P (Optional[int]): size of boxes in pixels (default None) 358 | 359 | Returns: 360 | an N x N array of average greyscale around the gridpoint, where N is the 361 | number of grid points 362 | 363 | Examples: 364 | >>> img = gis.preprocess_image('https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg') 365 | >>> window = gis.crop_image(img) 366 | >>> grid = gis.compute_grid_points(img, window=window) 367 | >>> gis.compute_mean_level(img, grid[0], grid[1]) 368 | array([[ 0.62746325, 0.62563642, 0.62348078, 0.50651686, 0.37438874, 369 | 0.0644063 , 0.55968952, 0.59356148, 0.60473832], 370 | [ 0.35337797, 0.50272543, 0.27711346, 0.42384226, 0.39006181, 371 | 0.16773968, 0.10471924, 0.33647144, 0.62902124], 372 | [ 0.20307514, 0.19021892, 0.12435402, 0.44990121, 0.38527996, 373 | 0.08339507, 0.05530059, 0.18469107, 0.21125228], 374 | [ 0.25727387, 0.1669419 , 0.08964046, 0.1372754 , 0.48529236, 375 | 0.39894004, 0.10387907, 0.11282135, 0.30014612], 376 | [ 0.23447867, 0.15702549, 0.25232943, 0.75172715, 0.79488688, 377 | 0.4943538 , 0.29645163, 0.10714578, 0.0629376 ], 378 | [ 0.22167555, 0.04839472, 0.10125833, 0.1550749 , 0.14346914, 379 | 0.04713144, 0.10095568, 0.15349296, 0.04456733], 380 | [ 0.09233709, 0.11210942, 0.05361996, 0.07066566, 0.04191625, 381 | 0.03548839, 0.03420656, 0.05025029, 0.03519956], 382 | [ 0.19226873, 0.20647194, 0.62972106, 0.45514529, 0.05620413, 383 | 0.03383168, 0.03413588, 0.04741828, 0.02987698], 384 | [ 0.05799523, 0.23310153, 0.43719717, 0.27666873, 0.25106573, 385 | 0.11094163, 0.10180622, 0.04633349, 0.02704855]]) 386 | 387 | """ 388 | 389 | if P is None: 390 | P = max([2.0, int(0.5 + min(image.shape)/20.)]) # per the paper 391 | 392 | avg_grey = np.zeros((x_coords.shape[0], y_coords.shape[0])) 393 | 394 | for i, x in enumerate(x_coords): # not the fastest implementation 395 | lower_x_lim = max([x - P/2, 0]) 396 | upper_x_lim = min([lower_x_lim + P, image.shape[0]]) 397 | for j, y in enumerate(y_coords): 398 | lower_y_lim = max([y - P/2, 0]) 399 | upper_y_lim = min([lower_y_lim + P, image.shape[1]]) 400 | 401 | avg_grey[i, j] = np.mean(image[lower_x_lim:upper_x_lim, 402 | lower_y_lim:upper_y_lim]) # no smoothing here as in the paper 403 | 404 | return avg_grey 405 | 406 | @staticmethod 407 | def compute_differentials(grey_level_matrix, diagonal_neighbors=True): 408 | """Computes differences in greylevels for neighboring grid points. 409 | 410 | First part of 'step 4' in the paper. 411 | 412 | Returns n x n x 8 rank 3 array for an n x n grid (if diagonal_neighbors == True) 413 | 414 | The n x nth coordinate corresponds to a grid point. The eight values are 415 | the differences between neighboring grid points, in this order: 416 | 417 | upper left 418 | upper 419 | upper right 420 | left 421 | right 422 | lower left 423 | lower 424 | lower right 425 | 426 | Args: 427 | grey_level_matrix (numpy.ndarray): grid of values sampled from image 428 | diagonal_neighbors (Optional[boolean]): whether or not to use diagonal 429 | neighbors (default True) 430 | 431 | Returns: 432 | a n x n x 8 rank 3 numpy array for an n x n grid (if diagonal_neighbors == True) 433 | 434 | Examples: 435 | >>> img = gis.preprocess_image('https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg') 436 | >>> window = gis.crop_image(img) 437 | >>> grid = gis.compute_grid_points(img, window=window) 438 | >>> grey_levels = gis.compute_mean_level(img, grid[0], grid[1]) 439 | >>> gis.compute_differentials(grey_levels) 440 | array([[[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 441 | 0.00000000e+00, 1.82683143e-03, -0.00000000e+00, 442 | 2.74085276e-01, 1.24737821e-01], 443 | [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 444 | -1.82683143e-03, 2.15563930e-03, 2.72258444e-01, 445 | 1.22910990e-01, 3.48522956e-01], 446 | [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 447 | -2.15563930e-03, 1.16963917e-01, 1.20755351e-01, 448 | 3.46367317e-01, 1.99638513e-01], 449 | [ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 450 | -1.16963917e-01, 1.32128118e-01, 2.29403399e-01, 451 | 8.26745956e-02, 1.16455050e-01], 452 | ... 453 | 454 | """ 455 | right_neighbors = -np.concatenate((np.diff(grey_level_matrix), 456 | np.zeros(grey_level_matrix.shape[0]). 457 | reshape((grey_level_matrix.shape[0], 1))), 458 | axis=1) 459 | left_neighbors = -np.concatenate((right_neighbors[:, -1:], 460 | right_neighbors[:, :-1]), 461 | axis=1) 462 | 463 | down_neighbors = -np.concatenate((np.diff(grey_level_matrix, axis=0), 464 | np.zeros(grey_level_matrix.shape[1]). 465 | reshape((1, grey_level_matrix.shape[1])))) 466 | 467 | up_neighbors = -np.concatenate((down_neighbors[-1:], down_neighbors[:-1])) 468 | 469 | if diagonal_neighbors: 470 | # this implementation will only work for a square (m x m) grid 471 | diagonals = np.arange(-grey_level_matrix.shape[0] + 1, 472 | grey_level_matrix.shape[0]) 473 | 474 | upper_left_neighbors = sum( 475 | [np.diagflat(np.insert(np.diff(np.diag(grey_level_matrix, i)), 0, 0), i) 476 | for i in diagonals]) 477 | lower_right_neighbors = -np.pad(upper_left_neighbors[1:, 1:], 478 | (0, 1), mode='constant') 479 | 480 | # flip for anti-diagonal differences 481 | flipped = np.fliplr(grey_level_matrix) 482 | upper_right_neighbors = sum([np.diagflat(np.insert( 483 | np.diff(np.diag(flipped, i)), 0, 0), i) for i in diagonals]) 484 | lower_left_neighbors = -np.pad(upper_right_neighbors[1:, 1:], 485 | (0, 1), mode='constant') 486 | 487 | return np.dstack(np.array([ 488 | upper_left_neighbors, 489 | up_neighbors, 490 | np.fliplr(upper_right_neighbors), 491 | left_neighbors, 492 | right_neighbors, 493 | np.fliplr(lower_left_neighbors), 494 | down_neighbors, 495 | lower_right_neighbors])) 496 | 497 | return np.dstack(np.array([ 498 | up_neighbors, 499 | left_neighbors, 500 | right_neighbors, 501 | down_neighbors])) 502 | 503 | @staticmethod 504 | def normalize_and_threshold(difference_array, 505 | identical_tolerance=2/255., n_levels=2): 506 | """Normalizes difference matrix in place. 507 | 508 | 'Step 4' of the paper. The flattened version of this array is the image signature. 509 | 510 | Args: 511 | difference_array (numpy.ndarray): n x n x l array, where l are the differences between 512 | the grid point and its neighbors. Typically the output of compute_differentials 513 | identical_tolerance (Optional[float]): maximum amount two gray values can differ and 514 | still be considered equivalent (default 2/255) 515 | n_levels (Optional[int]): bin differences into 2 n + 1 bins (e.g. n_levels=2 -> [-2, -1, 516 | 0, 1, 2]) 517 | 518 | Examples: 519 | >>> img = gis.preprocess_image('https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg') 520 | >>> window = gis.crop_image(img) 521 | >>> grid = gis.compute_grid_points(img, window=window) 522 | >>> grey_levels = gis.compute_mean_level(img, grid[0], grid[1]) 523 | >>> m = gis.compute_differentials(grey_levels) 524 | >>> m 525 | array([[[ 0., 0., 0., 0., 0., 0., 2., 2.], 526 | [ 0., 0., 0., 0., 0., 2., 2., 2.], 527 | [ 0., 0., 0., 0., 2., 2., 2., 2.], 528 | [ 0., 0., 0., -2., 2., 2., 1., 2.], 529 | [ 0., 0., 0., -2., 2., -1., -1., 2.], 530 | [ 0., 0., 0., -2., -2., -2., -2., -1.], 531 | [ 0., 0., 0., 2., -1., 2., 2., 2.], 532 | [ 0., 0., 0., 1., -1., 2., 2., -1.], 533 | [ 0., 0., 0., 1., 0., 2., -1., 0.]], 534 | 535 | [[ 0., -2., -2., 0., -2., 0., 2., 2.], 536 | [-2., -2., -2., 2., 2., 2., 2., 2.], 537 | [-2., -2., -2., -2., -2., 1., 2., -2.], 538 | [-2., -1., 1., 2., 1., 2., -1., 1.], 539 | [-2., 1., 2., -1., 2., -1., 0., 2.], 540 | ... 541 | 542 | """ 543 | 544 | # set very close values as equivalent 545 | mask = np.abs(difference_array) < identical_tolerance 546 | difference_array[mask] = 0. 547 | 548 | # if image is essentially featureless, exit here 549 | if np.all(mask): 550 | return None 551 | 552 | # bin so that size of bins on each side of zero are equivalent 553 | positive_cutoffs = np.percentile(difference_array[difference_array > 0.], 554 | np.linspace(0, 100, n_levels+1)) 555 | negative_cutoffs = np.percentile(difference_array[difference_array < 0.], 556 | np.linspace(100, 0, n_levels+1)) 557 | 558 | for level, interval in enumerate([positive_cutoffs[i:i+2] 559 | for i in range(positive_cutoffs.shape[0] - 1)]): 560 | difference_array[(difference_array >= interval[0]) & 561 | (difference_array <= interval[1])] = level + 1 562 | 563 | for level, interval in enumerate([negative_cutoffs[i:i+2] 564 | for i in range(negative_cutoffs.shape[0] - 1)]): 565 | difference_array[(difference_array <= interval[0]) & 566 | (difference_array >= interval[1])] = -(level + 1) 567 | 568 | return None 569 | 570 | @staticmethod 571 | def normalized_distance(_a, _b): 572 | """Compute normalized distance between two points. 573 | 574 | Computes || b - a || / ( ||b|| + ||a||) 575 | 576 | Args: 577 | _a (numpy.ndarray): array of size m 578 | _b (numpy.ndarray): array of size m 579 | 580 | Returns: 581 | normalized distance between signatures (float) 582 | 583 | Examples: 584 | >>> a = gis.generate_signature('https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg') 585 | >>> b = gis.generate_signature('https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg') 586 | >>> gis.normalized_distance(a, b) 587 | 0.22095170140933634 588 | 589 | """ 590 | b = _b.astype(int) 591 | a = _a.astype(int) 592 | norm_diff = np.linalg.norm(b - a) 593 | norm1 = np.linalg.norm(b) 594 | norm2 = np.linalg.norm(a) 595 | return norm_diff / (norm1 + norm2) 596 | -------------------------------------------------------------------------------- /image_match/mongodb_driver.py: -------------------------------------------------------------------------------- 1 | from signature_database_base import SignatureDatabaseBase 2 | from signature_database_base import normalized_distance 3 | from multiprocessing import cpu_count, Process, Queue 4 | from multiprocessing.managers import Queue as managerQueue 5 | import numpy as np 6 | 7 | 8 | class SignatureMongo(SignatureDatabaseBase): 9 | """MongoDB driver for image-match 10 | 11 | """ 12 | def __init__(self, collection, *args, **kwargs): 13 | """Additional MongoDB setup 14 | 15 | Args: 16 | collection (collection): a MongoDB collection instance 17 | args (Optional): Variable length argument list to pass to base constructor 18 | kwargs (Optional): Arbitrary keyword arguments to pass to base constructor 19 | 20 | Examples: 21 | >>> from image_match.mongodb_driver import SignatureMongo 22 | >>> from pymongo import MongoClient 23 | >>> client = MongoClient(connect=False) 24 | >>> c = client.images.images 25 | >>> ses = SignatureMongo(c) 26 | >>> ses.add_image('https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg') 27 | >>> ses.search_image('https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg') 28 | [ 29 | {'dist': 0.0, 30 | 'id': u'AVM37nMg0osmmAxpPvx6', 31 | 'path': u'https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg', 32 | 'score': 0.28797293} 33 | ] 34 | 35 | """ 36 | self.collection = collection 37 | # Extract index fields, if any exist yet 38 | if self.collection.count() > 0: 39 | self.index_names = [field for field in self.collection.find_one({}).keys() 40 | if field.find('simple') > -1] 41 | 42 | super(SignatureMongo, self).__init__(*args, **kwargs) 43 | 44 | def search_single_record(self, rec, n_parallel_words=1, word_limit=None, 45 | process_timeout=None, maximum_matches=1000): 46 | if n_parallel_words is None: 47 | n_parallel_words = cpu_count() 48 | 49 | if word_limit is None: 50 | word_limit = self.N 51 | 52 | initial_q = managerQueue.Queue() 53 | 54 | [initial_q.put({field_name: rec[field_name]}) for field_name in self.index_names[:word_limit]] 55 | 56 | # enqueue a sentinel value so we know we have reached the end of the queue 57 | initial_q.put('STOP') 58 | queue_empty = False 59 | 60 | # create an empty queue for results 61 | results_q = Queue() 62 | 63 | # create a set of unique results, using MongoDB _id field 64 | unique_results = set() 65 | 66 | l = list() 67 | 68 | while True: 69 | 70 | # build children processes, taking cursors from in_process queue first, then initial queue 71 | p = list() 72 | while len(p) < n_parallel_words: 73 | word_pair = initial_q.get() 74 | if word_pair == 'STOP': 75 | # if we reach the sentinel value, set the flag and stop queuing processes 76 | queue_empty = True 77 | break 78 | if not initial_q.empty(): 79 | p.append(Process(target=get_next_match, 80 | args=(results_q, 81 | word_pair, 82 | self.collection, 83 | np.array(rec['signature']), 84 | self.distance_cutoff, 85 | maximum_matches))) 86 | 87 | if len(p) > 0: 88 | for process in p: 89 | process.start() 90 | else: 91 | break 92 | 93 | # collect results, taking care not to return the same result twice 94 | 95 | num_processes = len(p) 96 | 97 | while num_processes: 98 | results = results_q.get() 99 | if results == 'STOP': 100 | num_processes -= 1 101 | else: 102 | for key in results.keys(): 103 | if key not in unique_results: 104 | unique_results.add(key) 105 | l.append(results[key]) 106 | 107 | for process in p: 108 | process.join() 109 | 110 | # yield a set of results 111 | if queue_empty: 112 | break 113 | 114 | return l 115 | 116 | 117 | def insert_single_record(self, rec): 118 | self.collection.insert(rec) 119 | 120 | # if the collection has no indexes (except possibly '_id'), build them 121 | if len(self.collection.index_information()) <= 1: 122 | self.index_collection() 123 | 124 | def index_collection(self): 125 | """Index a collection on words. 126 | 127 | """ 128 | # Index on words 129 | self.index_names = [field for field in self.collection.find_one({}).keys() 130 | if field.find('simple') > -1] 131 | for name in self.index_names: 132 | self.collection.create_index(name) 133 | 134 | 135 | def get_next_match(result_q, word, collection, signature, cutoff=0.5, max_in_cursor=100): 136 | """Given a cursor, iterate through matches 137 | 138 | Scans a cursor for word matches below a distance threshold. 139 | Exhausts a cursor, possibly enqueuing many matches 140 | Note that placing this function outside the SignatureCollection 141 | class breaks encapsulation. This is done for compatibility with 142 | multiprocessing. 143 | 144 | Args: 145 | result_q (multiprocessing.Queue): a multiprocessing queue in which to queue results 146 | word (dict): {word_name: word_value} dict to scan against 147 | collection (collection): a pymongo collection 148 | signature (numpy.ndarray): signature array to match against 149 | cutoff (Optional[float]): normalized distance limit (default 0.5) 150 | max_in_cursor (Optional[int]): if more than max_in_cursor matches are in the cursor, 151 | ignore this cursor; this column is not discriminatory (default 100) 152 | 153 | """ 154 | curs = collection.find(word, projection=['_id', 'signature', 'path']) 155 | 156 | # if the cursor has many matches, then it's probably not a huge help. Get the next one. 157 | if curs.count() > max_in_cursor: 158 | result_q.put('STOP') 159 | return 160 | 161 | matches = dict() 162 | while True: 163 | try: 164 | rec = curs.next() 165 | dist = normalized_distance(np.reshape(signature, (1, signature.size)), np.array(rec['signature']))[0] 166 | if dist < cutoff: 167 | matches[rec['_id']] = {'dist': dist, 'path': rec['path'], 'id': rec['_id']} 168 | result_q.put(matches) 169 | except StopIteration: 170 | # do nothing...the cursor is exhausted 171 | break 172 | result_q.put('STOP') 173 | 174 | -------------------------------------------------------------------------------- /image_match/signature_database_base.py: -------------------------------------------------------------------------------- 1 | from image_match.goldberg import ImageSignature 2 | from itertools import product 3 | from operator import itemgetter 4 | import numpy as np 5 | 6 | 7 | class SignatureDatabaseBase(object): 8 | """Base class for storing and searching image signatures in a database 9 | 10 | Note: 11 | You must implement the methods search_single_record and insert_single_record 12 | in a derived class 13 | 14 | """ 15 | 16 | def search_single_record(self, rec): 17 | """Search for a matching image record. 18 | 19 | Must be implemented by derived class. 20 | 21 | Args: 22 | rec (dict): an image record. Will be in the format returned by 23 | make_record 24 | 25 | For example, rec could have the form: 26 | 27 | {'path': 'https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg', 28 | 'signature': [0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 2, 2, 2, 2, 0 ... ] 29 | 'simple_word_0': 42252475, 30 | 'simple_word_1': 23885671, 31 | 'simple_word_10': 9967839, 32 | 'simple_word_11': 4257902, 33 | 'simple_word_12': 28651959, 34 | 'simple_word_13': 33773597, 35 | 'simple_word_14': 39331441, 36 | 'simple_word_15': 39327300, 37 | 'simple_word_16': 11337345, 38 | 'simple_word_17': 9571961, 39 | 'simple_word_18': 28697868, 40 | 'simple_word_19': 14834907, 41 | 'simple_word_2': 7434746, 42 | 'simple_word_20': 37985525, 43 | 'simple_word_21': 10753207, 44 | 'simple_word_22': 9566120, 45 | ... 46 | } 47 | 48 | The number of simple words corresponds to the attribute N 49 | 50 | Returns: 51 | a formatted list of dicts representing matches. 52 | 53 | For example, if three matches are found: 54 | 55 | [ 56 | {'dist': 0.069116439263706961, 57 | 'id': u'AVM37oZq0osmmAxpPvx7', 58 | 'path': u'https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg'}, 59 | {'dist': 0.22484320805049718, 60 | 'id': u'AVM37nMg0osmmAxpPvx6', 61 | 'path': u'https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg'}, 62 | {'dist': 0.42529792112113302, 63 | 'id': u'AVM37p530osmmAxpPvx9', 64 | 'path': u'https://c2.staticflickr.com/8/7158/6814444991_08d82de57e_z.jpg'} 65 | ] 66 | 67 | You can return any fields you like, but must include at least dist and id. Duplicate entries are ok, 68 | and they do not need to be sorted 69 | 70 | """ 71 | raise NotImplementedError 72 | 73 | def insert_single_record(self, rec): 74 | """Insert an image record. 75 | 76 | Must be implemented by derived class. 77 | 78 | Args: 79 | rec (dict): an image record. Will be in the format returned by 80 | make_record 81 | 82 | For example, rec could have the form: 83 | 84 | {'path': 'https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg', 85 | 'signature': [0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 2, 2, 2, 2, 0 ... ] 86 | 'simple_word_0': 42252475, 87 | 'simple_word_1': 23885671, 88 | 'simple_word_10': 9967839, 89 | 'simple_word_11': 4257902, 90 | 'simple_word_12': 28651959, 91 | 'simple_word_13': 33773597, 92 | 'simple_word_14': 39331441, 93 | 'simple_word_15': 39327300, 94 | 'simple_word_16': 11337345, 95 | 'simple_word_17': 9571961, 96 | 'simple_word_18': 28697868, 97 | 'simple_word_19': 14834907, 98 | 'simple_word_2': 7434746, 99 | 'simple_word_20': 37985525, 100 | 'simple_word_21': 10753207, 101 | 'simple_word_22': 9566120, 102 | ... 103 | } 104 | 105 | The number of simple words corresponds to the attribute N 106 | 107 | """ 108 | raise NotImplementedError 109 | 110 | def __init__(self, k=16, N=63, n_grid=9, 111 | crop_percentile=(5, 95), distance_cutoff=0.45, 112 | *signature_args, **signature_kwargs): 113 | """Set up storage scheme for images 114 | 115 | Central to the speed of this approach is the transforming the image 116 | signature into something that can be speedily indexed and matched. 117 | In our case, that means splitting the image signature into N words 118 | of length k, then encoding those words as integers. The idea here is 119 | that integer indices are more efficient than array indices. 120 | 121 | For example, say your image signature is [0, 1, 2, 0, -1, -2, 0, 1] and 122 | k=3 and N=4. That means we want 4 words of length 3. For this signa- 123 | ture, that gives us: 124 | 125 | [0, 1, 2] 126 | [2, 0, -1] 127 | [-1, -2, 0] 128 | [0, 1] 129 | 130 | Note that signature elements can be repeated, and any mismatch in length 131 | is chopped off in the last word (which will be padded with zeros). Since 132 | these numbers run from -2..2, there 5 possibilites. Adding 2 to each word 133 | makes them strictly non-negative, then the quantity, and transforming to 134 | base-5 makes unique integers. For the first word: 135 | 136 | [0, 1, 2] + 2 = [2, 3, 4] 137 | [5**0, 5**1, 5**2] = [1, 5, 25] 138 | dot([2, 3, 4], [1, 5, 25]) = 2 + 15 + 100 = 117 139 | 140 | So the integer word is 117. Storing all the integer words as different 141 | database columns or fields gives us the speedy lookup. In practice, word 142 | arrays are 'squeezed' to between -1..1 before encoding. 143 | 144 | Args: 145 | k (Optional[int]): the width of a word (default 16) 146 | N (Optional[int]): the number of words (default 63) 147 | n_grid (Optional[int]): the n_grid x n_grid size to use in determining 148 | the image signature (default 9) 149 | crop_percentiles (Optional[Tuple[int]]): lower and upper bounds when 150 | considering how much variance to keep in the image (default (5, 95)) 151 | distance_cutoff (Optional [float]): maximum image signature distance to 152 | be considered a match (default 0.45) 153 | *signature_args: Variable length argument list to pass to ImageSignature 154 | **signature_kwargs: Arbitrary keyword arguments to pass to ImageSignature 155 | 156 | """ 157 | # Check integer inputs 158 | if type(k) is not int: 159 | raise TypeError('k should be an integer') 160 | if type(N) is not int: 161 | raise TypeError('N should be an integer') 162 | if type(n_grid) is not int: 163 | raise TypeError('n_grid should be an integer') 164 | 165 | self.k = k 166 | self.N = N 167 | self.n_grid = n_grid 168 | 169 | # Check float input 170 | if type(distance_cutoff) is not float: 171 | raise TypeError('distance_cutoff should be a float') 172 | if distance_cutoff < 0.: 173 | raise ValueError('distance_cutoff should be > 0 (got %r)' % distance_cutoff) 174 | 175 | self.distance_cutoff = distance_cutoff 176 | 177 | self.crop_percentile = crop_percentile 178 | 179 | self.gis = ImageSignature(n=n_grid, crop_percentiles=crop_percentile, *signature_args, **signature_kwargs) 180 | 181 | def add_image(self, path, img=None): 182 | """Add a single image to the database 183 | 184 | Args: 185 | path (string): path or identifier for image. If img=None, then path is assumed to be 186 | a URL or filesystem path 187 | img (Optional[string]): raw image data. In this case, path will still be stored, but 188 | a signature will be generated from data in img (default None) 189 | 190 | """ 191 | rec = make_record(path, self.gis, self.k, self.N, img=img) 192 | self.insert_single_record(rec) 193 | 194 | def search_image(self, path, all_orientations=False, bytestream=False): 195 | """Search for matches 196 | 197 | Args: 198 | path (string): path or image data. If bytestream=False, then path is assumed to be 199 | a URL or filesystem path. Otherwise, it's assumed to be raw image data 200 | all_orientations (Optional[boolean]): if True, search for all combinations of mirror 201 | images, rotations, and color inversions (default False) 202 | bytestream (Optional[boolean]): will the image be passed as raw bytes? 203 | That is, is the 'path_or_image' argument an in-memory image? 204 | (default False) 205 | 206 | Returns: 207 | a formatted list of dicts representing unique matches, sorted by dist 208 | 209 | For example, if three matches are found: 210 | 211 | [ 212 | {'dist': 0.069116439263706961, 213 | 'id': u'AVM37oZq0osmmAxpPvx7', 214 | 'path': u'https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg'}, 215 | {'dist': 0.22484320805049718, 216 | 'id': u'AVM37nMg0osmmAxpPvx6', 217 | 'path': u'https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg/687px-Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg'}, 218 | {'dist': 0.42529792112113302, 219 | 'id': u'AVM37p530osmmAxpPvx9', 220 | 'path': u'https://c2.staticflickr.com/8/7158/6814444991_08d82de57e_z.jpg'} 221 | ] 222 | 223 | """ 224 | img = self.gis.preprocess_image(path, bytestream) 225 | 226 | if all_orientations: 227 | # initialize an iterator of composed transformations 228 | inversions = [lambda x: x, lambda x: -x] 229 | 230 | mirrors = [lambda x: x, np.fliplr] 231 | 232 | # an ugly solution for function composition 233 | rotations = [lambda x: x, 234 | np.rot90, 235 | lambda x: np.rot90(x, 2), 236 | lambda x: np.rot90(x, 3)] 237 | 238 | # cartesian product of all possible orientations 239 | orientations = product(inversions, rotations, mirrors) 240 | 241 | else: 242 | # otherwise just use the identity transformation 243 | orientations = [lambda x: x] 244 | 245 | # try for every possible combination of transformations; if all_orientations=False, 246 | # this will only take one iteration 247 | result = [] 248 | 249 | orientations = np.unique(np.ravel(list(orientations))) 250 | for transform in orientations: 251 | # compose all functions and apply on signature 252 | transformed_img = transform(img) 253 | 254 | # generate the signature 255 | transformed_record = make_record(transformed_img, self.gis, self.k, self.N) 256 | 257 | l = self.search_single_record(transformed_record) 258 | result.extend(l) 259 | r = sorted(np.unique(result).tolist(), key=itemgetter('dist')) 260 | s = set([x['id'] for x in r]) 261 | for i, x in enumerate(r): 262 | if x['id'] not in s: 263 | r.pop(i) 264 | else: 265 | s.remove(x['id']) 266 | return r 267 | 268 | 269 | def make_record(path, gis, k, N, img=None): 270 | """Makes a record suitable for database insertion. 271 | 272 | Note: 273 | This non-class version of make_record is provided for 274 | CPU pooling. Functions passed to worker processes must 275 | be picklable. 276 | 277 | Args: 278 | path (string): path or image data. If bytestream=False, then path is assumed to be 279 | a URL or filesystem path. Otherwise, it's assumed to be raw image data 280 | gis (ImageSignature): an instance of ImageSignature for generating the 281 | signature 282 | k (int): width of words for encoding 283 | N (int): number of words for encoding 284 | img (Optional[string]): raw image data. In this case, path will still be stored, but 285 | a signature will be generated from data in img (default None) 286 | 287 | Returns: 288 | An image record. 289 | 290 | For example: 291 | 292 | {'path': 'https://pixabay.com/static/uploads/photo/2012/11/28/08/56/mona-lisa-67506_960_720.jpg', 293 | 'signature': [0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 2, 2, 2, 0, 0, 0, 0, 2, 2, 2, 2, 0 ... ] 294 | 'simple_word_0': 42252475, 295 | 'simple_word_1': 23885671, 296 | 'simple_word_10': 9967839, 297 | 'simple_word_11': 4257902, 298 | 'simple_word_12': 28651959, 299 | 'simple_word_13': 33773597, 300 | 'simple_word_14': 39331441, 301 | 'simple_word_15': 39327300, 302 | 'simple_word_16': 11337345, 303 | 'simple_word_17': 9571961, 304 | 'simple_word_18': 28697868, 305 | 'simple_word_19': 14834907, 306 | 'simple_word_2': 7434746, 307 | 'simple_word_20': 37985525, 308 | 'simple_word_21': 10753207, 309 | 'simple_word_22': 9566120, 310 | ... 311 | } 312 | 313 | """ 314 | record = dict() 315 | record['path'] = path 316 | if img is not None: 317 | signature = gis.generate_signature(img, bytestream=True) 318 | else: 319 | signature = gis.generate_signature(path) 320 | 321 | record['signature'] = signature.tolist() 322 | 323 | words = get_words(signature, k, N) 324 | max_contrast(words) 325 | 326 | words = words_to_int(words) 327 | 328 | for i in range(N): 329 | record[''.join(['simple_word_', str(i)])] = words[i].tolist() 330 | 331 | return record 332 | 333 | 334 | def get_words(array, k, N): 335 | """Gets N words of length k from an array. 336 | 337 | Words may overlap. 338 | 339 | For example, say your image signature is [0, 1, 2, 0, -1, -2, 0, 1] and 340 | k=3 and N=4. That means we want 4 words of length 3. For this signature, 341 | that gives us: 342 | 343 | [0, 1, 2] 344 | [2, 0, -1] 345 | [-1, -2, 0] 346 | [0, 1] 347 | 348 | Args: 349 | array (numpy.ndarray): array to split into words 350 | k (int): word length 351 | N (int): number of words 352 | 353 | Returns: 354 | an array with N rows of length k 355 | 356 | """ 357 | # generate starting positions of each word 358 | word_positions = np.linspace(0, array.shape[0], 359 | N, endpoint=False).astype('int') 360 | 361 | # check that inputs make sense 362 | if k > array.shape[0]: 363 | raise ValueError('Word length cannot be longer than array length') 364 | if word_positions.shape[0] > array.shape[0]: 365 | raise ValueError('Number of words cannot be more than array length') 366 | 367 | # create empty words array 368 | words = np.zeros((N, k)).astype('int8') 369 | 370 | for i, pos in enumerate(word_positions): 371 | if pos + k <= array.shape[0]: 372 | words[i] = array[pos:pos+k] 373 | else: 374 | temp = array[pos:].copy() 375 | temp.resize(k) 376 | words[i] = temp 377 | 378 | return words 379 | 380 | 381 | def words_to_int(word_array): 382 | """Converts a simplified word to an integer 383 | 384 | Encodes a k-byte word to int (as those returned by max_contrast). 385 | First digit is least significant. 386 | 387 | Returns dot(word + 1, [1, 3, 9, 27 ...] ) for each word in word_array 388 | 389 | e.g.: 390 | [ -1, -1, -1] -> 0 391 | [ 0, 0, 0] -> 13 392 | [ 0, 1, 0] -> 16 393 | 394 | Args: 395 | word_array (numpy.ndarray): N x k array 396 | 397 | Returns: 398 | an array of integers of length N (the integer word encodings) 399 | 400 | """ 401 | width = word_array.shape[1] 402 | 403 | # Three states (-1, 0, 1) 404 | coding_vector = 3**np.arange(width) 405 | 406 | # The 'plus one' here makes all digits positive, so that the 407 | # integer represntation is strictly non-negative and unique 408 | return np.dot(word_array + 1, coding_vector) 409 | 410 | 411 | def max_contrast(array): 412 | """Sets all positive values to one and all negative values to -1. 413 | 414 | Needed for first pass lookup on word table. 415 | 416 | Args: 417 | array (numpy.ndarray): target array 418 | """ 419 | array[array > 0] = 1 420 | array[array < 0] = -1 421 | 422 | return None 423 | 424 | 425 | def normalized_distance(_target_array, _vec, nan_value=1.0): 426 | """Compute normalized distance to many points. 427 | 428 | Computes || vec - b || / ( ||vec|| + ||b||) for every b in target_array 429 | 430 | Args: 431 | _target_array (numpy.ndarray): N x m array 432 | _vec (numpy.ndarray): array of size m 433 | nan_value (Optional[float]): value to replace 0.0/0.0 = nan with 434 | (default 1.0, to take those featureless images out of contention) 435 | 436 | Returns: 437 | the normalized distance (float) 438 | """ 439 | target_array = _target_array.astype(int) 440 | vec = _vec.astype(int) 441 | topvec = np.linalg.norm(vec - target_array, axis=1) 442 | norm1 = np.linalg.norm(vec, axis=0) 443 | norm2 = np.linalg.norm(target_array, axis=1) 444 | finvec = topvec / (norm1 + norm2) 445 | finvec[np.isnan(finvec)] = nan_value 446 | 447 | return finvec 448 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | """ 2 | image_match is a simple package for finding approximate image matches from a 3 | corpus. It is similar, for instance, to pHash , but 4 | includes a database backend that easily scales to billions of images and 5 | supports sustained high rates of image insertion: up to 10,000 images/s on our 6 | cluster! 7 | 8 | Based on the paper An image signature for any kind of image, Goldberg et 9 | al . 10 | """ 11 | 12 | from setuptools import setup, find_packages 13 | 14 | tests_require = [ 15 | 'coverage', 16 | 'pep8', 17 | 'pyflakes', 18 | 'pylint', 19 | 'pytest', 20 | 'pytest-cov', 21 | 'pytest-xdist', 22 | ] 23 | 24 | dev_require = [ 25 | 'ipdb', 26 | 'ipython', 27 | ] 28 | 29 | docs_require = [ 30 | 'recommonmark>=0.4.0', 31 | 'Sphinx>=1.3.5', 32 | 'sphinxcontrib-napoleon>=0.4.4', 33 | 'sphinx-rtd-theme>=0.1.9', 34 | ] 35 | 36 | 37 | def check_if_numpy_is_installed(): 38 | try: 39 | import numpy 40 | except ImportError: 41 | print('There is an issue installing numpy automatically as a ' 42 | 'dependency. Please install it manually using\n' 43 | ' $ pip install numpy\n' 44 | 'or try Anaconda: https://www.continuum.io/') 45 | exit(1) 46 | 47 | check_if_numpy_is_installed() 48 | 49 | setup( 50 | name='image_match', 51 | version='0.1.0', 52 | description='image_match is a simple package for finding approximate '\ 53 | 'image matches from a corpus.', 54 | long_description=__doc__, 55 | url='https://github.com/ascribe/image-match/', 56 | author='Ryan Henderson', 57 | author_email='ryan@ascribe.io', 58 | license='Apache License 2.0', 59 | zip_safe=True, 60 | 61 | classifiers=[ 62 | 'Development Status :: 3 - Alpha', 63 | 'Intended Audience :: Developers', 64 | 'Topic :: Database', 65 | 'Topic :: Database :: Database Engines/Servers', 66 | 'Topic :: Software Development', 67 | 'Natural Language :: English', 68 | 'License :: OSI Approved :: Apache Software License', 69 | 'Programming Language :: Python :: 2', 70 | 'Programming Language :: Python :: 2.7', 71 | 'Operating System :: MacOS :: MacOS X', 72 | 'Operating System :: POSIX :: Linux', 73 | 'Topic :: Multimedia :: Graphics', 74 | ], 75 | 76 | packages=find_packages(), 77 | 78 | setup_requires=[ 79 | 'pytest-runner', 80 | ], 81 | install_requires=[ 82 | 'numpy>=1.10,<1.11', 83 | 'scipy>=0.17,<0.18', 84 | 'scikit-image>=0.12,<0.13', 85 | 'cairosvg>1,<2', 86 | 'elasticsearch>=2.3,<2.4', 87 | ], 88 | tests_require=tests_require, 89 | extras_require={ 90 | 'test': tests_require, 91 | 'dev': dev_require + tests_require + docs_require, 92 | 'docs': docs_require, 93 | }, 94 | ) 95 | 96 | --------------------------------------------------------------------------------