├── .gitignore ├── LICENSE.txt ├── README.md ├── datasets └── .gitignore ├── outputs └── .gitignore ├── randcolor.py └── smartgrid.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE 2 | Version 2, December 2004 3 | 4 | Copyright (C) 2004 Sam Hocevar 5 | 6 | Everyone is permitted to copy and distribute verbatim or modified 7 | copies of this license document, and changing it is allowed as long 8 | as the name is changed. 9 | 10 | DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE 11 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 12 | 13 | 0. You just DO WHAT THE FUCK YOU WANT TO. 14 | 15 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Intelligent grid layout from a set of images. 2 | 3 | # basics 4 | 5 | You can get started with the [someflags dataset](https://github.com/vusd/smartgrid/releases/download/someflags/someflags.zip) - unzip in the datasets subdirectory. Then: 6 | 7 | ```bash 8 | python smartgrid.py \ 9 | --input-glob 'datasets/someflags/*.png' \ 10 | --output-path outputs/flag_grid 11 | ``` 12 | Output is `outputs/flag_grid/grid.jpg`: 13 | ![default flag grid](https://cloud.githubusercontent.com/assets/945979/26386053/08cb8076-4098-11e7-8caa-9449cd241e85.jpg) 14 | 15 | Output is a set of files in the provided directory. These files inclde images that show the original TSNE layout and grid assignments. 16 | 17 | # options 18 | 19 | ## different models 20 | The arrangement is based on a trained neural network, and many model options are available via keras. They are: 21 | * vgg16 22 | * vgg19 23 | * resnet50 24 | * inceptionv3 25 | * xception 26 | 27 | In addition, the specific layer of the network to be used for feature extraction can be provided as well. And for inceptionv3, a pooling option is available. By default vgg16/fc2 is used. 28 | 29 | For example: 30 | ```bash 31 | python smartgrid.py \ 32 | --model inceptionv3 \ 33 | --pooling avg \ 34 | --input-glob 'datasets/someflags/*.png' \ 35 | --output-path outputs/flags_inception_avgpool 36 | ``` 37 | 38 | ![inception avg pooling](https://cloud.githubusercontent.com/assets/945979/26386117/9afb5c3c-4098-11e7-96dc-2bc444a2c982.jpg) 39 | 40 | Grouping by color is also possible by using the `color` or `color_lab` models: 41 | 42 | ```bash 43 | python smartgrid.py \ 44 | --model color \ 45 | --input-glob 'datasets/someflags/*.png' \ 46 | --output-path outputs/flag_grid_colors 47 | ``` 48 | Output is `outputs/flag_grid_colors/grid.jpg`: 49 | ![flag color grid](https://cloud.githubusercontent.com/assets/945979/26386050/08b8c170-4098-11e7-885f-787fd17e31b3.jpg) 50 | 51 | ## tile, aspect-ratio, left-right anchors 52 | 53 | Optional command line arguements are also available that change the aspect ratio and influence the layout. For example: 54 | 55 | ```bash 56 | python smartgrid.py \ 57 | --tile 24x12 \ 58 | --input-glob 'datasets/someflags/*.png' \ 59 | --left-image 'datasets/someflags/FR.png' \ 60 | --right-image 'datasets/someflags/NL.png' \ 61 | --output-path outputs/someflags_horizvert_s10 62 | ``` 63 | ![flag color grid](https://cloud.githubusercontent.com/assets/945979/26386049/08b85140-4098-11e7-8f80-d0158fd22b11.jpg) 64 | 65 | This set of arguments creates a non-square grid and also suggests that the `FR.png` image (🇫🇷) should be laid out to the left of the `NL.png` image (🇳🇱). The left/right image flags also try to influence groupings by exaggerating the differences between these anchors (this stretching can be disabled by setting `--left-right-scale 0.0`). 66 | 67 | The tile argument specifies the number of rows and columns. You can also specify `--aspect-ratio` to have the grid image fit a specific format. 68 | 69 | ## filtering, output format and filename 70 | 71 | An experimental argument `--min-distance` has been added that will remove duplicates based on the distance apart in feature space. Additionally, the output file name can be overridden, and the file format will 72 | be inferred from the name. So to output the flag grid in png format without duplicates: 73 | 74 | ```bash 75 | python smartgrid.py \ 76 | --aspect-ratio 1.778 \ 77 | --input-glob 'datasets/someflags/*.png' \ 78 | --min-distance 5 \ 79 | --grid-file grid_min_dist_5.png \ 80 | --output-path outputs/someflags_nodupes 81 | ``` 82 | Output this time is `outputs/someflags_nodupes/grid_min_dist_5.png`: 83 | ![filtered png file](https://cloud.githubusercontent.com/assets/945979/26386051/08cae440-4098-11e7-9fde-3ac0ad8ccbde.png) 84 | 85 | Note the duplicates were removed (eg: there were 2 US flags before). The argument has to be fiddled with, but there is an output folder `rejects` which shows the duplicates that were found. 86 | 87 | ## rerunning, grid spacing, and visualizing link strength 88 | 89 | Rerunning can be done more quickly by keeping the same output-path and adding the `--do-reload` flag. You probably also want to remove the `model`, `layer` arguments and perhaps change the `grid-file` output. 90 | 91 | The `--grid-spacing` option puts space between elements on the grid. For example, `--grid-spacing 1` will add a one pixel border to the grid elements. 92 | 93 | Additionally, there is an experimental option to use the grid spacing to visualize the strength between adjacent entries. For this you add `--show-links`. 94 | 95 | Putting that together, we can reuse the result from the section above and output to a different filename showing link strength. 96 | 97 | ```bash 98 | python smartgrid.py \ 99 | --do-reload \ 100 | --aspect-ratio 1.778 \ 101 | --input-glob 'datasets/someflags/*.png' \ 102 | --min-distance 5 \ 103 | --show-links \ 104 | --grid-spacing 24 \ 105 | --grid-file grid_with_links.jpg \ 106 | --output-path outputs/someflags_nodupes 107 | ``` 108 | ![reload with spacing and links](https://cloud.githubusercontent.com/assets/945979/26386054/08cb8e54-4098-11e7-9fec-5fb6a553ac79.jpg) 109 | 110 | In the current visualization, the closer the entries are in feature space the *thinner* the line between them (think of the line as a wall that wants to separate them). Also this run much faster because the neural net is no longer needed when `--do-reload` is used. 111 | 112 | ## imagemagick (for the big grids) 113 | 114 | Extrememly large grids blow up because of a PIL memory limitation. In this case you can fallback 115 | to using imagemagick (montage) to construct the grid. So if you have 4700 images to group perceptually 116 | by color: 117 | 118 | ```bash 119 | python smartgrid.py \ 120 | --model color_lab \ 121 | --use-imagemagick \ 122 | --aspect-ratio 1.778 \ 123 | --input-glob 'datasets/new_yorker/*.jpg' \ 124 | --output-path outputs/ny_color_lab 125 | 126 | # resize output with imagemagick 127 | convert outputs/ny_color_lab/grid.jpg \ 128 | -resize 5% \ 129 | outputs/ny_color_lab/grid_scaled.jpg 130 | ``` 131 | Go get a coffee. Then come back to find `outputs/ny_color_lab/grid_scaled.jpg`: 132 | ![huge grid shrunken](https://cloud.githubusercontent.com/assets/945979/26386052/08cb66e0-4098-11e7-8222-7ec9afaf50ed.jpg) 133 | 134 | # dependencies 135 | 136 | Currently requires keras 2.x, scipy, sklearn, matplotlib, 137 | braceexpand, tqdm, and either [python-lap](https://github.com/dribnet/python-lap/tree/rename_lap) (seems to work everywhere but sometimes hangs when processing >600 images) or [lapjv](https://github.com/src-d/lapjv) (runs much faster for me and provides verbose output). Also requires imagemagick (montage) when using `--use-imagemagick` option. 138 | 139 | # credits 140 | 141 | Code originally adapted from genekogan's [tSNE-images.py](https://github.com/ml4a/ml4a-ofx/blob/master/scripts/tSNE-images.py) and kylemcdonald's [CloudToGrid](https://github.com/kylemcdonald/CloudToGrid) 142 | 143 | # license 144 | 145 | WTFPL -------------------------------------------------------------------------------- /datasets/.gitignore: -------------------------------------------------------------------------------- 1 | * 2 | -------------------------------------------------------------------------------- /outputs/.gitignore: -------------------------------------------------------------------------------- 1 | * 2 | -------------------------------------------------------------------------------- /randcolor.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import sys 3 | import numpy as np 4 | import json 5 | import os 6 | from os.path import isfile, join 7 | import keras 8 | from keras.preprocessing import image 9 | from keras.models import Model 10 | from sklearn.decomposition import PCA 11 | from sklearn.manifold import TSNE 12 | from scipy.spatial import distance 13 | import scipy 14 | from skimage import color 15 | import math 16 | import numbers 17 | import time 18 | from tqdm import tqdm 19 | from PIL import Image 20 | import tensorflow as tf 21 | 22 | import matplotlib 23 | matplotlib.use('Agg') 24 | import matplotlib.pyplot as plt 25 | 26 | from braceexpand import braceexpand 27 | import glob 28 | import random 29 | 30 | def generateRandColor(width, height, outpath): 31 | stale_color = True 32 | 33 | while stale_color: 34 | r = random.randint(0, 255) 35 | g = random.randint(0, 255) 36 | b = random.randint(0, 255) 37 | rgbstr = '0x{:02x}{:02x}{:02x}'.format(r,g,b) 38 | outfile = os.path.join(outpath, "{}.png".format(rgbstr)) 39 | stale_color = os.path.exists(outfile) 40 | 41 | im_array = np.zeros([height, width, 3]).astype(np.uint8) 42 | im_array[:,:] = [r, g, b] 43 | im = Image.fromarray(im_array) 44 | 45 | im.save(outfile) 46 | 47 | def main(): 48 | parser = argparse.ArgumentParser(description="Make N random color images, save to outdir") 49 | parser.add_argument('--num-colors', default=100, type=int, 50 | help="how many images to generate") 51 | parser.add_argument('--width', default=10, type=int, 52 | help="image width") 53 | parser.add_argument('--height', default=10, type=int, 54 | help="image height") 55 | parser.add_argument('--output-path', default="outputs/colors/rand100_01", type=str, 56 | help='path to where to put output files') 57 | parser.add_argument('--random-seed', default=1, type=int, 58 | help='Use a specific random seed (for repeatability)') 59 | args = parser.parse_args() 60 | 61 | if args.random_seed: 62 | print("Setting random seed: ", args.random_seed) 63 | random.seed(args.random_seed) 64 | np.random.seed(args.random_seed) 65 | tf.set_random_seed(args.random_seed) 66 | 67 | # make output directory if needed 68 | if not os.path.exists(args.output_path): 69 | os.makedirs(args.output_path) 70 | 71 | for i in range(args.num_colors): 72 | generateRandColor(args.width, args.height, args.output_path) 73 | 74 | if __name__ == '__main__': 75 | main() 76 | -------------------------------------------------------------------------------- /smartgrid.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import sys 3 | import numpy as np 4 | import json 5 | import os 6 | from os.path import isfile, join 7 | import keras 8 | from keras.preprocessing import image 9 | from keras.models import Model 10 | from sklearn.decomposition import PCA 11 | from sklearn.manifold import TSNE 12 | from scipy.spatial import distance 13 | from skimage import color 14 | import imageio 15 | import math 16 | import numbers 17 | import time 18 | from tqdm import tqdm 19 | from PIL import Image 20 | import tensorflow as tf 21 | import tensorflow_hub as hub 22 | import random 23 | import umap 24 | 25 | import matplotlib 26 | matplotlib.use('Agg') 27 | import matplotlib.pyplot as plt 28 | 29 | from braceexpand import braceexpand 30 | import glob 31 | 32 | # import with fallback behavior 33 | using_python_lap = True 34 | try: 35 | # https://github.com/gatagat/lap 36 | import lap 37 | except ImportError: 38 | try: 39 | # https://github.com/src-d/lapjv 40 | import lapjv 41 | using_python_lap = False 42 | except ImportError: 43 | print("Error: could not find lapjv or python-lap, cannot continue") 44 | sys.exit(1) 45 | 46 | def real_glob(rglob): 47 | glob_list = braceexpand(rglob) 48 | files = [] 49 | for g in glob_list: 50 | files = files + glob.glob(g) 51 | return sorted(files) 52 | 53 | def center_crop(img, target_size): 54 | width, height = img.size 55 | smaller = width 56 | if height < width: 57 | smaller = height 58 | 59 | # TODO: this might be off by one 60 | left = np.ceil((width - smaller)/2.) 61 | top = np.ceil((height - smaller)/2.) 62 | right = np.floor((width + smaller)/2.) 63 | bottom = np.floor((height + smaller)/2.) 64 | img = img.crop((left, top, right, bottom)) 65 | # print("resizing from {} to {}".format([width, height], target_size)) 66 | img = img.resize(target_size) 67 | return img 68 | 69 | def get_image(path, input_shape, do_crop=False, is_bit=False): 70 | if do_crop: 71 | # cropping version 72 | img = image.load_img(path) 73 | # print(path) 74 | img = center_crop(img, target_size=input_shape) 75 | else: 76 | # scaling version 77 | img = image.load_img(path, target_size=input_shape) 78 | 79 | # img.save("sized.png") 80 | # print("DONE") 81 | x = image.img_to_array(img) 82 | if not is_bit: 83 | x = np.expand_dims(x, axis=0) 84 | return x 85 | 86 | def get_average_color_classic(path, colorspace='rgb'): 87 | c = imageio.imread(path, pilmode='RGB') 88 | if colorspace == 'lab': 89 | # print("CONVERTING TO LAB") 90 | # old_color = c 91 | c = color.rgb2lab(c) 92 | # print("Converted from {} to {}".format(old_color[0], c[0])) 93 | c = c.mean(axis=(0,1)) 94 | else: 95 | c = c.mean(axis=(0,1)) 96 | c = c / 255.0 97 | 98 | # WTF indeed (this happens for black (rgb)) 99 | if isinstance(c, numbers.Number): 100 | c = [c, c, c] 101 | return c 102 | 103 | np.seterr(all='raise') 104 | def get_average_color(path, colorspace='rgb', subsampling=None): 105 | im = imageio.imread(path, pilmode='RGB') 106 | w, h, c = im.shape 107 | colors = [] 108 | if subsampling is None: 109 | subsampling = "1"; 110 | if subsampling.endswith("+"): 111 | sample_from = int(subsampling[:-1]) 112 | sample_downto = 0 113 | else: 114 | sample_from = int(subsampling) 115 | sample_downto = sample_from-1 116 | for gridsize in range(sample_from, sample_downto, -1): 117 | for y in range(gridsize): 118 | h1 = int(y*h/gridsize) 119 | h2 = int((y+1)*h/gridsize) 120 | for x in range(gridsize): 121 | w1 = int(x*w/gridsize) 122 | w2 = int((x+1)*w/gridsize) 123 | quadrant = im[w1:w2, h1:h2, :] 124 | 125 | if colorspace == 'lab': 126 | try: 127 | c = color.rgb2lab(quadrant) 128 | c = c.mean(axis=(0,1)) 129 | except RuntimeWarning: 130 | print("problem with ", path) 131 | print(quadrant.shape) 132 | c = np.array([0.0, 0.0, 0.0]) 133 | else: 134 | c = quadrant.mean(axis=(0,1)) 135 | c = c / 255.0 136 | 137 | # WTF indeed (this happens for black (rgb)) 138 | if isinstance(c, numbers.Number): 139 | c = [c, c, c] 140 | 141 | colors.append(c) 142 | 143 | return np.array(colors).flatten() 144 | 145 | def read_file_list(filelist): 146 | lines = [] 147 | with open(filelist) as file: 148 | for line in file: 149 | line = line.strip() #or someother preprocessing 150 | line = line.strip( '"' ) # remove quotes 151 | lines.append(line) 152 | return lines 153 | 154 | def read_json_vectors(filename): 155 | """Return np array of vectors from json sources""" 156 | vectors = [] 157 | with open(filename) as json_file: 158 | json_data = json.load(json_file) 159 | for v in json_data: 160 | vectors.append(v) 161 | print("Read {} vectors from {}".format(len(vectors), filename)) 162 | np_array = np.array(vectors) 163 | return np_array 164 | 165 | def get_image_list(input_glob): 166 | if input_glob.startswith('@'): 167 | images = read_file_list(input_glob[1:]) 168 | else: 169 | images = real_glob(input_glob) 170 | num_images = len(images) 171 | print("Found {} images".format(num_images)) 172 | return images 173 | 174 | def set_grid_size(images, width, height, aspect_ratio, drop_to_fit): 175 | num_images = len(images) 176 | if width is None and aspect_ratio is None: 177 | # just have width == height 178 | max_side_extent = math.sqrt(num_images) 179 | if max_side_extent.is_integer() or drop_to_fit: 180 | width = int(max_side_extent) 181 | height = width 182 | else: 183 | width = int(max_side_extent) + 1 184 | print("Checking: ", width*(width-1), num_images) 185 | if width*(width-1) >= num_images: 186 | height = width-1 187 | else: 188 | height = width 189 | elif width is None: 190 | # sniff the aspect ratio of the first file 191 | with Image.open(images[0]) as img: 192 | im_width = img.size[0] 193 | im_height = img.size[1] 194 | tile_aspect_ratio = im_width / im_height 195 | raw_height = math.sqrt((num_images * tile_aspect_ratio) / aspect_ratio) 196 | raw_width = num_images / raw_height 197 | int_height = int(raw_height) 198 | int_width = int(raw_width) 199 | if (raw_height.is_integer() and raw_width.is_integer()) or drop_to_fit: 200 | height = int_height 201 | width = int_width 202 | if not drop_to_fit: 203 | print("--> {} images fits exactly as {}x{}".format(num_images, width, height)) 204 | else: 205 | if not raw_height.is_integer(): 206 | int_height = int_height + 1 207 | if not raw_width.is_integer(): 208 | int_width = int_width + 1 209 | if int_width*(int_height-1) >= num_images: 210 | width = int_width 211 | height = int_height-1 212 | else: 213 | width = int_width 214 | height = int_height 215 | print("--> {} images best fits as {}x{}".format(num_images, width, height)) 216 | print("tile size is {}x{} so aspect of {:.3f} is {}x{} (final: {}x{})".format( 217 | im_width, im_height, aspect_ratio, width, height, width*im_width, height*im_height)) 218 | 219 | num_grid_spaces = width * height 220 | if drop_to_fit: 221 | grid_images = images[:num_grid_spaces] 222 | num_images = len(grid_images) 223 | else: 224 | grid_images = images 225 | 226 | if num_grid_spaces < num_images: 227 | print("Error: {} images is too many for {}x{} grid.".format(num_images, width, height)) 228 | sys.exit(0) 229 | elif num_images == 0: 230 | print("Error: no images to process") 231 | sys.exit(0) 232 | elif num_grid_spaces == 0: 233 | print("Error: no spaces for images") 234 | sys.exit(0) 235 | 236 | print("Using {} images to build {}x{} montage".format(num_images, width, height)) 237 | return grid_images, width, height 238 | 239 | def normalize_columns(rawpoints, low=0, high=1): 240 | mins = np.min(rawpoints, axis=0) 241 | maxs = np.max(rawpoints, axis=0) 242 | rng = maxs - mins 243 | scaled_points = high - (((high - low) * (maxs - rawpoints)) / rng) 244 | return scaled_points 245 | 246 | def analyze_images_colors(images, colorspace='rgb', subsampling=None): 247 | # analyze images and grab activations 248 | colors = [] 249 | for image_path in images: 250 | try: 251 | if subsampling is None: 252 | c = get_average_color_classic(image_path, colorspace) 253 | else: 254 | c = get_average_color(image_path, colorspace, subsampling) 255 | except Exception as e: 256 | print("Problem reading {}: {}".format(image_path, e)) 257 | c = [0, 0, 0] 258 | # print(image_path, c) 259 | colors.append(c) 260 | # colors = normalize_columns(colors) 261 | return colors 262 | 263 | def bit_preprocess_image(image): 264 | image = np.array(image) 265 | # reshape into shape [batch_size, height, width, num_channels] 266 | img_reshaped = tf.reshape(image, [1, image.shape[0], image.shape[1], image.shape[2]]) 267 | # Use `convert_image_dtype` to convert to floats in the [0,1] range. 268 | image = tf.image.convert_image_dtype(img_reshaped, tf.float32) 269 | return image 270 | 271 | def analyze_images(images, model_name, layer_name=None, pooling=None, do_crop=False, subsampling=None, do_pca=False): 272 | if model_name == 'color_lab': 273 | return analyze_images_colors(images, colorspace='lab', subsampling=subsampling) 274 | elif model_name == 'color' or model_name == 'color_rgb': 275 | return analyze_images_colors(images, colorspace='rgb', subsampling=subsampling) 276 | 277 | num_images = len(images) 278 | include_top = (layer_name is not None) 279 | 280 | model_lookup_table = { 281 | 'densenet121': { 282 | 'model_class': keras.applications.densenet.DenseNet121, 283 | 'input_shape': (224, 224), 284 | 'preprocess_input': keras.applications.densenet.preprocess_input 285 | }, 286 | 'densenet169': { 287 | 'model_class': keras.applications.densenet.DenseNet169, 288 | 'input_shape': (224, 224), 289 | 'preprocess_input': keras.applications.densenet.preprocess_input 290 | }, 291 | 'densenet201': { 292 | 'model_class': keras.applications.densenet.DenseNet201, 293 | 'input_shape': (224, 224), 294 | 'preprocess_input': keras.applications.densenet.preprocess_input 295 | }, 296 | 'inceptionresnetv2': { 297 | 'model_class': keras.applications.inception_resnet_v2.InceptionResNetV2, 298 | 'input_shape': (299, 299), 299 | 'preprocess_input': keras.applications.inception_resnet_v2.preprocess_input 300 | }, 301 | 'inceptionv3': { 302 | 'model_class': keras.applications.inception_v3.InceptionV3, 303 | 'input_shape': (299, 299), 304 | 'preprocess_input': keras.applications.inception_v3.preprocess_input 305 | }, 306 | 'resnet50': { 307 | 'model_class': keras.applications.resnet.ResNet50, 308 | 'input_shape': (224, 224), 309 | 'preprocess_input': keras.applications.resnet.preprocess_input 310 | }, 311 | 'resnet101': { 312 | 'model_class': keras.applications.resnet.ResNet101, 313 | 'input_shape': (224, 224), 314 | 'preprocess_input': keras.applications.resnet.preprocess_input 315 | }, 316 | 'resnet152': { 317 | 'model_class': keras.applications.resnet.ResNet152, 318 | 'input_shape': (224, 224), 319 | 'preprocess_input': keras.applications.resnet.preprocess_input 320 | }, 321 | 'vgg16': { 322 | 'model_class': keras.applications.vgg16.VGG16, 323 | 'input_shape': (224, 224), 324 | 'preprocess_input': keras.applications.vgg16.preprocess_input 325 | }, 326 | 'vgg19': { 327 | 'model_class': keras.applications.vgg19.VGG19, 328 | 'input_shape': (224, 224), 329 | 'preprocess_input': keras.applications.vgg19.preprocess_input 330 | }, 331 | 'inceptionv3': { 332 | 'model_class': keras.applications.xception.Xception, 333 | 'input_shape': (299, 299), 334 | 'preprocess_input': keras.applications.xception.preprocess_input 335 | }, 336 | } 337 | 338 | is_bit = False 339 | is_clip = False 340 | if model_name.startswith("bit"): 341 | model_url = f"https://tfhub.dev/google/{model_name}/1" 342 | input_shape = None 343 | preprocess_input = bit_preprocess_image 344 | model = hub.KerasLayer(model_url) 345 | is_bit = True 346 | elif model_name.startswith("clip"): 347 | import clip 348 | import torch 349 | from torchvision.transforms import Compose, Resize, CenterCrop, ToTensor, Normalize 350 | 351 | model_type = "ViT-B/32" 352 | if len(model_name) > 4: 353 | parts = model_name.split(":") 354 | model_type = parts[1] 355 | model, preprocess = clip.load(model_type) 356 | print(preprocess) 357 | input_size = model.input_resolution.item() 358 | input_shape = (input_size,input_size) 359 | preprocess_input = None 360 | is_clip = True 361 | elif model_name in model_lookup_table: 362 | model_class = model_lookup_table[model_name]['model_class'] 363 | input_shape = model_lookup_table[model_name]['input_shape'] 364 | preprocess_input = model_lookup_table[model_name]['preprocess_input'] 365 | model = model_class(weights='imagenet', include_top=include_top) 366 | else: 367 | print("Error: model {} not found".format(model_name)) 368 | sys.exit(1) 369 | 370 | if layer_name is None: 371 | feat_extractor = model 372 | elif layer_name == "show" or layer_name == "list": 373 | for i,layer in enumerate(model.layers): 374 | print("{} layer {:03d}: {}".format(model_name, i, layer.name)) 375 | sys.exit(0) 376 | else: 377 | feat_extractor = Model(inputs=model.input, outputs=model.get_layer(layer_name).output) 378 | 379 | # analyze images and grab activations 380 | activations = [] 381 | for idx in tqdm(range(len(images))): 382 | file_path = images[idx] 383 | img = get_image(file_path, input_shape, do_crop, is_bit); 384 | if img is not None: 385 | # preprocess 386 | if preprocess_input is not None: 387 | img = preprocess_input(img) 388 | # print("getting activations for %s %d/%d" % (file_path,idx,num_images)) 389 | if is_bit: 390 | acts = model(img)[0].numpy() 391 | elif is_clip: 392 | batch_item = img[0]/255.0; 393 | transform2 = Compose([ 394 | ToTensor(), 395 | Normalize((0.48145466, 0.4578275, 0.40821073), (0.26862954, 0.26130258, 0.27577711)), 396 | ]) 397 | zimages = [] 398 | im = transform2(batch_item) 399 | zimages.append(im) 400 | im_batch = torch.stack(zimages) 401 | acts = model.encode_image(im_batch)[0].detach().cpu().numpy() 402 | else: 403 | acts = feat_extractor.predict(img)[0] 404 | if len(activations) == 0: 405 | print("Collecting vectors of size {}".format(acts.flatten().shape)) 406 | activations.append(acts.flatten()) 407 | # run PCA firt 408 | features = np.array(activations) 409 | if do_pca: 410 | print("Running PCA on features: {}".format(features.shape)) 411 | pca = PCA(n_components=300) 412 | pca.fit(features) 413 | pca_features = pca.transform(features) 414 | return np.asarray(pca_features) 415 | else: 416 | return features 417 | 418 | def fit_to_unit_square(points, width, height): 419 | x_scale = 1.0 420 | y_scale = 1.0 421 | if (width > height): 422 | y_scale = height / width 423 | elif(width < height): 424 | x_scale = width / height 425 | points -= points.min(axis=0) 426 | points /= points.max(axis=0) 427 | points = points * [x_scale, y_scale] 428 | return points 429 | 430 | def index_from_substring(images, substr): 431 | index = None 432 | for i in range(len(images)): 433 | # print("{} and {} and {}".format(images[i], substr, images[i].find(substr))) 434 | if images[i].find(substr) != -1: 435 | if index is None: 436 | index = i 437 | else: 438 | raise ValueError("The substring {} is ambiguious: {} and {}".format( 439 | substr, images[index], images[i])) 440 | if index is None: 441 | raise ValueError("The substring {} was not found in {} images".format(substr, len(images))) 442 | else: 443 | print("Resolved {} to image {}".format(substr, images[index])) 444 | return index 445 | 446 | def write_list(list, output_path, output_file, quote=False): 447 | filelist = os.path.join(output_path, output_file) 448 | with open(filelist, "w") as text_file: 449 | for item in list: 450 | if isinstance(item, np.ndarray): 451 | text_file.write("{}\n".format(",".join(map(str,item)))) 452 | elif quote: 453 | text_file.write("\"{}\"\n".format(item)) 454 | else: 455 | text_file.write("{}\n".format(item)) 456 | return filelist 457 | 458 | def read_list(output_path, output_file, numeric=False): 459 | filelist = os.path.join(output_path, output_file) 460 | lines = [] 461 | with open(filelist) as file: 462 | for line in file: 463 | line = line.strip() #or someother preprocessing 464 | if numeric: 465 | lines.append(list(map(float, line.split(",")))) 466 | else: 467 | lines.append(line) 468 | if numeric: 469 | return np.array(lines) 470 | else: 471 | return lines 472 | 473 | def make_grid_image(filelist, cols=None, rows=None, spacing=0, links=None, do_hexgrid=False): 474 | """Convert an image grid to a single image""" 475 | N = len(filelist) 476 | 477 | with Image.open(filelist[0]) as img: 478 | width = img.size[0] 479 | height = img.size[1] 480 | if width > height: 481 | max_link_size = int(1.0 * height) 482 | else: 483 | max_link_size = int(1.0 * width) 484 | 485 | if rows == None: 486 | sq_num = math.sqrt(N) 487 | sq_dim = int(sq_num) 488 | if sq_num != sq_dim: 489 | sq_dim = sq_dim + 1 490 | rows = sq_dim 491 | cols = sq_dim 492 | 493 | total_height = rows * height 494 | total_width = cols * width 495 | 496 | total_height = total_height + spacing * (rows - 1) 497 | total_width = total_width + spacing * (cols - 1) 498 | # shift every other row this much in x 499 | hex_space = int(width / 2 + spacing) 500 | if do_hexgrid: 501 | total_width = total_width + hex_space 502 | 503 | im_array = np.zeros([total_height, total_width, 3]).astype(np.uint8) 504 | im_array.fill(255) 505 | 506 | if links is not None: 507 | print("Rows: {}".format(len(links))) 508 | for r in range(len(links)): 509 | row = links[r] 510 | for c in range(len(row)): 511 | cell = row[c] 512 | offset_y, offset_x = r*height+spacing*r, c*width+spacing*c 513 | cy = int(offset_y + height / 2) 514 | cx = int(offset_x + width / 2) 515 | if cell[0] >= 0: 516 | link_right_height = max_link_size * (1.0 - cell[0]) 517 | oy = int(link_right_height / 2) 518 | ldw = int(link_right_height) 519 | im_array[(cy-oy):(cy-oy+ldw), cx:(cx+width), :] = 0 520 | if cell[1] >= 0: 521 | link_down_width = max_link_size * (1.0 - cell[1]) 522 | ox = int(link_down_width / 2) 523 | lrw = int(link_down_width) 524 | im_array[cy:(cy+height), (cx-ox):(cx-ox+lrw), :] = 0 525 | 526 | for i in range(rows*cols): 527 | if i < N: 528 | r = i // cols 529 | c = i % cols 530 | 531 | with Image.open(filelist[i]) as img: 532 | rgb_im = img.convert('RGB') 533 | offset_y, offset_x = r*height+spacing*r, c*width+spacing*c 534 | if do_hexgrid and (r%2 == 0): 535 | offset_x += hex_space 536 | im_array[offset_y:(offset_y+height), offset_x:(offset_x+width), :] = rgb_im 537 | 538 | return Image.fromarray(im_array) 539 | 540 | def filter_distance_min(images, X, min_distance, reject_dir=None): 541 | num_images = len(images) 542 | keepers = [True] * num_images 543 | cur_pos = 0 544 | assignments = [] 545 | min_distance2 = min_distance * min_distance 546 | for i in range(num_images): 547 | if not keepers[i]: 548 | continue 549 | rejects = [] 550 | assignments.append(i) 551 | cur_v = X[i] 552 | for j in range(i+1, num_images): 553 | if keepers[j]: 554 | # if np.linalg.norm(cur_v - X[j]) < min_distance: 555 | diff = cur_v - X[j] 556 | if np.dot(diff, diff) < min_distance2: 557 | rejects.append(j) 558 | keepers[j] = False 559 | if len(rejects) > 0: 560 | print("rejecting {} images similar to entry {}".format(len(rejects), i)) 561 | if reject_dir: 562 | reject_grid = [images[i]] 563 | for ix in rejects: 564 | reject_grid.append(images[ix]) 565 | img = make_grid_image(reject_grid) 566 | reject_file_path = os.path.join(reject_dir, 567 | "dist_{:04f}_{:03d}.jpg".format(min_distance, i)) 568 | img.save(reject_file_path) 569 | 570 | 571 | print("Keeping {} of {} images".format(len(assignments), num_images)) 572 | im_array = np.array(images) 573 | X_array = np.array(X) 574 | return im_array[assignments].tolist(), X_array[assignments] 575 | 576 | def filter_distance_max(images, X, max_distance, reject_dir=None, max_group_size=1): 577 | num_images = len(images) 578 | keepers = [False] * num_images 579 | cur_pos = 0 580 | assignments = [] 581 | max_distance2 = max_distance * max_distance 582 | for i in range(num_images): 583 | if keepers[i]: 584 | assignments.append(i) 585 | continue 586 | accepts = [] 587 | cur_v = X[i] 588 | for j in range(i+1, num_images): 589 | if not keepers[j]: 590 | # if np.linalg.norm(cur_v - X[j]) < max_distance: 591 | diff = cur_v - X[j] 592 | if np.dot(diff, diff) < max_distance2: 593 | keepers[i] = True 594 | keepers[j] = True 595 | accepts.append(j) 596 | 597 | if len(accepts) >= max_group_size: 598 | print("accepting {} images similar to entry {}".format(len(accepts), i)) 599 | assignments.append(i) 600 | if reject_dir: 601 | reject_grid = [images[i]] 602 | for ix in accepts: 603 | reject_grid.append(images[ix]) 604 | img = make_grid_image(reject_grid) 605 | reject_file_path = os.path.join(reject_dir, 606 | "dist_{:04f}_{:03d}.jpg".format(max_distance, i)) 607 | img.save(reject_file_path) 608 | 609 | 610 | print("Keeping {} of {} images".format(len(assignments), num_images)) 611 | im_array = np.array(images) 612 | X_array = np.array(X) 613 | return im_array[assignments].tolist(), X_array[assignments] 614 | 615 | def reduce_grid_targets(grid, num_grid_images, do_reduce_hack): 616 | if do_reduce_hack: 617 | num_grid_images = len(grid) - 1 618 | print("reducing grid from {} to {}".format(len(grid), num_grid_images)) 619 | mean_point = np.mean(grid, axis=0) 620 | newList = grid - mean_point 621 | sort = np.sum(np.power(newList, 2), axis=1) 622 | indexed_order = sort.argsort() 623 | sorted_list = grid[indexed_order] 624 | return sorted_list[:num_grid_images], indexed_order 625 | 626 | def run_prune(filelist, vectorlist): 627 | new_filelist = [] 628 | new_vectorlist = [] 629 | for i in range(len(vectorlist)): 630 | # if vectorlist[i] is not None: 631 | if vectorlist[i] is not None and os.path.exists(filelist[i]): 632 | new_filelist.append(filelist[i]) 633 | new_vectorlist.append(vectorlist[i]) 634 | print("Pruned filelist from {} to {} entries".format(len(filelist), len(new_filelist))) 635 | return new_filelist, np.array(new_vectorlist) 636 | 637 | # def run_filecheck(filelist, vectorlist): 638 | # new_filelist = [] 639 | # new_vectorlist = [] 640 | # for i in range(len(vectorlist)): 641 | # if os.path.exists(filelist[i]): 642 | # new_filelist.append(filelist[i]) 643 | # new_vectorlist.append(vectorlist[i]) 644 | # print("Pruned filelist from {} to {} entries".format(len(filelist), len(new_filelist))) 645 | # return new_filelist, np.array(new_vectorlist) 646 | 647 | # in the future the clip_range could be smarter, 648 | # like 1-4,100-200 etc. 649 | # for now, just doing head 650 | def run_clip(filelist, vectorlist, clip_range): 651 | clip_number = int(clip_range) 652 | new_filelist = filelist[:clip_number] 653 | new_vectorlist = vectorlist[:clip_number] 654 | return new_filelist, np.array(new_vectorlist) 655 | 656 | from sklearn.cluster import KMeans 657 | from sklearn.metrics import pairwise_distances_argmin_min 658 | 659 | def run_kmeans(images, X): 660 | km = KMeans(n_clusters=100).fit(X) 661 | closest, _ = pairwise_distances_argmin_min(km.cluster_centers_, X) 662 | np_filelist = np.array(images) 663 | return np_filelist[closest].tolist(), X[closest] 664 | 665 | def run_grid(input_glob, left_image, right_image, left_right_scale, 666 | output_path, tsne_dimensions, tsne_perplexity, 667 | tsne_learning_rate, width, height, aspect_ratio, drop_to_fit, fill_shade, 668 | vectors_file, do_prune, clip_range, subsampling, 669 | model, layer, pooling, do_crop, grid_file, use_imagemagick, 670 | grid_spacing, show_links, links_max_threshold, 671 | min_distance, max_distance, max_group_size, do_reload=False, 672 | do_tsne=False, do_reduce_hack=False, do_pca=False, do_hexgrid=False): 673 | 674 | # make output directory if needed 675 | if output_path != '' and not os.path.exists(output_path): 676 | os.makedirs(output_path) 677 | 678 | if do_reload: 679 | images = read_list(output_path, "image_files.txt", numeric=False) 680 | X = read_list(output_path, "image_vectors.txt", numeric=True) 681 | print("Reloaded {} images and {} vectors".format(len(images), X.shape)) 682 | num_images = len(images) 683 | avg_colors = analyze_images_colors(images, 'rgb') 684 | else: 685 | ## compute width,weight from image list and provided defaults 686 | if input_glob is not None: 687 | images = get_image_list(input_glob) 688 | num_images = len(images) 689 | 690 | if vectors_file is not None: 691 | X = read_json_vectors(vectors_file) 692 | else: 693 | X = analyze_images(images, model, layer, pooling, do_crop, subsampling) 694 | 695 | if do_prune: 696 | images, X = run_prune(images, X) 697 | 698 | if clip_range: 699 | images, X = run_clip(images, X, clip_range) 700 | 701 | # images, X = run_kmeans(images, X) 702 | 703 | # save data 704 | write_list(images, output_path, "image_files.txt") 705 | write_list(X, output_path, "image_vectors.txt") 706 | 707 | ## Lookup left/right images 708 | left_image_index = None 709 | right_image_index = None 710 | # scale X by left/right axis 711 | if left_image is not None and right_image is not None: 712 | left_image_index = index_from_substring(images, left_image) 713 | right_image_index = index_from_substring(images, right_image) 714 | 715 | if left_image_index is not None: 716 | # todo: confirm this is how to stretch by a vector 717 | lr_vector = X[right_image_index] - X[left_image_index] 718 | lr_vector = lr_vector / np.linalg.norm(lr_vector) 719 | X_new = np.zeros_like(X) 720 | for i in range(len(X)): 721 | len_x = np.linalg.norm(X[i]) 722 | norm_x = X[i] / len_x 723 | scale_factor = 1.0 + left_right_scale * (1.0 + np.dot(norm_x, lr_vector)) 724 | new_length = len_x * scale_factor 725 | # print("Vector {}: length went from {} to {}".format(i, len_x, new_length)) 726 | X_new[i] = new_length * norm_x 727 | X = X_new 728 | 729 | # TODO: filtering here 730 | if min_distance is not None and min_distance > 0: 731 | reject_dir = os.path.join(output_path, "rejects_min") 732 | if reject_dir != '' and not os.path.exists(reject_dir): 733 | os.makedirs(reject_dir) 734 | images, X = filter_distance_min(images, X, min_distance, reject_dir) 735 | 736 | if max_distance is not None and max_distance > 0: 737 | reject_dir = os.path.join(output_path, "rejects_max") 738 | if reject_dir != '' and not os.path.exists(reject_dir): 739 | os.makedirs(reject_dir) 740 | images, X = filter_distance_max(images, X, max_distance, reject_dir, max_group_size) 741 | 742 | grid_images, width, height = set_grid_size(images, width, height, aspect_ratio, drop_to_fit) 743 | num_grid_images = len(grid_images) 744 | print("Compare: {} and {}".format(num_grid_images, width*height)) 745 | 746 | # this line is a hack for now 747 | X = np.asarray(X[:num_grid_images]) 748 | 749 | print("SO X {}".format(X.shape)) 750 | if do_tsne: 751 | print("Running tsne on {} images...".format(num_grid_images)) 752 | tsne = TSNE(n_components=tsne_dimensions, learning_rate=tsne_learning_rate, perplexity=tsne_perplexity, verbose=2).fit_transform(X) 753 | else: 754 | print("Running umap on {} images...".format(num_grid_images)) 755 | tsne = umap.UMAP(metric='cosine', min_dist=0.9).fit_transform(X) 756 | print("EMBEDDING SHAPE {}".format(tsne.shape)) 757 | 758 | avg_colors = analyze_images_colors(images, 'rgb') 759 | 760 | data = [] 761 | for i,f in enumerate(grid_images): 762 | point = [ ((tsne[i,k] - np.min(tsne[:,k]))/(np.max(tsne[:,k]) - np.min(tsne[:,k]))).tolist() for k in range(tsne_dimensions) ] 763 | data.append({"path":grid_images[i], "point":point}) 764 | with open(os.path.join(output_path, "points.json"), 'w') as outfile: 765 | json.dump(data, outfile) 766 | 767 | if left_image_index is not None: 768 | data2d = fit_to_unit_square(tsne, 1, 1) 769 | else: 770 | data2d = fit_to_unit_square(tsne, width, height) 771 | plt.figure(figsize=(12, 12)) 772 | plt.xlim(-0.1, 1.1) 773 | plt.ylim(-0.1, 1.1) 774 | plt.gca().invert_yaxis() 775 | grays = np.linspace(0, 0.8, len(data2d)) 776 | plt.scatter(data2d[:,0], data2d[:,1], c=avg_colors, edgecolors='none', marker='o', s=24) 777 | if left_image_index is not None: 778 | plt.scatter(data2d[left_image_index:left_image_index+1,0], 779 | data2d[left_image_index:left_image_index+1,1], 780 | facecolors='none', edgecolors='r', marker='o', s=24*3) 781 | plt.scatter(data2d[right_image_index:right_image_index+1,0], 782 | data2d[right_image_index:right_image_index+1,1], 783 | facecolors='none', edgecolors='g', marker='o', s=24*3) 784 | plt.savefig(os.path.join(output_path, "embedding.png"), bbox_inches='tight') 785 | 786 | # this is an experimental section where left/right image can be given 787 | if left_image_index is not None: 788 | origin = data2d[left_image_index] 789 | data2d = data2d - origin 790 | dest = data2d[right_image_index] 791 | x_axis = np.array([1, 0]) 792 | theta = np.arctan2(dest[1],dest[0]) 793 | print("Spin angle is {}".format(np.rad2deg(theta))) 794 | # theta = np.deg2rad(90) 795 | # print("Spin angle is {}".format(np.rad2deg(theta))) 796 | # # http://scipython.com/book/chapter-6-numpy/examples/creating-a-rotation-matrix-in-numpy/ 797 | a_c, a_s = np.cos(theta), np.sin(theta) 798 | R = np.matrix([[a_c, -a_s], [a_s, a_c]]) 799 | data2d = np.array(data2d * R) 800 | # print("IS: ", data2d.shape) 801 | data2d = fit_to_unit_square(data2d, width, height) 802 | 803 | # TODO: this is a nasty cut-n-paste of above with different filename 804 | plt.figure(figsize=(8, 8)) 805 | plt.xlim(-0.1, 1.1) 806 | plt.ylim(-0.1, 1.1) 807 | plt.gca().invert_yaxis() 808 | 809 | plt.scatter(data2d[:,0], data2d[:,1], c=avg_colors, edgecolors='none', marker='o', s=24) 810 | if left_image_index is not None: 811 | plt.scatter(data2d[left_image_index:left_image_index+1,0], 812 | data2d[left_image_index:left_image_index+1,1], 813 | facecolors='none', edgecolors='r', marker='o', s=48) 814 | plt.scatter(data2d[right_image_index:right_image_index+1,0], 815 | data2d[right_image_index:right_image_index+1,1], 816 | facecolors='none', edgecolors='g', marker='o', s=48) 817 | plt.savefig(os.path.join(output_path, "embedding_spun.png"), bbox_inches='tight') 818 | 819 | write_list(data2d, output_path, "embedding_coords.txt") 820 | 821 | # TSNE is done, setup layout for grid assignment 822 | max_width, max_height = 1, 1 823 | if (width > height): 824 | max_height = height / width 825 | elif(width < height): 826 | max_width = width / height 827 | xv, yv = np.meshgrid(np.linspace(0, max_width, width), np.linspace(0, max_height, height)) 828 | if do_hexgrid: 829 | half_space = max_width / (2 * width) 830 | # print("RUNNING THE FUCKING HEXGRID ", half_space, xv) 831 | xv[::2, :] += half_space 832 | # print("RAN ", xv) 833 | grid = np.dstack((xv, yv)).reshape(-1, 2) 834 | # this strange step removes corners 835 | grid, indexed_lookup = reduce_grid_targets(grid, num_grid_images, do_reduce_hack) 836 | 837 | # print("G", grid.shape, grid[0]) 838 | # print("D2D", data2d.shape) 839 | 840 | cost = distance.cdist(grid, data2d, 'euclidean') 841 | # cost = distance.cdist(grid, data2d, 'sqeuclidean') 842 | cost = cost * (100000. / cost.max()) 843 | # print("C", cost.shape, cost[0][0]) 844 | 845 | if using_python_lap: 846 | print("Starting assignment (this can take a few minutes)") 847 | min_cost2, row_assigns2, col_assigns2 = lap.lapjv(cost, extend_cost=do_reduce_hack) 848 | print("Assignment complete") 849 | else: 850 | # note slightly different API 851 | row_assigns2, col_assigns2, min_cost2 = lapjv.lapjv(cost, verbose=True, force_doubles=False) 852 | grid_jv2 = grid[col_assigns2] 853 | # print(col_assigns2.shape) 854 | plt.figure(figsize=(20, 20)) 855 | plt.xlim(-0.1, 1.1) 856 | plt.ylim(-0.1, 1.1) 857 | plt.gca().invert_yaxis() 858 | for start, end, c in zip(data2d, grid_jv2, avg_colors): 859 | plt.arrow(start[0], start[1], end[0] - start[0], end[1] - start[1], 860 | color=c, head_length=0.01, head_width=0.01) 861 | if left_image_index is not None: 862 | plt.scatter(data2d[left_image_index:left_image_index+1,0], 863 | data2d[left_image_index:left_image_index+1,1], 864 | facecolors='none', edgecolors='r', marker='o', s=48) 865 | plt.scatter(data2d[right_image_index:right_image_index+1,0], 866 | data2d[right_image_index:right_image_index+1,1], 867 | facecolors='none', edgecolors='g', marker='o', s=48) 868 | plt.savefig(os.path.join(output_path, 'movement.png'), bbox_inches='tight') 869 | 870 | num_grid_spaces = len(indexed_lookup) 871 | num_actual_images = len(row_assigns2) 872 | num_missing = num_grid_spaces - num_actual_images 873 | using_placeholder = False 874 | 875 | if num_missing > 0: 876 | # makde a note that placeholder is in use 877 | using_placeholder = True 878 | 879 | # add a blank entry to the vectors 880 | _, v_len = X.shape 881 | X2 = np.append(X, [np.zeros(v_len)], axis=0) 882 | print("Updating vectors from {} to {}".format(X.shape, X2.shape)) 883 | X = X2 884 | 885 | # add blank entry to images 886 | # sniff the aspect ratio of the first file 887 | with Image.open(grid_images[0]) as img: 888 | im_width = img.size[0] 889 | im_height = img.size[1] 890 | 891 | im_array = np.full([im_height, im_width, 3], [fill_shade, fill_shade, fill_shade]).astype(np.uint8) 892 | # im_array = np.zeros([im_width, im_height, 3]).astype(np.uint8) 893 | blank_img = Image.fromarray(im_array) 894 | blank_image_path = os.path.join(output_path, "blank.png") 895 | blank_img.save(blank_image_path) 896 | blank_index = len(grid_images) 897 | grid_images.append(blank_image_path) 898 | 899 | # now grow row assignments, giving all remaining to new blanks 900 | residuals = np.full([num_missing], blank_index) 901 | row_assigns2 = np.append(row_assigns2, residuals) 902 | 903 | reverse_lookup = np.zeros(num_grid_spaces, dtype=int) 904 | reverse_lookup[indexed_lookup] = np.arange(num_grid_spaces) 905 | 906 | image_indexes = row_assigns2[reverse_lookup] 907 | img_grid_vectors = X[image_indexes] 908 | g_len, g_dim = img_grid_vectors.shape 909 | img_grid_shaped = img_grid_vectors.reshape(height, width, g_dim) 910 | with open(os.path.join(output_path, "grid_vectors.json"), 'w') as outfile: 911 | json.dump(img_grid_shaped.tolist(), outfile) 912 | 913 | n_images = np.asarray(grid_images) 914 | image_grid = n_images[image_indexes] 915 | montage_filelist = write_list(image_grid, output_path, 916 | "montage_{}x{}.txt".format(width, height), quote=True) 917 | grid_file_path = os.path.join(output_path, grid_file) 918 | grid_im_file_path = os.path.join(output_path, "{}".format(grid_file)) 919 | left_right_path = os.path.join(output_path, "left_right.jpg") 920 | if use_imagemagick: 921 | command = "montage @{} -geometry +0+0 -tile {}x{} {}".format( 922 | montage_filelist, width, height, grid_im_file_path) 923 | # print("running imagemagick montage: {}".format(command)) 924 | os.system(command) 925 | 926 | # if left_image_index is not None: 927 | # command = "montage '{}' '{}' -geometry +0+0 -tile 2x1 {}".format( 928 | # images[left_image_index], images[right_image_index], left_right_path) 929 | # os.system(command) 930 | 931 | else: 932 | # image vectors are in X 933 | links = None 934 | if show_links: 935 | links = [] 936 | img_grid_vectors = X[image_indexes] 937 | for r in range(height): 938 | row = [] 939 | links.append(row) 940 | for c in range(width): 941 | idx = r * width + c 942 | cur_v = img_grid_vectors[idx] 943 | if c < width - 1: 944 | left_v = img_grid_vectors[idx+1] 945 | if using_placeholder and (not cur_v.any() or not left_v.any()): 946 | dist_left = -1 947 | else: 948 | dist_left = np.linalg.norm(cur_v - left_v) 949 | else: 950 | dist_left = -1 951 | if r < height - 1: 952 | down_v = img_grid_vectors[idx+width] 953 | if using_placeholder and (not cur_v.any() or not down_v.any()): 954 | dist_down = -1 955 | else: 956 | dist_down = np.linalg.norm(cur_v - down_v) 957 | else: 958 | dist_down = -1 959 | cell = [dist_left, dist_down] 960 | row.append(cell) 961 | links = np.array(links) 962 | # normalize to 0-1 963 | if links_max_threshold is not None: 964 | num_removed = (links > links_max_threshold).sum() 965 | links[links > links_max_threshold] = -1 966 | num_left = (links > 0).sum() 967 | print("removed {} links, {} left".format(num_removed, num_left)) 968 | links_max = np.amax(links) 969 | valid_vals = np.where(links > 0) 970 | links_min = np.amin(links[valid_vals]) 971 | print("Normalizing to {}/{}".format(links_min, links_max)) 972 | links = ((links - links_min) / (links_max - links_min)) 973 | print("Links is {}".format(links.shape)) 974 | img = make_grid_image(image_grid, width, height, grid_spacing, links, do_hexgrid) 975 | img.save(grid_file_path) 976 | if left_image_index is not None: 977 | img = make_grid_image([grid_images[left_image_index], grid_images[right_image_index]], 2, 1, 1) 978 | img.save(left_right_path) 979 | 980 | 981 | def main(): 982 | parser = argparse.ArgumentParser(description="Deep learning grid layout") 983 | parser.add_argument('--input-glob', default=None, 984 | help="use file glob source of images") 985 | parser.add_argument('--left-image', default=None, 986 | help="use file as example of left") 987 | parser.add_argument('--right-image', default=None, 988 | help="use file as example of right") 989 | parser.add_argument('--vectors', default=None, 990 | help="read vectors directly instead of running model") 991 | parser.add_argument('--do-prune', default=False, action='store_true', 992 | help="Prune filelist filtering if vectors missing") 993 | parser.add_argument('--clip-range', default=None, 994 | help="only show range of images given (eg: 100)") 995 | parser.add_argument('--model', default=None, 996 | help="model to use, one of: vgg16 vgg19 resnet50 inceptionv3 xception") 997 | parser.add_argument('--layer', default=None, 998 | help="optional override to set custom model layer") 999 | parser.add_argument('--pooling', default=None, 1000 | help="optional override to control inceptionv3 pooling (avg or max)") 1001 | parser.add_argument('--subsampling', default=None, 1002 | help="subsampling specifier for tiles (for some models). eg: 2+") 1003 | parser.add_argument('--left-right-scale', default=4.0, type=float, 1004 | help="scaling factor for left-right axis") 1005 | parser.add_argument('--output-path', 1006 | help='path to where to put output files') 1007 | parser.add_argument('--grid-file', default="grid.jpg", 1008 | help='name (and format) of grid output file') 1009 | parser.add_argument('--num-dimensions', default=2, type=int, 1010 | help='dimensionality of t-SNE points') 1011 | parser.add_argument('--perplexity', default=30, type=int, 1012 | help='perplexity of t-SNE') 1013 | parser.add_argument('--learning-rate', default=150, type=int, 1014 | help='learning rate of t-SNE') 1015 | parser.add_argument('--do-crop', default=False, action='store_true', 1016 | help="Center crop instead of scale") 1017 | parser.add_argument('--drop-to-fit', default=False, action='store_true', 1018 | help="Drop extra images to fit to aspect ratio") 1019 | parser.add_argument('--fill-shade', default=0, type=int, 1020 | help='shade of gray for filling in blanks') 1021 | parser.add_argument('--use-imagemagick', default=False, action='store_true', 1022 | help="generate grid using imagemagick (montage)") 1023 | parser.add_argument('--tile', default=None, 1024 | help="Grid size WxH (eg: 12x12)") 1025 | parser.add_argument('--grid-spacing', default=0, type=int, 1026 | help='whitespace between images in grid') 1027 | parser.add_argument('--show-links', default=False, action='store_true', 1028 | help="visualize link strength in whitespace") 1029 | parser.add_argument('--links-max-threshold', default=None, type=float, 1030 | help="drop links past this threshold") 1031 | parser.add_argument('--aspect-ratio', default=None, type=float, 1032 | help="Instead of square, fit image to given aspect ratio") 1033 | parser.add_argument('--min-distance', default=None, type=float, 1034 | help="Removed duplicates based on distance") 1035 | parser.add_argument('--max-distance', default=None, type=float, 1036 | help="Removes items if they are beyond max from all others") 1037 | parser.add_argument('--max-group-size', default=1, type=int, 1038 | help='when max-distance, minimum number of additional members') 1039 | parser.add_argument('--do-reload', default=False, action='store_true', 1040 | help="Reload file list and vectors from saved state") 1041 | parser.add_argument('--do-tsne', default=False, action='store_true', 1042 | help="Run tsne instead of umap") 1043 | parser.add_argument('--do-reduce-hack', default=False, action='store_true', 1044 | help="allow holes (and remove one entry)") 1045 | parser.add_argument('--do-pca', default=False, action='store_true', 1046 | help="run PCA on features before dimensionality reduction") 1047 | parser.add_argument('--do-hexgrid', default=False, action='store_true', 1048 | help="shift even rows by half a cell size to make grid a hex grid") 1049 | parser.add_argument('--random-seed', default=None, type=int, 1050 | help='Use a specific random seed (for repeatability)') 1051 | args = parser.parse_args() 1052 | 1053 | if args.random_seed: 1054 | print("Setting random seed: ", args.random_seed) 1055 | random.seed(args.random_seed) 1056 | np.random.seed(args.random_seed) 1057 | # tf.set_random_seed(args.random_seed) 1058 | 1059 | width, height = None, None 1060 | if args.tile is not None: 1061 | width, height = map(int, args.tile.split("x")) 1062 | 1063 | if args.model is None and args.layer is None: 1064 | model = "bit/m-r101x1" 1065 | layer = None 1066 | elif args.model is None: 1067 | model = "vgg16" 1068 | layer = args.layer 1069 | else: 1070 | model = args.model 1071 | layer = args.layer 1072 | # this obviously needs refactoring 1073 | run_grid(args.input_glob, args.left_image, args.right_image, args.left_right_scale, 1074 | args.output_path, args.num_dimensions, 1075 | args.perplexity, args.learning_rate, width, height, args.aspect_ratio, 1076 | args.drop_to_fit, args.fill_shade, args.vectors, args.do_prune, args.clip_range, 1077 | args.subsampling, 1078 | model, layer, args.pooling, args.do_crop, args.grid_file, args.use_imagemagick, 1079 | args.grid_spacing, args.show_links, args.links_max_threshold, 1080 | args.min_distance, args.max_distance, 1081 | args.max_group_size, args.do_reload, args.do_tsne, 1082 | args.do_reduce_hack, args.do_pca, args.do_hexgrid) 1083 | 1084 | if __name__ == '__main__': 1085 | main() 1086 | --------------------------------------------------------------------------------