├── .gitignore
├── LICENSE.txt
├── README.md
├── datasets
    └── .gitignore
├── outputs
    └── .gitignore
├── randcolor.py
└── smartgrid.py


/.gitignore:
--------------------------------------------------------------------------------
1 | *.pyc
2 | 


--------------------------------------------------------------------------------
/LICENSE.txt:
--------------------------------------------------------------------------------
 1 |             DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
 2 |                     Version 2, December 2004
 3 | 
 4 |  Copyright (C) 2004 Sam Hocevar <sam@hocevar.net>
 5 | 
 6 |  Everyone is permitted to copy and distribute verbatim or modified
 7 |  copies of this license document, and changing it is allowed as long
 8 |  as the name is changed.
 9 | 
10 |             DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
11 |    TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
12 | 
13 |   0. You just DO WHAT THE FUCK YOU WANT TO.
14 | 
15 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Intelligent grid layout from a set of images.
  2 | 
  3 | # basics
  4 | 
  5 | You can get started with the [someflags dataset](https://github.com/vusd/smartgrid/releases/download/someflags/someflags.zip) - unzip in the datasets subdirectory. Then:
  6 | 
  7 | ```bash
  8 | python smartgrid.py \
  9 |   --input-glob 'datasets/someflags/*.png' \
 10 |   --output-path outputs/flag_grid
 11 | ```
 12 | Output is `outputs/flag_grid/grid.jpg`:
 13 | ![default flag grid](https://cloud.githubusercontent.com/assets/945979/26386053/08cb8076-4098-11e7-8caa-9449cd241e85.jpg)
 14 | 
 15 | Output is a set of files in the provided directory. These files inclde images that show the original TSNE layout and grid assignments.
 16 | 
 17 | # options
 18 | 
 19 | ## different models
 20 | The arrangement is based on a trained neural network, and many model options are available via keras. They are:
 21 |  * vgg16
 22 |  * vgg19
 23 |  * resnet50
 24 |  * inceptionv3
 25 |  * xception
 26 | 
 27 | In addition, the specific layer of the network to be used for feature extraction can be provided as well. And for inceptionv3, a pooling option is available. By default vgg16/fc2 is used.
 28 | 
 29 | For example:
 30 | ```bash
 31 | python smartgrid.py \
 32 |   --model inceptionv3 \
 33 |   --pooling avg \
 34 |   --input-glob 'datasets/someflags/*.png' \
 35 |   --output-path outputs/flags_inception_avgpool
 36 | ```
 37 | 
 38 | ![inception avg pooling](https://cloud.githubusercontent.com/assets/945979/26386117/9afb5c3c-4098-11e7-96dc-2bc444a2c982.jpg)
 39 | 
 40 | Grouping by color is also possible by using the `color` or `color_lab` models:
 41 | 
 42 | ```bash
 43 | python smartgrid.py \
 44 |   --model color \
 45 |   --input-glob 'datasets/someflags/*.png' \
 46 |   --output-path outputs/flag_grid_colors
 47 | ```
 48 | Output is `outputs/flag_grid_colors/grid.jpg`:
 49 | ![flag color grid](https://cloud.githubusercontent.com/assets/945979/26386050/08b8c170-4098-11e7-885f-787fd17e31b3.jpg)
 50 | 
 51 | ## tile, aspect-ratio, left-right anchors
 52 | 
 53 | Optional command line arguements are also available that change the aspect ratio and influence the layout. For example:
 54 | 
 55 | ```bash
 56 | python smartgrid.py \
 57 |   --tile 24x12 \
 58 |   --input-glob 'datasets/someflags/*.png' \
 59 |   --left-image 'datasets/someflags/FR.png' \
 60 |   --right-image 'datasets/someflags/NL.png' \
 61 |   --output-path outputs/someflags_horizvert_s10
 62 | ```
 63 | ![flag color grid](https://cloud.githubusercontent.com/assets/945979/26386049/08b85140-4098-11e7-8f80-d0158fd22b11.jpg)
 64 | 
 65 | This set of arguments creates a non-square grid and also suggests that the `FR.png` image (🇫🇷) should be laid out to the left of the `NL.png` image (🇳🇱). The left/right image flags also try to influence groupings by exaggerating the differences between these anchors (this stretching can be disabled by setting `--left-right-scale 0.0`).
 66 | 
 67 | The tile argument specifies the number of rows and columns. You can also specify `--aspect-ratio` to have the grid image fit a specific format.
 68 | 
 69 | ## filtering, output format and filename
 70 | 
 71 | An experimental argument `--min-distance` has been added that will remove duplicates based on the distance apart in feature space. Additionally, the output file name can be overridden, and the file format will
 72 | be inferred from the name. So to output the flag grid in png format without duplicates:
 73 | 
 74 | ```bash
 75 | python smartgrid.py \
 76 |   --aspect-ratio 1.778 \
 77 |   --input-glob 'datasets/someflags/*.png' \
 78 |   --min-distance 5 \
 79 |   --grid-file grid_min_dist_5.png \
 80 |   --output-path outputs/someflags_nodupes
 81 | ```
 82 | Output this time is `outputs/someflags_nodupes/grid_min_dist_5.png`:
 83 | ![filtered png file](https://cloud.githubusercontent.com/assets/945979/26386051/08cae440-4098-11e7-9fde-3ac0ad8ccbde.png)
 84 | 
 85 | Note the duplicates were removed (eg: there were 2 US flags before). The argument has to be fiddled with, but there is an output folder `rejects` which shows the duplicates that were found.
 86 | 
 87 | ## rerunning, grid spacing, and visualizing link strength
 88 | 
 89 | Rerunning can be done more quickly by keeping the same output-path and adding the `--do-reload` flag. You probably also want to remove the `model`, `layer` arguments and perhaps change the `grid-file` output.
 90 | 
 91 | The `--grid-spacing` option puts space between elements on the grid. For example, `--grid-spacing 1` will add a one pixel border to the grid elements.
 92 | 
 93 | Additionally, there is an experimental option to use the grid spacing to visualize the strength between adjacent entries. For this you add `--show-links`.
 94 | 
 95 | Putting that together, we can reuse the result from the section above and output to a different filename showing link strength.
 96 | 
 97 | ```bash
 98 | python smartgrid.py \
 99 |   --do-reload \
100 |   --aspect-ratio 1.778 \
101 |   --input-glob 'datasets/someflags/*.png' \
102 |   --min-distance 5 \
103 |   --show-links \
104 |   --grid-spacing 24 \
105 |   --grid-file grid_with_links.jpg \
106 |   --output-path outputs/someflags_nodupes
107 | ```
108 | ![reload with spacing and links](https://cloud.githubusercontent.com/assets/945979/26386054/08cb8e54-4098-11e7-9fec-5fb6a553ac79.jpg)
109 | 
110 | In the current visualization, the closer the entries are in feature space the *thinner* the line between them (think of the line as a wall that wants to separate them). Also this run much faster because the neural net is no longer needed when `--do-reload` is used.
111 | 
112 | ## imagemagick (for the big grids)
113 | 
114 | Extrememly large grids blow up because of a PIL memory limitation. In this case you can fallback
115 | to using imagemagick (montage) to construct the grid. So if you have 4700 images to group perceptually
116 | by color:
117 | 
118 | ```bash
119 | python smartgrid.py \
120 |   --model color_lab \
121 |   --use-imagemagick \
122 |   --aspect-ratio 1.778 \
123 |   --input-glob 'datasets/new_yorker/*.jpg' \
124 |   --output-path outputs/ny_color_lab
125 | 
126 | # resize output with imagemagick
127 | convert outputs/ny_color_lab/grid.jpg \
128 |   -resize 5%  \
129 |   outputs/ny_color_lab/grid_scaled.jpg
130 | ```
131 | Go get a coffee. Then come back to find `outputs/ny_color_lab/grid_scaled.jpg`:
132 | ![huge grid shrunken](https://cloud.githubusercontent.com/assets/945979/26386052/08cb66e0-4098-11e7-8222-7ec9afaf50ed.jpg)
133 | 
134 | # dependencies
135 | 
136 | Currently requires keras 2.x, scipy, sklearn, matplotlib,
137 | braceexpand, tqdm, and either [python-lap](https://github.com/dribnet/python-lap/tree/rename_lap) (seems to work everywhere but sometimes hangs when processing >600 images) or [lapjv](https://github.com/src-d/lapjv) (runs much faster for me and provides verbose output). Also requires imagemagick (montage) when using `--use-imagemagick` option.
138 | 
139 | # credits
140 | 
141 | Code originally adapted from genekogan's [tSNE-images.py](https://github.com/ml4a/ml4a-ofx/blob/master/scripts/tSNE-images.py) and kylemcdonald's [CloudToGrid](https://github.com/kylemcdonald/CloudToGrid)
142 | 
143 | # license
144 | 
145 | WTFPL


--------------------------------------------------------------------------------
/datasets/.gitignore:
--------------------------------------------------------------------------------
1 | *
2 | 


--------------------------------------------------------------------------------
/outputs/.gitignore:
--------------------------------------------------------------------------------
1 | *
2 | 


--------------------------------------------------------------------------------
/randcolor.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import sys
 3 | import numpy as np
 4 | import json
 5 | import os
 6 | from os.path import isfile, join
 7 | import keras
 8 | from keras.preprocessing import image
 9 | from keras.models import Model
10 | from sklearn.decomposition import PCA
11 | from sklearn.manifold import TSNE
12 | from scipy.spatial import distance
13 | import scipy
14 | from skimage import color
15 | import math
16 | import numbers
17 | import time
18 | from tqdm import tqdm
19 | from PIL import Image
20 | import tensorflow as tf
21 | 
22 | import matplotlib
23 | matplotlib.use('Agg')
24 | import matplotlib.pyplot as plt
25 | 
26 | from braceexpand import braceexpand
27 | import glob
28 | import random
29 | 
30 | def generateRandColor(width, height, outpath):
31 |     stale_color = True
32 | 
33 |     while stale_color:
34 |         r = random.randint(0, 255)
35 |         g = random.randint(0, 255)
36 |         b = random.randint(0, 255)
37 |         rgbstr = '0x{:02x}{:02x}{:02x}'.format(r,g,b)
38 |         outfile = os.path.join(outpath, "{}.png".format(rgbstr))
39 |         stale_color = os.path.exists(outfile)
40 | 
41 |     im_array = np.zeros([height, width, 3]).astype(np.uint8)
42 |     im_array[:,:] = [r, g, b]
43 |     im = Image.fromarray(im_array)
44 | 
45 |     im.save(outfile)
46 | 
47 | def main():
48 |     parser = argparse.ArgumentParser(description="Make N random color images, save to outdir")
49 |     parser.add_argument('--num-colors', default=100, type=int,
50 |                         help="how many images to generate")
51 |     parser.add_argument('--width', default=10, type=int,
52 |                         help="image width")
53 |     parser.add_argument('--height', default=10, type=int,
54 |                         help="image height")
55 |     parser.add_argument('--output-path', default="outputs/colors/rand100_01", type=str,
56 |                          help='path to where to put output files')
57 |     parser.add_argument('--random-seed', default=1, type=int,
58 |                         help='Use a specific random seed (for repeatability)')
59 |     args = parser.parse_args()
60 | 
61 |     if args.random_seed:
62 |       print("Setting random seed: ", args.random_seed)
63 |       random.seed(args.random_seed)
64 |       np.random.seed(args.random_seed)
65 |       tf.set_random_seed(args.random_seed)
66 | 
67 |     # make output directory if needed
68 |     if not os.path.exists(args.output_path):
69 |         os.makedirs(args.output_path)
70 | 
71 |     for i in range(args.num_colors):
72 |         generateRandColor(args.width, args.height, args.output_path)
73 | 
74 | if __name__ == '__main__':
75 |     main()
76 | 


--------------------------------------------------------------------------------
/smartgrid.py:
--------------------------------------------------------------------------------
   1 | import argparse
   2 | import sys
   3 | import numpy as np
   4 | import json
   5 | import os
   6 | from os.path import isfile, join
   7 | import keras
   8 | from keras.preprocessing import image
   9 | from keras.models import Model
  10 | from sklearn.decomposition import PCA
  11 | from sklearn.manifold import TSNE
  12 | from scipy.spatial import distance
  13 | from skimage import color
  14 | import imageio
  15 | import math
  16 | import numbers
  17 | import time
  18 | from tqdm import tqdm
  19 | from PIL import Image
  20 | import tensorflow as tf
  21 | import tensorflow_hub as hub
  22 | import random
  23 | import umap
  24 | 
  25 | import matplotlib
  26 | matplotlib.use('Agg')
  27 | import matplotlib.pyplot as plt
  28 | 
  29 | from braceexpand import braceexpand
  30 | import glob
  31 | 
  32 | # import with fallback behavior
  33 | using_python_lap = True
  34 | try:
  35 |     # https://github.com/gatagat/lap
  36 |     import lap
  37 | except ImportError:
  38 |     try:
  39 |         # https://github.com/src-d/lapjv
  40 |         import lapjv
  41 |         using_python_lap = False
  42 |     except ImportError:
  43 |         print("Error: could not find lapjv or python-lap, cannot continue")
  44 |         sys.exit(1)
  45 | 
  46 | def real_glob(rglob):
  47 |     glob_list = braceexpand(rglob)
  48 |     files = []
  49 |     for g in glob_list:
  50 |         files = files + glob.glob(g)
  51 |     return sorted(files)
  52 | 
  53 | def center_crop(img, target_size):
  54 |      width, height = img.size
  55 |      smaller = width
  56 |      if height < width:
  57 |          smaller = height
  58 | 
  59 |      # TODO: this might be off by one
  60 |      left = np.ceil((width - smaller)/2.)
  61 |      top = np.ceil((height - smaller)/2.)
  62 |      right = np.floor((width + smaller)/2.)
  63 |      bottom = np.floor((height + smaller)/2.)
  64 |      img = img.crop((left, top, right, bottom))
  65 |      # print("resizing from {} to {}".format([width, height], target_size))
  66 |      img = img.resize(target_size)
  67 |      return img
  68 | 
  69 | def get_image(path, input_shape, do_crop=False, is_bit=False):
  70 |     if do_crop:
  71 |         # cropping version
  72 |         img = image.load_img(path)
  73 |         # print(path)
  74 |         img = center_crop(img, target_size=input_shape)
  75 |     else:
  76 |         # scaling version
  77 |         img = image.load_img(path, target_size=input_shape)
  78 | 
  79 |     # img.save("sized.png")
  80 |     # print("DONE")
  81 |     x = image.img_to_array(img)
  82 |     if not is_bit:
  83 |         x = np.expand_dims(x, axis=0)
  84 |     return x
  85 | 
  86 | def get_average_color_classic(path, colorspace='rgb'):
  87 |     c = imageio.imread(path, pilmode='RGB')
  88 |     if colorspace == 'lab':
  89 |         # print("CONVERTING TO LAB")
  90 |         # old_color = c
  91 |         c = color.rgb2lab(c)
  92 |         # print("Converted from {} to {}".format(old_color[0], c[0]))
  93 |         c = c.mean(axis=(0,1))
  94 |     else:
  95 |         c = c.mean(axis=(0,1))
  96 |         c = c / 255.0
  97 | 
  98 |     # WTF indeed (this happens for black (rgb))
  99 |     if isinstance(c, numbers.Number):
 100 |         c = [c, c, c]
 101 |     return c
 102 | 
 103 | np.seterr(all='raise')
 104 | def get_average_color(path, colorspace='rgb', subsampling=None):
 105 |     im = imageio.imread(path, pilmode='RGB')
 106 |     w, h, c = im.shape
 107 |     colors = []
 108 |     if subsampling is None:
 109 |         subsampling = "1";
 110 |     if subsampling.endswith("+"):
 111 |         sample_from = int(subsampling[:-1])
 112 |         sample_downto = 0
 113 |     else:
 114 |         sample_from = int(subsampling)
 115 |         sample_downto = sample_from-1
 116 |     for gridsize in range(sample_from, sample_downto, -1):
 117 |         for y in range(gridsize):
 118 |             h1 = int(y*h/gridsize)
 119 |             h2 = int((y+1)*h/gridsize)
 120 |             for x in range(gridsize):
 121 |                 w1 = int(x*w/gridsize)
 122 |                 w2 = int((x+1)*w/gridsize)
 123 |                 quadrant = im[w1:w2, h1:h2, :]
 124 | 
 125 |                 if colorspace == 'lab':
 126 |                     try:
 127 |                         c = color.rgb2lab(quadrant)
 128 |                         c = c.mean(axis=(0,1))
 129 |                     except RuntimeWarning:
 130 |                         print("problem with ", path)
 131 |                         print(quadrant.shape)
 132 |                         c = np.array([0.0, 0.0, 0.0])
 133 |                 else:
 134 |                     c = quadrant.mean(axis=(0,1))
 135 |                     c = c / 255.0
 136 | 
 137 |                 # WTF indeed (this happens for black (rgb))
 138 |                 if isinstance(c, numbers.Number):
 139 |                     c = [c, c, c]
 140 | 
 141 |                 colors.append(c)
 142 | 
 143 |     return np.array(colors).flatten()
 144 | 
 145 | def read_file_list(filelist):
 146 |     lines = []
 147 |     with open(filelist) as file:
 148 |         for line in file:
 149 |             line = line.strip() #or someother preprocessing
 150 |             line = line.strip( '"' ) # remove quotes
 151 |             lines.append(line)
 152 |     return lines
 153 | 
 154 | def read_json_vectors(filename):
 155 |     """Return np array of vectors from json sources"""
 156 |     vectors = []
 157 |     with open(filename) as json_file:
 158 |         json_data = json.load(json_file)
 159 |     for v in json_data:
 160 |         vectors.append(v)
 161 |     print("Read {} vectors from {}".format(len(vectors), filename))
 162 |     np_array = np.array(vectors)
 163 |     return np_array
 164 | 
 165 | def get_image_list(input_glob):
 166 |     if input_glob.startswith('@'):
 167 |         images = read_file_list(input_glob[1:])
 168 |     else:
 169 |         images = real_glob(input_glob)
 170 |     num_images = len(images)
 171 |     print("Found {} images".format(num_images))
 172 |     return images
 173 | 
 174 | def set_grid_size(images, width, height, aspect_ratio, drop_to_fit):
 175 |     num_images = len(images)
 176 |     if width is None and aspect_ratio is None:
 177 |         # just have width == height
 178 |         max_side_extent = math.sqrt(num_images)
 179 |         if max_side_extent.is_integer() or drop_to_fit:
 180 |             width = int(max_side_extent)
 181 |             height = width
 182 |         else:
 183 |             width = int(max_side_extent) + 1
 184 |             print("Checking: ", width*(width-1), num_images)
 185 |             if width*(width-1) >= num_images:
 186 |                 height = width-1
 187 |             else:
 188 |                 height = width
 189 |     elif width is None:
 190 |         # sniff the aspect ratio of the first file
 191 |         with Image.open(images[0]) as img:
 192 |             im_width = img.size[0]
 193 |             im_height = img.size[1]
 194 |             tile_aspect_ratio =  im_width / im_height
 195 |         raw_height = math.sqrt((num_images * tile_aspect_ratio) / aspect_ratio)
 196 |         raw_width = num_images / raw_height
 197 |         int_height = int(raw_height)
 198 |         int_width = int(raw_width)
 199 |         if (raw_height.is_integer() and raw_width.is_integer()) or drop_to_fit:
 200 |             height = int_height
 201 |             width = int_width
 202 |             if not drop_to_fit:
 203 |                 print("--> {} images fits exactly as {}x{}".format(num_images, width, height))
 204 |         else:
 205 |             if not raw_height.is_integer():
 206 |                 int_height = int_height + 1
 207 |             if not raw_width.is_integer():
 208 |                 int_width = int_width + 1
 209 |             if int_width*(int_height-1) >= num_images:
 210 |                 width = int_width
 211 |                 height = int_height-1
 212 |             else:
 213 |                 width = int_width
 214 |                 height = int_height
 215 |             print("--> {} images best fits as {}x{}".format(num_images, width, height))
 216 |         print("tile size is {}x{} so aspect of {:.3f} is {}x{} (final: {}x{})".format(
 217 |             im_width, im_height, aspect_ratio, width, height, width*im_width, height*im_height))
 218 | 
 219 |     num_grid_spaces = width * height
 220 |     if drop_to_fit:
 221 |         grid_images = images[:num_grid_spaces]
 222 |         num_images = len(grid_images)
 223 |     else:
 224 |         grid_images = images
 225 | 
 226 |     if num_grid_spaces < num_images:
 227 |         print("Error: {} images is too many for {}x{} grid.".format(num_images, width, height))
 228 |         sys.exit(0)
 229 |     elif num_images == 0:
 230 |         print("Error: no images to process")
 231 |         sys.exit(0)
 232 |     elif num_grid_spaces == 0:
 233 |         print("Error: no spaces for images")
 234 |         sys.exit(0)
 235 | 
 236 |     print("Using {} images to build {}x{} montage".format(num_images, width, height))
 237 |     return grid_images, width, height
 238 | 
 239 | def normalize_columns(rawpoints, low=0, high=1):
 240 |     mins = np.min(rawpoints, axis=0)
 241 |     maxs = np.max(rawpoints, axis=0)
 242 |     rng = maxs - mins
 243 |     scaled_points = high - (((high - low) * (maxs - rawpoints)) / rng)
 244 |     return scaled_points
 245 | 
 246 | def analyze_images_colors(images, colorspace='rgb', subsampling=None):
 247 |     # analyze images and grab activations
 248 |     colors = []
 249 |     for image_path in images:
 250 |         try:
 251 |             if subsampling is None:
 252 |                 c = get_average_color_classic(image_path, colorspace)
 253 |             else:
 254 |                 c = get_average_color(image_path, colorspace, subsampling)
 255 |         except Exception as e:
 256 |             print("Problem reading {}: {}".format(image_path, e))
 257 |             c = [0, 0, 0]
 258 |         # print(image_path, c)
 259 |         colors.append(c)
 260 |     # colors = normalize_columns(colors)
 261 |     return colors
 262 | 
 263 | def bit_preprocess_image(image):
 264 |     image = np.array(image)
 265 |     # reshape into shape [batch_size, height, width, num_channels]
 266 |     img_reshaped = tf.reshape(image, [1, image.shape[0], image.shape[1], image.shape[2]])
 267 |     # Use `convert_image_dtype` to convert to floats in the [0,1] range.
 268 |     image = tf.image.convert_image_dtype(img_reshaped, tf.float32)
 269 |     return image
 270 | 
 271 | def analyze_images(images, model_name, layer_name=None, pooling=None, do_crop=False, subsampling=None, do_pca=False):
 272 |     if model_name == 'color_lab':
 273 |         return analyze_images_colors(images, colorspace='lab', subsampling=subsampling)
 274 |     elif model_name == 'color' or model_name == 'color_rgb':
 275 |         return analyze_images_colors(images, colorspace='rgb', subsampling=subsampling)
 276 | 
 277 |     num_images = len(images)
 278 |     include_top = (layer_name is not None)
 279 | 
 280 |     model_lookup_table = {
 281 |         'densenet121': {
 282 |             'model_class': keras.applications.densenet.DenseNet121,
 283 |             'input_shape': (224, 224),
 284 |             'preprocess_input': keras.applications.densenet.preprocess_input
 285 |         },
 286 |         'densenet169': {
 287 |             'model_class': keras.applications.densenet.DenseNet169,
 288 |             'input_shape': (224, 224),
 289 |             'preprocess_input': keras.applications.densenet.preprocess_input
 290 |         },
 291 |         'densenet201': {
 292 |             'model_class': keras.applications.densenet.DenseNet201,
 293 |             'input_shape': (224, 224),
 294 |             'preprocess_input': keras.applications.densenet.preprocess_input
 295 |         },
 296 |         'inceptionresnetv2': {
 297 |             'model_class': keras.applications.inception_resnet_v2.InceptionResNetV2,
 298 |             'input_shape': (299, 299),
 299 |             'preprocess_input': keras.applications.inception_resnet_v2.preprocess_input
 300 |         },
 301 |         'inceptionv3': {
 302 |             'model_class': keras.applications.inception_v3.InceptionV3,
 303 |             'input_shape': (299, 299),
 304 |             'preprocess_input': keras.applications.inception_v3.preprocess_input
 305 |         },
 306 |         'resnet50': {
 307 |             'model_class': keras.applications.resnet.ResNet50,
 308 |             'input_shape': (224, 224),
 309 |             'preprocess_input': keras.applications.resnet.preprocess_input
 310 |         },
 311 |         'resnet101': {
 312 |             'model_class': keras.applications.resnet.ResNet101,
 313 |             'input_shape': (224, 224),
 314 |             'preprocess_input': keras.applications.resnet.preprocess_input
 315 |         },
 316 |         'resnet152': {
 317 |             'model_class': keras.applications.resnet.ResNet152,
 318 |             'input_shape': (224, 224),
 319 |             'preprocess_input': keras.applications.resnet.preprocess_input
 320 |         },
 321 |         'vgg16': {
 322 |             'model_class': keras.applications.vgg16.VGG16,
 323 |             'input_shape': (224, 224),
 324 |             'preprocess_input': keras.applications.vgg16.preprocess_input
 325 |         },
 326 |         'vgg19': {
 327 |             'model_class': keras.applications.vgg19.VGG19,
 328 |             'input_shape': (224, 224),
 329 |             'preprocess_input': keras.applications.vgg19.preprocess_input
 330 |         },
 331 |         'inceptionv3': {
 332 |             'model_class': keras.applications.xception.Xception,
 333 |             'input_shape': (299, 299),
 334 |             'preprocess_input': keras.applications.xception.preprocess_input
 335 |         },
 336 |     }
 337 | 
 338 |     is_bit = False
 339 |     is_clip = False
 340 |     if model_name.startswith("bit"):
 341 |         model_url = f"https://tfhub.dev/google/{model_name}/1"
 342 |         input_shape = None
 343 |         preprocess_input = bit_preprocess_image
 344 |         model = hub.KerasLayer(model_url)
 345 |         is_bit = True
 346 |     elif model_name.startswith("clip"):
 347 |         import clip
 348 |         import torch
 349 |         from torchvision.transforms import Compose, Resize, CenterCrop, ToTensor, Normalize
 350 | 
 351 |         model_type = "ViT-B/32"
 352 |         if len(model_name) > 4:
 353 |             parts = model_name.split(":")
 354 |             model_type = parts[1]
 355 |         model, preprocess = clip.load(model_type)
 356 |         print(preprocess)
 357 |         input_size = model.input_resolution.item()
 358 |         input_shape = (input_size,input_size)
 359 |         preprocess_input = None
 360 |         is_clip = True
 361 |     elif model_name in model_lookup_table:
 362 |         model_class = model_lookup_table[model_name]['model_class']
 363 |         input_shape = model_lookup_table[model_name]['input_shape']
 364 |         preprocess_input = model_lookup_table[model_name]['preprocess_input']
 365 |         model = model_class(weights='imagenet', include_top=include_top)
 366 |     else:
 367 |         print("Error: model {} not found".format(model_name))
 368 |         sys.exit(1)
 369 | 
 370 |     if layer_name is None:
 371 |         feat_extractor = model
 372 |     elif layer_name == "show" or layer_name == "list":
 373 |         for i,layer in enumerate(model.layers):
 374 |             print("{} layer {:03d}: {}".format(model_name, i, layer.name))
 375 |         sys.exit(0)
 376 |     else:
 377 |         feat_extractor = Model(inputs=model.input, outputs=model.get_layer(layer_name).output)
 378 | 
 379 |     # analyze images and grab activations
 380 |     activations = []
 381 |     for idx in tqdm(range(len(images))):
 382 |         file_path = images[idx]
 383 |         img = get_image(file_path, input_shape, do_crop, is_bit);
 384 |         if img is not None:
 385 |             # preprocess
 386 |             if preprocess_input is not None:
 387 |                 img = preprocess_input(img)
 388 |             # print("getting activations for %s %d/%d" % (file_path,idx,num_images))
 389 |             if is_bit:
 390 |                 acts = model(img)[0].numpy()
 391 |             elif is_clip:
 392 |                 batch_item = img[0]/255.0;
 393 |                 transform2 = Compose([
 394 |                     ToTensor(),
 395 |                     Normalize((0.48145466, 0.4578275, 0.40821073), (0.26862954, 0.26130258, 0.27577711)),
 396 |                 ])
 397 |                 zimages = []
 398 |                 im = transform2(batch_item)
 399 |                 zimages.append(im)
 400 |                 im_batch = torch.stack(zimages)
 401 |                 acts = model.encode_image(im_batch)[0].detach().cpu().numpy()
 402 |             else:
 403 |                 acts = feat_extractor.predict(img)[0]
 404 |             if len(activations) == 0:
 405 |                 print("Collecting vectors of size {}".format(acts.flatten().shape))
 406 |             activations.append(acts.flatten())
 407 |     # run PCA firt
 408 |     features = np.array(activations)
 409 |     if do_pca:
 410 |         print("Running PCA on features: {}".format(features.shape))
 411 |         pca = PCA(n_components=300)
 412 |         pca.fit(features)
 413 |         pca_features = pca.transform(features)
 414 |         return np.asarray(pca_features)
 415 |     else:
 416 |         return features
 417 | 
 418 | def fit_to_unit_square(points, width, height):
 419 |     x_scale = 1.0
 420 |     y_scale = 1.0
 421 |     if (width > height):
 422 |         y_scale = height / width
 423 |     elif(width < height):
 424 |         x_scale = width / height
 425 |     points -= points.min(axis=0)
 426 |     points /= points.max(axis=0)
 427 |     points = points * [x_scale, y_scale]
 428 |     return points
 429 | 
 430 | def index_from_substring(images, substr):
 431 |     index = None
 432 |     for i in range(len(images)):
 433 |         # print("{} and {} and {}".format(images[i], substr, images[i].find(substr)))
 434 |         if images[i].find(substr) != -1:
 435 |             if index is None:
 436 |                 index = i
 437 |             else:
 438 |                 raise ValueError("The substring {} is ambiguious: {} and {}".format(
 439 |                     substr, images[index], images[i]))
 440 |     if index is None:
 441 |         raise ValueError("The substring {} was not found in {} images".format(substr, len(images)))
 442 |     else:
 443 |         print("Resolved {} to image {}".format(substr, images[index]))
 444 |     return index
 445 | 
 446 | def write_list(list, output_path, output_file, quote=False):
 447 |     filelist = os.path.join(output_path, output_file)
 448 |     with open(filelist, "w") as text_file:
 449 |         for item in list:
 450 |             if isinstance(item, np.ndarray):
 451 |                 text_file.write("{}\n".format(",".join(map(str,item))))
 452 |             elif quote:
 453 |                 text_file.write("\"{}\"\n".format(item))
 454 |             else:
 455 |                 text_file.write("{}\n".format(item))
 456 |     return filelist
 457 | 
 458 | def read_list(output_path, output_file, numeric=False):
 459 |     filelist = os.path.join(output_path, output_file)
 460 |     lines = []
 461 |     with open(filelist) as file:
 462 |         for line in file:
 463 |             line = line.strip() #or someother preprocessing
 464 |             if numeric:
 465 |                 lines.append(list(map(float, line.split(","))))
 466 |             else:
 467 |                 lines.append(line)
 468 |     if numeric:
 469 |         return np.array(lines)
 470 |     else:
 471 |         return lines
 472 | 
 473 | def make_grid_image(filelist, cols=None, rows=None, spacing=0, links=None, do_hexgrid=False):
 474 |     """Convert an image grid to a single image"""
 475 |     N = len(filelist)
 476 | 
 477 |     with Image.open(filelist[0]) as img:
 478 |         width = img.size[0]
 479 |         height = img.size[1]
 480 |         if width > height:
 481 |             max_link_size = int(1.0 * height)
 482 |         else:
 483 |             max_link_size = int(1.0 * width)
 484 | 
 485 |     if rows == None:
 486 |         sq_num = math.sqrt(N)
 487 |         sq_dim = int(sq_num)
 488 |         if sq_num != sq_dim:
 489 |             sq_dim = sq_dim + 1
 490 |         rows = sq_dim
 491 |         cols = sq_dim
 492 | 
 493 |     total_height = rows * height
 494 |     total_width  = cols * width
 495 | 
 496 |     total_height = total_height + spacing * (rows - 1)
 497 |     total_width  = total_width + spacing * (cols - 1)
 498 |     # shift every other row this much in x
 499 |     hex_space = int(width / 2 + spacing)
 500 |     if do_hexgrid:
 501 |         total_width = total_width + hex_space
 502 | 
 503 |     im_array = np.zeros([total_height, total_width, 3]).astype(np.uint8)
 504 |     im_array.fill(255)
 505 | 
 506 |     if links is not None:
 507 |         print("Rows: {}".format(len(links)))
 508 |         for r in range(len(links)):
 509 |             row = links[r]
 510 |             for c in range(len(row)):
 511 |                 cell = row[c]
 512 |                 offset_y, offset_x = r*height+spacing*r, c*width+spacing*c
 513 |                 cy = int(offset_y + height / 2)
 514 |                 cx = int(offset_x + width / 2)
 515 |                 if cell[0] >= 0:
 516 |                     link_right_height = max_link_size * (1.0 - cell[0])
 517 |                     oy = int(link_right_height / 2)
 518 |                     ldw = int(link_right_height)
 519 |                     im_array[(cy-oy):(cy-oy+ldw), cx:(cx+width), :] = 0
 520 |                 if cell[1] >= 0:
 521 |                     link_down_width = max_link_size * (1.0 - cell[1])
 522 |                     ox = int(link_down_width / 2)
 523 |                     lrw = int(link_down_width)
 524 |                     im_array[cy:(cy+height), (cx-ox):(cx-ox+lrw), :] = 0
 525 | 
 526 |     for i in range(rows*cols):
 527 |         if i < N:
 528 |             r = i // cols
 529 |             c = i % cols
 530 | 
 531 |             with Image.open(filelist[i]) as img:
 532 |                 rgb_im = img.convert('RGB')
 533 |                 offset_y, offset_x = r*height+spacing*r, c*width+spacing*c
 534 |                 if do_hexgrid and (r%2 == 0):
 535 |                     offset_x += hex_space
 536 |                 im_array[offset_y:(offset_y+height), offset_x:(offset_x+width), :] = rgb_im
 537 | 
 538 |     return Image.fromarray(im_array)
 539 | 
 540 | def filter_distance_min(images, X, min_distance, reject_dir=None):
 541 |     num_images = len(images)
 542 |     keepers = [True] * num_images
 543 |     cur_pos = 0
 544 |     assignments = []
 545 |     min_distance2 = min_distance * min_distance
 546 |     for i in range(num_images):
 547 |         if not keepers[i]:
 548 |             continue
 549 |         rejects = []
 550 |         assignments.append(i)
 551 |         cur_v = X[i]
 552 |         for j in range(i+1, num_images):
 553 |             if keepers[j]:
 554 |                 # if np.linalg.norm(cur_v - X[j]) < min_distance:
 555 |                 diff = cur_v - X[j]
 556 |                 if np.dot(diff, diff) < min_distance2:
 557 |                     rejects.append(j)
 558 |                     keepers[j] = False
 559 |         if len(rejects) > 0:
 560 |             print("rejecting {} images similar to entry {}".format(len(rejects), i))
 561 |             if reject_dir:
 562 |                 reject_grid = [images[i]]
 563 |                 for ix in rejects:
 564 |                     reject_grid.append(images[ix])
 565 |                 img = make_grid_image(reject_grid)
 566 |                 reject_file_path = os.path.join(reject_dir,
 567 |                     "dist_{:04f}_{:03d}.jpg".format(min_distance, i))
 568 |                 img.save(reject_file_path)
 569 | 
 570 | 
 571 |     print("Keeping {} of {} images".format(len(assignments), num_images))
 572 |     im_array = np.array(images)
 573 |     X_array = np.array(X)
 574 |     return im_array[assignments].tolist(), X_array[assignments]
 575 | 
 576 | def filter_distance_max(images, X, max_distance, reject_dir=None, max_group_size=1):
 577 |     num_images = len(images)
 578 |     keepers = [False] * num_images
 579 |     cur_pos = 0
 580 |     assignments = []
 581 |     max_distance2 = max_distance * max_distance
 582 |     for i in range(num_images):
 583 |         if keepers[i]:
 584 |             assignments.append(i)
 585 |             continue
 586 |         accepts = []
 587 |         cur_v = X[i]
 588 |         for j in range(i+1, num_images):
 589 |             if not keepers[j]:
 590 |                 # if np.linalg.norm(cur_v - X[j]) < max_distance:
 591 |                 diff = cur_v - X[j]
 592 |                 if np.dot(diff, diff) < max_distance2:
 593 |                     keepers[i] = True
 594 |                     keepers[j] = True
 595 |                     accepts.append(j)
 596 | 
 597 |         if len(accepts) >= max_group_size:
 598 |             print("accepting {} images similar to entry {}".format(len(accepts), i))
 599 |             assignments.append(i)
 600 |             if reject_dir:
 601 |                 reject_grid = [images[i]]
 602 |                 for ix in accepts:
 603 |                     reject_grid.append(images[ix])
 604 |                 img = make_grid_image(reject_grid)
 605 |                 reject_file_path = os.path.join(reject_dir, 
 606 |                     "dist_{:04f}_{:03d}.jpg".format(max_distance, i))
 607 |                 img.save(reject_file_path)
 608 | 
 609 | 
 610 |     print("Keeping {} of {} images".format(len(assignments), num_images))
 611 |     im_array = np.array(images)
 612 |     X_array = np.array(X)
 613 |     return im_array[assignments].tolist(), X_array[assignments]
 614 | 
 615 | def reduce_grid_targets(grid, num_grid_images, do_reduce_hack):
 616 |     if do_reduce_hack:
 617 |         num_grid_images = len(grid) - 1
 618 |         print("reducing grid from {} to {}".format(len(grid), num_grid_images))
 619 |     mean_point = np.mean(grid, axis=0)
 620 |     newList = grid - mean_point
 621 |     sort = np.sum(np.power(newList, 2), axis=1)
 622 |     indexed_order = sort.argsort()
 623 |     sorted_list = grid[indexed_order]
 624 |     return sorted_list[:num_grid_images], indexed_order
 625 | 
 626 | def run_prune(filelist, vectorlist):
 627 |     new_filelist = []
 628 |     new_vectorlist = []
 629 |     for i in range(len(vectorlist)):
 630 |         # if vectorlist[i] is not None:
 631 |         if vectorlist[i] is not None and os.path.exists(filelist[i]):
 632 |             new_filelist.append(filelist[i])
 633 |             new_vectorlist.append(vectorlist[i])
 634 |     print("Pruned filelist from {} to {} entries".format(len(filelist), len(new_filelist)))
 635 |     return new_filelist, np.array(new_vectorlist)
 636 | 
 637 | # def run_filecheck(filelist, vectorlist):
 638 | #     new_filelist = []
 639 | #     new_vectorlist = []
 640 | #     for i in range(len(vectorlist)):
 641 | #         if os.path.exists(filelist[i]):
 642 | #             new_filelist.append(filelist[i])
 643 | #             new_vectorlist.append(vectorlist[i])
 644 | #     print("Pruned filelist from {} to {} entries".format(len(filelist), len(new_filelist)))
 645 | #     return new_filelist, np.array(new_vectorlist)
 646 | 
 647 | # in the future the clip_range could be smarter,
 648 | # like 1-4,100-200 etc.
 649 | # for now, just doing head
 650 | def run_clip(filelist, vectorlist, clip_range):
 651 |     clip_number = int(clip_range)
 652 |     new_filelist = filelist[:clip_number]
 653 |     new_vectorlist = vectorlist[:clip_number]
 654 |     return new_filelist, np.array(new_vectorlist)
 655 | 
 656 | from sklearn.cluster import KMeans
 657 | from sklearn.metrics import pairwise_distances_argmin_min
 658 | 
 659 | def run_kmeans(images, X):
 660 |     km = KMeans(n_clusters=100).fit(X)
 661 |     closest, _ = pairwise_distances_argmin_min(km.cluster_centers_, X)
 662 |     np_filelist = np.array(images)
 663 |     return np_filelist[closest].tolist(), X[closest]
 664 | 
 665 | def run_grid(input_glob, left_image, right_image, left_right_scale,
 666 |         output_path, tsne_dimensions, tsne_perplexity,
 667 |         tsne_learning_rate, width, height, aspect_ratio, drop_to_fit, fill_shade,
 668 |         vectors_file, do_prune, clip_range, subsampling,
 669 |         model, layer, pooling, do_crop, grid_file, use_imagemagick,
 670 |         grid_spacing, show_links, links_max_threshold,
 671 |         min_distance, max_distance, max_group_size, do_reload=False,
 672 |         do_tsne=False, do_reduce_hack=False, do_pca=False, do_hexgrid=False):
 673 | 
 674 |     # make output directory if needed
 675 |     if output_path != '' and not os.path.exists(output_path):
 676 |         os.makedirs(output_path)
 677 | 
 678 |     if do_reload:
 679 |         images = read_list(output_path, "image_files.txt", numeric=False)
 680 |         X = read_list(output_path, "image_vectors.txt", numeric=True)
 681 |         print("Reloaded {} images and {} vectors".format(len(images), X.shape))
 682 |         num_images = len(images)
 683 |         avg_colors = analyze_images_colors(images, 'rgb')
 684 |     else:
 685 |         ## compute width,weight from image list and provided defaults
 686 |         if input_glob is not None:
 687 |             images = get_image_list(input_glob)
 688 |             num_images = len(images)
 689 | 
 690 |         if vectors_file is not None:
 691 |             X = read_json_vectors(vectors_file)
 692 |         else:
 693 |             X = analyze_images(images, model, layer, pooling, do_crop, subsampling)
 694 | 
 695 |         if do_prune:
 696 |             images, X = run_prune(images, X)
 697 | 
 698 |         if clip_range:
 699 |             images, X = run_clip(images, X, clip_range)
 700 | 
 701 |         # images, X = run_kmeans(images, X)
 702 | 
 703 |         # save data
 704 |         write_list(images, output_path, "image_files.txt")
 705 |         write_list(X, output_path, "image_vectors.txt")
 706 | 
 707 |     ## Lookup left/right images
 708 |     left_image_index = None
 709 |     right_image_index = None
 710 |     # scale X by left/right axis
 711 |     if left_image is not None and right_image is not None:
 712 |         left_image_index = index_from_substring(images, left_image)
 713 |         right_image_index = index_from_substring(images, right_image)
 714 | 
 715 |     if left_image_index is not None:
 716 |         # todo: confirm this is how to stretch by a vector
 717 |         lr_vector = X[right_image_index] - X[left_image_index]
 718 |         lr_vector = lr_vector / np.linalg.norm(lr_vector)
 719 |         X_new = np.zeros_like(X)
 720 |         for i in range(len(X)):
 721 |             len_x = np.linalg.norm(X[i])
 722 |             norm_x = X[i] / len_x
 723 |             scale_factor = 1.0 + left_right_scale * (1.0 + np.dot(norm_x, lr_vector))
 724 |             new_length = len_x * scale_factor
 725 |             # print("Vector {}: length went from {} to {}".format(i, len_x, new_length))
 726 |             X_new[i] = new_length * norm_x
 727 |         X = X_new
 728 | 
 729 |     # TODO: filtering here
 730 |     if min_distance is not None and min_distance > 0:
 731 |         reject_dir = os.path.join(output_path, "rejects_min")
 732 |         if reject_dir != '' and not os.path.exists(reject_dir):
 733 |             os.makedirs(reject_dir)
 734 |         images, X = filter_distance_min(images, X, min_distance, reject_dir)
 735 | 
 736 |     if max_distance is not None and max_distance > 0:
 737 |         reject_dir = os.path.join(output_path, "rejects_max")
 738 |         if reject_dir != '' and not os.path.exists(reject_dir):
 739 |             os.makedirs(reject_dir)
 740 |         images, X = filter_distance_max(images, X, max_distance, reject_dir, max_group_size)
 741 | 
 742 |     grid_images, width, height = set_grid_size(images, width, height, aspect_ratio, drop_to_fit)
 743 |     num_grid_images = len(grid_images)
 744 |     print("Compare: {} and {}".format(num_grid_images, width*height))
 745 | 
 746 |     # this line is a hack for now
 747 |     X = np.asarray(X[:num_grid_images])
 748 | 
 749 |     print("SO X {}".format(X.shape))
 750 |     if do_tsne:
 751 |         print("Running tsne on {} images...".format(num_grid_images))
 752 |         tsne = TSNE(n_components=tsne_dimensions, learning_rate=tsne_learning_rate, perplexity=tsne_perplexity, verbose=2).fit_transform(X)
 753 |     else:
 754 |         print("Running umap on {} images...".format(num_grid_images))
 755 |         tsne = umap.UMAP(metric='cosine', min_dist=0.9).fit_transform(X)
 756 |     print("EMBEDDING SHAPE {}".format(tsne.shape))
 757 | 
 758 |     avg_colors = analyze_images_colors(images, 'rgb')
 759 | 
 760 |     data = []
 761 |     for i,f in enumerate(grid_images):
 762 |         point = [ ((tsne[i,k] - np.min(tsne[:,k]))/(np.max(tsne[:,k]) - np.min(tsne[:,k]))).tolist() for k in range(tsne_dimensions) ]
 763 |         data.append({"path":grid_images[i], "point":point})
 764 |     with open(os.path.join(output_path, "points.json"), 'w') as outfile:
 765 |         json.dump(data, outfile)
 766 | 
 767 |     if left_image_index is not None:
 768 |         data2d = fit_to_unit_square(tsne, 1, 1)
 769 |     else:
 770 |         data2d = fit_to_unit_square(tsne, width, height)
 771 |     plt.figure(figsize=(12, 12))
 772 |     plt.xlim(-0.1, 1.1)
 773 |     plt.ylim(-0.1, 1.1)
 774 |     plt.gca().invert_yaxis()
 775 |     grays = np.linspace(0, 0.8, len(data2d))
 776 |     plt.scatter(data2d[:,0], data2d[:,1], c=avg_colors, edgecolors='none', marker='o', s=24)
 777 |     if left_image_index is not None:
 778 |         plt.scatter(data2d[left_image_index:left_image_index+1,0],
 779 |             data2d[left_image_index:left_image_index+1,1],
 780 |             facecolors='none', edgecolors='r', marker='o', s=24*3)
 781 |         plt.scatter(data2d[right_image_index:right_image_index+1,0],
 782 |             data2d[right_image_index:right_image_index+1,1],
 783 |             facecolors='none', edgecolors='g', marker='o', s=24*3)
 784 |     plt.savefig(os.path.join(output_path, "embedding.png"), bbox_inches='tight')
 785 | 
 786 |     # this is an experimental section where left/right image can be given
 787 |     if left_image_index is not None:
 788 |         origin = data2d[left_image_index]
 789 |         data2d = data2d - origin
 790 |         dest = data2d[right_image_index]
 791 |         x_axis = np.array([1, 0])
 792 |         theta = np.arctan2(dest[1],dest[0])
 793 |         print("Spin angle is {}".format(np.rad2deg(theta)))
 794 |         # theta = np.deg2rad(90)
 795 |         # print("Spin angle is {}".format(np.rad2deg(theta)))
 796 |         # # http://scipython.com/book/chapter-6-numpy/examples/creating-a-rotation-matrix-in-numpy/
 797 |         a_c, a_s = np.cos(theta), np.sin(theta)
 798 |         R = np.matrix([[a_c, -a_s], [a_s, a_c]])
 799 |         data2d = np.array(data2d * R)
 800 |         # print("IS: ", data2d.shape)
 801 |         data2d = fit_to_unit_square(data2d, width, height)
 802 | 
 803 |         # TODO: this is a nasty cut-n-paste of above with different filename
 804 |         plt.figure(figsize=(8, 8))
 805 |         plt.xlim(-0.1, 1.1)
 806 |         plt.ylim(-0.1, 1.1)
 807 |         plt.gca().invert_yaxis()
 808 | 
 809 |         plt.scatter(data2d[:,0], data2d[:,1], c=avg_colors, edgecolors='none', marker='o', s=24)
 810 |         if left_image_index is not None:
 811 |             plt.scatter(data2d[left_image_index:left_image_index+1,0],
 812 |                 data2d[left_image_index:left_image_index+1,1],
 813 |                 facecolors='none', edgecolors='r', marker='o', s=48)
 814 |             plt.scatter(data2d[right_image_index:right_image_index+1,0],
 815 |                 data2d[right_image_index:right_image_index+1,1],
 816 |                 facecolors='none', edgecolors='g', marker='o', s=48)
 817 |         plt.savefig(os.path.join(output_path, "embedding_spun.png"), bbox_inches='tight')
 818 | 
 819 |     write_list(data2d, output_path, "embedding_coords.txt")
 820 | 
 821 |     # TSNE is done, setup layout for grid assignment
 822 |     max_width, max_height = 1, 1
 823 |     if (width > height):
 824 |         max_height = height / width
 825 |     elif(width < height):
 826 |         max_width = width / height
 827 |     xv, yv = np.meshgrid(np.linspace(0, max_width, width), np.linspace(0, max_height, height))
 828 |     if do_hexgrid:
 829 |         half_space = max_width  / (2 * width)
 830 |         # print("RUNNING THE FUCKING HEXGRID ", half_space, xv)
 831 |         xv[::2, :] += half_space
 832 |         # print("RAN ", xv)
 833 |     grid = np.dstack((xv, yv)).reshape(-1, 2)
 834 |     # this strange step removes corners
 835 |     grid, indexed_lookup = reduce_grid_targets(grid, num_grid_images, do_reduce_hack)
 836 | 
 837 |     # print("G", grid.shape, grid[0])
 838 |     # print("D2D", data2d.shape)
 839 | 
 840 |     cost = distance.cdist(grid, data2d, 'euclidean')
 841 |     # cost = distance.cdist(grid, data2d, 'sqeuclidean')
 842 |     cost = cost * (100000. / cost.max())
 843 |     # print("C", cost.shape, cost[0][0])
 844 | 
 845 |     if using_python_lap:
 846 |         print("Starting assignment (this can take a few minutes)")
 847 |         min_cost2, row_assigns2, col_assigns2 = lap.lapjv(cost, extend_cost=do_reduce_hack)
 848 |         print("Assignment complete")
 849 |     else:
 850 |         # note slightly different API
 851 |         row_assigns2, col_assigns2, min_cost2 = lapjv.lapjv(cost, verbose=True, force_doubles=False)
 852 |     grid_jv2 = grid[col_assigns2]
 853 |     # print(col_assigns2.shape)
 854 |     plt.figure(figsize=(20, 20))
 855 |     plt.xlim(-0.1, 1.1)
 856 |     plt.ylim(-0.1, 1.1)
 857 |     plt.gca().invert_yaxis()
 858 |     for start, end, c in zip(data2d, grid_jv2, avg_colors):
 859 |         plt.arrow(start[0], start[1], end[0] - start[0], end[1] - start[1],
 860 |                   color=c, head_length=0.01, head_width=0.01)
 861 |         if left_image_index is not None:
 862 |             plt.scatter(data2d[left_image_index:left_image_index+1,0],
 863 |                 data2d[left_image_index:left_image_index+1,1],
 864 |                 facecolors='none', edgecolors='r', marker='o', s=48)
 865 |             plt.scatter(data2d[right_image_index:right_image_index+1,0],
 866 |                 data2d[right_image_index:right_image_index+1,1],
 867 |                 facecolors='none', edgecolors='g', marker='o', s=48)
 868 |     plt.savefig(os.path.join(output_path, 'movement.png'), bbox_inches='tight')
 869 | 
 870 |     num_grid_spaces = len(indexed_lookup)
 871 |     num_actual_images = len(row_assigns2)
 872 |     num_missing = num_grid_spaces - num_actual_images
 873 |     using_placeholder = False
 874 | 
 875 |     if num_missing > 0:
 876 |         # makde a note that placeholder is in use
 877 |         using_placeholder = True
 878 | 
 879 |         # add a blank entry to the vectors
 880 |         _, v_len = X.shape
 881 |         X2 = np.append(X, [np.zeros(v_len)], axis=0)
 882 |         print("Updating vectors from {} to {}".format(X.shape, X2.shape))
 883 |         X = X2
 884 | 
 885 |         # add blank entry to images
 886 |         # sniff the aspect ratio of the first file
 887 |         with Image.open(grid_images[0]) as img:
 888 |             im_width = img.size[0]
 889 |             im_height = img.size[1]
 890 | 
 891 |         im_array = np.full([im_height, im_width, 3], [fill_shade, fill_shade, fill_shade]).astype(np.uint8)
 892 |         # im_array = np.zeros([im_width, im_height, 3]).astype(np.uint8)
 893 |         blank_img = Image.fromarray(im_array)
 894 |         blank_image_path = os.path.join(output_path, "blank.png")
 895 |         blank_img.save(blank_image_path)
 896 |         blank_index = len(grid_images)
 897 |         grid_images.append(blank_image_path)
 898 | 
 899 |         # now grow row assignments, giving all remaining to new blanks
 900 |         residuals = np.full([num_missing], blank_index)
 901 |         row_assigns2 = np.append(row_assigns2, residuals)
 902 | 
 903 |     reverse_lookup = np.zeros(num_grid_spaces, dtype=int)
 904 |     reverse_lookup[indexed_lookup] = np.arange(num_grid_spaces)
 905 | 
 906 |     image_indexes = row_assigns2[reverse_lookup]
 907 |     img_grid_vectors = X[image_indexes]
 908 |     g_len, g_dim = img_grid_vectors.shape
 909 |     img_grid_shaped = img_grid_vectors.reshape(height, width, g_dim)
 910 |     with open(os.path.join(output_path, "grid_vectors.json"), 'w') as outfile:
 911 |         json.dump(img_grid_shaped.tolist(), outfile)
 912 | 
 913 |     n_images = np.asarray(grid_images)
 914 |     image_grid = n_images[image_indexes]
 915 |     montage_filelist = write_list(image_grid, output_path, 
 916 |         "montage_{}x{}.txt".format(width, height), quote=True)
 917 |     grid_file_path = os.path.join(output_path, grid_file)
 918 |     grid_im_file_path = os.path.join(output_path, "{}".format(grid_file))
 919 |     left_right_path = os.path.join(output_path, "left_right.jpg")
 920 |     if use_imagemagick:
 921 |         command = "montage @{} -geometry +0+0 -tile {}x{} {}".format(
 922 |             montage_filelist, width, height, grid_im_file_path)
 923 |         # print("running imagemagick montage: {}".format(command))
 924 |         os.system(command)
 925 | 
 926 |         # if left_image_index is not None:
 927 |         #     command = "montage '{}' '{}' -geometry +0+0 -tile 2x1 {}".format(
 928 |         #         images[left_image_index], images[right_image_index], left_right_path)
 929 |         #     os.system(command)
 930 | 
 931 |     else:
 932 |         # image vectors are in X
 933 |         links = None
 934 |         if show_links:
 935 |             links = []
 936 |             img_grid_vectors = X[image_indexes]
 937 |             for r in range(height):
 938 |                 row = []
 939 |                 links.append(row)
 940 |                 for c in range(width):
 941 |                     idx = r * width + c
 942 |                     cur_v = img_grid_vectors[idx]
 943 |                     if c < width - 1:
 944 |                         left_v = img_grid_vectors[idx+1]
 945 |                         if using_placeholder and (not cur_v.any() or not left_v.any()):
 946 |                             dist_left = -1
 947 |                         else:
 948 |                             dist_left = np.linalg.norm(cur_v - left_v)
 949 |                     else:
 950 |                         dist_left = -1
 951 |                     if r < height - 1:
 952 |                         down_v = img_grid_vectors[idx+width]
 953 |                         if using_placeholder and (not cur_v.any() or not down_v.any()):
 954 |                             dist_down = -1
 955 |                         else:
 956 |                             dist_down = np.linalg.norm(cur_v - down_v)
 957 |                     else:
 958 |                         dist_down = -1
 959 |                     cell = [dist_left, dist_down]
 960 |                     row.append(cell)
 961 |             links = np.array(links)
 962 |             # normalize to 0-1
 963 |             if links_max_threshold is not None:
 964 |                 num_removed = (links > links_max_threshold).sum()
 965 |                 links[links > links_max_threshold] = -1
 966 |                 num_left = (links > 0).sum()
 967 |                 print("removed {} links, {} left".format(num_removed, num_left))
 968 |             links_max = np.amax(links)
 969 |             valid_vals = np.where(links > 0)
 970 |             links_min = np.amin(links[valid_vals])
 971 |             print("Normalizing to {}/{}".format(links_min, links_max))
 972 |             links = ((links - links_min) / (links_max - links_min))
 973 |             print("Links is {}".format(links.shape))
 974 |         img = make_grid_image(image_grid, width, height, grid_spacing, links, do_hexgrid)
 975 |         img.save(grid_file_path)
 976 |         if left_image_index is not None:
 977 |             img = make_grid_image([grid_images[left_image_index], grid_images[right_image_index]], 2, 1, 1)
 978 |             img.save(left_right_path)
 979 | 
 980 | 
 981 | def main():
 982 |     parser = argparse.ArgumentParser(description="Deep learning grid layout")
 983 |     parser.add_argument('--input-glob', default=None,
 984 |                         help="use file glob source of images")
 985 |     parser.add_argument('--left-image', default=None,
 986 |                         help="use file as example of left")
 987 |     parser.add_argument('--right-image', default=None,
 988 |                         help="use file as example of right")
 989 |     parser.add_argument('--vectors', default=None,
 990 |                         help="read vectors directly instead of running model")
 991 |     parser.add_argument('--do-prune', default=False, action='store_true',
 992 |                         help="Prune filelist filtering if vectors missing")
 993 |     parser.add_argument('--clip-range', default=None,
 994 |                         help="only show range of images given (eg: 100)")
 995 |     parser.add_argument('--model', default=None,
 996 |                         help="model to use, one of: vgg16 vgg19 resnet50 inceptionv3 xception")
 997 |     parser.add_argument('--layer', default=None,
 998 |                         help="optional override to set custom model layer")
 999 |     parser.add_argument('--pooling', default=None,
1000 |                         help="optional override to control inceptionv3 pooling (avg or max)")
1001 |     parser.add_argument('--subsampling', default=None,
1002 |                         help="subsampling specifier for tiles (for some models). eg: 2+")
1003 |     parser.add_argument('--left-right-scale', default=4.0, type=float,
1004 |                         help="scaling factor for left-right axis")
1005 |     parser.add_argument('--output-path', 
1006 |                          help='path to where to put output files')
1007 |     parser.add_argument('--grid-file', default="grid.jpg",
1008 |                          help='name (and format) of grid output file')
1009 |     parser.add_argument('--num-dimensions', default=2, type=int,
1010 |                         help='dimensionality of t-SNE points')
1011 |     parser.add_argument('--perplexity', default=30, type=int,
1012 |                         help='perplexity of t-SNE')
1013 |     parser.add_argument('--learning-rate', default=150, type=int,
1014 |                         help='learning rate of t-SNE')
1015 |     parser.add_argument('--do-crop', default=False, action='store_true',
1016 |                         help="Center crop instead of scale")
1017 |     parser.add_argument('--drop-to-fit', default=False, action='store_true',
1018 |                         help="Drop extra images to fit to aspect ratio")
1019 |     parser.add_argument('--fill-shade', default=0, type=int,
1020 |                         help='shade of gray for filling in blanks')
1021 |     parser.add_argument('--use-imagemagick', default=False, action='store_true',
1022 |                         help="generate grid using imagemagick (montage)")
1023 |     parser.add_argument('--tile', default=None,
1024 |                         help="Grid size WxH (eg: 12x12)")
1025 |     parser.add_argument('--grid-spacing', default=0, type=int,
1026 |                         help='whitespace between images in grid')
1027 |     parser.add_argument('--show-links', default=False, action='store_true',
1028 |                         help="visualize link strength in whitespace")
1029 |     parser.add_argument('--links-max-threshold', default=None, type=float,
1030 |                         help="drop links past this threshold")
1031 |     parser.add_argument('--aspect-ratio', default=None, type=float,
1032 |                         help="Instead of square, fit image to given aspect ratio")
1033 |     parser.add_argument('--min-distance', default=None, type=float,
1034 |                         help="Removed duplicates based on distance")
1035 |     parser.add_argument('--max-distance', default=None, type=float,
1036 |                         help="Removes items if they are beyond max from all others")
1037 |     parser.add_argument('--max-group-size', default=1, type=int,
1038 |                         help='when max-distance, minimum number of additional members')
1039 |     parser.add_argument('--do-reload', default=False, action='store_true',
1040 |                         help="Reload file list and vectors from saved state")
1041 |     parser.add_argument('--do-tsne', default=False, action='store_true',
1042 |                         help="Run tsne instead of umap")
1043 |     parser.add_argument('--do-reduce-hack', default=False, action='store_true',
1044 |                         help="allow holes (and remove one entry)")
1045 |     parser.add_argument('--do-pca', default=False, action='store_true',
1046 |                         help="run PCA on features before dimensionality reduction")
1047 |     parser.add_argument('--do-hexgrid', default=False, action='store_true',
1048 |                         help="shift even rows by half a cell size to make grid a hex grid")
1049 |     parser.add_argument('--random-seed', default=None, type=int,
1050 |                         help='Use a specific random seed (for repeatability)')
1051 |     args = parser.parse_args()
1052 | 
1053 |     if args.random_seed:
1054 |       print("Setting random seed: ", args.random_seed)
1055 |       random.seed(args.random_seed)
1056 |       np.random.seed(args.random_seed)
1057 |       # tf.set_random_seed(args.random_seed)
1058 | 
1059 |     width, height = None, None
1060 |     if args.tile is not None:
1061 |         width, height = map(int, args.tile.split("x"))
1062 | 
1063 |     if args.model is None and args.layer is None:
1064 |         model = "bit/m-r101x1"
1065 |         layer = None
1066 |     elif args.model is None:
1067 |         model = "vgg16"
1068 |         layer = args.layer
1069 |     else:
1070 |         model = args.model
1071 |         layer = args.layer
1072 |     # this obviously needs refactoring
1073 |     run_grid(args.input_glob, args.left_image, args.right_image, args.left_right_scale,
1074 |              args.output_path, args.num_dimensions, 
1075 |              args.perplexity, args.learning_rate, width, height, args.aspect_ratio,
1076 |              args.drop_to_fit, args.fill_shade, args.vectors, args.do_prune, args.clip_range,
1077 |              args.subsampling,
1078 |              model, layer, args.pooling, args.do_crop, args.grid_file, args.use_imagemagick,
1079 |              args.grid_spacing, args.show_links, args.links_max_threshold,
1080 |              args.min_distance, args.max_distance,
1081 |              args.max_group_size, args.do_reload, args.do_tsne,
1082 |              args.do_reduce_hack, args.do_pca, args.do_hexgrid)
1083 | 
1084 | if __name__ == '__main__':
1085 |     main()
1086 | 


--------------------------------------------------------------------------------