├── README.md └── resizer.py /README.md: -------------------------------------------------------------------------------- 1 | # Deprecated! see ResizeRight: https://github.com/assafshocher/ResizeRight 2 | # Resizer 3 | #### The only way to resize images 4 | 5 | -------- 6 | 7 | This image-resize funuction is used for image enhancement and restoration challenges, especially super-resolution. 8 | It is made to address some crucial issues that are not solved by other resizing packages. 9 | The code is mostly based on MATLAB's imresize function, but with a few crucial extra features. It is specifically useful due to the following reasons: 10 | 11 | 1. None of the Python packages I am aware of, currently resizes images with results similar to MATLAB's imresize (which is a common benchmark for image resotration tasks, especially super-resolution). 12 | 13 | 2. You can be accurate and consistent by specifying **both scale-factor and output-size**. This is an important feature for super-resolution and learning because one must acknowledge that the same outpu-size can be resulted with varying scale-factors. best explained by example: say you have an image of size 9x9 and you resize by scale-factor of 0.5. Result size is 5x5. now you resize with scale-factor of 2. you get result sized 10x10. "no big deal" ,you must thinking now, "I can resize it to output-size 9x9", right? but then you will not get the correct scale-fcator which is calculated as output-size / input-size = 1.8. 14 | This is one of the main reasons for creating this as this consistency is often crucial for learning based tasks. 15 | 16 | 3. In addition to the common resizing methods (linear, cubic, lanczos etc.), you can specify a numeric array as a resizing kernel and use it to resize the image. 17 | 18 | 4. You can resize N-D images. 19 | 20 | 5. Some existing packages return misaligned results. it is visually not apparent but can cause great damage to image enhancement tasks.(https://hackernoon.com/how-tensorflows-tf-image-resize-stole-60-days-of-my-life-aba5eb093f35) 21 | 22 | 6. You can specify a list of scale-factors to resize each dimension using a different scale-factor. 23 | 24 | -------- 25 | 26 | ### Cite / credit 27 | If you find our work useful in your research or publication, please cite this work: 28 | ``` 29 | @misc{Resizer, 30 | author = {Assaf Shocher}, 31 | title = {Resizer: Only way to resize}, 32 | year = {2018}, 33 | publisher = {GitHub}, 34 | journal = {GitHub repository}, 35 | howpublished = {\url{https://github.com/assafshocher/resizer}}, 36 | } 37 | ``` 38 | 39 | -------- 40 | 41 | ### Usage: 42 | ``` 43 | resizer.imresize(im, scale_factor=None, output_shape=None, kernel=None, antialiasing=True, kernel_shift_flag=False) 44 | ``` 45 | 46 | __im__ : 47 | the input image 48 | 49 | __scale_factor__: 50 | can be specified as- 51 | 1. one scalar scale - then it will be assumed that you want to resize first two dims with this scale. 52 | 2. a list or array of scales - one for each dimension you want to resize. note: if length of the list is L then first L dims will be rescaled. 53 | 3. not specified - then it will be calculated using output_size. this is not recomended (see advantage 2 in the list above). 54 | 55 | __output_shape__: 56 | A list or array with same length as im.shape. if not specified, can be calcualated from scale_factor 57 | 58 | __kernel__: 59 | Can be one of the following strings: "cubic", "lanczos2", "lanczos3", "box", "linear" (or other methods you may add). 60 | Or a 2D numpy array if the numeric kernel option is wanted. 61 | 62 | __antialiasing__: 63 | This is an option similar to MATLAB's default. only relevant for downscaling. if true it basicly means that the kernel is stretched with 1/scale_factor to prevent aliasing (low-pass filtering) 64 | 65 | __kernel_shift_flag__: 66 | this is an option made to automatically fix misalignment of kernel center of mass, by shifting it so that the resized result is aligned exactly (resizing with any regular method will align it perfectly to input image) (see comment inside the function for further info). the drawback here is that shifting the kenrel uses interpolation which can potentially harm accuracy. however it is sometimes crucial and furthermore this damage is neglegable in any application I tested. 67 | -------------------------------------------------------------------------------- /resizer.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from scipy.ndimage import filters, measurements, interpolation 3 | from math import pi 4 | 5 | 6 | def imresize(im, scale_factor=None, output_shape=None, kernel=None, antialiasing=True, kernel_shift_flag=False): 7 | # First standardize values and fill missing arguments (if needed) by deriving scale from output shape or vice versa 8 | scale_factor, output_shape = fix_scale_and_size(im.shape, output_shape, scale_factor) 9 | 10 | # For a given numeric kernel case, just do convolution and sub-sampling (downscaling only) 11 | if type(kernel) == np.ndarray and scale_factor[0] <= 1: 12 | return numeric_kernel(im, kernel, scale_factor, output_shape, kernel_shift_flag) 13 | 14 | # Choose interpolation method, each method has the matching kernel size 15 | method, kernel_width = { 16 | "cubic": (cubic, 4.0), 17 | "lanczos2": (lanczos2, 4.0), 18 | "lanczos3": (lanczos3, 6.0), 19 | "box": (box, 1.0), 20 | "linear": (linear, 2.0), 21 | None: (cubic, 4.0) # set default interpolation method as cubic 22 | }.get(kernel) 23 | 24 | # Antialiasing is only used when downscaling 25 | antialiasing *= (scale_factor[0] < 1) 26 | 27 | # Sort indices of dimensions according to scale of each dimension. since we are going dim by dim this is efficient 28 | sorted_dims = np.argsort(np.array(scale_factor)).tolist() 29 | 30 | # Iterate over dimensions to calculate local weights for resizing and resize each time in one direction 31 | out_im = np.copy(im) 32 | for dim in sorted_dims: 33 | # No point doing calculations for scale-factor 1. nothing will happen anyway 34 | if scale_factor[dim] == 1.0: 35 | continue 36 | 37 | # for each coordinate (along 1 dim), calculate which coordinates in the input image affect its result and the 38 | # weights that multiply the values there to get its result. 39 | weights, field_of_view = contributions(im.shape[dim], output_shape[dim], scale_factor[dim], 40 | method, kernel_width, antialiasing) 41 | 42 | # Use the affecting position values and the set of weights to calculate the result of resizing along this 1 dim 43 | out_im = resize_along_dim(out_im, dim, weights, field_of_view) 44 | 45 | return out_im 46 | 47 | 48 | def fix_scale_and_size(input_shape, output_shape, scale_factor): 49 | # First fixing the scale-factor (if given) to be standardized the function expects (a list of scale factors in the 50 | # same size as the number of input dimensions) 51 | if scale_factor is not None: 52 | # By default, if scale-factor is a scalar we assume 2d resizing and duplicate it. 53 | if np.isscalar(scale_factor): 54 | scale_factor = [scale_factor, scale_factor] 55 | 56 | # We extend the size of scale-factor list to the size of the input by assigning 1 to all the unspecified scales 57 | scale_factor = list(scale_factor) 58 | scale_factor.extend([1] * (len(input_shape) - len(scale_factor))) 59 | 60 | # Fixing output-shape (if given): extending it to the size of the input-shape, by assigning the original input-size 61 | # to all the unspecified dimensions 62 | if output_shape is not None: 63 | output_shape = list(np.uint(np.array(output_shape))) + list(input_shape[len(output_shape):]) 64 | 65 | # Dealing with the case of non-give scale-factor, calculating according to output-shape. note that this is 66 | # sub-optimal, because there can be different scales to the same output-shape. 67 | if scale_factor is None: 68 | scale_factor = 1.0 * np.array(output_shape) / np.array(input_shape) 69 | 70 | # Dealing with missing output-shape. calculating according to scale-factor 71 | if output_shape is None: 72 | output_shape = np.uint(np.ceil(np.array(input_shape) * np.array(scale_factor))) 73 | 74 | return scale_factor, output_shape 75 | 76 | 77 | def contributions(in_length, out_length, scale, kernel, kernel_width, antialiasing): 78 | # This function calculates a set of 'filters' and a set of field_of_view that will later on be applied 79 | # such that each position from the field_of_view will be multiplied with a matching filter from the 80 | # 'weights' based on the interpolation method and the distance of the sub-pixel location from the pixel centers 81 | # around it. This is only done for one dimension of the image. 82 | 83 | # When anti-aliasing is activated (default and only for downscaling) the receptive field is stretched to size of 84 | # 1/sf. this means filtering is more 'low-pass filter'. 85 | fixed_kernel = (lambda arg: scale * kernel(scale * arg)) if antialiasing else kernel 86 | kernel_width *= 1.0 / scale if antialiasing else 1.0 87 | 88 | # These are the coordinates of the output image 89 | out_coordinates = np.arange(1, out_length+1) 90 | 91 | # These are the matching positions of the output-coordinates on the input image coordinates. 92 | # Best explained by example: say we have 4 horizontal pixels for HR and we downscale by SF=2 and get 2 pixels: 93 | # [1,2,3,4] -> [1,2]. Remember each pixel number is the middle of the pixel. 94 | # The scaling is done between the distances and not pixel numbers (the right boundary of pixel 4 is transformed to 95 | # the right boundary of pixel 2. pixel 1 in the small image matches the boundary between pixels 1 and 2 in the big 96 | # one and not to pixel 2. This means the position is not just multiplication of the old pos by scale-factor). 97 | # So if we measure distance from the left border, middle of pixel 1 is at distance d=0.5, border between 1 and 2 is 98 | # at d=1, and so on (d = p - 0.5). we calculate (d_new = d_old / sf) which means: 99 | # (p_new-0.5 = (p_old-0.5) / sf) -> p_new = p_old/sf + 0.5 * (1-1/sf) 100 | match_coordinates = 1.0 * out_coordinates / scale + 0.5 * (1 - 1.0 / scale) 101 | 102 | # This is the left boundary to start multiplying the filter from, it depends on the size of the filter 103 | left_boundary = np.floor(match_coordinates - kernel_width / 2) 104 | 105 | # Kernel width needs to be enlarged because when covering has sub-pixel borders, it must 'see' the pixel centers 106 | # of the pixels it only covered a part from. So we add one pixel at each side to consider (weights can zeroize them) 107 | expanded_kernel_width = np.ceil(kernel_width) + 2 108 | 109 | # Determine a set of field_of_view for each each output position, these are the pixels in the input image 110 | # that the pixel in the output image 'sees'. We get a matrix whos horizontal dim is the output pixels (big) and the 111 | # vertical dim is the pixels it 'sees' (kernel_size + 2) 112 | field_of_view = np.squeeze(np.uint(np.expand_dims(left_boundary, axis=1) + np.arange(expanded_kernel_width) - 1)) 113 | 114 | # Assign weight to each pixel in the field of view. A matrix whos horizontal dim is the output pixels and the 115 | # vertical dim is a list of weights matching to the pixel in the field of view (that are specified in 116 | # 'field_of_view') 117 | weights = fixed_kernel(1.0 * np.expand_dims(match_coordinates, axis=1) - field_of_view - 1) 118 | 119 | # Normalize weights to sum up to 1. be careful from dividing by 0 120 | sum_weights = np.sum(weights, axis=1) 121 | sum_weights[sum_weights == 0] = 1.0 122 | weights = 1.0 * weights / np.expand_dims(sum_weights, axis=1) 123 | 124 | # We use this mirror structure as a trick for reflection padding at the boundaries 125 | mirror = np.uint(np.concatenate((np.arange(in_length), np.arange(in_length - 1, -1, step=-1)))) 126 | field_of_view = mirror[np.mod(field_of_view, mirror.shape[0])] 127 | 128 | # Get rid of weights and pixel positions that are of zero weight 129 | non_zero_out_pixels = np.nonzero(np.any(weights, axis=0)) 130 | weights = np.squeeze(weights[:, non_zero_out_pixels]) 131 | field_of_view = np.squeeze(field_of_view[:, non_zero_out_pixels]) 132 | 133 | # Final products are the relative positions and the matching weights, both are output_size X fixed_kernel_size 134 | return weights, field_of_view 135 | 136 | 137 | def resize_along_dim(im, dim, weights, field_of_view): 138 | # To be able to act on each dim, we swap so that dim 0 is the wanted dim to resize 139 | tmp_im = np.swapaxes(im, dim, 0) 140 | 141 | # We add singleton dimensions to the weight matrix so we can multiply it with the big tensor we get for 142 | # tmp_im[field_of_view.T], (bsxfun style) 143 | weights = np.reshape(weights.T, list(weights.T.shape) + (np.ndim(im) - 1) * [1]) 144 | 145 | # This is a bit of a complicated multiplication: tmp_im[field_of_view.T] is a tensor of order image_dims+1. 146 | # for each pixel in the output-image it matches the positions the influence it from the input image (along 1 dim 147 | # only, this is why it only adds 1 dim to the shape). We then multiply, for each pixel, its set of positions with 148 | # the matching set of weights. we do this by this big tensor element-wise multiplication (MATLAB bsxfun style: 149 | # matching dims are multiplied element-wise while singletons mean that the matching dim is all multiplied by the 150 | # same number 151 | tmp_out_im = np.sum(tmp_im[field_of_view.T] * weights, axis=0) 152 | 153 | # Finally we swap back the axes to the original order 154 | return np.swapaxes(tmp_out_im, dim, 0) 155 | 156 | 157 | def numeric_kernel(im, kernel, scale_factor, output_shape, kernel_shift_flag): 158 | # See kernel_shift function to understand what this is 159 | if kernel_shift_flag: 160 | kernel = kernel_shift(kernel, scale_factor) 161 | 162 | # First run a correlation (convolution with flipped kernel) 163 | out_im = np.zeros_like(im) 164 | for channel in range(np.ndim(im)): 165 | out_im[:, :, channel] = filters.correlate(im[:, :, channel], kernel) 166 | 167 | # Then subsample and return 168 | return out_im[np.round(np.linspace(0, im.shape[0] - 1 / scale_factor[0], output_shape[0])).astype(int)[:, None], 169 | np.round(np.linspace(0, im.shape[1] - 1 / scale_factor[1], output_shape[1])).astype(int), :] 170 | 171 | 172 | def kernel_shift(kernel, sf): 173 | # There are two reasons for shifting the kernel: 174 | # 1. Center of mass is not in the center of the kernel which creates ambiguity. There is no possible way to know 175 | # the degradation process included shifting so we always assume center of mass is center of the kernel. 176 | # 2. We further shift kernel center so that top left result pixel corresponds to the middle of the sfXsf first 177 | # pixels. Default is for odd size to be in the middle of the first pixel and for even sized kernel to be at the 178 | # top left corner of the first pixel. that is why different shift size needed between od and even size. 179 | # Given that these two conditions are fulfilled, we are happy and aligned, the way to test it is as follows: 180 | # The input image, when interpolated (regular bicubic) is exactly aligned with ground truth. 181 | 182 | # First calculate the current center of mass for the kernel 183 | current_center_of_mass = measurements.center_of_mass(kernel) 184 | 185 | # The second ("+ 0.5 * ....") is for applying condition 2 from the comments above 186 | wanted_center_of_mass = np.array(kernel.shape) / 2 + 0.5 * (sf - (kernel.shape[0] % 2)) 187 | 188 | # Define the shift vector for the kernel shifting (x,y) 189 | shift_vec = wanted_center_of_mass - current_center_of_mass 190 | 191 | # Before applying the shift, we first pad the kernel so that nothing is lost due to the shift 192 | # (biggest shift among dims + 1 for safety) 193 | kernel = np.pad(kernel, np.int(np.ceil(np.max(shift_vec))) + 1, 'constant') 194 | 195 | # Finally shift the kernel and return 196 | return interpolation.shift(kernel, shift_vec) 197 | 198 | 199 | # These next functions are all interpolation methods. x is the distance from the left pixel center 200 | 201 | 202 | def cubic(x): 203 | absx = np.abs(x) 204 | absx2 = absx ** 2 205 | absx3 = absx ** 3 206 | return ((1.5*absx3 - 2.5*absx2 + 1) * (absx <= 1) + 207 | (-0.5*absx3 + 2.5*absx2 - 4*absx + 2) * ((1 < absx) & (absx <= 2))) 208 | 209 | 210 | def lanczos2(x): 211 | return (((np.sin(pi*x) * np.sin(pi*x/2) + np.finfo(np.float32).eps) / 212 | ((pi**2 * x**2 / 2) + np.finfo(np.float32).eps)) 213 | * (abs(x) < 2)) 214 | 215 | 216 | def box(x): 217 | return ((-0.5 <= x) & (x < 0.5)) * 1.0 218 | 219 | 220 | def lanczos3(x): 221 | return (((np.sin(pi*x) * np.sin(pi*x/3) + np.finfo(np.float32).eps) / 222 | ((pi**2 * x**2 / 3) + np.finfo(np.float32).eps)) 223 | * (abs(x) < 3)) 224 | 225 | 226 | def linear(x): 227 | return (x + 1) * ((-1 <= x) & (x < 0)) + (1 - x) * ((0 <= x) & (x <= 1)) 228 | --------------------------------------------------------------------------------