├── METHOD.md ├── README.md ├── config.py ├── data ├── augmented │ ├── csv_files │ │ └── .gitkeep │ └── images │ │ └── .gitkeep └── raw │ ├── csv_files │ └── train_labels.csv │ └── images │ ├── COCO_test2014_000000291295.jpg │ └── COCO_test2014_000000389792.jpg ├── data_aug ├── __init__.py ├── box_utils.py ├── no_label_change.py └── with_label_change.py ├── example.py └── samples ├── COCO_test2014_000000291295_blured.jpg ├── COCO_test2014_000000291295_brightness.jpg ├── COCO_test2014_000000291295_colored.jpg ├── COCO_test2014_000000291295_contrasted.jpg ├── COCO_test2014_000000291295_gau_noise.jpg ├── COCO_test2014_000000291295_hf.jpg ├── COCO_test2014_000000291295_rotated_180.jpg ├── COCO_test2014_000000291295_rotated_270.jpg ├── COCO_test2014_000000291295_rotated_90.jpg ├── COCO_test2014_000000291295_scale.jpg ├── COCO_test2014_000000291295_vf.jpg └── FotoJet.jpg /METHOD.md: -------------------------------------------------------------------------------- 1 | ### 1. Python PIL 数据处理库常用方式 2 | 3 | ## 1.1 PIL.ImageEnhance 4 | 5 | 1. PIL.ImageEnhance.Color() 6 | 7 | 2. PIL.ImageEnhance.Contrast() 8 | 9 | 3. PIL.ImageEnhance.Brightness() 10 | 11 | 4. PIL.ImageEnhance.Sharpness() 12 | 13 | ## 1.2 PIL.ImageFilter 14 | 15 | 1. PIL.ImageFilter.EDGE_ENHANCE : 边缘特征增强滤波 16 | 2. PIL.ImageFilter.EDGE_ENHANCE_MORE : 深度边缘特征增强滤波 17 | 3. PIL.ImageFilter.EMBOSS: 浮雕滤波 18 | 4. PIL.ImageFilter.CONTOUR: 轮廓滤波 19 | 5. PIL.ImageFilter.BLUR: 模糊滤波 20 | 6. PIL.ImageFilter.DETAIL: 细节滤波 21 | 7. PIL.ImageFilter.FIND_EDGES: 寻找边界滤波(找寻图像的边界信息) 22 | 8. PIL.ImageFilter.SMOOTH: 平滑滤波 23 | 9. PIL.ImageFilter.SMOOTH_MORE: 深度平滑滤波 24 | 10. PIL.ImageFilter.SHARPEN:锐化滤波 25 | 11. PIL.ImageFilter.GaussianBlur:高斯模糊滤波 26 | 12. PIL.ImageFilter.UnsharpMask(radius=2, percent=150, threshold=3)):反锐化掩码滤波 27 | ①radius:模糊半径②percent:反锐化强度(百分比) ③threshold:被锐化的最小亮度 28 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | **背景** 2 | 3 | **深度学习三把斧第一把: "数据增广"** 4 | 5 | 博客地址: [目标检测系列二:数据增广](http://www.spytensor.com/index.php/archives/50/) 6 | 7 | 本部分代码: [image_aug_for_detection](https://github.com/spytensor/image_aug_for_detection) 8 | 9 | ### 1. 更新 10 | 11 | ```version 0.1.0``` 12 | 13 | 第一次提交版本,涵盖图像长宽变化的增广方式以及不改变图像长宽的方式. 14 | 15 | ### 2. 介绍 16 | 17 | #### 2.1 依赖 18 | 19 | `PIL`,`opencv`,`skimage` 20 | 21 | #### 2.2 数据扩充 22 | 23 | #### 2.2.1 功能介绍 24 | 25 | 数据扩充按照是否改变图像尺寸分为两大类 keep_size 和 change_size, 26 | 27 | keep_size支持的数据扩充方式: 28 | 29 | 1. 图像色彩平衡调节:PIL.ImageEnhance.Color() 30 | 2. 图像对比度调节:PIL.ImageEnhance.Contrast() 31 | 3. 图像亮度调节:PIL.ImageEnhance.Brightness() 32 | 4. 图像加噪声:通过skimage.util.random_noise()实现,支持:高斯噪声、盐/椒噪声、泊松噪声、乘法噪声 33 | 5. 图像模糊:PIL.ImageFilter 34 | 35 | change_size支持的数据扩充方式: 36 | 37 | 1. 旋转:选择0-360度范围内的旋转 38 | 2. 翻转:水平翻转,垂直翻转 39 | 3. 缩放:按照一定比例缩放图片 40 | 41 | #### 2.2.2 使用方法 42 | 43 | Step 1: 准备 csv 格式的标注文件 `train_labels.csv` ,样例如下 44 | ``` 45 | /mfs/home/zhuchaojie/ds/data/000.jpg,145,245,324,654,helmet 46 | ``` 47 | Step 2: 修改 `config.py`相关配置,解释如下 48 | ``` 49 | class DefaultConfigs(object): 50 | raw_images = "./data/raw/images/" # 原始图片路径 51 | raw_csv_files = "./data/raw/csv_files/train_labels.csv" # 原始csv格式标签 52 | augmented_images = "./data/augmented/images/" # 增强后的图片保存路径 53 | augmented_csv_file = "./data/augmented/csv_files/augmented_labels.csv" # 增强后的csv格式的标注文件 54 | image_format = "jpg" # 默认图片格式 55 | config = DefaultConfigs() 56 | ``` 57 | 58 | Step 3: 执行 ``python example.py`` 59 | 60 | **温馨提示: 记得将扩充的数据和原始数据合并后再转换格式** 61 | 62 | ### 2.3 标注格式转换: 63 | 64 | 详情移步: [目标检测系列一:如何制作数据集?](http://www.spytensor.com/index.php/archives/48/) 65 | 66 | 说明:由于csv,txt格式过于简单,不提供转换脚本,直接使用 python open就可以完成. 67 | 68 | 目前支持的格式转换: 69 | 70 | - csv to coco2017 71 | - csv to voc2007 72 | - labelme to coco2017 73 | - labelme to voc2007 74 | - txt to coco2017 75 | 76 | ### 2.4 增强效果 77 | 78 | ![](https://github.com/spytensor/image_aug_for_detection/blob/master/samples/FotoJet.jpg?raw=true) 79 | 80 | ## 3 目标检测系列 81 | 82 | 1. [目标检测系列一:如何制作数据集?](http://www.spytensor.com/index.php/archives/48/) 83 | 2. [目标检测系列二:数据增广](http://www.spytensor.com/index.php/archives/50/) 84 | 85 | 86 | # TODO 87 | 88 | 1. 替换增强方式为`imgaug`下的,支持更多种类的数据扩充 89 | 2. 暂时没其他需要,没考虑那么多,有需求提`issue`就行. 90 | 3. 下一篇:通过实例掌握目标检测 -------------------------------------------------------------------------------- /config.py: -------------------------------------------------------------------------------- 1 | class DefaultConfigs(object): 2 | raw_images = "./data/raw/images/" # 原始图片路径 3 | raw_csv_files = "./data/raw/csv_files/train_labels.csv" # 原始csv格式标签 4 | augmented_images = "./data/augmented/images/" # 增强后的图片保存路径 5 | augmented_csv_file = "./data/augmented/csv_files/augmented_labels.csv" # 增强后的csv格式的标注文件 6 | image_format = "jpg" # 默认图片格式 7 | config = DefaultConfigs() -------------------------------------------------------------------------------- /data/augmented/csv_files/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/data/augmented/csv_files/.gitkeep -------------------------------------------------------------------------------- /data/augmented/images/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/data/augmented/images/.gitkeep -------------------------------------------------------------------------------- /data/raw/csv_files/train_labels.csv: -------------------------------------------------------------------------------- 1 | ./data/raw/images/COCO_test2014_000000291295.jpg,434,121,535,434,"person" 2 | ./data/raw/images/COCO_test2014_000000291295.jpg,527,198,604,407,"person" 3 | ./data/raw/images/COCO_test2014_000000291295.jpg,49,126,604,379,"truck" 4 | ./data/raw/images/COCO_test2014_000000291295.jpg,281,243,358,458,"person" 5 | ./data/raw/images/COCO_test2014_000000389792.jpg,0,98,88,417,"person" 6 | ./data/raw/images/COCO_test2014_000000389792.jpg,249,78,528,425,"person" 7 | -------------------------------------------------------------------------------- /data/raw/images/COCO_test2014_000000291295.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/data/raw/images/COCO_test2014_000000291295.jpg -------------------------------------------------------------------------------- /data/raw/images/COCO_test2014_000000389792.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/data/raw/images/COCO_test2014_000000389792.jpg -------------------------------------------------------------------------------- /data_aug/__init__.py: -------------------------------------------------------------------------------- 1 | from .no_label_change import * 2 | from .with_label_change import * 3 | 4 | -------------------------------------------------------------------------------- /data_aug/box_utils.py: -------------------------------------------------------------------------------- 1 | ### 原作者及链接:Paperspace https://github.com/Paperspace/DataAugmentationForObjectDetection/blob/master/data_aug/bbox_util.py 2 | import cv2 3 | import numpy as np 4 | 5 | 6 | def draw_rect(im, cords, color = None): 7 | """Draw the rectangle on the image 8 | 9 | Parameters 10 | ---------- 11 | 12 | im : numpy.ndarray 13 | numpy image 14 | 15 | cords: numpy.ndarray 16 | Numpy array containing bounding boxes of shape `N X 4` where N is the 17 | number of bounding boxes and the bounding boxes are represented in the 18 | format `x1 y1 x2 y2` 19 | 20 | Returns 21 | ------- 22 | 23 | numpy.ndarray 24 | numpy image with bounding boxes drawn on it 25 | 26 | """ 27 | 28 | im = im.copy() 29 | 30 | cords = cords[:,:4] 31 | cords = cords.reshape(-1,4) 32 | if not color: 33 | color = [255,255,255] 34 | for cord in cords: 35 | 36 | pt1, pt2 = (cord[0], cord[1]) , (cord[2], cord[3]) 37 | 38 | pt1 = int(pt1[0]), int(pt1[1]) 39 | pt2 = int(pt2[0]), int(pt2[1]) 40 | 41 | im = cv2.rectangle(im.copy(), pt1, pt2, color, int(max(im.shape[:2])/200)) 42 | return im 43 | 44 | def bbox_area(bbox): 45 | return (bbox[:,2] - bbox[:,0])*(bbox[:,3] - bbox[:,1]) 46 | 47 | def clip_box(bbox, clip_box, alpha): 48 | """Clip the bounding boxes to the borders of an image 49 | 50 | Parameters 51 | ---------- 52 | 53 | bbox: numpy.ndarray 54 | Numpy array containing bounding boxes of shape `N X 4` where N is the 55 | number of bounding boxes and the bounding boxes are represented in the 56 | format `x1 y1 x2 y2` 57 | 58 | clip_box: numpy.ndarray 59 | An array of shape (4,) specifying the diagonal co-ordinates of the image 60 | The coordinates are represented in the format `x1 y1 x2 y2` 61 | 62 | alpha: float 63 | If the fraction of a bounding box left in the image after being clipped is 64 | less than `alpha` the bounding box is dropped. 65 | 66 | Returns 67 | ------- 68 | 69 | numpy.ndarray 70 | Numpy array containing **clipped** bounding boxes of shape `N X 4` where N is the 71 | number of bounding boxes left are being clipped and the bounding boxes are represented in the 72 | format `x1 y1 x2 y2` 73 | 74 | """ 75 | ar_ = (bbox_area(bbox)) 76 | x_min = np.maximum(bbox[:,0], clip_box[0]).reshape(-1,1) 77 | y_min = np.maximum(bbox[:,1], clip_box[1]).reshape(-1,1) 78 | x_max = np.minimum(bbox[:,2], clip_box[2]).reshape(-1,1) 79 | y_max = np.minimum(bbox[:,3], clip_box[3]).reshape(-1,1) 80 | 81 | bbox = np.hstack((x_min, y_min, x_max, y_max, bbox[:,4:])) 82 | 83 | delta_area = ((ar_ - bbox_area(bbox))/ar_) 84 | 85 | mask = (delta_area < (1 - alpha)).astype(int) 86 | 87 | bbox = bbox[mask == 1,:] 88 | 89 | 90 | return bbox 91 | 92 | 93 | def rotate_im(image, angle): 94 | """Rotate the image. 95 | 96 | Rotate the image such that the rotated image is enclosed inside the tightest 97 | rectangle. The area not occupied by the pixels of the original image is colored 98 | black. 99 | 100 | Parameters 101 | ---------- 102 | 103 | image : numpy.ndarray 104 | numpy image 105 | 106 | angle : float 107 | angle by which the image is to be rotated 108 | 109 | Returns 110 | ------- 111 | 112 | numpy.ndarray 113 | Rotated Image 114 | 115 | """ 116 | # grab the dimensions of the image and then determine the 117 | # centre 118 | (h, w) = image.shape[:2] 119 | (cX, cY) = (w // 2, h // 2) 120 | 121 | # grab the rotation matrix (applying the negative of the 122 | # angle to rotate clockwise), then grab the sine and cosine 123 | # (i.e., the rotation components of the matrix) 124 | M = cv2.getRotationMatrix2D((cX, cY), angle, 1.0) 125 | cos = np.abs(M[0, 0]) 126 | sin = np.abs(M[0, 1]) 127 | 128 | # compute the new bounding dimensions of the image 129 | nW = int((h * sin) + (w * cos)) 130 | nH = int((h * cos) + (w * sin)) 131 | 132 | # adjust the rotation matrix to take into account translation 133 | M[0, 2] += (nW / 2) - cX 134 | M[1, 2] += (nH / 2) - cY 135 | 136 | # perform the actual rotation and return the image 137 | image = cv2.warpAffine(image, M, (nW, nH)) 138 | 139 | # image = cv2.resize(image, (w,h)) 140 | return image 141 | 142 | def get_corners(bboxes): 143 | 144 | """Get corners of bounding boxes 145 | 146 | Parameters 147 | ---------- 148 | 149 | bboxes: numpy.ndarray 150 | Numpy array containing bounding boxes of shape `N X 4` where N is the 151 | number of bounding boxes and the bounding boxes are represented in the 152 | format `x1 y1 x2 y2` 153 | 154 | returns 155 | ------- 156 | 157 | numpy.ndarray 158 | Numpy array of shape `N x 8` containing N bounding boxes each described by their 159 | corner co-ordinates `x1 y1 x2 y2 x3 y3 x4 y4` 160 | 161 | """ 162 | width = (bboxes[:,2] - bboxes[:,0]).reshape(-1,1) 163 | height = (bboxes[:,3] - bboxes[:,1]).reshape(-1,1) 164 | 165 | x1 = bboxes[:,0].reshape(-1,1) 166 | y1 = bboxes[:,1].reshape(-1,1) 167 | 168 | x2 = x1 + width 169 | y2 = y1 170 | 171 | x3 = x1 172 | y3 = y1 + height 173 | 174 | x4 = bboxes[:,2].reshape(-1,1) 175 | y4 = bboxes[:,3].reshape(-1,1) 176 | 177 | corners = np.hstack((x1,y1,x2,y2,x3,y3,x4,y4)) 178 | 179 | return corners 180 | 181 | def rotate_box(corners,angle, cx, cy, h, w): 182 | 183 | """Rotate the bounding box. 184 | 185 | 186 | Parameters 187 | ---------- 188 | 189 | corners : numpy.ndarray 190 | Numpy array of shape `N x 8` containing N bounding boxes each described by their 191 | corner co-ordinates `x1 y1 x2 y2 x3 y3 x4 y4` 192 | 193 | angle : float 194 | angle by which the image is to be rotated 195 | 196 | cx : int 197 | x coordinate of the center of image (about which the box will be rotated) 198 | 199 | cy : int 200 | y coordinate of the center of image (about which the box will be rotated) 201 | 202 | h : int 203 | height of the image 204 | 205 | w : int 206 | width of the image 207 | 208 | Returns 209 | ------- 210 | 211 | numpy.ndarray 212 | Numpy array of shape `N x 8` containing N rotated bounding boxes each described by their 213 | corner co-ordinates `x1 y1 x2 y2 x3 y3 x4 y4` 214 | """ 215 | 216 | corners = corners.reshape(-1,2) 217 | corners = np.hstack((corners, np.ones((corners.shape[0],1), dtype = type(corners[0][0])))) 218 | 219 | M = cv2.getRotationMatrix2D((cx, cy), angle, 1.0) 220 | 221 | 222 | cos = np.abs(M[0, 0]) 223 | sin = np.abs(M[0, 1]) 224 | 225 | nW = int((h * sin) + (w * cos)) 226 | nH = int((h * cos) + (w * sin)) 227 | # adjust the rotation matrix to take into account translation 228 | M[0, 2] += (nW / 2) - cx 229 | M[1, 2] += (nH / 2) - cy 230 | # Prepare the vector to be transformed 231 | calculated = np.dot(M,corners.T).T 232 | 233 | calculated = calculated.reshape(-1,8) 234 | 235 | return calculated 236 | 237 | 238 | def get_enclosing_box(corners): 239 | """Get an enclosing box for ratated corners of a bounding box 240 | 241 | Parameters 242 | ---------- 243 | 244 | corners : numpy.ndarray 245 | Numpy array of shape `N x 8` containing N bounding boxes each described by their 246 | corner co-ordinates `x1 y1 x2 y2 x3 y3 x4 y4` 247 | 248 | Returns 249 | ------- 250 | 251 | numpy.ndarray 252 | Numpy array containing enclosing bounding boxes of shape `N X 4` where N is the 253 | number of bounding boxes and the bounding boxes are represented in the 254 | format `x1 y1 x2 y2` 255 | 256 | """ 257 | x_ = corners[:,[0,2,4,6]] 258 | y_ = corners[:,[1,3,5,7]] 259 | 260 | xmin = np.min(x_,1).reshape(-1,1) 261 | ymin = np.min(y_,1).reshape(-1,1) 262 | xmax = np.max(x_,1).reshape(-1,1) 263 | ymax = np.max(y_,1).reshape(-1,1) 264 | 265 | final = np.hstack((xmin, ymin, xmax, ymax,corners[:,8:])) 266 | 267 | return final 268 | 269 | 270 | def letterbox_image(img, inp_dim): 271 | '''resize image with unchanged aspect ratio using padding 272 | 273 | Parameters 274 | ---------- 275 | 276 | img : numpy.ndarray 277 | Image 278 | 279 | inp_dim: tuple(int) 280 | shape of the reszied image 281 | 282 | Returns 283 | ------- 284 | 285 | numpy.ndarray: 286 | Resized image 287 | 288 | ''' 289 | 290 | inp_dim = (inp_dim, inp_dim) 291 | img_w, img_h = img.shape[1], img.shape[0] 292 | w, h = inp_dim 293 | new_w = int(img_w * min(w/img_w, h/img_h)) 294 | new_h = int(img_h * min(w/img_w, h/img_h)) 295 | resized_image = cv2.resize(img, (new_w,new_h)) 296 | 297 | canvas = np.full((inp_dim[1], inp_dim[0], 3), 0) 298 | 299 | canvas[(h-new_h)//2:(h-new_h)//2 + new_h,(w-new_w)//2:(w-new_w)//2 + new_w, :] = resized_image 300 | 301 | return canvas -------------------------------------------------------------------------------- /data_aug/no_label_change.py: -------------------------------------------------------------------------------- 1 | """ 2 | 不改变标注信息的数据扩充方式,通过python PIL.ImageEnhance来实现, 3 | 包括: 4 | 5 | 1. 图像色彩平衡调节:PIL.ImageEnhance.Color() 6 | 2. 图像对比度调节:PIL.ImageEnhance.Contrast() 7 | 3. 图像亮度调节:PIL.ImageEnhance.Brightness() 8 | 4. 图像加噪声:通过skimage.util.random_noise()实现,支持:高斯噪声、盐/椒噪声、泊松噪声、乘法噪声 9 | 10 | """ 11 | import PIL 12 | import numpy as np 13 | from PIL import Image,ImageEnhance,ImageFilter 14 | from skimage.util import random_noise 15 | from IPython import embed 16 | class keep_size(object): 17 | 18 | def color(self,image,boxes,num=1.3): 19 | """ 20 | func: 对图像进行色彩平衡调节 21 | input: 22 | image: 待增强原始图像的路径,PIL格式 23 | num: 亮度调节的程度,0表示黑白,1.0表示原始色彩,默认设置1.3 24 | boxes: 图像中待检测物体的标注框信息,以list格式传入 25 | output: 26 | image: PIL格式的图像 27 | boxes_changed: 改变后的标注框,无改变时为原始框,list格式 28 | """ 29 | assert isinstance(image,PIL.JpegImagePlugin.JpegImageFile) 30 | image = image.copy() 31 | enh_color = ImageEnhance.Color(image) 32 | image_enhanced = enh_color.enhance(num) 33 | 34 | return image_enhanced,boxes 35 | 36 | def contrast(self,image,boxes,num=1.3): 37 | """ 38 | func: 对图像进行对比度调节 39 | input: 40 | image: 待增强原始图像的路径,PIL格式 41 | num: 对比度的调节,0表示纯灰色,1.0表示原始对比,默认设置1.3 42 | boxes: 图像中待检测物体的标注框信息,以list格式传入 43 | output: 44 | image: PIL格式的图像 45 | boxes_changed: 改变后的标注框,无改变时为原始框,list格式 46 | """ 47 | assert isinstance(image,PIL.JpegImagePlugin.JpegImageFile) 48 | image = image.copy() 49 | enh_contrast = ImageEnhance.Contrast(image) 50 | image_enhanced = enh_contrast.enhance(num) 51 | 52 | return image_enhanced,boxes 53 | 54 | def brightness(self,image,boxes,num=1.3): 55 | """ 56 | func: 对图像进行亮度调节 57 | input: 58 | image: 待增强原始图像的路径,PIL格式 59 | num: 对比度的调节,0表示黑色,1.0表示原始亮度,默认设置1.3 60 | boxes: 图像中待检测物体的标注框信息,以list格式传入 61 | output: 62 | image: PIL格式的图像 63 | boxes_changed: 改变后的标注框,无改变时为原始框,list格式 64 | """ 65 | assert isinstance(image,PIL.JpegImagePlugin.JpegImageFile) 66 | image = image.copy() 67 | enh_brightness = ImageEnhance.Brightness(image) 68 | image_enhanced = enh_brightness.enhance(num) 69 | 70 | return image_enhanced,boxes 71 | 72 | def noise(self,image,boxes,noise_type="gaussian"): 73 | """ 74 | func: 对图像增加噪声 75 | input: 76 | image: 待增强原始图像的路径,PIL格式 77 | noise_type: 噪声类别,包括高斯/盐椒噪声/泊松噪声等 78 | boxes: 图像中待检测物体的标注框信息,以list格式传入 79 | output: 80 | image: PIL格式的图像 81 | boxes_changed: 改变后的标注框,无改变时为原始框,list格式 82 | """ 83 | assert isinstance(image,PIL.JpegImagePlugin.JpegImageFile) 84 | image = image.copy() 85 | noised_image = (random_noise(np.array(image),mode=noise_type,seed=2020)*255).astype(np.uint8) 86 | changed_image = Image.fromarray(noised_image) 87 | 88 | return changed_image,boxes 89 | 90 | def blur(self,image,boxes,filter_type="gaussian",radius=5): 91 | """ 92 | func: 对图像进行模糊滤波 93 | input: 94 | image: 待增强原始图像的路径,PIL格式 95 | filter_type: 模糊方式,包括"original"和"gaussian" 96 | boxes: 图像中待检测物体的标注框信息,以list格式传入 97 | output: 98 | image: PIL格式的图像 99 | boxes_changed: 改变后的标注框,无改变时为原始框,list格式 100 | """ 101 | assert isinstance(image,PIL.JpegImagePlugin.JpegImageFile) 102 | image = image.copy() 103 | if filter_type == "original": 104 | blured_image = image.filter(ImageFilter.BLUR) 105 | elif filter_type == "gaussian": 106 | blured_image = image.filter(ImageFilter.GaussianBlur(radius=radius)) 107 | else: 108 | print("ERROR!Blur type not support,please check it later!") 109 | pass 110 | return blured_image,boxes -------------------------------------------------------------------------------- /data_aug/with_label_change.py: -------------------------------------------------------------------------------- 1 | """ 2 | 需要改变标注框大小、位置等信息的数据扩充方式,支持的方式如下: 3 | 主要依赖opencv,numpy完成 4 | 5 | 1. 旋转:选择0-360度范围内的旋转 6 | 2. 翻转:水平翻转,垂直翻转 7 | 3. 缩放:按照一定比例缩放图片 8 | 9 | """ 10 | import cv2 11 | import PIL 12 | import numpy as np 13 | from .box_utils import * 14 | from PIL import Image 15 | 16 | class change_size(object): 17 | def __init__(self,config): 18 | self.config = config 19 | 20 | def rotate(self,image,boxes,angle): 21 | """ 22 | 按给定角度旋转图像 23 | input: 24 | image: PIL格式图像 25 | boxes: 原始标注框信息,list格式 26 | angle: 待旋转的角度 27 | output: 28 | image: 旋转后的图像,PIL格式 29 | boxes: 翻转后的框 30 | """ 31 | assert isinstance(image,PIL.JpegImagePlugin.JpegImageFile) 32 | assert isinstance(boxes,list) 33 | boxes = np.array(boxes).astype(np.float64) 34 | image = np.array(image.copy()) 35 | ## get image info 36 | weight,height = image.shape[1],image.shape[0] 37 | cx,cy = weight//2,height//2 38 | corners = get_corners(boxes) 39 | corners = np.hstack((corners,boxes[:,4:])) 40 | img = rotate_im(image,angle) 41 | corners[:,:8] = rotate_box(corners[:,:8],angle,cx,cy,height,weight) 42 | ## create new box 43 | new_bbox = get_enclosing_box(corners) 44 | scale_factor_x = img.shape[1] / weight 45 | scale_factor_y = img.shape[0] / height 46 | img = cv2.resize(img,(weight,height)) 47 | new_bbox[:,:4] /= [scale_factor_x,scale_factor_y,scale_factor_x,scale_factor_y] 48 | boxes = new_bbox 49 | boxes = clip_box(boxes,[0,0,weight,height],0.25) 50 | 51 | return Image.fromarray(img),boxes.astype(np.int64) 52 | 53 | def horizontal_flip(self,image,boxes): 54 | """ 55 | 水平翻转 56 | input: 57 | image: PIL格式图像 58 | boxes: 原始标注框信息,list格式 59 | output: 60 | image: 水平翻转后的图像,PIL格式 61 | boxes: 翻转后的框 62 | """ 63 | assert isinstance(image,PIL.JpegImagePlugin.JpegImageFile) 64 | assert isinstance(boxes,list) 65 | boxes = np.array(boxes).astype(np.float64) 66 | image = np.array(image.copy()) 67 | #get image center 68 | img_center = np.array(image.shape[:2])[::-1]/2 69 | img_center = np.hstack((img_center,img_center)) 70 | #horizontal flip image 71 | img_hor_flip = image[:,::-1,:] 72 | #change boxes for horizontal direction 73 | boxes[:,[0,2]] += 2 * (img_center[[0,2]] - boxes[:,[0,2]]) 74 | box_w = abs(boxes[:,0] - boxes[:,2]) 75 | #finetune box 76 | boxes[:,0] -= box_w 77 | boxes[:,2] += box_w 78 | 79 | return Image.fromarray(img_hor_flip),boxes.astype(np.int64) 80 | 81 | def vertical_flip(self,image,boxes): 82 | """ 83 | 垂直翻转 84 | input: 85 | image: PIL格式图像 86 | boxes: 原始标注框信息,list格式 87 | output: 88 | image: 垂直翻转后的图像,PIL格式 89 | boxes: 翻转后的框 90 | """ 91 | assert isinstance(image,PIL.JpegImagePlugin.JpegImageFile) 92 | assert isinstance(boxes,list) 93 | boxes = np.array(boxes).astype(np.float64) 94 | image = np.array(image.copy()) 95 | #get image center 96 | img_center = np.array(image.shape[:2])[::-1]/2 97 | img_center = np.hstack((img_center,img_center)) 98 | #horizontal flip image 99 | img_hor_flip = image[::-1,:,:] 100 | #change boxes for horizontal direction 101 | boxes[:,[1,3]] += 2 * (img_center[[1,3]] - boxes[:,[1,3]]) 102 | box_h = abs(boxes[:,1] - boxes[:,3]) 103 | #finetune box 104 | boxes[:,1] -= box_h 105 | boxes[:,3] += box_h 106 | 107 | return Image.fromarray(img_hor_flip),boxes.astype(np.int64) 108 | 109 | def scale(self,image,boxes,ratio=[0.2,0.2]): 110 | """ 111 | 按给定尺度压缩图像 112 | input: 113 | image: PIL格式图像 114 | boxes: 原始标注框信息,list格式 115 | ratio: 压缩比例,[x,y] 两个方向 116 | output: 117 | image: 旋转后的图像,PIL格式 118 | boxes: 翻转后的框 119 | """ 120 | assert isinstance(image,PIL.JpegImagePlugin.JpegImageFile) 121 | assert isinstance(boxes,list) 122 | assert isinstance(ratio,list) 123 | assert ratio[0] < 1 124 | assert ratio[1] < 1 125 | boxes = np.array(boxes).astype(np.float64) 126 | scale_x,scale_y = ratio 127 | img = np.array(image) 128 | img_shape = img.shape 129 | # resize 130 | resize_scale_x = 1 - scale_x 131 | resize_scale_y = 1 - scale_y 132 | img = cv2.resize(img,None,fx=resize_scale_x,fy=resize_scale_y) 133 | boxes[:,:4] *= [resize_scale_x,resize_scale_y,resize_scale_x,resize_scale_y] 134 | canvas = np.zeros(img_shape,dtype=np.uint8) 135 | 136 | y_lim = int(min(resize_scale_y,1) * img_shape[0]) 137 | x_lim = int(min(resize_scale_x,1) * img_shape[1]) 138 | 139 | canvas[:y_lim,:x_lim,:] = img[:y_lim,:x_lim,:] 140 | img = canvas 141 | boxes = clip_box(boxes,[0,0,1+img_shape[1],img_shape[0]],0.25) 142 | 143 | return Image.fromarray(canvas),boxes.astype(np.int64) 144 | 145 | 146 | -------------------------------------------------------------------------------- /example.py: -------------------------------------------------------------------------------- 1 | import os 2 | import cv2 3 | import numpy as np 4 | import pandas as pd 5 | from tqdm import tqdm 6 | from glob import glob 7 | from PIL import Image 8 | from data_aug import * 9 | from config import config 10 | from IPython import embed 11 | all_augumentors = ["color","contrast","brightness","noise","blur","rotate", 12 | "horizontal_flip","vertical_flip","scale"] 13 | all_noise_type = ["gaussian","localvar","poisson","salt","pepper","s&p","speckle"] 14 | 15 | class Augumentor(): 16 | def __init__(self): 17 | self.image_paths = glob(config.raw_images+"/*.%s"%config.image_format) # 默认图片格式为jpg 18 | self.annotations_path = config.raw_csv_files 19 | self.nlc = keep_size() # 不改变原始图像大小的增强方式 20 | self.wlc = change_size(config) # 改变原始图像大小的增强方式 21 | 22 | def fit(self): 23 | total_boxes = {} 24 | # read box info for csv format 25 | annotations = pd.read_csv(self.annotations_path,header=None).values 26 | for annotation in annotations: 27 | key = annotation[0].split(os.sep)[-1] 28 | value = np.array([annotation[1:]]) 29 | if key in total_boxes.keys(): 30 | total_boxes[key] = np.concatenate((total_boxes[key],value),axis=0) 31 | else: 32 | total_boxes[key] = value 33 | # read image and process boxes 34 | for image_path in tqdm(self.image_paths): 35 | image = Image.open(image_path) 36 | #embed() 37 | raw_boxes = total_boxes[image_path.split(os.sep)[-1]].tolist() # convert csv box to list 38 | 39 | # do augumentation: keep size 40 | img_file_name = config.augmented_images+image_path.split(os.sep)[-1].split("."+config.image_format)[0] 41 | # color banlance 42 | colored_image,colored_box = self.nlc.color(image,raw_boxes,num=1.4) 43 | self.write_csv(img_file_name+"_colored.%s"%config.image_format,colored_box) 44 | self.write_image(colored_image,img_file_name+"_colored.%s"%config.image_format) 45 | # contrast enhance 46 | contrasted_image,contrasted_box = self.nlc.contrast(image,raw_boxes,num=1.4) 47 | self.write_csv(img_file_name+"_contrasted.%s"%config.image_format,contrasted_box) 48 | self.write_image(contrasted_image,img_file_name+"_contrasted.%s"%config.image_format) 49 | # brightness change 50 | brightness_image,brightness_box = self.nlc.brightness(image,raw_boxes,num=1.4) 51 | self.write_csv(img_file_name+"_brightness.%s"%config.image_format,brightness_box) 52 | self.write_image(brightness_image,img_file_name+"_brightness.%s"%config.image_format) 53 | # noise change default guassion 54 | gau_noise_image,gau_noise_box = self.nlc.noise(image,raw_boxes,noise_type="gaussian") 55 | self.write_csv(img_file_name+"_gau_noise.%s"%config.image_format,gau_noise_box) 56 | self.write_image(gau_noise_image,img_file_name+"_gau_noise.%s"%config.image_format) 57 | # blur default guassion 58 | blured_image,blured_box = self.nlc.blur(image,raw_boxes,filter_type="gaussian",radius=0.9) 59 | self.write_csv(img_file_name+"_blured.%s"%config.image_format,blured_box) 60 | self.write_image(blured_image,img_file_name+"_blured.%s"%config.image_format) 61 | 62 | # change image size 63 | # rotate image 64 | raw_labels = np.array(raw_boxes)[:,-1] 65 | for angle in (90,180,270): 66 | trans_boxes = np.array(raw_boxes)[:,:-1].tolist() # 传入改变图像大小函数中 67 | rotated_image,rotated_box = self.wlc.rotate(image,trans_boxes,angle) 68 | 69 | self.write_csv(img_file_name+"_rotated_{}.{}".format(str(angle),config.image_format),[raw_labels,rotated_box],original=False) 70 | self.write_image(rotated_image,img_file_name+"_rotated_{}.{}".format(str(angle),config.image_format)) 71 | 72 | # horizontal_flip 73 | hf_image,hf_box = self.wlc.horizontal_flip(image,np.array(raw_boxes)[:,:-1].tolist()) 74 | self.write_csv(img_file_name+"_hf.%s"%config.image_format,[raw_labels,hf_box],original=False) 75 | self.write_image(hf_image,img_file_name+"_hf.%s"%config.image_format) 76 | 77 | # vertical_flip 78 | vf_image,vf_box = self.wlc.vertical_flip(image,np.array(raw_boxes)[:,:-1].tolist()) 79 | self.write_csv(img_file_name+"_vf.%s"%config.image_format,[raw_labels,vf_box],original=False) 80 | self.write_image(vf_image,img_file_name+"_vf.%s"%config.image_format) 81 | 82 | # scale 83 | scale_image,scale_box = self.wlc.scale(image,np.array(raw_boxes)[:,:-1].tolist(),ratio=[0.3,0.3]) 84 | self.write_csv(img_file_name+"_scale.%s"%config.image_format,[raw_labels,scale_box],original=False) 85 | self.write_image(scale_image,img_file_name+"_scale.%s"%config.image_format) 86 | 87 | 88 | def write_csv(self,filename,boxes,original=True): 89 | saved_file = open(config.augmented_csv_file,"a+") 90 | if original: 91 | new_boxes = boxes 92 | for new_box in new_boxes: 93 | label = new_box[-1] 94 | saved_file.write(filename+","+str(new_box[0])+","+str(new_box[1])+","+str(new_box[2])+","+str(new_box[3]) + ","+label+"\n") 95 | else: 96 | labels, new_boxes= boxes[0],boxes[1] 97 | for label,new_box in zip(labels,new_boxes): 98 | saved_file.write(filename+","+str(new_box[0])+","+str(new_box[1])+","+str(new_box[2])+","+str(new_box[3]) + ","+label+"\n") 99 | 100 | def write_image(self,image,filename): 101 | image.save(filename) 102 | 103 | if __name__ == "__main__": 104 | if not os.path.exists(config.augmented_images): 105 | os.makedirs(config.augmented_images) 106 | if not os.path.exists(config.augmented_csv_file): 107 | os.makedirs(config.augmented_csv_file) 108 | detection_augumentor = Augumentor() 109 | detection_augumentor.fit() 110 | -------------------------------------------------------------------------------- /samples/COCO_test2014_000000291295_blured.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/samples/COCO_test2014_000000291295_blured.jpg -------------------------------------------------------------------------------- /samples/COCO_test2014_000000291295_brightness.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/samples/COCO_test2014_000000291295_brightness.jpg -------------------------------------------------------------------------------- /samples/COCO_test2014_000000291295_colored.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/samples/COCO_test2014_000000291295_colored.jpg -------------------------------------------------------------------------------- /samples/COCO_test2014_000000291295_contrasted.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/samples/COCO_test2014_000000291295_contrasted.jpg -------------------------------------------------------------------------------- /samples/COCO_test2014_000000291295_gau_noise.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/samples/COCO_test2014_000000291295_gau_noise.jpg -------------------------------------------------------------------------------- /samples/COCO_test2014_000000291295_hf.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/samples/COCO_test2014_000000291295_hf.jpg -------------------------------------------------------------------------------- /samples/COCO_test2014_000000291295_rotated_180.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/samples/COCO_test2014_000000291295_rotated_180.jpg -------------------------------------------------------------------------------- /samples/COCO_test2014_000000291295_rotated_270.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/samples/COCO_test2014_000000291295_rotated_270.jpg -------------------------------------------------------------------------------- /samples/COCO_test2014_000000291295_rotated_90.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/samples/COCO_test2014_000000291295_rotated_90.jpg -------------------------------------------------------------------------------- /samples/COCO_test2014_000000291295_scale.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/samples/COCO_test2014_000000291295_scale.jpg -------------------------------------------------------------------------------- /samples/COCO_test2014_000000291295_vf.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/samples/COCO_test2014_000000291295_vf.jpg -------------------------------------------------------------------------------- /samples/FotoJet.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/spytensor/image_aug_for_detection/a2ee39b0bb549b2ea1e8770efe343596f628c855/samples/FotoJet.jpg --------------------------------------------------------------------------------