├── README.md ├── figures └── Method.png ├── requirements.txt ├── segmentation.py └── segmentation_utils.py /README.md: -------------------------------------------------------------------------------- 1 | # Weakly-Supervised Surface Crack Segmentation by Generating Pseudo-Labels using Localization with a Classifier and Thresholding 2 | This repository contains the code implementation of our weakly-supervised pseudo-labeling method from the paper: _Weakly-Supervised Surface Crack Segmentation by Generating Pseudo-Labels using Localization with a Classifier and Thresholding_. 3 | 4 | König, J., Jenkins, M.D., Mannion, M., Barrie, P. and Morison, G., 2021. Weakly-Supervised Surface Crack Segmentation by Generating Pseudo-Labels using Localization with a Classifier and Thresholding. [[arXiv]](https://arxiv.org/abs/2109.00456) 5 | 6 | #### Abstract 7 | Surface cracks are a common sight on public infrastructure nowadays. Recent work has been addressing this problem by supporting structural maintenance measures using machine learning methods. Those methods are used to segment surface cracks from their background, making them easier to localize. However, a common issue is that to create a well-functioning algorithm, the training data needs to have detailed annotations of pixels that belong to cracks. Our work proposes a weakly supervised approach that leverages a CNN classifier in a novel way to create surface crack pseudo labels. First, we use the classifier to create a rough crack localization map by using its class activation maps and a patch based classification approach and fuse this with a thresholding based approach to segment the mostly darker crack pixels. The classifier assists in suppressing noise from the background regions, which commonly are incorrectly highlighted as cracks by standard thresholding methods. Then, the pseudo labels can be used in an end-to-end approach when training a standard CNN for surface crack segmentation. Our method is shown to yield sufficiently accurate pseudo labels. Those labels, incorporated into segmentation CNN training using multiple recent crack segmentation architectures, achieve comparable performance to fully supervised methods on four popular crack segmentation datasets. 8 | 9 | # Overview 10 | ![Weakly-Supervised-Crack-Segmentation](figures/Method.png) 11 | 12 | 13 | # This Repository 14 | This repository contains code to run our proposed, weakly-supervised pseudo label creation method (`segmentation.py`) 15 | 16 | **NOTE: Pretrained weights to run the pseudo-labeling procedure will be provided in due course** 17 | 18 | ### Requirements 19 | 20 | The main requirements to run this project are shown in the `requirements.txt` which can be installed using PIP. We used Ubuntu 16.04, Python 3.8.11 and Cuda 10.1. 21 | 22 | 23 | 24 | ### Generating Pseudo Labels 25 | 26 | To generate pseudo-labels using our weakly supervised method the `segmentation.py` file is used. 27 | 28 | ``` 29 | python segmentation.py --img_path=data/images_to_pseudo_label --prediction_path=data/pseudo_labels --classifier_type=R50 --classifier_weight_path=weights_to_classifier.h5 30 | ``` 31 | The `classifier_type` can be set to R50, R101 or R152, however pretrained weights in the .h5 format are necessary (and need to be passed to the `classifier_weight_path` argument). `img_path` should contain the images to "pseudo label" and `prediction_path` is where the predictions will be stored. 32 | 33 | 34 | # Reference 35 | 36 | If you use our proposed model or code please cite our paper: 37 | 38 | ``` 39 | @article{konig2021aweakly, 40 | title={A Weakly-Supervised Surface Crack Segmentation Method using Localisation with a Classifier and Thresholding}, 41 | author={K{\"o}nig, Jacob and Jenkins, Mark and Mannion, Mike and Barrie, Peter and Morison, Gordon}, 42 | journal={arXiv preprint arXiv:2109.00456}, 43 | year={2021} 44 | } 45 | ``` 46 | 47 | 48 | # Acknowledgements: 49 | 50 | This project uses code and data from the following: 51 | 52 | - [DeepCrack Dataset](https://github.com/yhlleo/DeepCrack) 53 | - [CrackForest Dataset](https://github.com/cuilimeng/CrackForest-dataset) 54 | - [AigleRN + ESAR + LCMS Datasets](https://www.irit.fr/~Sylvie.Chambon/Crack_Detection_Database.html) 55 | - [Retina U-Net](https://github.com/orobix/retina-unet) 56 | -------------------------------------------------------------------------------- /figures/Method.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jacobkoenig/WeaklySupervisedCrackSeg/79bce54582f62fe3fc238eb6cb7f2eff81276b56/figures/Method.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy 2 | Pillow 3 | tensorflow==2.3.0 4 | tensorflow-gpu==2.3.0 5 | albumentations==0.5.2 6 | matplotlib 7 | opencv-python 8 | scipy 9 | scikit-learn 10 | 11 | -------------------------------------------------------------------------------- /segmentation.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import argparse 4 | 5 | np.set_printoptions(suppress=True) 6 | import tensorflow.keras.backend as K 7 | import tensorflow as tf 8 | from PIL import Image 9 | from tensorflow.keras.preprocessing.image import load_img 10 | import cv2 11 | import numpy as np 12 | from segmentation_utils import ( 13 | model_factory, 14 | make_gradcam_plus_heatmap, 15 | _split_into_patches, 16 | _merge_out_preds, 17 | _norm_threshold_patches, 18 | ) 19 | 20 | gpus = tf.config.experimental.list_physical_devices("GPU") 21 | for gpu in gpus: 22 | tf.config.experimental.set_memory_growth(gpu, True) 23 | import numpy as np 24 | import tensorflow as tf 25 | 26 | 27 | class WeaklySupervisedCrackSeg: 28 | def __init__( 29 | self, 30 | classifier_type="R50", 31 | classifier_weight_path="./", 32 | patch_size=32, 33 | stride_classifier=16, 34 | stride_thresholding=8, 35 | ): 36 | 37 | self.classifier_type = classifier_type 38 | self.classifier_weight_path = classifier_weight_path 39 | self.patch_size = patch_size 40 | self.stride_classifier = stride_classifier 41 | self.stride_thresholding = stride_thresholding 42 | 43 | self.classifier = model_factory(classifier_type=self.classifier_type) 44 | self.classifier.load_weights(classifier_weight_path) 45 | print("--- Classification Model -----") 46 | print(self.classifier.summary()) 47 | 48 | def predict(self, img, detailed_output=False): 49 | """Predicts the segmentation map for a single image 50 | 51 | Args: 52 | img (np.array): input image in range of [0,255] with len(img.shape)==3 53 | detailed_output (bool, optional): wether to also return the grad_cam, classification, merged_localisation and threshold map. Defaults to False. 54 | Returns: 55 | [np.array]: output prediction in range of [0, 255] with len(img.shape)==3 56 | """ 57 | morph_kernel = np.ones((3, 3), np.uint8) 58 | 59 | # --- Coarse Localisation --- 60 | img_patches = _split_into_patches( 61 | img / 255.0, self.patch_size, self.stride_classifier 62 | ) 63 | 64 | classifier_pred = self.classifier(img_patches) 65 | merged_classifier_pred = _merge_out_preds( 66 | classifier_pred, 67 | img.shape[0], 68 | img.shape[1], 69 | self.patch_size, 70 | self.stride_classifier, 71 | ) 72 | merged_classifier_pred = ( 73 | np.array( 74 | Image.fromarray( 75 | np.squeeze(merged_classifier_pred * 255).astype("uint8") 76 | ).resize((img.shape[1], img.shape[0]), 5) 77 | ) 78 | / 255.0 79 | ) 80 | 81 | grad_cam_plus = make_gradcam_plus_heatmap( 82 | img / 255.0, self.classifier, "global_average_pooling2d" 83 | ) 84 | grad_cam_plus = ( 85 | np.array( 86 | Image.fromarray(np.squeeze(grad_cam_plus * 255).astype("uint8")).resize( 87 | (img.shape[1], img.shape[0]), 5 88 | ) 89 | ) 90 | / 255.0 91 | ) 92 | # average and only keep very confident localisations 93 | merge_cam_class = (grad_cam_plus + merged_classifier_pred) / 2.0 94 | merge_cam_class *= merge_cam_class > 0.5 95 | # perform erosion to narrow the predicted crack-regions 96 | merge_cam_class = cv2.erode(merge_cam_class, morph_kernel, iterations=4) 97 | 98 | # --- Thresholding --- 99 | bilateral = cv2.bilateralFilter((img).astype("uint8"), 5, 120, 120) 100 | thresholded = _norm_threshold_patches( 101 | bilateral, self.patch_size, self.stride_thresholding 102 | ) 103 | 104 | # --- Merging of Localisation with Thresholded Map --- 105 | segmentation = merge_cam_class * thresholded 106 | segmentation = ( 107 | cv2.bilateralFilter((segmentation * 255).astype("uint8"), 5, 120, 120) / 255 108 | ) 109 | segmentation = cv2.morphologyEx( 110 | (segmentation * 255).astype("uint8"), cv2.MORPH_CLOSE, morph_kernel 111 | ) 112 | if detailed_output: 113 | return ( 114 | segmentation, 115 | grad_cam_plus, 116 | merged_classifier_pred, 117 | merge_cam_class, 118 | thresholded, 119 | ) 120 | else: 121 | return segmentation 122 | 123 | 124 | def parse_args(): 125 | parser = argparse.ArgumentParser() 126 | parser.add_argument("--img_path", required=True) 127 | parser.add_argument("--prediction_path", required=True) 128 | 129 | parser.add_argument("--classifier_type", required=True) 130 | parser.add_argument("--classifier_weight_path", required=True) 131 | 132 | parser.add_argument("--patch_size", default="32") 133 | parser.add_argument("--stride_classifier", default="16") 134 | parser.add_argument("--stride_thresholding", default="8") 135 | 136 | parser.add_argument("--device", default="gpu:0") 137 | 138 | args = parser.parse_args() 139 | return args 140 | 141 | 142 | def main(cl_args): 143 | classifier_type = cl_args.classifier_type 144 | classifier_weight_path = cl_args.classifier_weight_path 145 | patch_size = int(cl_args.patch_size) 146 | stride_classifier = int(cl_args.stride_classifier) 147 | stride_thresholding = int(cl_args.stride_thresholding) 148 | 149 | img_path = cl_args.img_path 150 | prediction_path = cl_args.prediction_path 151 | 152 | device = cl_args.device 153 | 154 | os.makedirs(prediction_path, exist_ok=True) 155 | assert classifier_type in ["R50", "R101", "R152"] 156 | 157 | with tf.device(f"/{device}"): 158 | 159 | weakly = WeaklySupervisedCrackSeg( 160 | classifier_type=classifier_type, 161 | classifier_weight_path=classifier_weight_path, 162 | patch_size=patch_size, 163 | stride_classifier=stride_classifier, 164 | stride_thresholding=stride_thresholding, 165 | ) 166 | 167 | for filename in sorted(os.listdir(img_path)): 168 | if filename.endswith(".jpg") or filename.endswith(".png"): 169 | print("Predicting File:", filename) 170 | img = load_img(os.path.join(img_path, filename), color_mode="rgb",) 171 | img = np.array(img) 172 | 173 | prediction = weakly.predict(img) 174 | cv2.imwrite( 175 | os.path.join(prediction_path, os.path.splitext(filename)[0] + ".png"), 176 | prediction, 177 | ) 178 | 179 | 180 | if __name__ == "__main__": 181 | args = parse_args() 182 | main(args) 183 | -------------------------------------------------------------------------------- /segmentation_utils.py: -------------------------------------------------------------------------------- 1 | import tensorflow.keras as keras 2 | import numpy as np 3 | from tensorflow.keras import applications 4 | 5 | np.set_printoptions(suppress=True) 6 | import tensorflow.keras.backend as K 7 | import tensorflow as tf 8 | import numpy as np 9 | from skimage.filters import threshold_multiotsu 10 | import cv2 11 | 12 | # some of the patch code has been adapted from https://github.com/orobix/retina-unet/blob/master/lib/extract_patches.py 13 | 14 | 15 | def _split_into_patches(ar, patch_size, stride): 16 | h, w, c = ar.shape 17 | 18 | # pad img so that it is dividable by the patch size 19 | pad_h = patch_size - h % patch_size if h % patch_size != 0 else 0 20 | pad_w = patch_size - w % patch_size if w % patch_size != 0 else 0 21 | 22 | padded = np.pad(ar, [(0, pad_h), (0, pad_w), (0, 0)], mode="reflect") 23 | padded_h, padded_w, _ = padded.shape 24 | ppi = ((padded_h - patch_size) // stride + 1) * ( 25 | (padded_w - patch_size) // stride + 1 26 | ) 27 | patches = np.empty((ppi, patch_size, patch_size, c)) 28 | patch_ix = 0 29 | for h in range((padded_h - patch_size) // stride + 1): 30 | for w in range((padded_w - patch_size) // stride + 1): 31 | patch = padded[ 32 | h * stride : (h * stride) + patch_size, 33 | w * stride : (w * stride) + patch_size, 34 | ] 35 | patches[patch_ix] = patch 36 | patch_ix += 1 37 | return patches 38 | 39 | 40 | def _merge_out_preds(preds, img_h, img_w, patch_size, stride): 41 | if (img_h - patch_size) % stride != 0: 42 | img_h = img_h + (stride - ((img_h - patch_size) % stride)) 43 | if (img_w - patch_size) % stride != 0: 44 | img_w = img_w + (stride - (img_w - patch_size) % stride) 45 | 46 | N_preds_h = (img_h - patch_size) // stride + 1 47 | N_preds_w = (img_w - patch_size) // stride + 1 48 | probabilities = np.zeros((N_preds_h, N_preds_w)) 49 | sum = np.zeros((N_preds_h, N_preds_w)) 50 | 51 | i = 0 52 | for h in range(N_preds_h): 53 | for w in range(N_preds_w): 54 | probabilities[h, w] += preds[i][0] 55 | sum[h, w] += 1 56 | i += 1 57 | avg = probabilities / sum 58 | return avg 59 | 60 | 61 | def _norm_threshold_patches(img, patch_size, stride): 62 | img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) 63 | h, w = img.shape 64 | # pad img so that it is dividable by the patch size 65 | pad_h = patch_size - h % patch_size if h % patch_size != 0 else 0 66 | pad_w = patch_size - w % patch_size if w % patch_size != 0 else 0 67 | 68 | padded = np.pad(img, [(0, pad_h), (0, pad_w)], mode="reflect") 69 | 70 | padded_h, padded_w = padded.shape 71 | 72 | ppi = ((padded_h - patch_size) // stride + 1) * ( 73 | (padded_w - patch_size) // stride + 1 74 | ) 75 | 76 | N_patches_h = (padded_h - patch_size) // stride + 1 77 | N_patches_w = (padded_w - patch_size) // stride + 1 78 | 79 | patches = np.empty((ppi, patch_size, patch_size)) 80 | 81 | patch_ix = 0 82 | for h_temp in range(N_patches_h): 83 | for w_temp in range(N_patches_w): 84 | patch = padded[ 85 | h_temp * stride : (h_temp * stride) + patch_size, 86 | w_temp * stride : (w_temp * stride) + patch_size, 87 | ] 88 | patch = normalize(patch) 89 | thresh = threshold_multiotsu(patch, classes=3) 90 | patch = patch < thresh[0] 91 | patches[patch_ix] = patch 92 | patch_ix += 1 93 | 94 | # merge patches 95 | out = np.ones((padded_h, padded_w)) 96 | i = 0 97 | for h_temp in range(N_patches_h): 98 | for w_temp in range(N_patches_w): 99 | out[ 100 | h_temp * stride : (h_temp * stride) + patch_size, 101 | w_temp * stride : (w_temp * stride) + patch_size, 102 | ] *= patches[i] 103 | i += 1 104 | 105 | out = out[:h, :w] 106 | return out 107 | 108 | 109 | def normalize(img): 110 | img = img.astype("float") 111 | minimum = img.min() 112 | maximum = img.max() 113 | return (img - minimum) / (maximum - minimum) 114 | 115 | 116 | def model_factory(classifier_type="R50"): 117 | """ 118 | basic binary classification model with pretrained ResNet 119 | """ 120 | 121 | classifier_options = { 122 | "R50": tf.keras.applications.ResNet50, 123 | "R101": tf.keras.applications.ResNet101, 124 | "R152": tf.keras.applications.ResNet152, 125 | } 126 | assert classifier_type in classifier_options.keys() 127 | 128 | resnet = classifier_options[classifier_type]( 129 | input_shape=(None, None, 3), include_top=False, weights="imagenet" 130 | ) 131 | inp = keras.layers.Input(shape=(None, None, 3)) 132 | x = resnet(inp) 133 | x = keras.layers.GlobalAveragePooling2D()(x) 134 | x = keras.layers.Dense(1000, activation="relu")(x) 135 | x = keras.layers.Dense(2, activation="softmax")(x) 136 | 137 | model = keras.models.Model(inputs=inp, outputs=x) 138 | return model 139 | 140 | 141 | # slightly adapted from: https://github.com/samson6460/tf_keras_gradcamplusplus/blob/master/gradcam.py 142 | def make_gradcam_plus_heatmap( 143 | img, model, layer_name="block5_conv3", label_name=None, category_id=None 144 | ): 145 | """Get a heatmap by Grad-CAM. 146 | Args: 147 | model: A model object, build from tf.keras 2.X. 148 | img: An image ndarray. 149 | layer_name: A string, layer name in model. 150 | label_name: A list, 151 | show the label name by assign this argument, 152 | it should be a list of all label names. 153 | category_id: An integer, index of the class. 154 | Default is the category with the highest score in the prediction. 155 | Return: 156 | A heatmap ndarray(without color). 157 | """ 158 | img_tensor = np.expand_dims(img, axis=0) 159 | 160 | pool_layer = model.get_layer(layer_name) 161 | heatmap_model = tf.keras.models.Model( 162 | [model.inputs], [pool_layer.input, model.output] 163 | ) 164 | 165 | with tf.GradientTape() as gtape1: 166 | with tf.GradientTape() as gtape2: 167 | with tf.GradientTape() as gtape3: 168 | conv_output, predictions = heatmap_model(img_tensor) 169 | if category_id == None: 170 | category_id = np.argmax(predictions[0]) 171 | if label_name: 172 | print(label_name[category_id]) 173 | output = predictions[:, category_id] 174 | conv_first_grad = gtape3.gradient(output, conv_output) 175 | conv_second_grad = gtape2.gradient(conv_first_grad, conv_output) 176 | conv_third_grad = gtape1.gradient(conv_second_grad, conv_output) 177 | 178 | global_sum = np.sum(conv_output, axis=(0, 1, 2)) 179 | 180 | alpha_num = conv_second_grad[0] 181 | alpha_denom = conv_second_grad[0] * 2.0 + conv_third_grad[0] * global_sum 182 | alpha_denom = np.where(alpha_denom != 0.0, alpha_denom, 1e-10) 183 | 184 | alphas = alpha_num / alpha_denom 185 | alpha_normalization_constant = np.sum(alphas, axis=(0, 1)) 186 | alphas /= alpha_normalization_constant 187 | 188 | weights = np.maximum(conv_first_grad[0], 0.0) 189 | 190 | deep_linearization_weights = np.sum(weights * alphas, axis=(0, 1)) 191 | grad_CAM_map = np.sum(deep_linearization_weights * conv_output[0], axis=2) 192 | 193 | heatmap = np.maximum(grad_CAM_map, 0) 194 | max_heat = np.max(heatmap) 195 | if max_heat == 0: 196 | max_heat = 1e-10 197 | heatmap /= max_heat 198 | 199 | return heatmap 200 | --------------------------------------------------------------------------------