├── README.md
├── figures
    └── Method.png
├── requirements.txt
├── segmentation.py
└── segmentation_utils.py


/README.md:
--------------------------------------------------------------------------------
 1 | # Weakly-Supervised Surface Crack Segmentation by Generating Pseudo-Labels using Localization with a Classifier and Thresholding
 2 | This repository contains the code implementation of our weakly-supervised pseudo-labeling method from the paper: _Weakly-Supervised Surface Crack Segmentation by Generating Pseudo-Labels using Localization with a Classifier and Thresholding_.
 3 | 
 4 | König, J., Jenkins, M.D., Mannion, M., Barrie, P. and Morison, G., 2021. Weakly-Supervised Surface Crack Segmentation by Generating Pseudo-Labels using Localization with a Classifier and Thresholding. [[arXiv]](https://arxiv.org/abs/2109.00456)
 5 | 
 6 | #### Abstract
 7 | Surface cracks are a common sight on public infrastructure nowadays. Recent work has been addressing this problem by supporting structural maintenance measures using machine learning methods. Those methods are used to segment surface cracks from their background, making them easier to localize. However, a common issue is that to create a well-functioning algorithm, the training data needs to have detailed annotations of pixels that belong to cracks. Our work proposes a weakly supervised approach that leverages a CNN classifier in a novel way to create surface crack pseudo labels. First, we use the classifier to create a rough crack localization map by using its class activation maps and a patch based classification approach and fuse this with a thresholding based approach to segment the mostly darker crack pixels. The classifier assists in suppressing noise from the background regions, which commonly are incorrectly highlighted as cracks by standard thresholding methods. Then, the pseudo labels can be used in an end-to-end approach when training a standard CNN for surface crack segmentation. Our method is shown to yield sufficiently accurate pseudo labels. Those labels, incorporated into segmentation CNN training using multiple recent crack segmentation architectures, achieve comparable performance to fully supervised methods on four popular crack segmentation datasets.
 8 | 
 9 | # Overview
10 | ![Weakly-Supervised-Crack-Segmentation](figures/Method.png)
11 | 
12 | 
13 | # This Repository
14 | This repository contains code to run our proposed, weakly-supervised pseudo label creation method (`segmentation.py`)
15 | 
16 | **NOTE: Pretrained weights to run the pseudo-labeling procedure will be provided in due course**
17 | 
18 | ### Requirements
19 | 
20 | The main requirements to run this project are shown in the `requirements.txt` which can be installed using PIP. We used Ubuntu 16.04, Python 3.8.11 and Cuda 10.1.
21 | 
22 | 
23 | 
24 | ### Generating Pseudo Labels
25 | 
26 | To generate pseudo-labels using our weakly supervised method the `segmentation.py` file is used.
27 | 
28 | ```
29 | python segmentation.py --img_path=data/images_to_pseudo_label  --prediction_path=data/pseudo_labels --classifier_type=R50 --classifier_weight_path=weights_to_classifier.h5
30 | ```
31 | The `classifier_type` can be set to R50, R101 or R152, however pretrained weights in the .h5 format are necessary (and need to be passed to the `classifier_weight_path` argument). `img_path` should contain the images to "pseudo label" and `prediction_path` is where the predictions will be stored.
32 | 
33 | 
34 | # Reference
35 | 
36 | If you use our proposed model or code please cite our paper:
37 | 
38 | ```
39 | @article{konig2021aweakly,
40 |   title={A Weakly-Supervised Surface Crack Segmentation Method using Localisation with a Classifier and Thresholding},
41 |   author={K{\"o}nig, Jacob and Jenkins, Mark and Mannion, Mike and Barrie, Peter and Morison, Gordon},
42 |   journal={arXiv preprint arXiv:2109.00456},
43 |   year={2021}
44 | }
45 | ```
46 | 
47 | 
48 | # Acknowledgements:
49 | 
50 | This project uses code and data from the following:
51 | 
52 | - [DeepCrack Dataset](https://github.com/yhlleo/DeepCrack)
53 | - [CrackForest Dataset](https://github.com/cuilimeng/CrackForest-dataset)
54 | - [AigleRN + ESAR + LCMS Datasets](https://www.irit.fr/~Sylvie.Chambon/Crack_Detection_Database.html)
55 | - [Retina U-Net](https://github.com/orobix/retina-unet)
56 | 


--------------------------------------------------------------------------------
/figures/Method.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jacobkoenig/WeaklySupervisedCrackSeg/79bce54582f62fe3fc238eb6cb7f2eff81276b56/figures/Method.png


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | numpy
 2 | Pillow
 3 | tensorflow==2.3.0
 4 | tensorflow-gpu==2.3.0
 5 | albumentations==0.5.2
 6 | matplotlib
 7 | opencv-python
 8 | scipy
 9 | scikit-learn
10 | 
11 | 


--------------------------------------------------------------------------------
/segmentation.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import numpy as np
  3 | import argparse
  4 | 
  5 | np.set_printoptions(suppress=True)
  6 | import tensorflow.keras.backend as K
  7 | import tensorflow as tf
  8 | from PIL import Image
  9 | from tensorflow.keras.preprocessing.image import load_img
 10 | import cv2
 11 | import numpy as np
 12 | from segmentation_utils import (
 13 |     model_factory,
 14 |     make_gradcam_plus_heatmap,
 15 |     _split_into_patches,
 16 |     _merge_out_preds,
 17 |     _norm_threshold_patches,
 18 | )
 19 | 
 20 | gpus = tf.config.experimental.list_physical_devices("GPU")
 21 | for gpu in gpus:
 22 |     tf.config.experimental.set_memory_growth(gpu, True)
 23 | import numpy as np
 24 | import tensorflow as tf
 25 | 
 26 | 
 27 | class WeaklySupervisedCrackSeg:
 28 |     def __init__(
 29 |         self,
 30 |         classifier_type="R50",
 31 |         classifier_weight_path="./",
 32 |         patch_size=32,
 33 |         stride_classifier=16,
 34 |         stride_thresholding=8,
 35 |     ):
 36 | 
 37 |         self.classifier_type = classifier_type
 38 |         self.classifier_weight_path = classifier_weight_path
 39 |         self.patch_size = patch_size
 40 |         self.stride_classifier = stride_classifier
 41 |         self.stride_thresholding = stride_thresholding
 42 | 
 43 |         self.classifier = model_factory(classifier_type=self.classifier_type)
 44 |         self.classifier.load_weights(classifier_weight_path)
 45 |         print("--- Classification Model -----")
 46 |         print(self.classifier.summary())
 47 | 
 48 |     def predict(self, img, detailed_output=False):
 49 |         """Predicts the segmentation map for a single image
 50 | 
 51 |         Args:
 52 |             img (np.array): input image in range of [0,255] with len(img.shape)==3
 53 |             detailed_output (bool, optional): wether to also return the grad_cam, classification, merged_localisation and threshold map. Defaults to False.
 54 |         Returns:
 55 |             [np.array]: output prediction in range of [0, 255] with len(img.shape)==3 
 56 |         """
 57 |         morph_kernel = np.ones((3, 3), np.uint8)
 58 | 
 59 |         # --- Coarse Localisation ---
 60 |         img_patches = _split_into_patches(
 61 |             img / 255.0, self.patch_size, self.stride_classifier
 62 |         )
 63 | 
 64 |         classifier_pred = self.classifier(img_patches)
 65 |         merged_classifier_pred = _merge_out_preds(
 66 |             classifier_pred,
 67 |             img.shape[0],
 68 |             img.shape[1],
 69 |             self.patch_size,
 70 |             self.stride_classifier,
 71 |         )
 72 |         merged_classifier_pred = (
 73 |             np.array(
 74 |                 Image.fromarray(
 75 |                     np.squeeze(merged_classifier_pred * 255).astype("uint8")
 76 |                 ).resize((img.shape[1], img.shape[0]), 5)
 77 |             )
 78 |             / 255.0
 79 |         )
 80 | 
 81 |         grad_cam_plus = make_gradcam_plus_heatmap(
 82 |             img / 255.0, self.classifier, "global_average_pooling2d"
 83 |         )
 84 |         grad_cam_plus = (
 85 |             np.array(
 86 |                 Image.fromarray(np.squeeze(grad_cam_plus * 255).astype("uint8")).resize(
 87 |                     (img.shape[1], img.shape[0]), 5
 88 |                 )
 89 |             )
 90 |             / 255.0
 91 |         )
 92 |         # average and only keep very confident localisations
 93 |         merge_cam_class = (grad_cam_plus + merged_classifier_pred) / 2.0
 94 |         merge_cam_class *= merge_cam_class > 0.5
 95 |         # perform erosion to narrow the predicted crack-regions
 96 |         merge_cam_class = cv2.erode(merge_cam_class, morph_kernel, iterations=4)
 97 | 
 98 |         # --- Thresholding ---
 99 |         bilateral = cv2.bilateralFilter((img).astype("uint8"), 5, 120, 120)
100 |         thresholded = _norm_threshold_patches(
101 |             bilateral, self.patch_size, self.stride_thresholding
102 |         )
103 | 
104 |         # --- Merging of Localisation with Thresholded Map ---
105 |         segmentation = merge_cam_class * thresholded
106 |         segmentation = (
107 |             cv2.bilateralFilter((segmentation * 255).astype("uint8"), 5, 120, 120) / 255
108 |         )
109 |         segmentation = cv2.morphologyEx(
110 |             (segmentation * 255).astype("uint8"), cv2.MORPH_CLOSE, morph_kernel
111 |         )
112 |         if detailed_output:
113 |             return (
114 |                 segmentation,
115 |                 grad_cam_plus,
116 |                 merged_classifier_pred,
117 |                 merge_cam_class,
118 |                 thresholded,
119 |             )
120 |         else:
121 |             return segmentation
122 | 
123 | 
124 | def parse_args():
125 |     parser = argparse.ArgumentParser()
126 |     parser.add_argument("--img_path", required=True)
127 |     parser.add_argument("--prediction_path", required=True)
128 | 
129 |     parser.add_argument("--classifier_type", required=True)
130 |     parser.add_argument("--classifier_weight_path", required=True)
131 | 
132 |     parser.add_argument("--patch_size", default="32")
133 |     parser.add_argument("--stride_classifier", default="16")
134 |     parser.add_argument("--stride_thresholding", default="8")
135 | 
136 |     parser.add_argument("--device", default="gpu:0")
137 | 
138 |     args = parser.parse_args()
139 |     return args
140 | 
141 | 
142 | def main(cl_args):
143 |     classifier_type = cl_args.classifier_type
144 |     classifier_weight_path = cl_args.classifier_weight_path
145 |     patch_size = int(cl_args.patch_size)
146 |     stride_classifier = int(cl_args.stride_classifier)
147 |     stride_thresholding = int(cl_args.stride_thresholding)
148 | 
149 |     img_path = cl_args.img_path
150 |     prediction_path = cl_args.prediction_path
151 | 
152 |     device = cl_args.device
153 | 
154 |     os.makedirs(prediction_path, exist_ok=True)
155 |     assert classifier_type in ["R50", "R101", "R152"]
156 | 
157 |     with tf.device(f"/{device}"):
158 | 
159 |         weakly = WeaklySupervisedCrackSeg(
160 |             classifier_type=classifier_type,
161 |             classifier_weight_path=classifier_weight_path,
162 |             patch_size=patch_size,
163 |             stride_classifier=stride_classifier,
164 |             stride_thresholding=stride_thresholding,
165 |         )
166 | 
167 |         for filename in sorted(os.listdir(img_path)):
168 |             if filename.endswith(".jpg") or filename.endswith(".png"):
169 |                 print("Predicting File:", filename)
170 |                 img = load_img(os.path.join(img_path, filename), color_mode="rgb",)
171 |                 img = np.array(img)
172 | 
173 |                 prediction = weakly.predict(img)
174 |                 cv2.imwrite(
175 |                     os.path.join(prediction_path, os.path.splitext(filename)[0] + ".png"),
176 |                     prediction,
177 |                 )
178 | 
179 | 
180 | if __name__ == "__main__":
181 |     args = parse_args()
182 |     main(args)
183 | 


--------------------------------------------------------------------------------
/segmentation_utils.py:
--------------------------------------------------------------------------------
  1 | import tensorflow.keras as keras
  2 | import numpy as np
  3 | from tensorflow.keras import applications
  4 | 
  5 | np.set_printoptions(suppress=True)
  6 | import tensorflow.keras.backend as K
  7 | import tensorflow as tf
  8 | import numpy as np
  9 | from skimage.filters import threshold_multiotsu
 10 | import cv2
 11 | 
 12 | # some of the patch code has been adapted from https://github.com/orobix/retina-unet/blob/master/lib/extract_patches.py
 13 | 
 14 | 
 15 | def _split_into_patches(ar, patch_size, stride):
 16 |     h, w, c = ar.shape
 17 | 
 18 |     # pad img so that it is dividable by the patch size
 19 |     pad_h = patch_size - h % patch_size if h % patch_size != 0 else 0
 20 |     pad_w = patch_size - w % patch_size if w % patch_size != 0 else 0
 21 | 
 22 |     padded = np.pad(ar, [(0, pad_h), (0, pad_w), (0, 0)], mode="reflect")
 23 |     padded_h, padded_w, _ = padded.shape
 24 |     ppi = ((padded_h - patch_size) // stride + 1) * (
 25 |         (padded_w - patch_size) // stride + 1
 26 |     )
 27 |     patches = np.empty((ppi, patch_size, patch_size, c))
 28 |     patch_ix = 0
 29 |     for h in range((padded_h - patch_size) // stride + 1):
 30 |         for w in range((padded_w - patch_size) // stride + 1):
 31 |             patch = padded[
 32 |                 h * stride : (h * stride) + patch_size,
 33 |                 w * stride : (w * stride) + patch_size,
 34 |             ]
 35 |             patches[patch_ix] = patch
 36 |             patch_ix += 1
 37 |     return patches
 38 | 
 39 | 
 40 | def _merge_out_preds(preds, img_h, img_w, patch_size, stride):
 41 |     if (img_h - patch_size) % stride != 0:
 42 |         img_h = img_h + (stride - ((img_h - patch_size) % stride))
 43 |     if (img_w - patch_size) % stride != 0:
 44 |         img_w = img_w + (stride - (img_w - patch_size) % stride)
 45 | 
 46 |     N_preds_h = (img_h - patch_size) // stride + 1
 47 |     N_preds_w = (img_w - patch_size) // stride + 1
 48 |     probabilities = np.zeros((N_preds_h, N_preds_w))
 49 |     sum = np.zeros((N_preds_h, N_preds_w))
 50 | 
 51 |     i = 0
 52 |     for h in range(N_preds_h):
 53 |         for w in range(N_preds_w):
 54 |             probabilities[h, w] += preds[i][0]
 55 |             sum[h, w] += 1
 56 |             i += 1
 57 |     avg = probabilities / sum
 58 |     return avg
 59 | 
 60 | 
 61 | def _norm_threshold_patches(img, patch_size, stride):
 62 |     img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
 63 |     h, w = img.shape
 64 |     # pad img so that it is dividable by the patch size
 65 |     pad_h = patch_size - h % patch_size if h % patch_size != 0 else 0
 66 |     pad_w = patch_size - w % patch_size if w % patch_size != 0 else 0
 67 | 
 68 |     padded = np.pad(img, [(0, pad_h), (0, pad_w)], mode="reflect")
 69 | 
 70 |     padded_h, padded_w = padded.shape
 71 | 
 72 |     ppi = ((padded_h - patch_size) // stride + 1) * (
 73 |         (padded_w - patch_size) // stride + 1
 74 |     )
 75 | 
 76 |     N_patches_h = (padded_h - patch_size) // stride + 1
 77 |     N_patches_w = (padded_w - patch_size) // stride + 1
 78 | 
 79 |     patches = np.empty((ppi, patch_size, patch_size))
 80 | 
 81 |     patch_ix = 0
 82 |     for h_temp in range(N_patches_h):
 83 |         for w_temp in range(N_patches_w):
 84 |             patch = padded[
 85 |                 h_temp * stride : (h_temp * stride) + patch_size,
 86 |                 w_temp * stride : (w_temp * stride) + patch_size,
 87 |             ]
 88 |             patch = normalize(patch)
 89 |             thresh = threshold_multiotsu(patch, classes=3)
 90 |             patch = patch < thresh[0]
 91 |             patches[patch_ix] = patch
 92 |             patch_ix += 1
 93 | 
 94 |     # merge patches
 95 |     out = np.ones((padded_h, padded_w))
 96 |     i = 0
 97 |     for h_temp in range(N_patches_h):
 98 |         for w_temp in range(N_patches_w):
 99 |             out[
100 |                 h_temp * stride : (h_temp * stride) + patch_size,
101 |                 w_temp * stride : (w_temp * stride) + patch_size,
102 |             ] *= patches[i]
103 |             i += 1
104 | 
105 |     out = out[:h, :w]
106 |     return out
107 | 
108 | 
109 | def normalize(img):
110 |     img = img.astype("float")
111 |     minimum = img.min()
112 |     maximum = img.max()
113 |     return (img - minimum) / (maximum - minimum)
114 | 
115 | 
116 | def model_factory(classifier_type="R50"):
117 |     """
118 |     basic binary classification model with pretrained ResNet
119 |     """
120 | 
121 |     classifier_options = {
122 |         "R50": tf.keras.applications.ResNet50,
123 |         "R101": tf.keras.applications.ResNet101,
124 |         "R152": tf.keras.applications.ResNet152,
125 |     }
126 |     assert classifier_type in classifier_options.keys()
127 | 
128 |     resnet = classifier_options[classifier_type](
129 |         input_shape=(None, None, 3), include_top=False, weights="imagenet"
130 |     )
131 |     inp = keras.layers.Input(shape=(None, None, 3))
132 |     x = resnet(inp)
133 |     x = keras.layers.GlobalAveragePooling2D()(x)
134 |     x = keras.layers.Dense(1000, activation="relu")(x)
135 |     x = keras.layers.Dense(2, activation="softmax")(x)
136 | 
137 |     model = keras.models.Model(inputs=inp, outputs=x)
138 |     return model
139 | 
140 | 
141 | # slightly adapted from: https://github.com/samson6460/tf_keras_gradcamplusplus/blob/master/gradcam.py
142 | def make_gradcam_plus_heatmap(
143 |     img, model, layer_name="block5_conv3", label_name=None, category_id=None
144 | ):
145 |     """Get a heatmap by Grad-CAM.
146 |     Args:
147 |         model: A model object, build from tf.keras 2.X.
148 |         img: An image ndarray.
149 |         layer_name: A string, layer name in model.
150 |         label_name: A list,
151 |             show the label name by assign this argument,
152 |             it should be a list of all label names.
153 |         category_id: An integer, index of the class.
154 |             Default is the category with the highest score in the prediction.
155 |     Return:
156 |         A heatmap ndarray(without color).
157 |     """
158 |     img_tensor = np.expand_dims(img, axis=0)
159 | 
160 |     pool_layer = model.get_layer(layer_name)
161 |     heatmap_model = tf.keras.models.Model(
162 |         [model.inputs], [pool_layer.input, model.output]
163 |     )
164 | 
165 |     with tf.GradientTape() as gtape1:
166 |         with tf.GradientTape() as gtape2:
167 |             with tf.GradientTape() as gtape3:
168 |                 conv_output, predictions = heatmap_model(img_tensor)
169 |                 if category_id == None:
170 |                     category_id = np.argmax(predictions[0])
171 |                 if label_name:
172 |                     print(label_name[category_id])
173 |                 output = predictions[:, category_id]
174 |                 conv_first_grad = gtape3.gradient(output, conv_output)
175 |             conv_second_grad = gtape2.gradient(conv_first_grad, conv_output)
176 |         conv_third_grad = gtape1.gradient(conv_second_grad, conv_output)
177 | 
178 |     global_sum = np.sum(conv_output, axis=(0, 1, 2))
179 | 
180 |     alpha_num = conv_second_grad[0]
181 |     alpha_denom = conv_second_grad[0] * 2.0 + conv_third_grad[0] * global_sum
182 |     alpha_denom = np.where(alpha_denom != 0.0, alpha_denom, 1e-10)
183 | 
184 |     alphas = alpha_num / alpha_denom
185 |     alpha_normalization_constant = np.sum(alphas, axis=(0, 1))
186 |     alphas /= alpha_normalization_constant
187 | 
188 |     weights = np.maximum(conv_first_grad[0], 0.0)
189 | 
190 |     deep_linearization_weights = np.sum(weights * alphas, axis=(0, 1))
191 |     grad_CAM_map = np.sum(deep_linearization_weights * conv_output[0], axis=2)
192 | 
193 |     heatmap = np.maximum(grad_CAM_map, 0)
194 |     max_heat = np.max(heatmap)
195 |     if max_heat == 0:
196 |         max_heat = 1e-10
197 |     heatmap /= max_heat
198 | 
199 |     return heatmap
200 | 


--------------------------------------------------------------------------------