├── LICENSE
├── README.md
├── doc
    └── results_legend.png
├── predict.py
├── predict_template.py
├── segmentation
    ├── bee_dataset.py
    ├── dataset.py
    ├── model.py
    ├── results_analysis.py
    ├── results_visualization.py
    ├── training_config.py
    └── unet.py
├── train.py
└── train_template.py


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2019 Okinawa Institute of Science & Technology
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | Segmentation-based bee detection
 2 | ================================
 3 | 
 4 | This repository contains code and instructions for a segmentation based object detection method, initially developed for a beehive environment and described here: [Towards Dense Object Tracking in a 2D Honeybee Hive](http://openaccess.thecvf.com/content_cvpr_2018/html/Bozek_Towards_Dense_Object_CVPR_2018_paper.html).
 5 | 
 6 | The repository is prepared for the beehive dataset available on [this webpage](https://groups.oist.jp/bptu/honeybee-tracking-dataset). The model is also a simplified version of the one used in the paper : single frame only, no gaussian weights, smaller resolution.
 7 | 
 8 | 
 9 | ## Requirements
10 | - Python 3.5+
11 | - [TensorFlow](https://www.tensorflow.org/) (Tested on 1.12)
12 | - [OpenCV](https://opencv.org/)
13 | - [Numpy](http://www.numpy.org/)
14 | 
15 | ## Run training and prediction for the beehive dataset
16 | Download one of the two bee datasets available, for example the 30 fps one:   
17 | 30 fps: [images](https://beepositions.unit.oist.jp/frame_imgs_30fps.tgz), [annotations](https://beepositions.unit.oist.jp/frame_annotations_30fps.tgz)  
18 | 70 fps: [images](https://beepositions.unit.oist.jp/frame_imgs_70fps.tgz), [annotations](https://beepositions.unit.oist.jp/frame_annotations_70fps.tgz)
19 | 
20 | Create an empty folder to uncompress the two files, the resulting file structure should look like:  
21 | ```
22 | dataset
23 | +-- frames  
24 | |   +-- *.png  
25 | +-- frames_txt  
26 | |   +-- *.txt  
27 | ```
28 | 
29 | You can train the model by passing as an argument the path to your dataset root folder:
30 | 
31 | `python3 train.py path_to_dataset`
32 | 
33 | A subset of the images will be separated for testing (parameter --validation_num_files). You can observe that the loss value decreases with time.
34 | 
35 | After training you can predict the results on the test set by running:
36 | 
37 | `python3 predict.py path_to_dataset`
38 | 
39 | The results will be stored in a new `predict_results` folder. For each input frame, three images will be generated :  
40 | ![alt text](doc/results_legend.png)
41 | 
42 | An aggregate results file of error metrics for all test images is also created : `average_error_metrics.txt`.
43 | 
44 | 
45 | 
46 | ## To adapt to another dataset
47 | 
48 | To predict on a new dataset, you can use the script `predict_template.py`, by replacing the function `predict_data_generator` to load your own dataset.
49 | 
50 | To train on a new dataset, you can use the script `train_template.py` by replacing the functions `train_data_generator` and `eval_data_generator` (generate a pair of images image_data, label_data).  
51 | 
52 | The labels for the beehive dataset were created using a custom [tool](https://github.com/oist/DenseObjectAnnotation).
53 | 
54 | ## Contact
55 | 
56 | If you have any questions about this project, please contact:  
57 | laetitia.hebert at oist.jp  
58 | kasia.bozek at oist.jp  
59 | 


--------------------------------------------------------------------------------
/doc/results_legend.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/oist/DenseObjectDetection/004ebf3d76bd66fcaa7f13ce3acafbf336927ed5/doc/results_legend.png


--------------------------------------------------------------------------------
/predict.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import logging
  3 | import os
  4 | import shutil
  5 | from functools import partial
  6 | 
  7 | import numpy as np
  8 | import cv2
  9 | import tensorflow as tf
 10 | 
 11 | from segmentation import model, dataset, bee_dataset, training_config
 12 | from segmentation.results_analysis import find_positions, compute_error_metrics
 13 | from segmentation.results_visualization import plot_positions, plot_segmentation_map, plot_TP_FN_FP
 14 | 
 15 | logging.basicConfig()
 16 | logger = logging.getLogger(__name__)
 17 | logger.setLevel(logging.DEBUG)
 18 | tf.logging.set_verbosity(tf.logging.INFO)
 19 | 
 20 | 
 21 | def save_error_file(point_d, axis_d, tps, fps, fns, correct_type, out_file):
 22 |     total = len(point_d) + fns
 23 |     tps_mn = float(tps) / total
 24 |     fns_mn = float(fns) / total
 25 |     fps_mn = float(fps) / total
 26 |     correct_type_pr = float(correct_type) / tps
 27 | 
 28 |     with open(out_file, "w") as f:
 29 |         f.write("position error (pixels): mean: {:.2f} ({:.2f}) median: {:.2f}\n".format(np.mean(point_d),
 30 |                                                                                          np.std(point_d),
 31 |                                                                                          np.median(point_d)))
 32 |         f.write("correct class: {:.2f}%\n".format(correct_type_pr * 100))
 33 |         axis_d = np.rad2deg(axis_d)
 34 |         f.write("axis error (degrees) : mean: {:.2f} ({:.2f}) median {:.2f}\n".format(np.mean(axis_d),
 35 |                                                                                       np.std(axis_d),
 36 |                                                                                       np.median(axis_d)))
 37 |         f.write("True Positives: {:.2f}%\n".format(tps_mn * 100))
 38 |         f.write("False Negatives: {:.2f}%\n".format(fns_mn * 100))
 39 |         f.write("False Positives: {:.2f}%\n".format(fps_mn * 100))
 40 | 
 41 | 
 42 | if __name__ == '__main__':
 43 | 
 44 |     parser = argparse.ArgumentParser()
 45 |     parser.add_argument('dataset_root_dir', type=str, help="Path to sample images folder")
 46 |     parser.add_argument('--checkpoint_dir', default='checkpoints', help="Path to trained model folder")
 47 |     parser.add_argument('--results_folder', default='predict_results', help="Output folder")
 48 | 
 49 |     # model parameters, should be the same as training
 50 |     parser.add_argument('--num_classes', type=int, default=3, help="How many outputs of the model")
 51 |     parser.add_argument('--data_format', type=str, default='channels_last', choices={'channels_last', 'channels_first'})
 52 | 
 53 |     # evaluation metrics
 54 |     parser.add_argument('--min_distance_px', type=int, default=20,
 55 |                         help="Minimum distance in pixels between prediction and label objects"
 56 |                              " to be considered a true positive result."
 57 |                              "Use same coordinate system as the predicted image.")
 58 |     parser.add_argument('--min_blob_size_px', type=int, default=20,
 59 |                         help="Blobs with bounding box sides smaller than min_blob_size_px are discarded."
 60 |                              "Use same coordinate system as the predicted image.")
 61 |     parser.add_argument('--max_blob_size_px', type=int, default=200,
 62 |                         help="Blobs with bounding box sides larger than max_blob_size_px are discarded."
 63 |                              "Use same coordinate system as the predicted image.")
 64 | 
 65 |     args = parser.parse_args()
 66 |     logger.info('Predicting with settings: {}'.format(vars(args)))
 67 | 
 68 |     output_path = os.path.join(os.getcwd(), args.results_folder)
 69 |     if os.path.isdir(output_path):
 70 |         shutil.rmtree(output_path)
 71 |     os.mkdir(output_path)
 72 | 
 73 |     images_root_dir = os.path.join(args.dataset_root_dir, 'frames')
 74 |     labels_root_dir = os.path.join(args.dataset_root_dir, 'frames_txt')
 75 |     if not (os.path.exists(images_root_dir) and os.path.exists(labels_root_dir)):
 76 |         raise FileNotFoundError()
 77 | 
 78 |     config = training_config.get(args.dataset_root_dir)
 79 |     if config is not None:
 80 |         to_predict_filenames = config['test']
 81 |         logger.info('Predicting only test images from training_config file.')
 82 |     else:
 83 |         logger.info(
 84 |             "Couldn't find a training_config file, so predicting all images in folder {}".format(images_root_dir))
 85 |         to_predict_filenames = [os.path.splitext(x)[0] for x in os.listdir(images_root_dir)]
 86 | 
 87 |     labels = [bee_dataset.read_label_file_globalcoords(os.path.join(labels_root_dir, name + '.txt'))
 88 |               for name in to_predict_filenames]
 89 |     regions_of_interest = [l[1] for l in labels]
 90 | 
 91 |     estimator = tf.estimator.Estimator(model_fn=partial(model.build_model,
 92 |                                                         num_classes=args.num_classes,
 93 |                                                         data_format=args.data_format,
 94 |                                                         bg_fg_weight=None), model_dir=args.checkpoint_dir)
 95 | 
 96 |     predictions = estimator.predict(input_fn=partial(dataset.make_dataset,
 97 |                                                      data_generator=partial(bee_dataset.generate_predict,
 98 |                                                                             images_root_dir=images_root_dir,
 99 |                                                                             filenames=to_predict_filenames,
100 |                                                                             regions_of_interest=regions_of_interest),
101 |                                                      data_format=args.data_format,
102 |                                                      batch_size=1,
103 |                                                      mode=tf.estimator.ModeKeys.PREDICT))
104 | 
105 |     drawing_functions = bee_dataset.get_object_drawing_functions()
106 | 
107 |     TP_count, FP_count, FN_count, correct_type_count = 0, 0, 0, 0
108 |     all_pixel_dist, all_axis_diff = [], []
109 | 
110 |     for name, prediction, label in zip(to_predict_filenames, predictions, labels):
111 |         logger.info('processing {}'.format(name))
112 | 
113 |         input_image = prediction['input_data']
114 |         pred_image = prediction['prediction']
115 | 
116 |         channels_axis = 0 if args.data_format == 'channels_first' else -1
117 |         amax = np.argmax(pred_image, axis=channels_axis)
118 | 
119 |         input_image = np.uint8(np.squeeze(input_image) * 255)
120 |         input_image = cv2.cvtColor(input_image, cv2.COLOR_GRAY2BGR)
121 | 
122 |         plot_segmentation_map(input_image, amax,
123 |                               os.path.join(output_path, "{}_seg_map.png".format(name)), num_classes=args.num_classes)
124 | 
125 |         predictions_pos = find_positions(amax, args.min_blob_size_px, args.max_blob_size_px)
126 |         if len(predictions_pos) == 0:
127 |             logger.info("Blob analysis failed to find objects.")
128 |             continue
129 | 
130 |         np.savetxt(os.path.join(output_path, "{}_predictions.csv".format(name)), predictions_pos, fmt="%i,%i,%i,%.4f")
131 |         np.savetxt(os.path.join(output_path, "{}_labels.csv".format(name)), label[0], fmt="%i,%i,%i,%.4f")
132 | 
133 |         plot_positions(input_image, [label[0], predictions_pos], [(0, 250, 255), (0, 0, 255)],
134 |                        os.path.join(output_path, "{}_mixed.png".format(name)),
135 |                        drawing_params=drawing_functions)
136 | 
137 |         pixel_dist, axis_diff, correct_type, TP_results, FN_results, FP_results \
138 |             = compute_error_metrics(np.array(label[0]), np.array(predictions_pos), dist_min=args.min_distance_px)
139 | 
140 |         TP_count += len(TP_results)
141 |         FN_count += len(FN_results)
142 |         FP_count += len(FP_results)
143 |         correct_type_count += correct_type
144 |         all_pixel_dist += pixel_dist
145 |         all_axis_diff += axis_diff
146 | 
147 |         plot_TP_FN_FP(input_image, TP_results, FN_results, FP_results,
148 |                       os.path.join(output_path, "{}_detail.png".format(name)), drawing_functions)
149 | 
150 |     save_error_file(all_pixel_dist, all_axis_diff, TP_count, FP_count, FN_count, correct_type_count,
151 |                     os.path.join(output_path, "average_error_metrics.txt"))
152 | 


--------------------------------------------------------------------------------
/predict_template.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import logging
 3 | import os
 4 | import shutil
 5 | from functools import partial
 6 | 
 7 | import numpy as np
 8 | import cv2
 9 | import tensorflow as tf
10 | 
11 | from segmentation import model, dataset, bee_dataset
12 | from segmentation.results_analysis import find_positions
13 | from segmentation.results_visualization import plot_positions, plot_segmentation_map
14 | 
15 | logging.basicConfig()
16 | logger = logging.getLogger(__name__)
17 | logger.setLevel(logging.DEBUG)
18 | tf.logging.set_verbosity(tf.logging.INFO)
19 | 
20 | 
21 | def predict_data_generator():
22 |     # Here load your 2D image data in the format uint8 (values between 0 and 255)
23 |     # example :
24 |     # for my_image_path in my_images_paths:
25 |     #     yield cv2.imread(my_image_path, cv2.IMREAD_GRAYSCALE)
26 |     raise NotImplementedError()
27 | 
28 | 
29 | if __name__ == '__main__':
30 | 
31 |     parser = argparse.ArgumentParser()
32 |     parser.add_argument('--checkpoint_dir', default='checkpoints', help="Path to trained model folder")
33 |     parser.add_argument('--results_folder', default='predict_results', help="Output folder")
34 | 
35 |     # model parameters, should be the same as training
36 |     parser.add_argument('--num_classes', type=int, default=3, help="How many outputs of the model")
37 |     parser.add_argument('--data_format', type=str, default='channels_last', choices={'channels_last', 'channels_first'})
38 | 
39 |     # metrics to accept a blob as an object
40 |     parser.add_argument('--min_blob_size_px', type=int, default=20,
41 |                         help="Blobs with bounding box sides smaller than min_blob_size_px are discarded."
42 |                              "Use same coordinate system as the predicted image.")
43 |     parser.add_argument('--max_blob_size_px', type=int, default=200,
44 |                         help="Blobs with bounding box sides larger than max_blob_size_px are discarded."
45 |                              "Use same coordinate system as the predicted image.")
46 | 
47 |     args = parser.parse_args()
48 |     logger.info('Predicting with settings: {}'.format(vars(args)))
49 | 
50 |     output_path = os.path.join(os.getcwd(), args.results_folder)
51 |     if os.path.isdir(output_path):
52 |         shutil.rmtree(output_path)
53 |     os.mkdir(output_path)
54 | 
55 |     estimator = tf.estimator.Estimator(model_fn=partial(model.build_model,
56 |                                                         num_classes=args.num_classes,
57 |                                                         data_format=args.data_format,
58 |                                                         bg_fg_weight=None), model_dir=args.checkpoint_dir)
59 | 
60 |     predictions = estimator.predict(input_fn=partial(dataset.make_dataset,
61 |                                                      data_generator=predict_data_generator,
62 |                                                      data_format=args.data_format,
63 |                                                      batch_size=1,
64 |                                                      mode=tf.estimator.ModeKeys.PREDICT))
65 | 
66 |     drawing_functions = bee_dataset.get_object_drawing_functions()
67 | 
68 |     for index, prediction in enumerate(predictions):
69 | 
70 |         input_image = prediction['input_data']
71 |         pred_image = prediction['prediction']
72 | 
73 |         channels_axis = 0 if args.data_format == 'channels_first' else -1
74 |         amax = np.argmax(pred_image, axis=channels_axis)
75 | 
76 |         input_image = np.uint8(np.squeeze(input_image) * 255)
77 |         input_image = cv2.cvtColor(input_image, cv2.COLOR_GRAY2BGR)
78 | 
79 |         plot_segmentation_map(input_image, amax,
80 |                               os.path.join(output_path, "{}_seg_map.png".format(index)), num_classes=args.num_classes)
81 | 
82 |         predictions_pos = find_positions(amax, args.min_blob_size_px, args.max_blob_size_px)
83 |         if len(predictions_pos) == 0:
84 |             logger.info("Blob analysis failed to find objects.")
85 |             continue
86 | 
87 |         np.savetxt(os.path.join(output_path, "{}_predictions.csv".format(index)), predictions_pos, fmt="%i,%i,%i,%.4f")
88 | 
89 |         plot_positions(input_image, [predictions_pos], [(0, 250, 255)],
90 |                        os.path.join(output_path, "{}_positions.png".format(index)),
91 |                        drawing_params=drawing_functions)
92 | 


--------------------------------------------------------------------------------
/segmentation/bee_dataset.py:
--------------------------------------------------------------------------------
  1 | import csv
  2 | import logging
  3 | import os
  4 | 
  5 | import cv2
  6 | import numpy as np
  7 | 
  8 | logging.basicConfig()
  9 | logger = logging.getLogger(__name__)
 10 | logger.setLevel(logging.DEBUG)
 11 | 
 12 | 
 13 | # raw image properties
 14 | SUB_IMAGE_SIZE = (512, 512)
 15 | BEE_OBJECT_SIZES = {1: (20, 35),  # bee class is labeled 1
 16 |                     2: (20, 20)}  # butt class is labeled 2
 17 | # pre processing params
 18 | SCALE_FACTOR = 2  # downscale images, labels by half
 19 | 
 20 | 
 21 | def get_object_drawing_functions():
 22 |     return {1: draw_bee_body,
 23 |             2: draw_bee_butt}
 24 | 
 25 | 
 26 | def draw_bee_butt(out_image, x, y, a, color):
 27 | 
 28 |     r = 30 // SCALE_FACTOR
 29 |     cv2.circle(out_image, (int(x), int(y)), r, color, thickness=2)
 30 |     draw_center(out_image, x, y, color)
 31 | 
 32 | 
 33 | def draw_center(out_image, x, y, color):
 34 | 
 35 |     d = 4 // SCALE_FACTOR
 36 |     cv2.rectangle(out_image, (int(x) - d, int(y) - d), (int(x) + d, int(y) + d), color, thickness=-1)
 37 | 
 38 | 
 39 | def draw_bee_body(out_image, x, y, a, color):
 40 | 
 41 |     d = 60. / SCALE_FACTOR
 42 |     dx = np.sin(a) * d
 43 |     dy = np.cos(a) * d
 44 |     x1, y1, x2, y2 = int(x - dx), int(y + dy), int(x + dx), int(y - dy)
 45 |     cv2.line(out_image, (x1, y1), (x2, y2), color, thickness=2)
 46 |     draw_center(out_image, x, y, color)
 47 | 
 48 | 
 49 | def generate_training(frames_root_dir, labels_root_dir, filenames):
 50 | 
 51 |     for name in filenames:
 52 | 
 53 |         label_filepath = os.path.join(labels_root_dir, name + '.txt')
 54 |         image_filepath = os.path.join(frames_root_dir, name + '.png')
 55 | 
 56 |         if not os.path.exists(label_filepath) or not os.path.exists(image_filepath):
 57 |             logger.info('Skipping {}.'.format(name))
 58 |             continue
 59 | 
 60 |         image = cv2.imread(image_filepath, cv2.IMREAD_GRAYSCALE)
 61 | 
 62 |         frame_label = read_label_file(label_filepath)
 63 | 
 64 |         all_unique_offsets = np.unique([[x[0], x[1]] for x in frame_label], axis=0)
 65 | 
 66 |         sub_label_size = (SUB_IMAGE_SIZE[0] // SCALE_FACTOR,
 67 |                           SUB_IMAGE_SIZE[1] // SCALE_FACTOR)
 68 | 
 69 |         for offset_x, offset_y in all_unique_offsets:
 70 | 
 71 |             label_image = np.zeros(sub_label_size, dtype=np.uint8)
 72 | 
 73 |             sub_labels = [x for x in frame_label if x[0] == offset_x and x[1] == offset_y]
 74 | 
 75 |             for _, _, x, y, bee_type, angle in sub_labels:
 76 |                 bee_object_size = BEE_OBJECT_SIZES[bee_type]
 77 | 
 78 |                 x = x // SCALE_FACTOR
 79 |                 y = y // SCALE_FACTOR
 80 |                 r1 = bee_object_size[0] // SCALE_FACTOR
 81 |                 r2 = bee_object_size[1] // SCALE_FACTOR
 82 | 
 83 |                 ellipse_around_point(label_image, y, x, angle, r1=r1, r2=r2, value=bee_type)
 84 | 
 85 |             sub_image = image[offset_y:offset_y + SUB_IMAGE_SIZE[0],
 86 |                         offset_x:offset_x + SUB_IMAGE_SIZE[1]]
 87 | 
 88 |             fx, fy = (1 / float(SCALE_FACTOR),) * 2
 89 |             sub_image = cv2.resize(sub_image, None, fx=fx, fy=fy, interpolation=cv2.INTER_LINEAR)
 90 | 
 91 |             yield sub_image, label_image
 92 | 
 93 | 
 94 | def generate_predict(images_root_dir, filenames, regions_of_interest):
 95 | 
 96 |     for name, roi in zip(filenames, regions_of_interest):
 97 |         image_filepath = os.path.join(images_root_dir, name + '.png')
 98 |         image = cv2.imread(image_filepath, cv2.IMREAD_GRAYSCALE)
 99 |         image = image[roi[2]:roi[3], roi[0]:roi[1]]
100 |         fx, fy = (1 / float(SCALE_FACTOR),) * 2
101 |         image = cv2.resize(image, None, fx=fx, fy=fy, interpolation=cv2.INTER_LINEAR)
102 |         yield image
103 | 
104 | 
105 | def ellipse_around_point(image, xc, yc, angle, r1, r2, value):
106 | 
107 |     image_size = image.shape
108 | 
109 |     ind0 = np.arange(-xc, image_size[0] - xc)[:, np.newaxis] * np.ones((1, image_size[1]))
110 |     ind1 = np.arange(-yc, image_size[1] - yc)[np.newaxis, :] * np.ones((image_size[0], 1))
111 |     ind = np.concatenate([ind0[np.newaxis], ind1[np.newaxis]], axis=0)
112 | 
113 |     sin_a = np.sin(angle)
114 |     cos_a = np.cos(angle)
115 | 
116 |     image[((ind[0, :, :] * sin_a + ind[1, :, :] * cos_a) ** 2 / r1 ** 2 + (
117 |             ind[1, :, :] * sin_a - ind[0, :, :] * cos_a) ** 2 / r2 ** 2) <= 1] = value
118 | 
119 |     return image
120 | 
121 | 
122 | def read_label_file(label_filename):
123 |     with open(label_filename, 'r') as csvfile:
124 |         csv_reader = csv.reader(csvfile, delimiter='\t')
125 | 
126 |         def parse_row(row):
127 |             offset_x, offset_y = int(row[0]), int(row[1])
128 |             bee_type = int(row[2])
129 |             x, y = int(row[3]), int(row[4])
130 |             angle = float(row[5])
131 | 
132 |             return offset_x, offset_y, x, y, bee_type, angle
133 | 
134 |         return list(map(parse_row, csv_reader))
135 | 
136 | 
137 | def read_label_file_globalcoords(label_filename):
138 | 
139 |     rows = read_label_file(label_filename)
140 | 
141 |     unique_offsets = np.unique([[x[0], x[1]] for x in rows], axis=0)
142 | 
143 |     roi = [np.min(unique_offsets[:, 0]), np.max(unique_offsets[:, 0]) + SUB_IMAGE_SIZE[0],
144 |            np.min(unique_offsets[:, 1]), np.max(unique_offsets[:, 1]) + SUB_IMAGE_SIZE[1]]
145 | 
146 |     labels_global_coordinates = [[offset_x + x - roi[0], offset_y + y - roi[2], bee_type, angle]
147 |                                  for offset_x, offset_y, x, y, bee_type, angle in rows]
148 | 
149 |     labels_global_coordinates = [[x // SCALE_FACTOR, y // SCALE_FACTOR, bee_type, angle]
150 |                                  for x, y, bee_type, angle in labels_global_coordinates]
151 | 
152 |     return labels_global_coordinates, roi
153 | 


--------------------------------------------------------------------------------
/segmentation/dataset.py:
--------------------------------------------------------------------------------
 1 | from functools import partial
 2 | 
 3 | import tensorflow as tf
 4 | from tensorflow.python.data import Dataset
 5 | 
 6 | 
 7 | def make_dataset(data_generator, data_format, batch_size, mode):
 8 |     first = next(data_generator())
 9 | 
10 |     if mode == tf.estimator.ModeKeys.PREDICT:
11 |         types = tf.uint8
12 |         shapes = first.shape
13 |     else:
14 |         types = (tf.uint8, tf.uint8)
15 |         shapes = (first[0].shape, first[1].shape)
16 | 
17 |     dataset = Dataset.from_generator(data_generator, types, shapes)
18 | 
19 |     if mode == tf.estimator.ModeKeys.TRAIN:
20 |         dataset = dataset.shuffle(buffer_size=10)
21 |         dataset = dataset.repeat()
22 | 
23 |     process_data_fn = process_image if mode == tf.estimator.ModeKeys.PREDICT else process_image_label
24 |     dataset = dataset.map(partial(process_data_fn, data_format=data_format),
25 |                           num_parallel_calls=64)
26 | 
27 |     dataset = dataset.batch(batch_size)
28 | 
29 |     if mode == tf.estimator.ModeKeys.TRAIN:
30 |         dataset = dataset.map(partial(augment_data))
31 | 
32 |     dataset = dataset.prefetch(batch_size)
33 | 
34 |     return dataset
35 | 
36 | 
37 | def process_image(image_data, data_format):
38 | 
39 |     image_data = tf.cast(image_data, dtype=tf.float32) / 255.
40 |     image_data = image_data[:, :, tf.newaxis] if data_format == 'channels_last' else image_data[tf.newaxis, :, :]
41 |     return image_data
42 | 
43 | 
44 | def process_image_label(image_data, label_data, data_format):
45 | 
46 |     image_data = process_image(image_data, data_format)
47 | 
48 |     label_data = tf.cast(label_data, dtype=tf.float32)
49 |     label_data = label_data[:, :, tf.newaxis] if data_format == 'channels_last' else label_data[tf.newaxis, :, :]
50 |     return image_data, label_data
51 | 
52 | 
53 | def augment_data(image_data, label_data):
54 |     # random flip horizontally
55 |     flip_cond = tf.less(tf.random_uniform([], 0, 1.0), .5)
56 |     image_data, label_data = tf.cond(flip_cond,
57 |                                      lambda: (
58 |                                          tf.image.flip_left_right(image_data), tf.image.flip_left_right(label_data)),
59 |                                      lambda: (image_data, label_data))
60 | 
61 |     # random rotation 90
62 |     k = tf.random_uniform((), minval=0, maxval=4, dtype=tf.int32)
63 |     image_data = tf.image.rot90(image_data, k=k)
64 |     label_data = tf.image.rot90(label_data, k=k)
65 | 
66 |     return image_data, label_data
67 | 


--------------------------------------------------------------------------------
/segmentation/model.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | from segmentation.unet import create_unet
 4 | 
 5 | 
 6 | def build_model(features, labels, num_classes, data_format, bg_fg_weight, mode):
 7 |     # The U-net network generates an output from the images data, network_output is the last layer of the network
 8 |     network_output = create_unet(features, num_classes, data_format)
 9 | 
10 |     # Prediction step : return the network output, plus the input data to compare
11 |     if mode == tf.estimator.ModeKeys.PREDICT:
12 |         predictions = {'prediction': network_output, 'input_data': features}
13 |         return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
14 | 
15 |     # Calculate the loss function by comparing network output and labels
16 |     loss = add_loss(network_output, labels, data_format, num_classes=num_classes, weight=bg_fg_weight)
17 | 
18 |     # Evaluation step : output only the loss
19 |     if mode == tf.estimator.ModeKeys.EVAL:
20 |         return tf.estimator.EstimatorSpec(mode=mode, loss=loss)
21 | 
22 |     # Training step : minimize the loss using an optimizer, outputs the loss too
23 |     if mode == tf.estimator.ModeKeys.TRAIN:
24 |         optimizer = tf.train.AdamOptimizer()
25 |         train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
26 |         return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)
27 | 
28 | 
29 | def add_loss(logits, labels, data_format, num_classes, weight):
30 |     with tf.name_scope('loss'):
31 |         # Convert the labels to separate binary channels : one for each class
32 |         channels_axis = 1 if data_format == 'channels_first' else -1
33 |         labels = tf.squeeze(labels, axis=channels_axis)
34 |         oh_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=num_classes, name="one_hot", axis=channels_axis)
35 | 
36 |         # Compare the network output with the labels, which will produce the most probable class for each pixel
37 |         loss_map = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits,
38 |                                                               labels=tf.stop_gradient(oh_labels),
39 |                                                               dim=channels_axis)
40 | 
41 |         # Weigh the loss map
42 |         weight_map = tf.where(tf.equal(labels, 0),
43 |                               tf.fill(tf.shape(labels), 1 - weight),
44 |                               tf.fill(tf.shape(labels), weight))
45 |         weighted_loss = tf.multiply(loss_map, weight_map)
46 | 
47 |         # Average of the loss for the full batch
48 |         loss = tf.reduce_mean(weighted_loss, name="weighted_loss")
49 |         tf.losses.add_loss(loss)
50 |         return loss
51 | 


--------------------------------------------------------------------------------
/segmentation/results_analysis.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import numpy as np
 3 | 
 4 | 
 5 | def _find_main_axis(regions, region_index):
 6 |     xs, ys = np.where(regions == region_index)
 7 |     m = np.concatenate([-ys[:, np.newaxis], xs[:, np.newaxis]], axis=1)
 8 | 
 9 |     _, _, v = np.linalg.svd(m - np.mean(m, axis=0), full_matrices=False)
10 | 
11 |     return np.arctan2(v[0, 0], v[0, 1])
12 | 
13 | 
14 | def find_positions(pred, min_blob_size, max_blob_size):
15 |     num_regions, regions, stats, centroids = cv2.connectedComponentsWithStats(pred.astype(np.uint8))
16 | 
17 |     result = []
18 |     for region_index in np.arange(1, num_regions):
19 |         region_stats = stats[region_index]
20 | 
21 |         # remove too big or too small blobs
22 |         region_width, region_height = region_stats[2], region_stats[3]
23 |         if region_width < min_blob_size or region_width > max_blob_size \
24 |                 or region_height < min_blob_size or region_height > max_blob_size:
25 |             continue
26 | 
27 |         # get blob properties : main axis and centroid
28 |         ax = _find_main_axis(regions, region_index)
29 |         x, y = centroids[region_index][0], centroids[region_index][1]
30 | 
31 |         # find object type (biggest occurence in the region)
32 |         unique_values, count = np.unique(pred[regions == region_index], return_counts=True)
33 |         typ = unique_values[np.argmax(count)]
34 | 
35 |         result.append([x, y, typ, ax])
36 | 
37 |     return result
38 | 
39 | 
40 | def _calculate_points_dist(preds, labels):
41 |     # euclidean distance between pred and label
42 |     res = np.zeros((len(preds), len(labels)))
43 | 
44 |     for ip in range(len(preds)):
45 |         for il in range(len(labels)):
46 |             pred = preds[ip]
47 |             label = labels[il]
48 |             res[ip, il] = np.sqrt((pred[0] - label[0]) ** 2 + (pred[1] - label[1]) ** 2)
49 | 
50 |     return res
51 | 
52 | 
53 | def _axis_difference(a1, a2):
54 |     # transform all angles in range [0, pi/2] before comparing with absolute difference
55 | 
56 |     a1 = np.mod(a1, np.pi)
57 |     a2 = np.mod(a2, np.pi)
58 | 
59 |     a1 = np.pi - a1 if a1 > np.pi / 2 else a1
60 |     a2 = np.pi - a2 if a2 > np.pi / 2 else a2
61 | 
62 |     return np.abs(a2 - a1)
63 | 
64 | 
65 | def compute_error_metrics(labels, preds, dist_min):
66 |     TP_results, correct_type, pixel_dist, axis_diff = [], 0, [], []
67 | 
68 |     dist_matrix = _calculate_points_dist(preds, labels)
69 | 
70 |     # find matches in dist_matrix, until no more values close enough to match
71 |     while dist_matrix.shape[0] > 0 and dist_matrix.shape[1] > 0 and np.min(dist_matrix) < dist_min:
72 | 
73 |         ip, il = np.argwhere(dist_matrix == np.min(dist_matrix))[0]
74 | 
75 |         xp, yp, tp, ap = tuple(preds[ip])
76 |         xl, yl, tl, al = tuple(labels[il])
77 | 
78 |         TP_results.append(preds[ip])
79 |         pixel_dist.append(dist_matrix[ip, il])
80 | 
81 |         # only calculate axis difference for class 1 (bee body)
82 |         if tl == 1 and tp == 1:
83 |             axis_diff.append(_axis_difference(ap, al))
84 | 
85 |         correct_type += int(tp == tl)
86 | 
87 |         dist_matrix = np.delete(dist_matrix, ip, 0)
88 |         dist_matrix = np.delete(dist_matrix, il, 1)
89 |         preds = np.delete(preds, ip, axis=0)
90 |         labels = np.delete(labels, il, axis=0)
91 | 
92 |     FN_results = labels
93 |     FP_results = preds
94 | 
95 |     return pixel_dist, axis_diff, correct_type, TP_results, FN_results, FP_results
96 | 


--------------------------------------------------------------------------------
/segmentation/results_visualization.py:
--------------------------------------------------------------------------------
 1 | import colorsys
 2 | 
 3 | import numpy as np
 4 | import cv2
 5 | 
 6 | 
 7 | def plot_TP_FN_FP(input_image, tps, fns, fps, out_file, drawing_params):
 8 | 
 9 |     symbols_image = np.zeros_like(input_image)
10 | 
11 |     for x, y, typ, a in tps:
12 |         drawing_params[typ](symbols_image, x, y, a, (0, 255, 0))
13 |     for x, y, typ, a in fns:
14 |         drawing_params[typ](symbols_image, x, y, a, (0, 0, 255))
15 |     for x, y, typ, a in fps:
16 |         drawing_params[typ](symbols_image, x, y, a, (0, 255, 255))
17 | 
18 |     res = cv2.addWeighted(input_image, 1, symbols_image, 0.5, 0)
19 |     cv2.imwrite(out_file, res)
20 | 
21 | 
22 | def plot_positions(input_image, positions, colors, out_file, drawing_params):
23 | 
24 |     symbols_image = np.zeros_like(input_image)
25 | 
26 |     for pos, color in zip(positions, colors):
27 |         for x, y, typ, a in pos:
28 |             drawing_params[typ](symbols_image, x, y, a, color)
29 | 
30 |     res = cv2.addWeighted(input_image, 1, symbols_image, 0.5, 0)
31 |     cv2.imwrite(out_file, res)
32 | 
33 | 
34 | def plot_segmentation_map(input_image, prediction, out_file, num_classes):
35 | 
36 |     label_im = np.zeros_like(input_image)
37 | 
38 |     prediction = prediction.astype(np.float32) / (num_classes - 1)
39 | 
40 |     colors = {k: tuple([int(c * 255) for c in reversed(colorsys.hsv_to_rgb(k, 1, 1))])
41 |               for k in np.unique(prediction)}
42 | 
43 |     values_x, values_y = np.where(prediction > 0)
44 |     for x, y in zip(values_x, values_y):
45 |         label_im[x, y] = colors[prediction[x, y]]
46 | 
47 |     res = cv2.addWeighted(input_image, 1, label_im, 0.4, 0)
48 |     cv2.imwrite(out_file, res)
49 | 


--------------------------------------------------------------------------------
/segmentation/training_config.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | import logging
 3 | import os
 4 | import random
 5 | 
 6 | import numpy as np
 7 | 
 8 | logging.basicConfig()
 9 | logger = logging.getLogger(__name__)
10 | logger.setLevel(logging.DEBUG)
11 | 
12 | CONFIG_NAME = 'training_config.json'
13 | 
14 | 
15 | def get(dataset_root_dir):
16 |     config_filepath = os.path.join(dataset_root_dir, CONFIG_NAME)
17 | 
18 |     if not os.path.exists(config_filepath):
19 |         return None
20 | 
21 |     with open(config_filepath, 'r') as f:
22 |         try:
23 |             training_config = json.load(f)
24 |             logger.info('Loading training config from {}'.format(config_filepath))
25 |             return training_config
26 |         except IOError:
27 |             return None
28 | 
29 | 
30 | def create(dataset_root_dir, test_num_files):
31 |     logger.info('Creating new training config')
32 | 
33 |     all_files = []
34 |     for _, _, files in os.walk(dataset_root_dir):
35 |         for name in files:
36 |             all_files.append(os.path.splitext(name)[0])
37 |     all_files = np.unique(all_files)
38 | 
39 |     random.shuffle(all_files)
40 |     train_names = all_files[test_num_files:]
41 |     test_names = all_files[:test_num_files]
42 |     with open(os.path.join(dataset_root_dir, CONFIG_NAME), 'w') as outfile:
43 |         json.dump({'test': test_names.tolist(), 'train': train_names.tolist()}, outfile, indent=4)
44 | 
45 |     return get(dataset_root_dir)
46 | 


--------------------------------------------------------------------------------
/segmentation/unet.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | 
 4 | def _create_conv_relu(inputs, data_format, name, filters, strides=1, kernel_size=3, padding='same'):
 5 |     return tf.layers.conv2d(inputs=inputs, filters=filters, strides=strides,
 6 |                             kernel_size=kernel_size, padding=padding, data_format=data_format,
 7 |                             name='{}_conv'.format(name), activation=tf.nn.relu)
 8 | 
 9 | 
10 | def _create_pool(data, data_format, name, pool_size=2, strides=2):
11 |     return tf.layers.max_pooling2d(inputs=data, pool_size=pool_size, strides=strides,
12 |                                    padding='same', name=name, data_format=data_format)
13 | 
14 | 
15 | def _contracting_path(data, data_format, num_layers, num_filters):
16 |     interim = []
17 | 
18 |     dim_out = num_filters
19 |     for i in range(num_layers):
20 |         name = 'c_{}'.format(i)
21 |         conv1 = _create_conv_relu(data, data_format, '{}_1'.format(name), dim_out)
22 |         conv2 = _create_conv_relu(conv1, data_format, '{}_2'.format(name), dim_out)
23 |         pool = _create_pool(conv2, data_format, name)
24 |         data = pool
25 | 
26 |         dim_out *= 2
27 |         interim.append(conv2)
28 | 
29 |     return interim, data
30 | 
31 | 
32 | def _expansive_path(data, data_format, interim, num_layers, dim_in):
33 |     dim_out = int(dim_in / 2)
34 |     for i in range(num_layers):
35 |         name = "e_{}".format(i)
36 |         upconv = tf.layers.conv2d_transpose(data, filters=dim_out, kernel_size=2, strides=2,
37 |                                             name='{}_upconv'.format(name), data_format=data_format, )
38 | 
39 |         channels_axis = 1 if data_format == 'channels_first' else -1
40 |         concat = tf.concat([interim[len(interim) - i - 1], upconv], axis=channels_axis)
41 |         conv1 = _create_conv_relu(concat, data_format, '{}_1'.format(name), dim_out)
42 |         conv2 = _create_conv_relu(conv1, data_format, '{}_2'.format(name), dim_out)
43 |         data = conv2
44 |         dim_out = int(dim_out / 2)
45 |     return data
46 | 
47 | 
48 | def create_unet(data, num_classes, data_format, num_layers=3, num_filters=32):
49 | 
50 |     (interim, contracting_data) = _contracting_path(data, data_format, num_layers, num_filters)
51 | 
52 |     middle_dim = num_filters * (2 ** num_layers)
53 |     middle_conv_1 = _create_conv_relu(contracting_data, data_format, 'm_1', middle_dim)
54 |     middle_conv_2 = _create_conv_relu(middle_conv_1, data_format, 'm_2', middle_dim)
55 |     middle_end = middle_conv_2
56 | 
57 |     expansive_path = _expansive_path(middle_end, data_format, interim, num_layers, middle_dim)
58 | 
59 |     conv_last = _create_conv_relu(expansive_path, data_format, 'final', num_classes)
60 |     return conv_last
61 | 


--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import logging
 3 | import os
 4 | from functools import partial
 5 | 
 6 | import tensorflow as tf
 7 | 
 8 | from segmentation import dataset, bee_dataset, training_config
 9 | from segmentation.model import build_model
10 | 
11 | logging.basicConfig()
12 | logger = logging.getLogger(__name__)
13 | logger.setLevel(logging.DEBUG)
14 | tf.logging.set_verbosity(tf.logging.INFO)
15 | 
16 | if __name__ == '__main__':
17 |     parser = argparse.ArgumentParser()
18 |     parser.add_argument('dataset_root_dir', type=str,
19 |                         help="path of root folder containing frames and frames_txt folders")
20 | 
21 |     # model parameters
22 |     parser.add_argument('--num_classes', type=int, default=3, help="How many outputs of the model")
23 |     parser.add_argument('--data_format', type=str, default='channels_last', choices={'channels_last', 'channels_first'})
24 | 
25 |     # training parameters
26 |     parser.add_argument('--bg_fg_weight', type=float, default=0.9,
27 |                         help="How much to weight the foreground objects against the background during training.")
28 |     parser.add_argument('--validation_num_files', type=int, default=10,
29 |                         help="How many images files are used for validation (chosen randomly).")
30 |     parser.add_argument('--batch_size', type=int, default=8,
31 |                         help="Batch size for training")
32 |     parser.add_argument('--num_steps', type=int, default=5000, help="Number of training steps")
33 |     parser.add_argument('--checkpoint_dir', type=str, default='checkpoints', help="Save model to this path.")
34 |     args = parser.parse_args()
35 | 
36 |     logger.info('Training network with settings: {}'.format(vars(args)))
37 | 
38 |     images_root_dir = os.path.join(args.dataset_root_dir, 'frames')
39 |     labels_root_dir = os.path.join(args.dataset_root_dir, 'frames_txt')
40 |     if not (os.path.exists(images_root_dir) and os.path.exists(labels_root_dir)):
41 |         raise FileNotFoundError()
42 | 
43 |     config = training_config.get(args.dataset_root_dir)
44 |     if config is None:
45 |         config = training_config.create(args.dataset_root_dir, args.validation_num_files)
46 | 
47 |     estimator = tf.estimator.Estimator(model_fn=partial(build_model,
48 |                                                         num_classes=args.num_classes,
49 |                                                         data_format=args.data_format,
50 |                                                         bg_fg_weight=args.bg_fg_weight),
51 |                                        model_dir=args.checkpoint_dir,
52 |                                        config=tf.estimator.RunConfig(save_checkpoints_steps=100,
53 |                                                                      save_summary_steps=100))
54 | 
55 |     train_spec = tf.estimator.TrainSpec(input_fn=partial(dataset.make_dataset,
56 |                                                          data_generator=partial(bee_dataset.generate_training,
57 |                                                                                 frames_root_dir=images_root_dir,
58 |                                                                                 labels_root_dir=labels_root_dir,
59 |                                                                                 filenames=config['train']),
60 |                                                          data_format=args.data_format,
61 |                                                          batch_size=args.batch_size,
62 |                                                          mode=tf.estimator.ModeKeys.TRAIN), max_steps=args.num_steps)
63 | 
64 |     eval_spec = tf.estimator.EvalSpec(input_fn=partial(dataset.make_dataset,
65 |                                                        data_generator=partial(bee_dataset.generate_training,
66 |                                                                               frames_root_dir=images_root_dir,
67 |                                                                               labels_root_dir=labels_root_dir,
68 |                                                                               filenames=config['test']),
69 |                                                        data_format=args.data_format,
70 |                                                        batch_size=args.batch_size,
71 |                                                        mode=tf.estimator.ModeKeys.EVAL), steps=None)
72 | 
73 |     tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
74 | 


--------------------------------------------------------------------------------
/train_template.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import logging
 3 | from functools import partial
 4 | 
 5 | import tensorflow as tf
 6 | 
 7 | from segmentation import dataset
 8 | from segmentation.model import build_model
 9 | 
10 | logging.basicConfig()
11 | logger = logging.getLogger(__name__)
12 | logger.setLevel(logging.DEBUG)
13 | tf.logging.set_verbosity(tf.logging.INFO)
14 | 
15 | 
16 | def train_data_generator():
17 |     # Here generate your own training data : pair of 2D images in uint8 format
18 |     # example :
19 |     # while True:
20 |     #    data = cv2.imread(my_train_data_image_path, cv2.IMREAD_GRAYSCALE)
21 |     #    label = cv2.imread(my_train_label_image_path, cv2.IMREAD_GRAYSCALE)
22 |     #    yield data, label
23 |     raise NotImplementedError()
24 | 
25 | 
26 | def eval_data_generator():
27 |     # Same as train_data_generator but with evaluation data, should not loop if using steps=None in EvalSpec
28 |     raise NotImplementedError()
29 | 
30 | 
31 | if __name__ == '__main__':
32 |     parser = argparse.ArgumentParser()
33 |     # model parameters
34 |     parser.add_argument('--num_classes', type=int, default=3, help="How many outputs of the model")
35 |     parser.add_argument('--data_format', type=str, default='channels_last', choices={'channels_last', 'channels_first'})
36 | 
37 |     # training parameters
38 |     parser.add_argument('--bg_fg_weight', type=float, default=0.9,
39 |                         help="How much to weight the foreground objects against the background during training.")
40 |     parser.add_argument('--batch_size', type=int, default=8,
41 |                         help="Batch size for training")
42 |     parser.add_argument('--num_steps', type=int, default=5000, help="Number of training steps")
43 |     parser.add_argument('--checkpoint_dir', type=str, default='checkpoints', help="Save model to this path.")
44 |     args = parser.parse_args()
45 | 
46 |     logger.info('Training network with settings: {}'.format(vars(args)))
47 | 
48 |     estimator = tf.estimator.Estimator(model_fn=partial(build_model,
49 |                                                         num_classes=args.num_classes,
50 |                                                         data_format=args.data_format,
51 |                                                         bg_fg_weight=args.bg_fg_weight),
52 |                                        model_dir=args.checkpoint_dir,
53 |                                        config=tf.estimator.RunConfig(save_checkpoints_steps=100,
54 |                                                                      save_summary_steps=100))
55 | 
56 |     train_spec = tf.estimator.TrainSpec(input_fn=partial(dataset.make_dataset,
57 |                                                          data_generator=train_data_generator,
58 |                                                          data_format=args.data_format,
59 |                                                          batch_size=args.batch_size,
60 |                                                          mode=tf.estimator.ModeKeys.TRAIN), max_steps=args.num_steps)
61 | 
62 |     eval_spec = tf.estimator.EvalSpec(input_fn=partial(dataset.make_dataset,
63 |                                                        data_generator=eval_data_generator,
64 |                                                        data_format=args.data_format,
65 |                                                        batch_size=args.batch_size,
66 |                                                        mode=tf.estimator.ModeKeys.EVAL), steps=None)
67 | 
68 |     tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
69 | 


--------------------------------------------------------------------------------