├── README.md
├── README_zh.md
├── anchors.json
├── assets
    ├── dataset.png
    ├── loss_curve.png
    ├── net_structure.png
    └── qrcodes.png
├── data_generator
    ├── README.md
    ├── README_zh.md
    ├── generate_qrcode.py
    └── generate_training_data.py
├── data_loader
    └── dataset.py
├── evaluate.py
├── gradio_demo.py
├── models
    ├── README.md
    ├── loss.py
    └── yolov3.py
├── requirements.txt
├── test.py
├── test_images
    ├── 1.jpg
    ├── 2.jpg
    └── 3.jpg
├── train.py
└── utils
    ├── anchor_generator.py
    ├── kmean.py
    └── util.py


/README.md:
--------------------------------------------------------------------------------
 1 | [中文](README_zh.md)
 2 | # QRCode Detection
 3 | Deep learning based QRCode detection.
 4 | 
 5 | ## Introduction
 6 | This is a project which depends on deep learning algorithm for QRCode detection.  
 7 | We have achieved fast and high-precision detection by using a yolov3-like detecter.  
 8 | 
 9 | Feature:  
10 | 
11 | + Fast detection, more than 190 fps on GTX 1060.
12 | + High precision  
13 |   Evaluate result on validation data 
14 |   |Precision|Recall|Mean IOU|
15 |   |  ----  | ----  |----|
16 |   |0.987|0.819|0.798|  
17 |   
18 | + Free deployment
19 | 
20 | ## Installation
21 | Please enable python in your machine.
22 | ```shell
23 | git clone https://github.com/cosimo17/QRCodeDetection.git
24 | cd QRCodeDetection
25 | pip install -r requirements.txt
26 | 
27 | ```
28 | ## Test
29 | To test with the pretrained model, please download the pretrained weight file from [here](https://drive.google.com/file/d/1lqlQySkYehgkVJjZtRnYAICla7qSnxeG/view?usp=sharing).
30 | ```shell
31 | python3 test.py \
32 | 	-w yolo_qrcode.h5 \
33 | 	-i test_images/1.jpg \
34 | 	-o ./result_1.jpg
35 | ```
36 | 
37 | ## Training
38 | * Before start training, please check [How to prepare dataset](data_generator/README.md)
39 | * Run the kmean algorithm to generate priori anchor boxes
40 | ```shell
41 | python3 utils/kmean.py \
42 | 		--root_dir your_dataset_dir \
43 | 		-n 6
44 | ```
45 | 
46 | Execute following command to start training:
47 | ```shell
48 | python3 train.py \
49 | 	-d your_dataset_dir \
50 | 	-b 64 \
51 | 	-e 80
52 | ```
53 | You can run ```python3 train.py --help``` to get help.  
54 | During training, you can use tensorboard to visualize the loss curve.
55 | ```shell
56 | tensorboard --logdir=./logs
57 | ```
58 | ![loss](assets/loss_curve.png)  
59 | 
60 | ## Evaluate
61 | Execute following command to evaluate the model performance:
62 | ```shell
63 | python3 evaluate.py \
64 | 	-d your_dataset_dir \
65 | 	-b 64 \
66 | 	--score_threshold 0.5 \
67 | 	--iou_threshold 0.5 \
68 | 	-w yolo_qrcode.h5
69 | ```
70 | 
71 | ## TODO
72 | - [ ] Integrate decode module  
73 | - [ ] Support docker container  
74 | - [ ] Support openvino  
75 | - [ ] Support tensorrt  
76 | - [ ] Support tflite
77 | 


--------------------------------------------------------------------------------
/README_zh.md:
--------------------------------------------------------------------------------
 1 | [In English](README.md)
 2 | # 二维码检测
 3 | 基于深度学习的二维码检测
 4 | 
 5 | ## 介绍
 6 | 这是一个基于深度学习算法的二维码检测项目，通过一个类似yolov3的目标检测网络，实现了快速，高精度的二维码检测。  
 7 | 特性：
 8 | + 快速, 在GTX 1060显卡上可以达到大于190的FPS
 9 | + 高精度   
10 |   在验证集上的测试结果 
11 |   |Precision|Recall|Mean IOU|
12 |   |  ----  | ----  |----|
13 |   |0.987|0.819|0.798|
14 |   
15 | + 多样化部署
16 | 
17 | ## 安装
18 | ```shell
19 | git clone https://github.com/cosimo17/QRCodeDetection.git
20 | cd QRCodeDetection
21 | pip install -r requirements.txt
22 | ```
23 | 
24 | ## 测试
25 | 测试前，请先从 [这里](https://drive.google.com/file/d/1lqlQySkYehgkVJjZtRnYAICla7qSnxeG/view?usp=sharing) 下载预训练好的模型。
26 | ```shell
27 | python3 test.py \
28 | 	-w yolo_qrcode.h5 \
29 | 	-i test_images/1.jpg \
30 | 	-o ./result_1.jpg
31 | ```
32 | 
33 | ## 训练
34 | * 训练自己的模型之前，请先查看[如何准备训练数据集](data_generator/README_zh.md)  
35 | * 运行聚类算法，为数据集生成先验的锚点(anchor box)
36 | ```shell
37 | python3 utils/kmean.py \
38 | 		--root_dir your_dataset_dir \
39 | 		-n 6
40 | ```
41 | 
42 | * 使用如下命令启动训练
43 | ```shell
44 | python3 train.py \
45 | 	-d your_dataset_dir \
46 | 	-b 64 \
47 | 	-e 80
48 | ```
49 | 可以运行```python3 train.py --help```来查看参数含义和帮助信息
50 | 
51 | * 在训练过程中，你可以使用tensorboard来监控loss的收敛曲线
52 | ```shell
53 | tensorboard --logdir=./logs
54 | ```
55 | ![loss](assets/loss_curve.png)  
56 | 
57 | ## 评估
58 | 运行如下命令来评估模型的性能:
59 | ```shell
60 | python3 evaluate.py \
61 | 	-d your_dataset_dir \
62 | 	-b 64 \
63 | 	--score_threshold 0.5 \
64 | 	--iou_threshold 0.5 \
65 | 	-w yolo_qrcode.h5
66 | ```
67 | 
68 | ## TODO
69 | - [ ] 集成解码模块  
70 | - [ ] 支持docker  
71 | - [ ] 支持openvino  
72 | - [ ] 支持tensorrt  
73 | - [ ] 支持tflite
74 | 


--------------------------------------------------------------------------------
/anchors.json:
--------------------------------------------------------------------------------
 1 | {
 2 |   "anchors": [
 3 |     [
 4 |       0.21708803813877386,
 5 |       0.19077461920925146
 6 |     ],
 7 |     [
 8 |       0.14457228401955907,
 9 |       0.14482037475851203
10 |     ],
11 |     [
12 |       0.24115443125042058,
13 |       0.24174995569171495
14 |     ],
15 |     [
16 |       0.2798744647627279,
17 |       0.28599593416090546
18 |     ],
19 |     [
20 |       0.17124336032308185,
21 |       0.17623757236501922
22 |     ],
23 |     [
24 |       0.18998317083452929,
25 |       0.2249816123708392
26 |     ]
27 |   ]
28 | }


--------------------------------------------------------------------------------
/assets/dataset.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/cosimo17/QRCodeDetection/865e5421c44d16db5ceb48e899cecb56823b3db9/assets/dataset.png


--------------------------------------------------------------------------------
/assets/loss_curve.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/cosimo17/QRCodeDetection/865e5421c44d16db5ceb48e899cecb56823b3db9/assets/loss_curve.png


--------------------------------------------------------------------------------
/assets/net_structure.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/cosimo17/QRCodeDetection/865e5421c44d16db5ceb48e899cecb56823b3db9/assets/net_structure.png


--------------------------------------------------------------------------------
/assets/qrcodes.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/cosimo17/QRCodeDetection/865e5421c44d16db5ceb48e899cecb56823b3db9/assets/qrcodes.png


--------------------------------------------------------------------------------
/data_generator/README.md:
--------------------------------------------------------------------------------
 1 | [中文](README_zh.md)
 2 | # How to prepare training data
 3 | 
 4 | Yu can using generated fake data to train this model, or you can collect your own dataset for training.  
 5 | We suggest you train on fake data first, then finetune on your own dataset.
 6 | 
 7 | ## Generate fake dataset
 8 | We provide two scripts for data generation.
 9 | * Generate QRCode image
10 | ```shell
11 | mkdir qrcodes  
12 | python3 data_generator/generate_qrcode.py \
13 | 		-n 1500 \
14 | 		-o qrcodes
15 | ```
16 | 
17 | * Prepare some background images. (Such as imagenet, open image.)
18 | 
19 | * Generate training data
20 | ```shell
21 | python3 generate_training_data.py \
22 | 		-fg qrcodes \
23 | 		-bg your_dir \
24 | 		-o training_ds \
25 | 		-n 40000 \
26 | 		--shape 256
27 | ```
28 | The generated data looks like following:  
29 | ![dataset](../assets/dataset.png)  
30 | We have already generated 40000 images and labels. You can download them from here: [dataset](https://drive.google.com/file/d/1Mv9fC8e4-IJq3MLQ_QA846o4TTjn-9ui/view?usp=sharing)
31 | 
32 | ## Prepare your own dataset
33 | Of course, you can prepare your own dataset by yourself.
34 | * Collect images data
35 | 
36 | * Annotate your images   
37 |   You can use any tools you like to annotate your images. [Labelme](https://github.com/wkentaro/labelme) will be a good choice.
38 | * Convert the label format 
39 |   After annotation, you should convert the label format.  
40 |     ```
41 |   training_ds
42 |   ------------
43 |   |
44 |   |---000001.jpg
45 |   |---000001.txt 
46 |   |---000002.jpg
47 |   |---000002.txt
48 |   |---...
49 |   |---...
50 |   |---...
51 |   |---xxxxxx.jpg
52 |   |---xxxxxx.txt
53 |   For each image, there should be a txtfile which has the same name with image's name.
54 |   The format of txt:
55 |     cx,cy,w,h,1.0, 1.0
56 |     cx,cyw,h,1.0,1.0
57 |   Each line represents an qrcode object.
58 |   cx,cy means the center coordinates, w,h means the width and the height. All coordinates are normalized to [0-1]


--------------------------------------------------------------------------------
/data_generator/README_zh.md:
--------------------------------------------------------------------------------
 1 | [In English](README.md)
 2 | # 如何准备训练数据集
 3 | 
 4 | 你可以使用生成的虚拟数据集进行训练，也可以使用自己在实际场景中收集标注的数据进行训练。  
 5 | 建议先使用虚拟数据集进行训练，然后再在自己的数据集上进行微调。
 6 | 
 7 | ## 生成数据集
 8 | 这里提供了两个程序来生成虚拟的训练数据。
 9 | * 生成二维码图片
10 | ```shell
11 | mkdir qrcodes  
12 | # 生成1500张二维码图片，保存在qrcodes目录下
13 | python3 data_generator/generate_qrcode.py \
14 | 		-n 1500 \
15 | 		-o qrcodes
16 | ```
17 | * 准备一些图片作为背景
18 | 
19 | * 合成二维码
20 | ```shell
21 | python3 generate_training_data.py \
22 | 		-fg qrcodes \
23 | 		-bg your_dir \
24 | 		-o training_ds \
25 | 		-n 40000 \
26 | 		--shape 256
27 | ```
28 | 生成的数据集如下所示：  
29 | ![数据集示意](../assets/dataset.png)
30 | 
31 | 预先使用这两个脚本生成了40000张图片数据和标签，你可以从这里下载到它: [数据集](https://drive.google.com/file/d/1Mv9fC8e4-IJq3MLQ_QA846o4TTjn-9ui/view?usp=sharing)
32 | 
33 | ## 创建自己的数据集
34 | 你也可以创建自己的数据集
35 | * 采集图片
36 | * 标注图片  
37 |   你可以使用任何你喜欢的标注工具来标注自己的数据，比如[labelme](https://github.com/wkentaro/labelme) 等
38 | * 确保标注格式符合要求  
39 |   请将标注工具生成的标签转换为如下格式：
40 |   ```
41 |   training_ds
42 |   ------------
43 |   |
44 |   |---000001.jpg
45 |   |---000001.txt 
46 |   |---000002.jpg
47 |   |---000002.txt
48 |   |---...
49 |   |---...
50 |   |---...
51 |   |---xxxxxx.jpg
52 |   |---xxxxxx.txt
53 |   
54 |   每一张图片，都应该有一个同名的txt文件与之对应  
55 |   txt的格式如下  
56 |       cx,cy,w,h,1.0, 1.0
57 |       cx,cyw,h,1.0,1.0
58 |   每一行表示一个二维码对象
59 |   cx,cy表示边界框的中心点坐标，w,h,表示边界框的宽和高，所有的坐标都被归一化到[0-1]之间.
60 | 


--------------------------------------------------------------------------------
/data_generator/generate_qrcode.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import os
 3 | import qrcode
 4 | import argparse
 5 | 
 6 | chars = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
 7 |          'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k',
 8 |          'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
 9 |          'w', 'x', 'y', 'z',
10 |          'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K',
11 |          'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V',
12 |          'W', 'X', 'Y', 'Z',
13 |          '!', '@', '#', '$', '%', '^', '&', '*', '(', ')',
14 |          '-', '+', '/', '?', ',']
15 | 
16 | def get_args():
17 |     parser = argparse.ArgumentParser()
18 |     parser.add_argument('--number', '-n', type=int,
19 |                         default=1000, help='How many qrcode images will be generated')
20 |     parser.add_argument('--min_length', '-min', type=int,
21 |                         default=10, help='min length of the string encoded in qrcode')
22 |     parser.add_argument('--max_length', '-max', type=int,
23 |                         default=25, help='max length of the string encoded in qrcode')
24 |     parser.add_argument('--output_dir', '-o', type=str, required=True,
25 |                         help='Dir to save the result')
26 |     parser.add_argument('--size', '-s', type=int, default=6,
27 |                         help='qrcode image pixel size')
28 |     parser.add_argument('--version', '-v', type=int, default=1,
29 |                         help='version of the qrcode')
30 |     args = parser.parse_args()
31 |     return args
32 | 
33 | def random_length(min_length, max_length):
34 |     return np.random.randint(min_length, max_length)
35 | 
36 | def random_index(length):
37 |     return np.random.randint(0, len(chars), size=(length,))
38 | 
39 | def string_from_index(index):
40 |     s = ''
41 |     for i in index:
42 |         s += chars[i]
43 |     return s
44 | 
45 | def string2qrcode(string, version, size):
46 |     img = qrcode.make(string, version=version, box_size=size)
47 |     return img
48 | 
49 | def run(args):
50 |     for i in range(args.number):
51 |         print("Generating {}/{} qrcode image".format(i, args.number))
52 |         length = random_length(args.min_length, args.max_length)
53 |         index = random_index(length)
54 |         string = string_from_index(index)
55 |         img = string2qrcode(string, args.version, args.size)
56 |         imgname = '{:04d}.jpg'.format(i)
57 |         imgname = os.path.join(args.output_dir, imgname)
58 |         img.save(imgname)
59 | 
60 | def main():
61 |     args = get_args()
62 |     run(args)
63 | 
64 | if __name__ == '__main__':
65 |     main()


--------------------------------------------------------------------------------
/data_generator/generate_training_data.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import numpy as np
  3 | import cv2
  4 | import argparse
  5 | import imgaug.augmenters as iaa
  6 | import tqdm
  7 | 
  8 | MIN_W = 32
  9 | MIN_H = 32
 10 | 
 11 | aug_seq = iaa.Sequential([
 12 |     iaa.Crop(px=(0, 10)),
 13 |     iaa.GaussianBlur(sigma=(0.0, 4)),
 14 |     iaa.Sometimes(0.5, iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.5 * 255), per_channel=0.5)),
 15 |     iaa.Affine(
 16 |         scale={"x": (0.3, 0.5), "y": (0.3, 0.5)},
 17 |         rotate=(-25, 25),
 18 |         shear=(-15, 15)
 19 |     ),
 20 |     iaa.PerspectiveTransform(scale=(0.01, 0.1))
 21 | ])
 22 | 
 23 | 
 24 | def augment_process_fg(image):
 25 |     image_aug = aug_seq.augment_image(image)
 26 |     return image_aug
 27 | 
 28 | 
 29 | def get_args():
 30 |     parser = argparse.ArgumentParser()
 31 |     parser.add_argument('--fg_dir', '-fg', type=str, help='path to foreground qrcode images')
 32 |     parser.add_argument('--bg_dir', '-bg', type=str, help='path to background images')
 33 |     parser.add_argument('--output', '-o', type=str, help='path to save the generated images')
 34 |     parser.add_argument('--number', '-n', type=int, help='how many images you want to generate')
 35 |     parser.add_argument('--size', '-s', type=str, default='(32,120)', help='size range of the qrcode image')
 36 |     parser.add_argument('--alpha', '-a', type=str, default='(10,30)', help='value range of the alpha parameter')
 37 |     parser.add_argument('--object_number', '-on', type=str, default='(1,5)',
 38 |                         help='the number of qrcode image in one background image')
 39 |     parser.add_argument('--shape', type=int, default=256, help='training data shape')
 40 |     parser.add_argument('--debug', type=bool, default=False, help='debug mode')
 41 |     args = parser.parse_args()
 42 |     args.size = eval(args.size) # string to tuple
 43 |     args.alpha = eval(args.alpha)
 44 |     args.object_number = eval(args.object_number)
 45 |     return args
 46 | 
 47 | 
 48 | class ImageLists(object):
 49 |     def __init__(self, root_dir, shape=None):
 50 |         imgnames = os.listdir(root_dir)
 51 |         imgnames = [os.path.join(root_dir, imgname) for imgname in imgnames]
 52 |         self.imgnames = imgnames
 53 |         self.shape = shape
 54 | 
 55 |     def __getitem__(self, item):
 56 |         item = item % len(self.imgnames)
 57 |         imgname = self.imgnames[item]
 58 |         img = cv2.imread(imgname)
 59 |         if self.shape is not None:
 60 |             img = cv2.resize(img, self.shape)
 61 |         return img
 62 | 
 63 |     def __len__(self):
 64 |         return len(self.imgnames)
 65 | 
 66 | 
 67 | def random_position(bg_size, fg_size):
 68 |     bg_w, bg_h = bg_size
 69 |     w, h = fg_size
 70 |     xmin = np.random.randint(0, bg_w - w)
 71 |     ymin = np.random.randint(0, bg_h - h)
 72 |     xmax = xmin + w
 73 |     ymax = ymin + h
 74 |     bbox = [xmin, ymin, xmax, ymax]
 75 |     return bbox
 76 | 
 77 | 
 78 | def random_resize(img, size_range):
 79 |     h, w = img.shape[:2]
 80 |     min_size, max_size = size_range
 81 |     new_w = np.random.randint(min_size, max_size)
 82 |     new_h = int(new_w * h / w)
 83 |     img = cv2.resize(img, (new_w, new_h))
 84 |     return img
 85 | 
 86 | 
 87 | def try_random_resize(fg_img, size_range):
 88 |     _fg_img = fg_img.copy()
 89 |     while True:
 90 |         fg_img = _fg_img.copy()
 91 |         # augment
 92 |         fg_img = augment_process_fg(fg_img)
 93 |         mask = np.argwhere(fg_img > 0)
 94 |         box = (np.min(mask[..., 0]),
 95 |                np.min(mask[..., 1]),
 96 |                np.max(mask[..., 0]),
 97 |                np.max(mask[..., 1]))
 98 |         if box[2] - box[0] < MIN_W or box[3] - box[1] < MIN_H:
 99 |             continue
100 |         fg_img = fg_img[box[0]:box[2], box[1]:box[3], ...]
101 |         fg_img = random_resize(fg_img, size_range)
102 | 
103 |         break
104 |     return fg_img
105 | 
106 | 
107 | def try_random_position(fg_img, bg_size, exist_bbox):
108 |     fg_size = [fg_img.shape[1], fg_img.shape[0]]
109 |     count = 0
110 |     max_count = 20
111 |     while True:
112 |         if count > max_count:
113 |             return None
114 |         bbox = random_position(bg_size, fg_size)
115 |         intersection = False
116 |         for bx in exist_bbox:
117 |             if is_overlap(bx, bbox):
118 |                 intersection = True
119 |                 break
120 |         if not intersection:
121 |             break
122 |         count += 1
123 |     return bbox
124 | 
125 | 
126 | def is_overlap(box1, box2):
127 |     if box1[0] > box2[2] or box2[0] > box1[2]:
128 |         return False
129 |     if box1[1] > box2[3] or box2[1] > box1[3]:
130 |         return False
131 |     return True
132 | 
133 | 
134 | def paste(fg_img, bg_img, bbox, alpha):
135 |     # crop qrcode image
136 |     fg_img = fg_img[0:bbox[3] - bbox[1], 0:bbox[2] - bbox[0], ...]
137 |     mask = np.nonzero(fg_img > np.random.randint(15, 50))
138 |     bg_crop = bg_img[bbox[1]:bbox[3], bbox[0]:bbox[2], ...]
139 |     # alpha fusion with random parameter
140 |     alpha = np.random.randint(alpha[0], alpha[1]) / 100.0
141 |     bg_crop[mask] = (bg_crop[mask] * alpha + fg_img[mask] * (1 - alpha)).astype(np.uint8)
142 |     bg_img[bbox[1]:bbox[3], bbox[0]:bbox[2], :] = bg_crop
143 |     return bg_img
144 | 
145 | 
146 | def normalize_coordinate(bbox, shape):
147 |     """Convert absolute coordinates to relative coordinates"""
148 |     xmin, ymin, xmax, ymax = bbox
149 |     w = xmax - xmin
150 |     h = ymax - ymin
151 |     xmin /= shape[1]
152 |     ymin /= shape[0]
153 |     w /= shape[1]
154 |     h /= shape[0]
155 |     return xmin, ymin, w, h
156 | 
157 | 
158 | def save_result(output_dir, img, count, labels):
159 |     imgname = '{:06d}.jpg'.format(count)
160 |     imgname = os.path.join(output_dir, imgname)
161 |     cv2.imwrite(imgname, img)
162 |     labels_name = imgname.replace('.jpg', '.txt')
163 |     with open(labels_name, 'w') as f:
164 |         for i in range(len(labels) - 1):
165 |             f.write(labels[i] + '\n')
166 |         f.write(labels[-1])
167 | 
168 | 
169 | def visualize(img, bbox):
170 |     xmin, ymin, xmax, ymax = bbox
171 |     cv2.rectangle(img, (xmin, ymin), (xmax, ymax), (0, 255, 0), 1)
172 |     cv2.imshow('img', img)
173 |     cv2.waitKey(0)
174 | 
175 | 
176 | def generate_training_data(args):
177 |     """
178 |     Generate fake training data.
179 |     1. select one background image
180 |     2. select one or more qrcode images
181 |     3. do some image augment for qrcode image (add noise, blur, affine ...)
182 |     4. resize qrcode image to a random size
183 |     5. paste qrcode image to a random location of background image(alpha Fusion)
184 |     """
185 |     if not os.path.exists(args.output):
186 |         os.mkdir(args.output)
187 |     bg_imgs = ImageLists(args.bg_dir, [args.shape] * 2)
188 |     fg_imgs = ImageLists(args.fg_dir)
189 |     count = 0
190 |     with tqdm.tqdm(total=args.number) as pbar:
191 |         pbar.set_description('Generating {}/{} sample'.format(count, args.number))
192 |         while True:
193 |             if count >= args.number:
194 |                 break
195 |             # get ackground image
196 |             bg_img = bg_imgs[count]
197 |             exist_bbox = []
198 |             labels = []
199 |             for i in range(np.random.randint(args.object_number[0], args.object_number[1])):
200 |                 # get qrcode image
201 |                 fg_img = fg_imgs[count]
202 |                 fg_img = try_random_resize(fg_img, args.size)
203 |                 bbox = try_random_position(fg_img, [bg_img.shape[1], bg_img.shape[0]], exist_bbox)
204 |                 if bbox is None:
205 |                     continue
206 |                 synth_img = paste(fg_img, bg_img, bbox, args.alpha)
207 |                 exist_bbox.append(bbox)
208 |                 l, t, w, h = normalize_coordinate(bbox, bg_img.shape)
209 |                 cx = l + w / 2
210 |                 cy = t + h / 2
211 |                 if args.debug:
212 |                     visualize(synth_img, bbox)
213 |                 # cx, cy, w, h, conf, cls
214 |                 one_label = '{},{},{},{},{},{}'.format(cx, cy, w, h, 1.0, 0)
215 |                 labels.append(one_label)
216 |                 bg_img = synth_img
217 |             count += 1
218 |             save_result(args.output, bg_img, count, labels)
219 |             pbar.update(1)
220 | 
221 | 
222 | def main():
223 |     args = get_args()
224 |     generate_training_data(args)
225 | 
226 | 
227 | if __name__ == '__main__':
228 |     main()
229 | 


--------------------------------------------------------------------------------
/data_loader/dataset.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import numpy as np
 3 | import os
 4 | import tensorflow as tf
 5 | from functools import partial
 6 | from utils import util
 7 | 
 8 | 
 9 | def preprocess_img(imgname):
10 |     """
11 |     Load img via cv2
12 |     """
13 |     img = cv2.imread(imgname.numpy().decode()).astype(np.float32)
14 |     img /= 255.0
15 |     return img
16 | 
17 | 
18 | def parse_label(txtname):
19 |     """
20 |     Load bbox from txtfile
21 |     """
22 |     labels = []
23 |     with open(txtname.numpy().decode(), 'r') as f:
24 |         lines = f.readlines()
25 |     for l in lines:
26 |         l = l.split(',')[:4]
27 |         labels.append(l)
28 |     labels = np.array(labels).astype(np.float32)
29 |     return labels
30 | 
31 | 
32 | def tf_preprocess_img(filename):
33 |     img = None
34 |     [img, ] = tf.py_function(preprocess_img, [filename], [tf.float32])
35 |     return img
36 | 
37 | 
38 | def tf_preprocess_label(filename):
39 |     label = None
40 |     [label, ] = tf.py_function(parse_label, [filename], [tf.float32])
41 |     return label
42 | 
43 | 
44 | def yolo_label(bbox, grids, anchor_ratios, class_number):
45 |     return util.bbox2yololabel(bboxs=bbox,
46 |                                grids=grids,
47 |                                anchor_ratios=anchor_ratios,
48 |                                class_number=class_number)
49 | 
50 | 
51 | def tf_create_yolo_label(bbox, grids, anchor_ratios, class_number):
52 |     [label, ] = tf.py_function(yolo_label, [bbox, grids, anchor_ratios, class_number], [tf.float32])
53 |     return label
54 | 
55 | 
56 | def create_dataset(root_dir, grids, anchor_ratios, class_number, batch_size):
57 |     """
58 |     Build dataset pipeline
59 |     """
60 |     list_ds = tf.data.Dataset.list_files(root_dir + '/*.jpg', shuffle=False)
61 |     imgs_ds = list_ds.map(tf_preprocess_img, num_parallel_calls=4)
62 |     list_ds = tf.data.Dataset.list_files(root_dir + '/*.txt', shuffle=False)
63 |     label_ds = list_ds.map(tf_preprocess_label, num_parallel_calls=4)
64 |     label_ds = label_ds.map(
65 |         partial(tf_create_yolo_label, grids=grids, anchor_ratios=anchor_ratios, class_number=class_number), num_parallel_calls=4)
66 |     dataset = tf.data.Dataset.zip((imgs_ds, label_ds))
67 |     # slice all data. 70% for training, 30% for validation
68 |     training_dataset = dataset.take(int(len(dataset)*0.7)).prefetch(batch_size*10).shuffle(batch_size*10).batch(batch_size)
69 |     val_dataset = dataset.skip(int(len(dataset)*0.7)).prefetch(batch_size*10).batch(batch_size)
70 |     return training_dataset, val_dataset
71 | 


--------------------------------------------------------------------------------
/evaluate.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import cv2
  3 | import argparse
  4 | from utils.util import *
  5 | from models.yolov3 import yolov3
  6 | import multiprocessing as mp
  7 | import numpy as np
  8 | import time
  9 | 
 10 | 
 11 | def get_args():
 12 |     parser = argparse.ArgumentParser()
 13 |     parser.add_argument('--data_dir', '-d', type=str, required=True, help='path to training dataset')
 14 |     parser.add_argument('--shape', type=str, default='(256,256)', help='input shape of network')
 15 |     parser.add_argument('--batch_size', '-b', type=int, default=32)
 16 |     parser.add_argument('--score_threshold', type=float, default=0.5)
 17 |     parser.add_argument('--iou_threshold', type=float, default=0.5)
 18 |     parser.add_argument('--anchors', '-a', type=str, default='anchors.json',
 19 |                         help='anchors generated from kmean algorithm')
 20 |     parser.add_argument('--weights', '-w', type=str, default='yolo_qrcode.h5', help='pretrained weight')
 21 |     args = parser.parse_args()
 22 |     args.shape = eval(args.shape)
 23 |     return args
 24 | 
 25 | 
 26 | def _load_img(name):
 27 |     img = cv2.imread(name)
 28 |     img = img.astype(np.float32) / 255.0
 29 |     return img
 30 | 
 31 | 
 32 | def _load_label(name):
 33 |     labelname = name.replace('.jpg', '.txt')
 34 |     with open(labelname, 'r') as f:
 35 |         lines = f.readlines()
 36 |     labels = []
 37 |     for line in lines:
 38 |         line = line.split(',')[:4]  # cx,cy,w,h
 39 |         line = [float(v) for v in line]
 40 |         line = cxcy2xyxy(line)
 41 |         labels.append(line)
 42 |     labels = np.array(labels).astype(np.float32)
 43 |     return labels
 44 | 
 45 | 
 46 | def loader(root_dir, batch_size, cpu):
 47 |     imgnames = os.listdir(root_dir)
 48 |     imgnames = [name for name in imgnames if name.endswith('.jpg')]
 49 |     imgnames.sort()
 50 |     imgnames = imgnames[int(len(imgnames) * 0.7):]  # last 30% is validation dataset
 51 |     imgnames = [os.path.join(root_dir, name) for name in imgnames]
 52 |     indexes = np.arange(len(imgnames))
 53 |     indexes = indexes[:batch_size * (len(indexes) // batch_size)]  # drop last
 54 |     indexes = np.reshape(indexes, (-1, batch_size))
 55 |     pool = mp.Pool(cpu)
 56 |     for i in range(indexes.shape[0]):
 57 |         index = indexes[i]
 58 |         _imgnames = [imgnames[idx] for idx in index]
 59 |         imgs = pool.map(_load_img, _imgnames)
 60 |         labels = pool.map(_load_label, _imgnames)
 61 |         imgs = np.array(imgs).astype(np.float32)
 62 |         yield imgs, labels
 63 | 
 64 | 
 65 | def _metrics(pred_bboxes, true_boxes, iou_threshold=0.5):
 66 |     """
 67 |     pred_bboxes: np.ndarray. [n,4]. format: normalized | xmin,ymin,xmax,ymax.
 68 |     true_boxes: list. [m,4]. format: normalized | xmin,ymin,xmax,ymax.
 69 |     """
 70 |     TP = 0  # true positive
 71 |     TN = 0  # true negative
 72 |     FP = 0  # false positive
 73 |     FN = 0  # false negative
 74 |     IOU = 0
 75 |     used = [False for _ in range(len(pred_bboxes))]  # mask indicate the pred box has matched with gt box or not
 76 |     for i in range(len(true_boxes)):
 77 |         detected = False
 78 |         for j in range(len(pred_bboxes)):
 79 |             _iou = general_iou(true_boxes[i], pred_bboxes[j])
 80 |             if _iou > iou_threshold and not used[j]:
 81 |                 TP += 1
 82 |                 used[j] = True
 83 |                 IOU += _iou
 84 |                 detected = True
 85 |                 break
 86 |         if not detected:
 87 |             FN += 1
 88 |     FP += (len(used) - sum(used))  # unmatched pred box. False positive pred.
 89 |     if TP > 0:
 90 |         mean_iou = IOU / TP
 91 |     else:
 92 |         mean_iou = 0
 93 |     return TP, FP, FN, mean_iou
 94 | 
 95 | 
 96 | def run():
 97 |     args = get_args()
 98 |     anchors = load_anchors(args.anchors)
 99 |     detecter = yolov3(input_shape=args.shape, anchor_number=len(anchors), weight=args.weights)
100 |     anchors = gen_anchors([s // 32 for s in args.shape], anchors)
101 |     TP, FP, FN = 0, 0, 0
102 |     IOU = 0
103 |     count = 0
104 |     for imgs, gt_labels in loader(args.data_dir, args.batch_size, 3):
105 |         print("Evaluating {}/{} sample".format(count, 12000))
106 |         # Forward
107 |         outputs = detecter.predict(imgs)  # [n,h,w,c]
108 |         for i in range(len(outputs)):
109 |             scores, classes, bboxes = decode(anchors, np.expand_dims(outputs[i], axis=0))
110 |             pred_scores, pred_bboxes = postprocess(scores, classes, bboxes)
111 |             tp, fp, fn, iou = _metrics(pred_bboxes, gt_labels[i])
112 |             TP += tp
113 |             FP += fp
114 |             FN += fn
115 |             IOU += iou
116 |         count += len(imgs)
117 |     precision = TP / (TP + FP)
118 |     recall = TP / (TP + FN)
119 |     mean_iou = IOU / count
120 |     print("\n")
121 |     print("--------------Evaluate Result-----------------")
122 |     print("Model: {}".format(args.weights))
123 |     print("score_threshold: {}".format(args.score_threshold))
124 |     print("iou_threshold: {}".format(args.iou_threshold))
125 |     print("Precision: {:.3f}  Recall: {:.3f}  MeanIOU: {:.3f}".format(precision, recall, mean_iou))
126 | 
127 | 
128 | if __name__ == '__main__':
129 |     run()
130 | 
131 | 


--------------------------------------------------------------------------------
/gradio_demo.py:
--------------------------------------------------------------------------------
 1 | import gradio as gr
 2 | from models.yolov3 import yolov3
 3 | import numpy as np
 4 | import cv2
 5 | from utils.anchor_generator import gen_anchors
 6 | import utils.util as util
 7 | from functools import partial
 8 | 
 9 | shape = (256, 256)
10 | anchors = util.load_anchors('./anchors.json')
11 | model = yolov3((256, 256), anchor_number=len(anchors), weight='yolo_qrcode.h5')
12 | anchors = gen_anchors([s // 32 for s in (256, 256)], anchors)
13 | 
14 | 
15 | def preprocess(img):
16 |     src_img = img.copy()
17 |     img = cv2.resize(img, shape)
18 |     img = img.astype(np.float32) / 255.0
19 |     img = np.expand_dims(img, axis=0)
20 |     return src_img, img
21 | 
22 | 
23 | def draw_roi(img, scores, bboxes, name='qrcode'):
24 |     h, w = img.shape[:2]
25 |     label_w = 46
26 |     label_h = 18
27 |     bbox_color = (240, 146, 31)
28 |     label_roi_color = np.array([192, 219, 103])
29 |     label_text_color = (255, 255, 255)
30 |     for score, bbox in zip(scores, bboxes):
31 |         xmin, ymin, xmax, ymax = bbox
32 |         xmin = int(xmin * w)
33 |         ymin = int(ymin * h)
34 |         xmax = int(xmax * w)
35 |         ymax = int(ymax * h)
36 |         cv2.rectangle(img, (xmin, ymin), (xmax, ymax), bbox_color, 2)
37 |         img[ymin - label_h:ymin, xmin:xmin + label_w, :] = label_roi_color
38 |         cv2.putText(img, str(name), (xmin, ymin - 8), cv2.FONT_HERSHEY_SIMPLEX, 0.4, label_text_color, 1)
39 |     return img
40 | 
41 | 
42 | def detection_info(bboxes, scores):
43 |     infos = []
44 |     temp = 'QRCode{}\n 置信度:{:.3f}\n xmin:{}, ymin:{}, xmax:{}, ymax:{}\n\n'
45 |     for i in range(len(scores)):
46 |         bbox = bboxes[i]
47 |         score = scores[i]
48 |         xmin, ymin, xmax, ymax = bbox
49 |         xmin = int(xmin * 256)
50 |         ymin = int(ymin * 256)
51 |         xmax = int(xmax * 256)
52 |         ymax = int(ymax * 256)
53 |         infos.append(temp.format(i + 1, score, xmin, ymin, xmax, ymax))
54 |     return ''.join(infos)
55 | 
56 | 
57 | def detect(image):
58 |     src_img, img = preprocess(image)
59 |     pred = model.predict(img)[0]
60 |     scores, classes, bboxes = util.decode(anchors, pred)
61 |     scores, bboxes = util.postprocess(scores, classes, bboxes)
62 |     src_img = draw_roi(src_img, scores, bboxes)
63 |     return src_img, str(detection_info(bboxes, scores))
64 | 
65 | 
66 | input_image = gr.Image()
67 | output_image = gr.Image()
68 | output_text = gr.Textbox()
69 | 
70 | demo = gr.Interface(
71 |     fn=detect,
72 |     inputs=input_image,
73 |     outputs=[output_image, output_text],
74 | )
75 | 
76 | demo.launch()
77 | 


--------------------------------------------------------------------------------
/models/README.md:
--------------------------------------------------------------------------------
1 | ## 网络结构如下 （Network structure is as follows）
2 | ![network structure](../assets/net_structure.png)  
3 | Created by [netron](https://netron.app/).


--------------------------------------------------------------------------------
/models/loss.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | 
 4 | def yolo_loss(y_true, y_pred):
 5 |     """
 6 |     :param y_true: [n, gridw, gridh, anchor_per_grid, channel]
 7 |     :param y_pred: [n, gridw, gridh, anchor_per_grid, channel]
 8 |     :return: loss
 9 |     """
10 |     pred_scores = tf.math.sigmoid(y_pred[..., 0])
11 |     pred_cls = tf.math.softmax(y_pred[..., 1:3], axis=-1)
12 |     epsilon = 0.0001
13 |     pred_cls = tf.clip_by_value(pred_cls, epsilon, 1 - epsilon)
14 |     pred_xy = tf.math.tanh(y_pred[..., 3:5])
15 |     pred_wh = tf.math.tanh(y_pred[..., 5:])
16 | 
17 |     true_scores = y_true[..., 0]
18 |     true_cls = y_true[..., 1:3]
19 |     true_xy = y_true[..., 3:5]
20 |     true_wh = y_true[..., 5:]
21 | 
22 |     bce = tf.keras.losses.BinaryCrossentropy(from_logits=False)
23 |     score_loss = bce(true_scores, pred_scores)
24 | 
25 |     cls_mask = true_scores + 0.005
26 |     cce = tf.keras.losses.CategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE)
27 |     cls_loss = cce(true_cls, pred_cls) * cls_mask
28 |     cls_loss = tf.math.reduce_mean(cls_loss)
29 | 
30 |     se = lambda x, y: tf.reduce_sum(tf.math.square(x - y), axis=-1)
31 |     xy_loss = se(true_xy, pred_xy) * true_scores
32 |     wh_loss = se(true_wh, pred_wh) * true_scores
33 |     bbox_loss = xy_loss + wh_loss
34 |     bbox_loss = tf.math.reduce_mean(bbox_loss)
35 | 
36 |     loss = score_loss + 2 * cls_loss + 5 * bbox_loss
37 |     loss *= 32
38 | 
39 |     return loss
40 | 


--------------------------------------------------------------------------------
/models/yolov3.py:
--------------------------------------------------------------------------------
 1 | import tensorflow.keras as keras
 2 | from tensorflow.keras.models import Model
 3 | import tensorflow as tf
 4 | 
 5 | def bn_act(x, act='relu'):
 6 |     acts = {'relu': keras.layers.ReLU,
 7 |             'leaky_relu': keras.layers.LeakyReLU,
 8 |             'swish': keras.activations.swish}
 9 |     x = keras.layers.BatchNormalization()(x)
10 |     x = acts[act]()(x)
11 |     return x
12 | 
13 | def head_layer(x, class_number=2, anchor_number=3):
14 |     """
15 |     Head layer for prediction.
16 |     Reference: https://pjreddie.com/media/files/papers/YOLOv3.pdf
17 |     """
18 |     kernel = anchor_number * (1 + class_number + 4)
19 |     output = keras.layers.Conv2D(kernel, (3,3), padding='SAME')(x)
20 |     output = tf.reshape(output, [-1, output.shape[1], output.shape[2], anchor_number, output.shape[3]//anchor_number])
21 |     return output
22 | 
23 | def downsize(x):
24 |     """
25 |     maxpool to downsize feature map
26 |     """
27 |     return keras.layers.MaxPool2D(padding='same')(x)
28 | 
29 | def yolov3(input_shape=(256,256), class_number=2, anchor_number=5, weight=''):
30 |     """
31 |     yolov3 like network. Not real yolov3.
32 |     size: 256 -> 128 -> 64 -> 32 -> 16 -> 8
33 |     kernel: 32 -> 64 -> 128 -> 256 -> 256
34 |     """
35 |     input_layer = keras.layers.Input(shape=input_shape + (3,))
36 |     for _ in range(2):
37 |         x = keras.layers.Conv2D(32, (3,3), padding='SAME')(input_layer)
38 |         x = bn_act(x)
39 |     # downsize
40 |     x = downsize(x) # 128 x 128
41 | 
42 |     for _ in range(3):
43 |         x = keras.layers.Conv2D(64, (3,3), padding='SAME')(x)
44 |         x = bn_act(x)
45 |     # downsize
46 |     x = downsize(x) # 64 x 64
47 | 
48 |     for _ in range(4):
49 |         x = keras.layers.Conv2D(128, (3,3), padding='SAME')(x)
50 |         x = bn_act(x)
51 |     # downsize
52 |     x = downsize(x) # 32 x 32
53 |     for _ in range(4):
54 |         x = keras.layers.Conv2D(128, (3,3), padding='SAME')(x)
55 |         x = bn_act(x)
56 |     # downsize
57 |     x = downsize(x) # 16 x 16
58 |     # low level feature
59 |     f1 = x
60 | 
61 |     for _ in range(4):
62 |         x = keras.layers.Conv2D(128, (3,3), padding='SAME')(x)
63 |         x = bn_act(x)
64 | 
65 |     # feature fusion
66 |     x = keras.layers.Concatenate()([f1, x])
67 | 
68 |     # downsize via stride2 conv
69 |     x = keras.layers.Conv2D(256, (3,3), strides=(2,2), padding='SAME')(x) # 8 x 8
70 | 
71 |     # 1x1 kernel to summary all channel
72 |     x = keras.layers.Conv2D(256, (1,1), strides=(1,1))(x)
73 | 
74 |     output = head_layer(x, class_number=class_number, anchor_number=anchor_number)
75 |     model = Model(inputs=input_layer, outputs=output)
76 |     if weight != '':
77 |         print('Load pretrained weight: {}'.format(weight))
78 |         model.load_weights(weight)
79 |     return model
80 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | opencv-python
2 | qrcode
3 | imgaug
4 | tqdm
5 | numpy
6 | tensorflow-gpu
7 | sklearn
8 | gradio


--------------------------------------------------------------------------------
/test.py:
--------------------------------------------------------------------------------
 1 | from models.yolov3 import yolov3
 2 | import numpy as np
 3 | import cv2
 4 | from utils.anchor_generator import gen_anchors
 5 | import utils.util as util
 6 | import argparse
 7 | 
 8 | 
 9 | def get_args():
10 |     parser = argparse.ArgumentParser()
11 |     parser.add_argument('--input', '-i', type=str, help='test image')
12 |     parser.add_argument('--weight', '-w', type=str, help='h5 weight file')
13 |     parser.add_argument('--shape', '-s', type=str, default='(256,256)',
14 |                         help='input shape. It should be equal with training shape')
15 |     parser.add_argument('--anchors', '-a', type=str, default='anchors.json',
16 |                         help='anchors generated from kmean algorithm')
17 |     parser.add_argument('--output', '-o', type=str, default='', help='output image')
18 |     args = parser.parse_args()
19 |     args.shape = eval(args.shape)
20 |     return args
21 | 
22 | 
23 | def load_test_img(name, shape):
24 |     img = cv2.imread(name)
25 |     src_img = img.copy()
26 |     img = cv2.resize(img, shape)
27 |     img = img.astype(np.float32) / 255.0
28 |     img = np.expand_dims(img, axis=0)
29 |     return src_img, img
30 | 
31 | 
32 | def draw_roi(img, scores, bboxes, name='qrcode'):
33 |     h, w = img.shape[:2]
34 |     label_w = 46
35 |     label_h = 18
36 |     bbox_color = (240, 146, 31)
37 |     label_roi_color = np.array([192, 219, 103])
38 |     label_text_color = (255, 255, 255)
39 |     for score, bbox in zip(scores, bboxes):
40 |         xmin, ymin, xmax, ymax = bbox
41 |         xmin = int(xmin * w)
42 |         ymin = int(ymin * h)
43 |         xmax = int(xmax * w)
44 |         ymax = int(ymax * h)
45 |         cv2.rectangle(img, (xmin, ymin), (xmax, ymax), bbox_color, 2)
46 |         img[ymin - label_h:ymin, xmin:xmin + label_w, :] = label_roi_color
47 |         cv2.putText(img, str(name), (xmin, ymin - 8), cv2.FONT_HERSHEY_SIMPLEX, 0.4, label_text_color, 1)
48 |     return img
49 | 
50 | 
51 | def main():
52 |     args = get_args()
53 |     anchors = util.load_anchors('./anchors.json')
54 |     model = yolov3(args.shape, anchor_number=len(anchors), weight=args.weight)
55 |     anchors = gen_anchors([s//32 for s in args.shape], anchors)
56 |     test_img = args.input
57 |     src_img, img = load_test_img(test_img, args.shape)
58 |     pred = model.predict(img)[0]
59 |     scores, classes, bboxes = util.decode(anchors, pred)
60 |     scores, bboxes = util.postprocess(scores, classes, bboxes)
61 |     src_img = draw_roi(src_img, scores, bboxes)
62 |     cv2.imshow('qrcode_detection', src_img)
63 |     cv2.waitKey(0)
64 |     if args.output != '':
65 |         cv2.imwrite(args.output, src_img)
66 | 
67 | 
68 | if __name__ == '__main__':
69 |     main()
70 | 


--------------------------------------------------------------------------------
/test_images/1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/cosimo17/QRCodeDetection/865e5421c44d16db5ceb48e899cecb56823b3db9/test_images/1.jpg


--------------------------------------------------------------------------------
/test_images/2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/cosimo17/QRCodeDetection/865e5421c44d16db5ceb48e899cecb56823b3db9/test_images/2.jpg


--------------------------------------------------------------------------------
/test_images/3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/cosimo17/QRCodeDetection/865e5421c44d16db5ceb48e899cecb56823b3db9/test_images/3.jpg


--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | 
 3 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
 4 | import argparse
 5 | import tensorflow.keras as keras
 6 | import tensorflow as tf
 7 | from data_loader import dataset
 8 | from models.yolov3 import yolov3
 9 | from models.loss import yolo_loss
10 | from utils.util import load_anchors
11 | 
12 | tf.get_logger().setLevel('WARNING')
13 | 
14 | 
15 | def get_args():
16 |     parser = argparse.ArgumentParser()
17 |     parser.add_argument('--data_dir', '-d', type=str, help='path to training dataset')
18 |     parser.add_argument('--shape', type=str, default='(256,256)', help='input shape of network')
19 |     parser.add_argument('--epoch', '-e', type=int, default=40)
20 |     parser.add_argument('--batch_size', '-b', type=int, default=32)
21 |     parser.add_argument('--anchors', '-a', type=str, default='anchors.json',
22 |                         help='anchors generated from kmean algorithm')
23 |     parser.add_argument('--weights', '-w', type=str, default='', help='pretrained weight')
24 |     parser.add_argument('--output', '-o', type=str, default='yolo_qrcode.h5', help='output weight')
25 |     parser.add_argument('--val_interval', '-i', type=int, default=1)
26 |     parser.add_argument('--learning_rate', '-lr', type=float, default=0.001)
27 |     args = parser.parse_args()
28 |     args.shape = eval(args.shape)
29 |     return args
30 | 
31 | 
32 | def scheduler(epoch, lr):
33 |     if epoch % 5 == 1:
34 |         return 0.001
35 |     else:
36 |         return lr * 0.7
37 | 
38 | 
39 | def score_acc(yt, yp):
40 |     pred_scores = tf.math.sigmoid(yp[..., 0])
41 |     true_scores = yt[..., 0]
42 |     pred_scores = tf.where(pred_scores > 0.5, 1, 0)
43 |     acc = tf.reduce_mean(
44 |         tf.cast(tf.math.equal(tf.cast(pred_scores, tf.float32), tf.cast(true_scores, tf.float32)), tf.float32))
45 |     return acc
46 | 
47 | 
48 | def cls_acc(yt, yp):
49 |     pred_cls = yp[..., 1:3]
50 |     pred_cls = tf.math.sigmoid(pred_cls)
51 |     pred_cls = tf.math.argmax(pred_cls, axis=-1)
52 | 
53 |     true_cls = yt[..., 1:3]
54 |     true_cls = tf.math.argmax(true_cls, axis=-1)
55 |     acc = tf.metrics.categorical_accuracy(true_cls, pred_cls)
56 |     return acc
57 | 
58 | 
59 | def train():
60 |     args = get_args()
61 |     anchors = load_anchors(args.anchors)
62 |     model = yolov3(input_shape=args.shape, anchor_number=len(anchors), weight=args.weights)
63 |     model.compile(optimizer=keras.optimizers.Adam(args.learning_rate), loss=yolo_loss,
64 |                   metrics=[score_acc, cls_acc])
65 |     training_ds, val_ds = dataset.create_dataset(args.data_dir, [s // 32 for s in args.shape], anchor_ratios=anchors,
66 |                                                  class_number=2, batch_size=args.batch_size)
67 |     lr_callback = tf.keras.callbacks.LearningRateScheduler(scheduler)
68 |     tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir="./logs")
69 |     model.fit(x=training_ds, epochs=args.epoch, validation_data=val_ds, callbacks=[lr_callback, tensorboard_callback],
70 |               validation_freq=5)
71 |     model.save(args.output)
72 | 
73 | 
74 | if __name__ == '__main__':
75 |     train()
76 | 


--------------------------------------------------------------------------------
/utils/anchor_generator.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | ANCHOR = None
 5 | def restack_anchors(xs, ys, ws, hs):
 6 |     xs = np.split(xs, xs.shape[2], axis=2)
 7 |     ys = np.split(ys, ys.shape[2], axis=2)
 8 |     ws = np.split(ws, ws.shape[2], axis=2)
 9 |     hs = np.split(hs, hs.shape[2], axis=2)
10 |     anchors = np.concatenate([xs[0], ys[0], ws[0], hs[0]], axis=3)
11 |     for i in range(1, len(xs)):
12 |         _anchor = np.concatenate([xs[i], ys[i], ws[i], hs[i]], axis=3)
13 |         anchors = np.concatenate([anchors, _anchor], axis=2)
14 |     return anchors
15 | 
16 | 
17 | def gen_anchors(grids, anchor_ratios):
18 |     """
19 |     :param grids: list. [grid_width, grid_height]. Grids means the last output from network
20 |     :param anchor_ratios: np.ndarray. [[anchor_w, anchor_h]*]
21 |     :return: priori anchors
22 |     """
23 |     global ANCHOR
24 |     if ANCHOR is not None:
25 |         return ANCHOR
26 |     grid_w, grid_h = grids
27 |     ys, xs = np.meshgrid(list(range(grid_w)), list(range(grid_h)))
28 |     xs = xs * (1 / grid_w) + (1 / grid_w) * 0.5
29 |     ys = ys * (1 / grid_h) + (1 / grid_h) * 0.5
30 |     anchor_per_grid = len(anchor_ratios)
31 |     ws = np.ones(shape=(grid_w, grid_h, anchor_per_grid, 1), dtype=np.float32)
32 |     ws *= np.expand_dims(anchor_ratios[..., 0], 1)
33 |     hs = np.ones(shape=(grid_w, grid_h, anchor_per_grid, 1), dtype=np.float32)
34 |     hs *= np.expand_dims(anchor_ratios[..., 1], 1)
35 |     xs = np.expand_dims(xs, axis=[2, 3])
36 |     xs = np.tile(xs, [1, 1, anchor_per_grid, 1])
37 |     ys = np.expand_dims(ys, axis=[2, 3])
38 |     ys = np.tile(ys, [1, 1, anchor_per_grid, 1])
39 |     anchors = restack_anchors(xs, ys, ws, hs)
40 |     ANCHOR = anchors.astype(np.float32)
41 |     return anchors.astype(np.float32)
42 | 
43 | 
44 | def test_generated_anchors():
45 |     """
46 |     Test case of this anchor generator
47 |     """
48 |     grid = [4, 4]
49 |     anchor_rations = np.array([[1, 1], [2, 2]])
50 |     generated_anchors = gen_anchors(grid, anchor_rations)
51 | 
52 |     anchors = np.zeros(shape=(4, 4, 2, 4), dtype=np.float32)
53 |     for i in range(4):
54 |         for j in range(4):
55 |             anchors[i, j, 0, :] = np.array([i * 0.25 + 0.125, j * 0.25 + 0.125, 1, 1])
56 |             anchors[i, j, 1, :] = np.array([i * 0.25 + 0.125, j * 0.25 + 0.125, 2, 2])
57 |     assert np.allclose(generated_anchors, anchors)
58 | 
59 |     global ANCHOR
60 |     ANCHOR = None
61 |     grid = [5, 5]
62 |     anchor_rations = np.array([[1, 1], [2, 2], [0.6, 0.8]])
63 |     generated_anchors = gen_anchors(grid, anchor_rations)
64 | 
65 |     anchors = np.zeros(shape=(5, 5, 3, 4), dtype=np.float32)
66 |     for i in range(5):
67 |         for j in range(5):
68 |             anchors[i, j, 0, :] = np.array([i * 0.2 + 0.1, j * 0.2 + 0.1, 1, 1])
69 |             anchors[i, j, 1, :] = np.array([i * 0.2 + 0.1, j * 0.2 + 0.1, 2, 2])
70 |             anchors[i, j, 2, :] = np.array(
71 |                 [i * 0.2 + 0.1, j * 0.2 + 0.1, 0.6, 0.8])
72 |     assert np.allclose(generated_anchors, anchors)
73 | 
74 | 
75 | if __name__ == '__main__':
76 |     test_generated_anchors()
77 | 


--------------------------------------------------------------------------------
/utils/kmean.py:
--------------------------------------------------------------------------------
 1 | from sklearn.cluster import KMeans
 2 | import numpy as np
 3 | import argparse
 4 | import json
 5 | import os
 6 | 
 7 | def get_args():
 8 |     parser = argparse.ArgumentParser()
 9 |     parser.add_argument('--root_dir', type=str, help='path of dataset')
10 |     parser.add_argument('--n_clusters', '-n', type=int, default=6)
11 |     parser.add_argument('--output', '-o', type=str, default='anchors.json')
12 |     args = parser.parse_args()
13 |     return args
14 | 
15 | def load_data(root_dir):
16 |     dataset = []
17 |     filenames = [name for name in os.listdir(root_dir) if name.endswith('.txt')]
18 |     filenames = [os.path.join(root_dir, name) for name in filenames]
19 |     for txt in filenames:
20 |         with open(txt, 'r') as f:
21 |             lines = f.readlines()
22 |         for l in lines:
23 |             l = l.split(',')
24 |             w,h = l[2:4]
25 |             w = float(w)
26 |             h = float(h)
27 |             dataset.append([w,h])
28 |     return dataset
29 | 
30 | def mean_iou(shape1, shape2):
31 |     w1, h1 = shape1[...,0], shape1[...,1]
32 |     w2, h2 = shape2
33 |     s1 = w1 * h1
34 |     s2 = w2 * h2
35 |     iou = np.minimum(s1, s2) / np.maximum(s1, s2)
36 |     return iou
37 | 
38 | def main():
39 |     args = get_args()
40 |     dataset = load_data(args.root_dir)
41 |     kmeans = KMeans(n_clusters=args.n_clusters, random_state=0, max_iter=500).fit(dataset)
42 |     anchors = kmeans.cluster_centers_
43 |     print(anchors)
44 |     anchors = anchors.tolist()
45 |     anchors = {
46 |         'anchors': anchors
47 |     }
48 |     print("Save kmean anchors to {}".format(args.output))
49 |     with open(args.output, 'w') as f:
50 |         json.dump(anchors, f, indent=2)
51 | 
52 | if __name__ == '__main__':
53 |     main()
54 | 


--------------------------------------------------------------------------------
/utils/util.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from .anchor_generator import gen_anchors
  3 | import json
  4 | import tensorflow as tf
  5 | np.seterr(divide='raise')
  6 | 
  7 | 
  8 | def load_anchors(anchor_file):
  9 |     with open(anchor_file, 'r') as f:
 10 |         anchors = json.load(f)['anchors']
 11 |     return np.array(anchors)
 12 | 
 13 | 
 14 | def general_iou(bbox1, bbox2):
 15 |     xmin1, ymin1, xmax1, ymax1 = bbox1
 16 |     xmin2, ymin2, xmax2, ymax2 = bbox2
 17 |     left = np.max([xmin1, xmin2])
 18 |     right = np.min([xmax1, xmax2])
 19 |     top = np.max([ymin1, ymin2])
 20 |     bottom = np.min([ymax1, ymax2])
 21 |     iw = np.max([(right - left), 0])
 22 |     ih = np.max([(bottom - top), 0])
 23 |     si = iw * ih
 24 |     s1 = (xmax1 - xmin1) * (ymax1 - ymin1)
 25 |     s2 = (xmax2 - xmin2) * (ymax2 - ymin2)
 26 |     _iou = si / (s1 + s2 - si)
 27 |     return _iou
 28 | 
 29 | 
 30 | def cxcy2xyxy(bbox):
 31 |     cx, cy, w, h = bbox
 32 |     xmin = cx - w / 2
 33 |     ymin = cy - h / 2
 34 |     xmax = xmin + w
 35 |     ymax = ymin + h
 36 |     return xmin, ymin, xmax, ymax
 37 | 
 38 | 
 39 | def iou(bbox1, bbox2):
 40 |     """
 41 |     :param bbox1: np.ndarray. [gridw, gridh, anchor_per_grid, 4]
 42 |     :param bbox2: np.ndarray. [n, 4]
 43 |     box format: cx, cy, w, h.normalized to 0~1
 44 |     :return: ious. [n, gridw, gridh, anchor_per_grid, 1]
 45 |     """
 46 |     cx1, cy1, w1, h1 = np.split(bbox1, 4, axis=-1)
 47 |     xmin1 = cx1 - w1 / 2
 48 |     ymin1 = cy1 - h1 / 2
 49 |     xmax1 = xmin1 + w1
 50 |     ymax1 = ymin1 + h1
 51 | 
 52 |     cx2, cy2, w2, h2 = np.split(bbox2, 4, axis=-1)
 53 |     xmin2 = cx2 - w2 / 2
 54 |     ymin2 = cy2 - h2 / 2
 55 |     xmax2 = xmin2 + w2
 56 |     ymax2 = ymin2 + h2
 57 | 
 58 |     ious = np.zeros(shape=(len(cx2), cx1.shape[0], cx1.shape[1], cx1.shape[2], 1), dtype=np.float32)
 59 |     for i in range(len(cx2)):
 60 |         left = np.maximum(xmin1, xmin2[i])
 61 |         right = np.minimum(xmax1, xmax2[i])
 62 |         top = np.maximum(ymin1, ymin2[i])
 63 |         bottom = np.minimum(ymax1, ymax2[i])
 64 |         iw = np.maximum((right - left), 0)
 65 |         ih = np.maximum((bottom - top), 0)
 66 |         s1 = w1 * h1
 67 |         s2 = w2[i] * h2[i]
 68 |         _iou = iw * ih / (s1 + s2 - (iw * ih))
 69 |         ious[i] = _iou
 70 |     return ious
 71 | 
 72 | 
 73 | def postprocess(scores, classes, bboxes, score_threshold=0.5, selected_cls=1, iou_threshold=0.6):
 74 |     scores = scores.flatten()
 75 |     classes = np.reshape(classes, [-1, 2])
 76 |     bboxes = np.reshape(bboxes, [-1, 4])
 77 |     idx = scores >= score_threshold
 78 | 
 79 |     scores = scores[idx]
 80 |     classes = classes[idx]
 81 |     bboxes = bboxes[idx]
 82 | 
 83 |     idx = classes[..., 1] > classes[..., 0]
 84 |     scores = scores[idx]
 85 |     classes = classes[idx]
 86 |     bboxes = bboxes[idx]
 87 | 
 88 |     idx = tf.image.non_max_suppression(
 89 |         bboxes, scores, max_output_size=10)
 90 |     scores = scores[idx.numpy()]
 91 |     classes = classes[idx.numpy()]
 92 |     bboxes = bboxes[idx.numpy()]
 93 |     return scores, bboxes
 94 | 
 95 | 
 96 | def sigmoid(x):
 97 |     x = np.clip(x, -15.0, 15.0)
 98 |     return 1 / (1 + np.exp(-x))
 99 | 
100 | 
101 | def decode(anchors, output):
102 |     """
103 |     decode anchor relative offset to bbox coordinate
104 |     output: np.ndarray. output from network. [n, w, h, anchor_per_grid, 7]
105 |     """
106 |     scores, cls_conf, bbox = output[..., 0], output[..., 1:3], output[..., 3:]
107 |     scores = sigmoid(scores)
108 |     cls_conf = sigmoid(cls_conf)
109 |     bbox[..., :2] = np.tanh(bbox[..., :2])  # xw
110 |     bbox[..., 2:] = np.tanh(bbox[..., 2:])  # wh
111 |     tx, ty, tw, th = np.split(bbox, 4, axis=-1)
112 |     anchor_cx, anchor_cy, anchor_w, anchor_h = np.split(anchors, 4, axis=-1)
113 |     cx = tx * anchor_w + anchor_cx
114 |     cy = ty * anchor_h + anchor_cy
115 |     w = np.exp(tw) * anchor_w
116 |     h = np.exp(th) * anchor_h
117 |     xmin = cx - w / 2
118 |     ymin = cy - h / 2
119 |     xmax = xmin + w
120 |     ymax = ymin + h
121 |     bboxes = np.concatenate([xmin, ymin, xmax, ymax], axis=-1)
122 |     return scores, cls_conf, bboxes
123 | 
124 | 
125 | def encode(anchor, bbox):
126 |     """
127 |     encode bbox coordinate to anchor relative offset
128 |     reference: https://pjreddie.com/media/files/papers/YOLOv3.pdf &&
129 |                https://github.com/tensorflow/models/blob/master/research/object_detection/box_coders/faster_rcnn_box_coder.py
130 |     """
131 |     anchor_cx, anchor_cy, anchor_w, anchor_h = anchor
132 |     cx, cy, w, h = bbox
133 |     tx = (cx - anchor_cx) / anchor_w  # -0.5 ~ 0.5
134 |     ty = (cy - anchor_cy) / anchor_h  # -0.5 ~ 0.5
135 |     tw = np.log(w / anchor_w)
136 |     th = np.log(h / anchor_h)
137 |     return np.array([tx, ty, tw, th])
138 | 
139 | 
140 | def bbox2yololabel(bboxs, grids, anchor_ratios, class_number=2):
141 |     """
142 |     bboxs: np.ndarray. [n,4] [[cx, cy, w,h]*]. bboxs are normalized to 0~1
143 |     """
144 |     channel = len(anchor_ratios) * (1 + class_number + 4)
145 |     labels = np.zeros(shape=(grids[0], grids[1], len(anchor_ratios), channel // len(anchor_ratios)), dtype=np.float32)
146 |     labels[..., 1] = np.array([1.0])
147 |     anchors = gen_anchors(grids, anchor_ratios)
148 |     ious = iou(anchors, bboxs)
149 |     for i in range(ious.shape[0]):
150 |         index = np.unravel_index(ious[i].argmax(), ious[i].shape)[:-1]
151 |         # assign label to the anchor whose iou is the max one
152 |         encoded_bbox = encode(anchors[index], bboxs[i])
153 |         scores = np.array([1.0])
154 |         cls_confidence = np.array([0.0, 1.0])
155 |         label = np.concatenate([scores, cls_confidence, encoded_bbox], axis=0)  # [score, cls_socre, bbox]
156 |         labels[index] = label
157 |     return labels.astype(np.float32)
158 | 
159 | 
160 | def test_iou():
161 |     anchors = [[0.5, 0.5, 1, 1], [1, 1, 2, 2]]
162 |     boxs = [[0.5, 0.5, 1, 1], [1, 1, 2, 2], [1.5, 1.5, 1, 1]]
163 |     anchors = np.array(anchors)
164 |     anchors = np.expand_dims(anchors, axis=[0, 1])
165 |     anchors = np.tile(anchors, [3, 3, 1, 1])
166 |     boxs = np.array(boxs)
167 |     _ious = iou(anchors, boxs)
168 |     assert np.allclose(_ious[0, :, :, 0, :], [1])
169 |     assert np.allclose(_ious[0, :, :, 1, :], [0.25])
170 |     assert np.allclose(_ious[1, :, :, 0, :], [0.25])
171 |     assert np.allclose(_ious[1, :, :, 1, :], [1])
172 |     assert np.allclose(_ious[2, :, :, 0, :], [0])
173 |     assert np.allclose(_ious[2, :, :, 1, :], [0.25])
174 | 
175 | 
176 | if __name__ == '__main__':
177 |     test_iou()
178 | 


--------------------------------------------------------------------------------