├── LICENSE ├── README.md ├── extract-table-visual.py ├── extract-table.py ├── findCounter.py ├── hough_line.py ├── media ├── 0b594eba5b576dce0fe7136c3af11b2d.jpg ├── 0c25686327ca6581752d1941e226d857.png ├── 12bf5b36a020426d75c265c154ad7a73.png ├── 2072aa36234b9401fa098c7576988a8d.png ├── 3fb740a4a8647b36b4804418457e3134.png ├── 43028238b302b8f870aae886b0f4c6d8.png ├── 65d04fccbf578a6f546c670657ae4c8e.jpg ├── 666ffa53885c7be62da3089fc8807649.jpg ├── 813fc6bcc44f63113e40a7d73094d091.png ├── 8f140a65e27f94bcc85f0fd72b0d5ce1.png ├── a9e18c699f4be54860792c2150385fd5.png ├── b7d18fe8c8bce5463d26429bf6981de1.png ├── f1e11863321b5815f2537c48af03f8fc.png ├── f3edeb885dcb7617bd14eee9e587a93a.png └── f9e05cc8e53e6608c7cac688e5b4bd5a.png ├── opencv-Perspective-auto.py ├── opencv-Perspective.py ├── pycococreatortools ├── __init__.py ├── __pycache__ │ ├── __init__.cpython-36.pyc │ └── pycococreatortools.cpython-36.pyc └── pycococreatortools.py ├── results ├── result_0.jpg ├── result_1.jpg ├── result_2.jpg └── rotated.png ├── results_label ├── INTERnet_vzorcni%20primeri_TPNO70_seznam_nalog%20kritje_19042012_objavljeno_0.json ├── Internet%20docs%20on%20Child%20labor_9.json └── internationell-politik-och-relationer-sh-b_0.json ├── rotation-opencv.py ├── table-cell-to-coco.py ├── table-rotation.py └── utils ├── __init__.py ├── convert_dots_to_bbox.py ├── draw_bbox.py ├── iter_all_images.py ├── match_files.py └── random_save_some_files_to_another_folder.py /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 weidafeng 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 项目说明 2 | ======== 3 | 4 | 本项目是我2019年7月份的实习工作的\*\***展示与记录**\*\*: 5 | 6 | 1. 把倾斜的表格旋转水平; 7 | 8 | 2. 制作5000张表格数据集,需要标注每一个单元格,并实现单元格检测 9 | 10 | 第一项比较简单,仿射变换、透视变换已经很成熟了,关键是第二项。 11 | 12 | 考虑到人工标注太费时(1张表格图片大概30个单元格,大概3~5分钟一张,一共需2500小时),所以尝试利用常规的图像处理,提取出表格线,自动标注。 13 | 14 | 涉及代码:图像旋转(仿射变换)、表格线提取、转coco格式、可视化、随机保存文件、根据图片文件名匹配对应的标签文件等 15 | 16 | 关键代码说明: 17 | -------------- 18 | 19 | ### 表格旋转 20 | 21 | `table-rotation.py` 22 | 23 | 只需指定输入图像路径,自动计算旋转角度、仿射变换、旋转调整整个图像(不限于表格图像,文字图像也可以) 24 | 25 | 过程: 1. HoughLines ——\> get the rotation angle 2. warpAffine ——\> 26 | affine(rotation) 27 | 28 | ### 从表格图片提取每个单元格坐标 29 | 30 | `extract-table-visual.py` 31 | 32 | 输入一张图片,通过腐蚀膨胀等操作去除表格内容,得到表格线,可视化。本代码的参数设置使可视化效果很好,但实际上左边框可能未闭合(肉眼不可见) 33 | 34 | `extract-table.py` 35 | 36 | 输入一张图片,可视化表格线。为了使表格闭合、得到交点,故意调整参数,使得横线、竖线都更长。 37 | 38 | ### 制作coco格式数据集 39 | 40 | `table-cell-to-coco.py (刚刚修改了多线程处理的bug-12月11日)` 41 | 42 | table-cell 单元格识别说明文档 43 | ============================= 44 | 45 | 任务一:tabel-bank数据集(标注到表格级,coco格式) 46 | -------------------------------------------------- 47 | 48 | ### 数据集下载 49 | 50 | (因版权限制,请自行申请链接) 51 | 52 | 该数据集分两块,word文档版和LaTeX文档版,其中word版比较杂,文件命名不规范,文件名有大量的拉丁文、俄文,中文系统下可能会有编码错误。表格识别任务建议使用LaTeX版的数据集。 53 | 54 | 使用mmdetection目标识别库进行训练时,只需修改config/xxx.py的数据集目录、图片大小、label_name、label种类数(2,表示表格和背景两类)。 55 | 56 | ### 表格检测示例图 57 | 58 | 检测结果示意图(使用最简单的faster rcnn训练12个epoch,准确率达到99%以上): 59 | 60 | ![C:\\Users\\wdf\\AppData\\Local\\Temp\\WeChat Files\\4ccb3b791a8466243ff2abe270da483.png](media/3fb740a4a8647b36b4804418457e3134.png) 61 | 62 | 任务二:tabel-cell 数据集(标注到单元格级,coco格式) 63 | ----------------------------------------------------- 64 | 65 | ### 对应代码: 66 | 67 | table-cell-to-coco.py 68 | 69 | ### 图片来源: 70 | 71 | table 72 | bank数据集word版图片,选取以a-c开头的5116张图片(然后从这5116张图片中随机选择1000张做测试集、其余做训练集)。 73 | 74 | ### 标签制作: 75 | 76 | \`\`\` bash 77 | 78 | 利用opencv库,提取出表格、单元格,然后转成coco格式 79 | 80 | \#\#\#\# 分割单元格步骤 81 | 82 | \# 1. 读取图像; 83 | 84 | \# 2. 二值化处理; 85 | 86 | \# 3. 横向、纵向的膨胀、腐蚀操作,得到横线图img_row和竖线图img_col; 87 | 88 | \# 4. 得到点图,img_row + img_col=img_dot; 89 | 90 | \# 5. 得到线图,img_row × img_col=img_line(线图只是拿来看看的,后续没有用到); 91 | 92 | \# 6. 根据点图得到闭合矩形单元格(左上角、右下角坐标) 93 | 94 | \# 7. 人工设计规则,优化这些坐标 95 | 96 | \# 8. 利用得到的坐标、可视化、保存可视化结果,人工从中选出结果好的样例,做数据集 97 | 98 | \# 9. 转化为coco格式 99 | 100 | \`\`\` 101 | 102 | ### 示例图 103 | 104 | 步骤5示例图: 105 | 106 | ![C:\\Users\\wdf\\AppData\\Local\\Temp\\WeChat Files\\90a7964b740e6d6128691fb8702569e.png](media/f3edeb885dcb7617bd14eee9e587a93a.png) 107 | 108 | 步骤8示例图: 109 | 110 | ![C:\\Users\\wdf\\AppData\\Local\\Temp\\WeChat Files\\709802294175933996.jpg](media/0b594eba5b576dce0fe7136c3af11b2d.jpg) 111 | 112 | 任务三:倾斜矫正、单元格识别 113 | ---------------------------- 114 | 115 | ### 倾斜矫正 116 | 117 | \`\`\` bash 118 | 119 | \# 输入一张倾斜的图像,自动仿射变换、旋转调整整个图像 120 | 121 | \# 步骤: 122 | 123 | \# 1. HoughLines ——\> get the rotation angle 124 | 125 | \# 2. warpAffine ——\> affine(rotation) 126 | 127 | \`\`\` 128 | 129 | **对应代码:** 130 | 131 | table-rotation.py 132 | 133 | #### 示例图: 134 | 135 | ![](media/12bf5b36a020426d75c265c154ad7a73.png) 136 | 137 | 示例图2(拍照,也能精准地调整到水平状态,且不影响表格内容): 138 | 139 | ![](media/0c25686327ca6581752d1941e226d857.png) 140 | 141 | ### 单元格识别 142 | 143 | 利用mmdetection目标识别库,训练任务二标注的table-cell数据集。 144 | 145 | 需要修改config文件,如config/cascade_mask_rcnn_r101_fpn_1x.py的字段: 146 | 147 | 数据集路径data_root = '/home/weidafeng/dataset/coco/TableBank/Word/' 148 | 149 | 类别数目num_classes=2,\#两处 150 | 151 | 图像大小img_scale=(596,842) 152 | 153 | 修改label name: 154 | 155 | - mmdetection/mmdet/core/evaluation/class_names.py 156 | 157 | - mmdetection/mmdet/datasets/coco.py 158 | 159 | - python setup.py install 160 | 161 | 我使用config/cascade_mask_rcnn_r101_fpn_1x.py进行测试,修改后的config文件、训练12个eopch达到99%以上的准确率,该配置文件及模型以上传到云盘(链接: 162 | https://pan.baidu.com/s/1nfGd7s0AMujJ00pCFAOyrA 提取码: 163 | hupu),下载后可进行测试。 164 | 165 | ![](media/b7d18fe8c8bce5463d26429bf6981de1.png) 166 | 167 | 测试步骤(详见mmdetection使用说明): 168 | 169 | 1. 使用提供的inference.py文件替换mmdetection/mmdet/apis/inference.py(我主要添加以文本形式保存预测结果的函数,不替换也能看到可视化效果) 170 | 171 | 2. 重新编译 python setup.py install 172 | 173 | 3. 运行测试代码: 174 | 175 | \`\`\` bash 176 | 177 | - bash test.sh \~/test_images/ ../mmdetection/config/mask_xxxx.py 178 | ../mmdetection/workdir/latest.pth 179 | 180 | \`\`\` 181 | 182 | #### 测试结果示例图: 183 | 184 | ![C:\\Users\\wdf\\AppData\\Local\\Temp\\WeChat Files\\63d042e5f3bb277d4dd36de0a40d5e2.png](media/8f140a65e27f94bcc85f0fd72b0d5ce1.png) 185 | 186 | ![C:\\Users\\wdf\\AppData\\Local\\Temp\\WeChat Files\\a6981c385d7b279e3b07a9465fe1de9.png](media/2072aa36234b9401fa098c7576988a8d.png) 187 | 188 | 测试结果示例图(**拍照,结果一样很好**): 189 | 190 | ![](media/666ffa53885c7be62da3089fc8807649.jpg) 191 | 192 | ![](media/65d04fccbf578a6f546c670657ae4c8e.jpg) 193 | 194 | ![](media/43028238b302b8f870aae886b0f4c6d8.png) 195 | 196 | 测试示例图(少部分结果出现漏检、误检): 197 | 198 | ![C:\\Users\\wdf\\AppData\\Local\\Temp\\WeChat Files\\cb8eceecf05b2da66c3a48b090aac08.png](media/f9e05cc8e53e6608c7cac688e5b4bd5a.png) 199 | 200 | ![C:\\Users\\wdf\\AppData\\Local\\Temp\\WeChat Files\\d287a509f1c22c97b3e47eb1969e3cd.png](media/a9e18c699f4be54860792c2150385fd5.png) 201 | 202 | Reference: 203 | ----------- 204 | 205 | ### TableBank: 206 | 207 | 208 | 209 | ### 制作coco格式数据集: 210 | 211 | 212 | 213 | ### mmdetection 214 | 215 | 216 | 217 | ### 图像旋转数学原理: 218 | 219 | https://blog.csdn.net/liyuan02/article/details/6750828 220 | 221 | ### 仿射变换与透射变换: 222 | 223 | 仿射变换和透视变换更直观的叫法可以叫做“平面变换”和“空间变换”或者“二维坐标变换”和“三维坐标变换”. 224 | 从另一个角度也能说明三维变换和二维变换的意思,仿射变换的方程组有6个未知数,所以要求解就需要找到3组映射点,三个点刚好确定一个平面. 225 | 透视变换的方程组有8个未知数,所以要求解就需要找到4组映射点,四个点就刚好确定了一个三维空间. 226 | 227 | ### 傅里叶相关知识: 228 | 229 | https://blog.csdn.net/on2way/article/details/46981825 230 | - 频率:对于图像来说就是指图像颜色值的梯度,即灰度级的变化速度 231 | - 幅度:可以简单的理解为是频率的权,即该频率所占的比例 232 | DFT之前的原图像在x y方向上表示空间坐标,DFT是经过x 233 | y方向上的傅里叶变换来统计像素在这两个方向上不同频率的分布情况, 234 | 所以DFT得到的图像在x y方向上不再表示空间上的长度,而是频率。 235 | -------------------------------------------------------------------------------- /extract-table-visual.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*-# 2 | #------------------------------------------------------------------------------- 3 | # Name: extract-table.py 4 | # Author: wdf 5 | # Date: 2019/7/9 6 | # IDE: PyCharm 7 | # Parameters: 8 | # @param: 9 | # @param: 10 | # Return: 11 | # 12 | # Description: 13 | # 参考: 14 | # https://answers.opencv.org/question/63847/how-to-extract-tables-from-an-image/ (opencv官方示例) 15 | # https://blog.csdn.net/yomo127/article/details/52045146 (c++版代码) 16 | # https://blog.csdn.net/weixin_34059951/article/details/88151801 (python) 17 | # 输入一张平整的图片,提取横线、竖线、交叉点、绘制表格 18 | # 只对平整的图片有效,旋转表格效果不行 19 | 20 | #### 分割单元格步骤 21 | # 1. 读取图像; 22 | # 2. 二值化处理; 23 | # 3. 横向、纵向的膨胀、腐蚀操作,得到横线图img_row和竖线图img_col; 24 | # 4. 得到点图,img_row + img_col=img_dot; 25 | # 5. 得到线图,img_row × img_col=img_line(线图只是拿来看看的,后续没有用到); 26 | # 6. 浓缩点团到单个像素; 27 | # 7. 开始遍历各行的点,将各个单元格从二值图像上裁剪出来,保存到temp文件夹。 28 | # --------------------- 29 | # 原文:https://blog.csdn.net/muxiong0308/article/details/80969355 (python实现) 30 | # 注释+C++实现:https://blog.csdn.net/yomo127/article/details/52045146 (逐行注释) 31 | # Usage: 若表格内容仍被处理为边框,可以调整腐蚀、膨胀函数的参数,比如调大处理次数(iteration) 32 | #------------------------------------------------------------------------------- 33 | import cv2 34 | import numpy as np 35 | 36 | def get_rec(img): 37 | """ 38 | 获取单元格顶点坐标 39 | :param img: 40 | :return: 41 | """ 42 | contours, hierarchy = cv2.findContours(img, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE) 43 | contours_poly = [0] * len(contours) 44 | boundRect = [0] * len(contours) 45 | rois = [] 46 | print("contours",contours) 47 | print("len(contours)",len(contours)) 48 | for i in range(len(contours)): 49 | cnt = contours[i] 50 | contours_poly[i] = cv2.approxPolyDP(cnt, 1, True) 51 | boundRect[i] = cv2.boundingRect(contours_poly[i]) 52 | rois.append(np.array(boundRect[i])) 53 | # img = cv2.rectangle(img_bak, (boundRect[i][0], boundRect[i][1]), (boundRect[i][2], boundRect[i][3]), 54 | # (255, 255, 255), 1, 8, 0) 55 | 56 | return rois 57 | 58 | def main(img_path): 59 | image = cv2.imread(img_path, 1) 60 | # 二值化 61 | gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) 62 | binary = cv2.adaptiveThreshold(~gray, 255, # ~取反,很重要,使二值化后的图片是黑底白字 63 | cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 15, -10) 64 | cv2.imshow("binary ", binary) 65 | 66 | rows, cols = binary.shape 67 | scale = 20 # 这个值越大,检测到的直线越多 68 | 69 | # 识别横线 70 | kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (cols // scale, 1)) 71 | # getStructuringElement: Returns a structuring element of the specified size and shape for morphological operations. 72 | # (cols // scale, 1) 为了获取横向的表格线,设置腐蚀和膨胀的操作区域为一个比较大的横向直条 73 | eroded = cv2.erode(binary, kernel, iterations=1) 74 | # cv2.imshow("Eroded Image",eroded) 75 | dilatedcol = cv2.dilate(eroded, kernel, iterations=1) 76 | # cv2.imshow("Dilated col", dilatedcol) 77 | 78 | 79 | # 识别竖线 80 | kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, rows // scale)) 81 | # 竖直方向上线条获取的步骤同上,唯一的区别在于腐蚀膨胀的区域为一个宽为1,高为缩放后的图片高度的一个竖长形直条 82 | eroded = cv2.erode(binary, kernel, iterations=1) 83 | dilatedrow = cv2.dilate(eroded, kernel, iterations=1) 84 | # cv2.imshow("Dilated row", dilatedrow) 85 | 86 | 87 | # 标识交点 88 | bitwiseAnd = cv2.bitwise_and(dilatedcol, dilatedrow) 89 | cv2.imshow("bitwiseAnd Image", bitwiseAnd) 90 | rois = get_rec(bitwiseAnd) 91 | # print(rois) 92 | 93 | lst = [] 94 | for i, r in enumerate(rois): 95 | print(i,r) 96 | # cv2.imshow("src" + str(i), image[r[3]:r[1], r[2]:r[0]]) 97 | lst.append(list(r)) 98 | 99 | print(lst) 100 | # 标识表格 101 | merge = cv2.add(dilatedcol, dilatedrow) 102 | cv2.imshow("add Image", merge) 103 | # cv2.imwrite("./img/mask.jpg", merge) 104 | cv2.waitKey(0) 105 | cv2.destroyAllWindows() 106 | if __name__ == '__main__': 107 | img_path = './img/4.jpg' 108 | img_path2 = './img/table-6.png' 109 | main(img_path) 110 | # 参考: https://blog.csdn.net/yomo127/article/details/52045146 111 | 112 | -------------------------------------------------------------------------------- /extract-table.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*-# 2 | # ------------------------------------------------------------------------------- 3 | # Name: extract-table.py 4 | # Author: wdf 5 | # Date: 2019/7/9 6 | # IDE: PyCharm 7 | # Parameters: 8 | # @param: 9 | # @param: 10 | # Return: 11 | # 12 | # Description: 13 | # 参考: 14 | # https://answers.opencv.org/question/63847/how-to-extract-tables-from-an-image/ (opencv官方示例) 15 | # https://blog.csdn.net/yomo127/article/details/52045146 (c++版代码) 16 | # https://blog.csdn.net/weixin_34059951/article/details/88151801 (python) 17 | # 输入一张平整的图片,提取横线、竖线、交叉点、绘制表格 18 | # 只对平整的图片有效,旋转表格效果不行 19 | 20 | #### 分割单元格步骤 21 | # 1. 读取图像; 22 | # 2. 二值化处理; 23 | # 3. 横向、纵向的膨胀、腐蚀操作,得到横线图img_row和竖线图img_col; 24 | # 4. 得到点图,img_row + img_col=img_dot; 25 | # 5. 得到线图,img_row × img_col=img_line(线图只是拿来看看的,后续没有用到); 26 | # 6. 浓缩点团到单个像素; 27 | # 7. 开始遍历各行的点,将各个单元格从二值图像上裁剪出来,保存到temp文件夹。 28 | # --------------------- 29 | # 原文:https://blog.csdn.net/muxiong0308/article/details/80969355 (python实现) 30 | # 注释+C++实现:https://blog.csdn.net/yomo127/article/details/52045146 (逐行注释) 31 | # Usage: 若表格内容仍被处理为边框,可以调整腐蚀、膨胀函数的参数,比如调大处理次数(iteration) 32 | # ------------------------------------------------------------------------------- 33 | import cv2 34 | import json 35 | import numpy as np 36 | from pathlib import Path 37 | import progressbar 38 | 39 | 40 | def iter_all_files(folder_dir): 41 | ''' 42 | 遍历文件夹里所有文件, 43 | 过滤掉其他文字(如俄罗斯文) 44 | 输入示例: 45 | ROOT_DIR = Path('..') 46 | IMAGE_DIR = ROOT_DIR / Path('img') 47 | iter_all_files(IMAGE_DIR) 48 | 49 | :param folder_dir: 输入文件夹路径 50 | :return: 文件夹内所有文件名的列表(只返回jpg文件) 51 | ''' 52 | capital = [chr(x) for x in range(65,91)] 53 | lowercase = [chr(x) for x in range(97, 123)] 54 | capital.extend(lowercase) 55 | im_files = [f for f in folder_dir.iterdir() if f.suffix == '.jpg' and f.stem[0] in capital] 56 | # im_files.sort(key=lambda f: int(f.stem[1:]),reverse=True) # 排序,防止顺序错乱、数据和标签不对应 57 | # print("length:",len(im_files),"\n im_files:",im_files) 58 | 59 | # 进度条 60 | # w = progressbar.widgets 61 | # widgets = ['Progress: ', w.Percentage(), ' ', w.Bar('#'), ' ', w.Timer(), 62 | # ' ', w.ETA(), ' ', w.FileTransferSpeed()] 63 | # progress = progressbar.ProgressBar(widgets=widgets) 64 | # for im_file in progress(im_files): 65 | # 66 | # print(im_file) 67 | return im_files 68 | 69 | 70 | def get_rec(img): 71 | """ 72 | 获取单元格顶点坐标 73 | :param img: 74 | :return: 75 | """ 76 | # 在mask那张图上通过findContours 找到轮廓,判断轮廓形状和大小是否为表格。 77 | contours, hierarchy = cv2.findContours(img, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE) 78 | contours_poly = [0] * len(contours) 79 | # print("len contours",len(contours)) 80 | boundRect = [0] * len(contours) 81 | rois = [] 82 | rois_list = [] 83 | for i in range(len(contours)): 84 | cnt = contours[i] 85 | # approxPolyDP 函数用来逼近区域成为一个形状,true值表示产生的区域为闭合区域。 86 | contours_poly[i] = cv2.approxPolyDP(cnt, 2, True) 87 | # boundingRect为将这片区域转化为矩形,此矩形包含输入的形状。 88 | boundRect[i] = cv2.boundingRect(contours_poly[i]) 89 | rois.append(np.array(boundRect[i])) 90 | rois_list.append(list(boundRect[i])) 91 | # img = cv2.rectangle(img_bak, (boundRect[i][0], boundRect[i][1]), (boundRect[i][2], boundRect[i][3]), 92 | # (255, 255, 255), 1, 8, 0) 93 | # rois = split_rec(rois) 94 | # print("len rois", len(rois_list)) 95 | return rois_list, rois 96 | 97 | 98 | def get_total_row_cols(x): 99 | ''' 100 | # # 输入交点列表,计算每行一共有多少个点 101 | # 输出为点的行偏移、本行点数(字典形式) 102 | # 格式 103 | # [58, 174, 1, 1], 104 | # [557, 145, 1, 1], 105 | # [513, 145, 1, 1], 106 | # [471, 145, 1, 1], 107 | # [58, 145, 1, 1]] 108 | :param x: 109 | :return: 110 | ''' 111 | 112 | row = {} 113 | num = 1 114 | for i in range(len(x) - 1): 115 | if x[i][1] == x[i + 1][1]: 116 | num += 1 117 | row[x[i][1]] = num 118 | else: 119 | num = 1 120 | return row 121 | 122 | 123 | def clean_dots(row, err=1): 124 | # 输入一个列表,指定一个精度,key之间小于该精度的,归为一类 125 | # err = 2 # 允许的误差 126 | ''' 127 | # 例如本例,452和451相近,归为一类 128 | # d = {770: 5, 730: 5, 683: 5, 644: 5, 617: 5, 471: 3, 470: 2, 452: 3, 451: 2, 414: 5, 360: 5, 286: 5, 50: 5} 129 | ''' 130 | d = row # 输入的字典(横坐标:该行点数) 131 | d_keys = list(d.keys()) 132 | for i in range(len(d_keys) - 1): 133 | # print(d_keys[i],d_keys[i+1]) 134 | if abs(d_keys[i + 1] - d_keys[i]) < err: # 两个点在误差允许范围内很接近 135 | # print(d[d_keys[i]] + d[d_keys[i+1]]) # 两点总数合并 136 | d[d_keys[i + 1]] = d[d_keys[i]] + d[d_keys[i + 1]] # 两点总数合并 137 | del d[d_keys[i]] # 删除其中一个 138 | # print(d) 139 | return d # 清洗后的字典{横坐标:该行点数} 140 | 141 | 142 | def get_dots(x, row): 143 | # 得到点的坐标 144 | # 输入: 145 | # 点列表x, 146 | # 每行点数 147 | results = [] 148 | # print("坐标值, 本行点数") 149 | for key in row: 150 | # print(row[key]) 151 | # print("*"*50) 152 | # print(key, row[key]) 153 | for val in range(row[key]): 154 | # print(key) 155 | yy = key 156 | xx = [val[0] for val in x if val[1] == yy] 157 | result = [[x, yy] for x in xx] 158 | # print(result) 159 | results.append(result) 160 | return results 161 | 162 | 163 | def get_bounding_box(results): 164 | ''' 165 | # 得到bounding box的对角线两点坐标(右下角、左上角) 166 | # 决定提取单元格效果的关键是设计的人工规则 167 | :param results: results = get_dots(row) 168 | :return: 对角两点坐标列表 169 | ''' 170 | bounding_box = [] 171 | for i in range(len(results) - 1): 172 | col_down = results[i] 173 | col_up = results[i + 1] 174 | # print(col_down) 175 | # print(col_up) 176 | len_down, len_up = len(col_down), len(col_up) 177 | # print(len_down,len_up) 178 | 179 | if len_down == len_up: # 上下两行点数相同,直接取对角点 180 | # print("上下两行点数相同,直接取对角点") 181 | for j in range(len(col_down) - 1): 182 | # print(col_down[j], col_up[j + 1]) 183 | bounding_box.append([col_down[j], col_up[j + 1]]) 184 | elif len_down > len_up: # 下面点数多: 185 | # print("下面点数多") 186 | for j in range(len(col_up) - 1): 187 | k = j # k存储多的点 188 | while k < len_down - 1: # 遍历下面所有的点(点数多的那条直线) 189 | if col_down[k + 1][0] == col_up[j + 1][0] : # 末尾两点匹配,且开头两点匹配 190 | # print(col_down[k], col_up[j + 1]) 191 | bounding_box.append([col_down[j], col_up[j + 1]]) 192 | k += 1 193 | else: # 上面点数多 194 | # print("上面点数多") 195 | for j in range(len(col_down) - 1): 196 | k = j # k存储多的点 197 | while k < len_up - 1: # 遍历上面所有的点(点数多的那条直线) 198 | if col_up[k + 1][0] == col_down[j + 1][0] and col_down[j][0] in col_up[k+1]: # 末尾两点匹配,且开头两点匹配 199 | # print(col_down[j], col_up[k + 1]) 200 | bounding_box.append([col_down[j], col_up[k + 1]]) 201 | k += 1 202 | return bounding_box 203 | 204 | 205 | def draw_bbox(img, bboxs, img_name='None', save=False, show=True): 206 | """ 207 | 可视化单元格 208 | 输入:图像、坐标列表 209 | :param img: 210 | :param bboxs: 输入的单元格坐标列表,格式:[左下角、右上角] 211 | :param save: 212 | True: 保存成图像,图像名为“原图像_box.jpg” 213 | False: 不保存,只可视化 214 | :param img_name: 若指定save 为True,则需指定该项 215 | :return: 216 | """ 217 | for i in range(len(bboxs)): 218 | ''' 219 | cv2.rectangle 的两个参数分别代表矩形的左上角和右下角两个点, 220 | 而且 x 坐标轴是水平方向的,y 坐标轴是垂直方向的。 221 | 222 | x1,y1 ------ -> x 223 | | | 224 | | | 225 | | | 226 | --------x2,y2 227 | ∨ 228 | y 229 | 230 | ''' 231 | for i in range(len(bboxs)): 232 | pt1 = (bboxs[i][1][0], bboxs[i][1][1]) # 左上角 233 | pt2 = (bboxs[i][0][0], bboxs[i][0][1]) # 右下角 234 | 235 | img = cv2.rectangle(img, 236 | pt1=pt1, 237 | pt2=pt2, 238 | color=(255, 0, 0), 239 | thickness=2, 240 | lineType=1, 241 | shift=0) 242 | if save: 243 | # assert img_name != 'None', "如需保存结果,应指定图像名" 244 | result_name = "./results/" + img_name + ".jpg" 245 | cv2.imwrite(filename=result_name, img=img) 246 | output_json = Path('./results_label') / Path(f'{img_name}.json') 247 | with output_json.open('w', encoding='utf-8') as f: 248 | json.dump(bboxs, f) 249 | 250 | if show: # 可视化 251 | cv2.imshow("contour", img) 252 | 253 | 254 | def extract_lines(image, scale=20, erode_iters=1, dilate_iters=2, show=True): 255 | # 输入一张图片,提取横线、竖线 256 | ''' 257 | :param image: image = cv2.imread(img_path,1) 258 | :param scale: scale = 20 # 这个值越大,检测到的直线越多 259 | :param erode_iters: 腐蚀的次数 260 | :param dilate_iters: 膨胀的次数 261 | :param show 是否可视化 262 | :return: dilatedcol, dilatedrow : 得到的竖线、横线 263 | ''' 264 | 265 | # 二值化 266 | gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) 267 | binary = cv2.adaptiveThreshold(~gray, 255, # ~取反,很重要,使二值化后的图片是黑底白字 268 | cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 15, -10) 269 | # cv2.imshow("binary ", binary) 270 | 271 | rows, cols = binary.shape 272 | 273 | # 识别横线 274 | kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (cols // scale, 1)) 275 | # getStructuringElement: Returns a structuring element of the specified size and shape for morphological operations. 276 | # (cols // scale, 1) 为了获取横向的表格线,设置腐蚀和膨胀的操作区域为一个比较大的横向直条 277 | eroded = cv2.erode(binary, kernel, iterations=erode_iters) 278 | # cv2.imshow("Eroded Image",eroded) 279 | dilatedcol = cv2.dilate(eroded, kernel, iterations=dilate_iters) # 为了是表格闭合,故意使得到的横向更长(以得到交点——bounding-box) 280 | 281 | # 识别竖线 282 | kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, rows // scale)) 283 | # 竖直方向上线条获取的步骤同上,唯一的区别在于腐蚀膨胀的区域为一个宽为1,高为缩放后的图片高度的一个竖长形直条 284 | eroded = cv2.erode(binary, kernel, iterations=erode_iters) 285 | dilatedrow = cv2.dilate(eroded, kernel, iterations=dilate_iters) # 为了是表格闭合,故意使线变长 286 | if show: 287 | print("shape:", rows, cols) 288 | cv2.imshow("Dilated col", dilatedcol) 289 | cv2.imshow("Dilated row", dilatedrow) 290 | # 绘制出横线、竖线 291 | merge = cv2.add(dilatedcol, dilatedrow) 292 | cv2.imshow("col & row", merge) 293 | return dilatedcol, dilatedrow 294 | 295 | 296 | def get_bit_wise(col, row, show=True): 297 | ''' 298 | 输入横线、竖线,得到交点 299 | :param col: 竖线 300 | :param row: 横线 301 | :return: 302 | ''' 303 | # 标识交点 304 | bitwiseAnd = cv2.bitwise_and(col, row) 305 | if show: 306 | cv2.imshow("bitwiseAnd Image", bitwiseAnd) 307 | return bitwiseAnd 308 | 309 | 310 | def process_single_image(img_path, show=True, save=False, scale=20, erode_iters=1, dilate_iters=2): 311 | ''' 312 | 输入单个图像路径 313 | 1. 提取表格线 314 | 2. 得到横线、竖线的交点 315 | 3. 通过交点找到矩形单元格坐标 316 | 4. 计算每一行有多少点 317 | 5. 清洗合并点(有些横不平、竖不直,只差一两个像素) 318 | 6. 把点转化成bounding box格式(左上角、右下角) 319 | 7. 可视化 320 | :param img_path: 非字符串格式,是pathlib.Path('./img')格式,方便后续提取图像名、保存结果 321 | :param scale=20, 越大提取的线越多 322 | :param erode_iters=1 323 | :param dilate_iters=2 324 | :param 325 | :return: 326 | ''' 327 | img_name = img.stem # 用于保存结果 328 | img_path = str(img_path) 329 | image = cv2.imread(img_path, 1) 330 | 331 | dilatedcol, dilatedrow = extract_lines(image, scale=scale, erode_iters=erode_iters, dilate_iters=dilate_iters, 332 | show=show) 333 | 334 | bitwiseAnd = get_bit_wise(col=dilatedcol, row=dilatedrow, show=show) 335 | rois_list, rois = get_rec(bitwiseAnd) 336 | # print(rois_list) 337 | # print(len(rois_list)) 338 | row = get_total_row_cols(x=rois_list) 339 | row = clean_dots(row) 340 | results = get_dots(x=rois_list, row=row) 341 | 342 | bounding_boxs = get_bounding_box(results) 343 | 344 | 345 | # 绘制单元格,save=False,可视化 346 | # save = True,指定img_name,保存图像 347 | draw_bbox(image, bounding_boxs, img_name=img_name, save=save, show=show) 348 | 349 | if show: 350 | print("bounding_boxs:",bounding_boxs) 351 | print("len(bounding_boxs):", len(bounding_boxs)) 352 | cv2.imshow("img", image) 353 | cv2.waitKey(0) 354 | 355 | 356 | if __name__ == '__main__': 357 | ROOT_DIR = Path('./img1') 358 | ROOT_DIR= Path('E:/dataset/TableBank/Word/train2017') 359 | imgs_list = iter_all_files(ROOT_DIR) 360 | 361 | w = progressbar.widgets 362 | widgets = ['Progress: ', w.Percentage(), ' ', w.Bar('#'), ' ', w.Timer(), 363 | ' ', w.ETA(), ' ', w.FileTransferSpeed()] 364 | progress = progressbar.ProgressBar(widgets=widgets) 365 | for img in progress(imgs_list): 366 | print(img) 367 | # process_single_image(img,show=True,save=False) 368 | process_single_image(img, show=False, save=True) 369 | 370 | # 参考: https://blog.csdn.net/yomo127/article/details/52045146 371 | -------------------------------------------------------------------------------- /findCounter.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*-# 2 | #------------------------------------------------------------------------------- 3 | # Name: findCounter.py 4 | # Author: wdf 5 | # Date: 2019/7/17 6 | # IDE: PyCharm 7 | # Parameters: 8 | # @param: 9 | # @param: 10 | # Return: 11 | # 12 | # Description: 13 | # Usage: 14 | #------------------------------------------------------------------------------- 15 | import cv2 16 | import numpy as np 17 | 18 | def split_rec(arr): 19 | """ 20 | 切分单元格 21 | :param arr: 22 | :return: 23 | """ 24 | # 数组进行排序 25 | print(arr) 26 | print("*"*50) 27 | arr.sort(key=lambda x: x[0],reverse=True) 28 | # 数组反转 29 | arr.reverse() 30 | for i in range(len(arr) - 1): 31 | if arr[i+1][0] == arr[i][0]: 32 | arr[i+1][3] = arr[i][1] 33 | arr[i + 1][2] = arr[i][2] 34 | if arr[i+1][0] > arr[i][0]: 35 | arr[i + 1][2] = arr[i][0] 36 | print(arr[i]) 37 | 38 | return arr 39 | 40 | def get_points(img_transverse, img_vertical): 41 | """ 42 | 获取横纵线的交点 43 | :param img_transverse: 44 | :param img_vertical: 45 | :return: 46 | """ 47 | img = cv2.bitwise_and(img_transverse, img_vertical) 48 | return img 49 | 50 | def get_vertical_line(binary): 51 | rows, cols = binary.shape 52 | scale = 20 # 这个值越大,检测到的直线越多 53 | 54 | # 识别竖线 55 | kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, rows // scale)) 56 | # 竖直方向上线条获取的步骤同上,唯一的区别在于腐蚀膨胀的区域为一个宽为1,高为缩放后的图片高度的一个竖长形直条 57 | eroded = cv2.erode(binary, kernel, iterations=1) 58 | dilatedrow = cv2.dilate(eroded, kernel, iterations=2) 59 | # cv2.imshow("Dilated row", dilatedrow) 60 | return dilatedrow 61 | 62 | def get_transverse_line(binary): 63 | rows, cols = binary.shape 64 | scale = 20 # 这个值越大,检测到的直线越多 65 | 66 | # 识别横线 67 | kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (cols // scale, 1)) 68 | # getStructuringElement: Returns a structuring element of the specified size and shape for morphological operations. 69 | # (cols // scale, 1) 为了获取横向的表格线,设置腐蚀和膨胀的操作区域为一个比较大的横向直条 70 | eroded = cv2.erode(binary, kernel, iterations=1) 71 | # cv2.imshow("Eroded Image",eroded) 72 | dilatedcol = cv2.dilate(eroded, kernel, iterations=2) 73 | # cv2.imshow("Dilated col", dilatedcol) 74 | return dilatedcol 75 | 76 | def bin_img(image): 77 | """ 78 | 对图像进行二值化处理 79 | :param img: 传入的图像对象(numpy.ndarray类型) 80 | :return: 二值化后的图像 81 | """ 82 | # 二值化 83 | gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) 84 | binary = cv2.adaptiveThreshold(~gray, 255, # ~取反,很重要,使二值化后的图片是黑底白字 85 | cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 15, -10) 86 | return binary 87 | 88 | 89 | def get_rec(img): 90 | """ 91 | 获取单元格 92 | :param img: 93 | :return: 94 | """ 95 | contours, hierarchy = cv2.findContours(img, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE) 96 | contours_poly = [0] * len(contours) 97 | boundRect = [0] * len(contours) 98 | rois = [] 99 | print("*"*50) 100 | print("contours: \n") 101 | for i in range(len(contours) - 1): 102 | cnt = contours[i] 103 | print(i,cnt) 104 | contours_poly[i] = cv2.approxPolyDP(curve=cnt, epsilon=1, closed=True) 105 | # 以指定的精度近似多边形曲线。 106 | ''' 107 | . @param curve Input vector of a 2D point stored in std::vector or Mat 108 | . @param epsilon Parameter specifying the approximation accuracy. This is the maximum distance 109 | . between the original curve and its approximation. 110 | . @param closed If true, the approximated curve is closed (its first and last vertices are 111 | . connected). Otherwise, it is not closed.''' 112 | boundRect[i] = cv2.boundingRect(contours_poly[i]) 113 | rois.append(np.array(boundRect[i])) 114 | pt1 = (boundRect[i][0], boundRect[i][1]), 115 | pt2 = (boundRect[i][2], boundRect[i][3]), 116 | print(img.shape) 117 | print("pt1:",pt1) 118 | print("pt2:",pt2) 119 | img = cv2.rectangle(img_bak, 120 | pt1=(boundRect[i][0], boundRect[i][1]), 121 | pt2=(boundRect[i][2], boundRect[i][3]), 122 | color=(0, 0, 255), 123 | thickness=2, 124 | lineType=1, 125 | shift=0) 126 | cv2.imshow("contour",img) 127 | rois = split_rec(rois) 128 | return rois 129 | 130 | if __name__ == "__main__": 131 | image = "./img/table-6.png" 132 | image1 = "./img/9.jpg" 133 | 134 | img_bak = cv2.imread(image) 135 | img = bin_img(img_bak) 136 | 137 | # img_transverse = erode_img(img,(1,2),40) 138 | # img_vertical = erode_img(img, (2,1), 40) 139 | # # img = img_transverse + img_vertical 140 | # img_transverse = dilate_img(img_transverse,(2,2),1) 141 | # img_vertical = dilate_img(img_vertical,(2,2),1) 142 | # 143 | # img = get_points(img_transverse,img_vertical) 144 | 145 | dilatedcol, dilatedrow = get_vertical_line(img), get_transverse_line(img) 146 | img = get_points(dilatedcol, dilatedrow) 147 | 148 | rois = get_rec(img) 149 | print("*"*50) 150 | 151 | print(rois) 152 | for i, r in enumerate(rois): 153 | cv2.imshow(str(i), img_bak[r[3]:r[1], r[2]:r[0]]) 154 | cv2.waitKey(0) 155 | 156 | cv2.destroyAllWindows() 157 | pass 158 | 159 | -------------------------------------------------------------------------------- /hough_line.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*-# 2 | #------------------------------------------------------------------------------- 3 | # Name: hough_line.py 4 | # Author: wdf 5 | # Date: 2019/7/7 6 | # IDE: PyCharm 7 | # Description: 8 | # Usage: 9 | #------------------------------------------------------------------------------- 10 | 11 | import cv2 as cv 12 | import numpy as np 13 | import math 14 | #-----------------霍夫变换--------------------- 15 | #前提条件: 边缘检测完成 16 | #标准霍夫线变换 17 | #标准霍夫线变换 18 | 19 | 20 | # 获取列表的最后一个元素 21 | def takeEnd(elem): 22 | return elem[-1] 23 | 24 | def line_detection(image): 25 | gray = cv.cvtColor(image, cv.COLOR_RGB2GRAY) 26 | edges = cv.Canny(gray, 50, 150, apertureSize=3) #apertureSize参数默认其实就是3 27 | # cv.imshow("edges", edges) 28 | lines = cv.HoughLines(edges, 1, np.pi/180, 80) 29 | for line in lines: 30 | rho, theta = line[0] #line[0]存储的是点到直线的极径和极角,其中极角是弧度表示的。 31 | a = np.cos(theta) #theta是弧度 32 | b = np.sin(theta) 33 | x0 = a * rho #代表x = r * cos(theta) 34 | y0 = b * rho #代表y = r * sin(theta) 35 | x1 = int(x0 + 1000 * (-b)) #计算直线起点横坐标 36 | y1 = int(y0 + 1000 * a) #计算起始起点纵坐标 37 | x2 = int(x0 - 1000 * (-b)) #计算直线终点横坐标 38 | y2 = int(y0 - 1000 * a) #计算直线终点纵坐标 39 | # 注:这里的数值1000给出了画出的线段长度范围大小,数值越小,画出的线段越短,数值越大,画出的线段越长 40 | cv.line(image, (x1, y1), (x2, y2), (0, 0, 255), 2) #点的坐标必须是元组,不能是列表。 41 | # cv.imshow("image-lines", image) 42 | 43 | def line_detect_possible_demo(image): 44 | gray = cv.cvtColor(image, cv.COLOR_BGR2GRAY) 45 | edges = cv.Canny(gray, 100, 150, apertureSize=3) 46 | lines = cv.HoughLinesP(edges, 1, np.pi/180, 100, minLineLength=50, maxLineGap=60) 47 | # minLineLengh(线的最短长度,比这个短的都被忽略) 48 | # maxLineCap(两条直线之间的最大间隔,小于此值,认为是一条直线) 49 | print(len(lines)) 50 | lengths = [] 51 | for line in lines: 52 | x1, y1, x2, y2 = line[0] 53 | length = ((x1-x2)**2 + (y1-y2)**2)**0.5 54 | lengths.append([x1, y1, x2, y2, length]) 55 | # print(line, length) 56 | cv.line(image, (x1, y1), (x2, y2), (0, 0, 255), 2) # 绘制所有直线 57 | # 绘制最长的直线 58 | lengths.sort(key=takeEnd) 59 | longest_line = lengths[-1] 60 | print(longest_line) 61 | x1, y1, x2, y2, length= longest_line 62 | cv.line(image, (x1, y1), (x2, y2), (0, 0, 0), 2) # 绘制直线 63 | 64 | # 计算这条直线的旋转角度 65 | angle = math.acos((x2-x1)/length) 66 | print(angle) # 弧度形式 67 | angle = angle*(180 /math.pi) 68 | print(angle) # 角度形式 69 | 70 | cv.imshow("longest", image) 71 | print(lengths) 72 | 73 | 74 | def main(): 75 | img = cv.imread("./img/rot-45.png") 76 | # cv.namedWindow("Show", cv.WINDOW_AUTOSIZE) 77 | # cv.imshow("Show", img) 78 | line_detect_possible_demo(img) 79 | # line_detection(img) 80 | 81 | cv.waitKey(0) 82 | cv.destroyAllWindows() 83 | if __name__ == '__main__': 84 | main() 85 | -------------------------------------------------------------------------------- /media/0b594eba5b576dce0fe7136c3af11b2d.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/0b594eba5b576dce0fe7136c3af11b2d.jpg -------------------------------------------------------------------------------- /media/0c25686327ca6581752d1941e226d857.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/0c25686327ca6581752d1941e226d857.png -------------------------------------------------------------------------------- /media/12bf5b36a020426d75c265c154ad7a73.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/12bf5b36a020426d75c265c154ad7a73.png -------------------------------------------------------------------------------- /media/2072aa36234b9401fa098c7576988a8d.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/2072aa36234b9401fa098c7576988a8d.png -------------------------------------------------------------------------------- /media/3fb740a4a8647b36b4804418457e3134.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/3fb740a4a8647b36b4804418457e3134.png -------------------------------------------------------------------------------- /media/43028238b302b8f870aae886b0f4c6d8.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/43028238b302b8f870aae886b0f4c6d8.png -------------------------------------------------------------------------------- /media/65d04fccbf578a6f546c670657ae4c8e.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/65d04fccbf578a6f546c670657ae4c8e.jpg -------------------------------------------------------------------------------- /media/666ffa53885c7be62da3089fc8807649.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/666ffa53885c7be62da3089fc8807649.jpg -------------------------------------------------------------------------------- /media/813fc6bcc44f63113e40a7d73094d091.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/813fc6bcc44f63113e40a7d73094d091.png -------------------------------------------------------------------------------- /media/8f140a65e27f94bcc85f0fd72b0d5ce1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/8f140a65e27f94bcc85f0fd72b0d5ce1.png -------------------------------------------------------------------------------- /media/a9e18c699f4be54860792c2150385fd5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/a9e18c699f4be54860792c2150385fd5.png -------------------------------------------------------------------------------- /media/b7d18fe8c8bce5463d26429bf6981de1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/b7d18fe8c8bce5463d26429bf6981de1.png -------------------------------------------------------------------------------- /media/f1e11863321b5815f2537c48af03f8fc.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/f1e11863321b5815f2537c48af03f8fc.png -------------------------------------------------------------------------------- /media/f3edeb885dcb7617bd14eee9e587a93a.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/f3edeb885dcb7617bd14eee9e587a93a.png -------------------------------------------------------------------------------- /media/f9e05cc8e53e6608c7cac688e5b4bd5a.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/media/f9e05cc8e53e6608c7cac688e5b4bd5a.png -------------------------------------------------------------------------------- /opencv-Perspective-auto.py: -------------------------------------------------------------------------------- 1 | 2 | 3 | # ============================================================================= 4 | # 点类 5 | # ============================================================================= 6 | class Point: 7 | def __init__(self, point): 8 | # x1, y1, x2, y2 = l # 前两个数为起点,后两个数为终点 9 | self.x = point[0] 10 | self.y = point[1] 11 | 12 | def copy(self): 13 | return self 14 | 15 | def toList(self): 16 | # 将点类转化为list类型 17 | return [int(self.x), int(self.y)] 18 | 19 | def lenth(self): 20 | return 1. * (self.x * self.x + self.y * self.y) ** 0.5 21 | 22 | def measureAngle(self, lastPoint, nextPoint): 23 | # 计算尖锐度,参考 https://www.cnblogs.com/jsxyhelu/p/5106760.html 24 | vect1 = [self.x - lastPoint.x, self.y - lastPoint.y] 25 | 26 | vect2 = [self.x - nextPoint.x, self.y - nextPoint.y] 27 | 28 | vect3 = [lastPoint.x - nextPoint.x, lastPoint.y - nextPoint.y] 29 | 30 | sin = 1.0 * Point(vect3).lenth() / (Point(vect1).lenth() + Point(vect2).lenth()) 31 | return 1 - sin 32 | 33 | def printf(self): 34 | print((self.x, self.y)) 35 | 36 | 37 | # ============================================================================= 38 | # 轮廓类 39 | # ============================================================================= 40 | class Contour(Point): 41 | def __init__(self, contour): 42 | self.contour = [] 43 | for p in contour: 44 | self.contour.append(Point(p[0])) 45 | self.length = len(contour) 46 | 47 | def pickLeftPoint(self, currentLocation, setp): 48 | # 防止取左边相邻点时越界 49 | if currentLocation - setp < 0: 50 | # print(currentLocation-setp+self.length) 51 | return currentLocation - setp + self.length 52 | else: 53 | # print(currentLocation-setp) 54 | return currentLocation - setp 55 | 56 | def pickRightPoint(self, currentLocation, setp): 57 | # 防止取右边相邻点时越界 58 | if currentLocation + setp > self.length - 1: 59 | # print(currentLocation+setp-self.length+1) 60 | return currentLocation + setp - self.length + 1 61 | else: 62 | # print(currentLocation+setp) 63 | return currentLocation + setp 64 | 65 | def getAngle(self, p, setp): 66 | # print(p) 67 | return self.contour[p].measureAngle(self.contour[self.pickRightPoint(p, setp)], 68 | self.contour[self.pickLeftPoint(p, setp)]) 69 | 70 | 71 | def sortPoint(rowdata): 72 | x = 0 73 | y = 0 74 | for p in rowdata: 75 | x = p.x + x 76 | y = p.y + y 77 | x = x / 4 78 | y = y / 4 79 | sorteddata = [[0, 0]] * 4 80 | for p in rowdata: 81 | if p.x < x and p.y < y: 82 | sorteddata[0] = p.toList() 83 | if p.x > x and p.y < y: 84 | sorteddata[1] = p.toList() 85 | if p.x > x and p.y > y: 86 | sorteddata[2] = p.toList() 87 | if p.x < x and p.y > y: 88 | sorteddata[3] = p.toList() 89 | return sorteddata 90 | 91 | 92 | def getPoint(contours): 93 | index = 0 94 | contour = contours[1] 95 | j = 0 96 | size = 0 97 | for i in contour: 98 | if i.size > size: 99 | size = i.size 100 | index = j 101 | j = j + 1 102 | maxContour = Contour(contour[index]) 103 | data = [] 104 | datas = [] 105 | for p in range(0, maxContour.length - 1): 106 | y = maxContour.getAngle(p, 5) 107 | datas.append(y) 108 | if 0.1 < y: 109 | data.append(maxContour.contour[p]) 110 | plt.plot(datas) 111 | plt.show() 112 | 113 | 114 | if __name__ == '__main__': 115 | old_img = cv2.imread('1.jpg') 116 | t_points = img_process(old_img) 117 | 118 | -------------------------------------------------------------------------------- /opencv-Perspective.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | ''' 3 | 先用cv2.getPerspectiveTransform(target_points, four_points)得到旋转矩阵 4 | 然后再用cv2.warpPerspective(img, M, (weight, height))进行透视变换。 5 | 6 | 关键是找到四个角点,本代码人工指定角点坐标 7 | ''' 8 | 9 | import cv2 10 | import numpy as np 11 | img= cv2.imread('img/rowRotate.jpg')#读取原始图片 12 | cv2.imshow("raw",img) 13 | target_points =[[278,189],[758,336],[570,1034],[65,900]] 14 | #这里我们通过人工的方式读出四个角点A,B,C,D 15 | height = img.shape[0] 16 | weight = img.shape[1] 17 | four_points= np.array(((0, 0), 18 | (weight - 1, 0), 19 | (weight - 1, height - 1), 20 | (0, height - 1)), 21 | np.float32) 22 | target_points = np.array(target_points, np.float32)#统一格式 23 | M = cv2.getPerspectiveTransform(target_points, four_points) 24 | Rotated= cv2.warpPerspective(img, M, (weight, height)) 25 | cv2.imshow("Rotated.jpg",Rotated) 26 | cv2.waitKey(0) 27 | cv2.destroyAllWindows() 28 | 29 | -------------------------------------------------------------------------------- /pycococreatortools/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/pycococreatortools/__init__.py -------------------------------------------------------------------------------- /pycococreatortools/__pycache__/__init__.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/pycococreatortools/__pycache__/__init__.cpython-36.pyc -------------------------------------------------------------------------------- /pycococreatortools/__pycache__/pycococreatortools.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/pycococreatortools/__pycache__/pycococreatortools.cpython-36.pyc -------------------------------------------------------------------------------- /pycococreatortools/pycococreatortools.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | import os 4 | import re 5 | import datetime 6 | import numpy as np 7 | from itertools import groupby 8 | from skimage import measure 9 | from PIL import Image 10 | from pycocotools import mask 11 | 12 | convert = lambda text: int(text) if text.isdigit() else text.lower() 13 | natrual_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ] 14 | 15 | def resize_binary_mask(array, new_size): 16 | ''' 从[0,1】 resize成【0,255】 17 | :param array: 18 | :param new_size: 19 | :return: 20 | ''' 21 | image = Image.fromarray(array.astype(np.uint8)*255) 22 | image = image.resize(new_size) 23 | return np.asarray(image).astype(np.bool_) 24 | 25 | def close_contour(contour): 26 | if not np.array_equal(contour[0], contour[-1]): 27 | contour = np.vstack((contour, contour[0])) 28 | return contour 29 | 30 | def binary_mask_to_rle(binary_mask): 31 | rle = {'counts': [], 'size': list(binary_mask.shape)} 32 | counts = rle.get('counts') 33 | for i, (value, elements) in enumerate(groupby(binary_mask.ravel(order='F'))): 34 | if i == 0 and value == 1: 35 | counts.append(0) 36 | counts.append(len(list(elements))) 37 | 38 | return rle 39 | 40 | def binary_mask_to_polygon(binary_mask, tolerance=0): 41 | """Converts a binary mask to COCO polygon representation 42 | Args: 43 | binary_mask: a 2D binary numpy array where '1's represent the object 44 | tolerance: Maximum distance from original points of polygon to approximated 45 | polygonal chain. If tolerance is 0, the original coordinate array is returned. 46 | """ 47 | polygons = [] 48 | # pad mask to close contours of shapes which start and end at an edge 49 | padded_binary_mask = np.pad(binary_mask, pad_width=1, mode='constant', constant_values=0) 50 | contours = measure.find_contours(padded_binary_mask, 0.5) 51 | contours = np.subtract(contours, 1) 52 | for contour in contours: 53 | contour = close_contour(contour) 54 | contour = measure.approximate_polygon(contour, tolerance) 55 | if len(contour) < 3: 56 | continue 57 | contour = np.flip(contour, axis=1) 58 | segmentation = contour.ravel().tolist() 59 | # after padding and subtracting 1 we may get -0.5 points in our segmentation 60 | segmentation = [0 if i < 0 else i for i in segmentation] 61 | polygons.append(segmentation) 62 | 63 | return polygons 64 | 65 | def create_image_info(image_id, file_name, image_size, 66 | date_captured=datetime.datetime.utcnow().isoformat(' '), 67 | license_id=1, coco_url="", flickr_url=""): 68 | image_info = { 69 | "id": image_id, 70 | "file_name": file_name, 71 | "width": image_size[0], 72 | "height": image_size[1], 73 | "date_captured": date_captured, 74 | "license": license_id, 75 | "coco_url": coco_url, 76 | "flickr_url": flickr_url 77 | } 78 | 79 | return image_info 80 | 81 | # 汇总 mask、bounding-box等信息 82 | def mask_create_annotation_info(annotation_id, image_id, area, category_id, image_size=None, bounding_box=None,segmentation= None): 83 | annotation_info = { 84 | "id": annotation_id, 85 | "image_id": image_id, 86 | "category_id": category_id, 87 | "iscrowd": 0, # 0或1,指定为0,表示“单个的对象(不存在多个对象重叠)”.只要是iscrowd=0那么segmentation就是polygon格式 88 | "area": area, # area是area of encoded masks,是标注区域的面积。如果是矩形框,那就是高乘宽; 浮点数,需大于0,因icdar数据没有segmentation,所以本项人为指定为10 89 | "bbox": bounding_box, 90 | "segmentation": segmentation, #polygon格式.这些数按照相邻的顺序两两组成一个点的xy坐标,如果有n个数(必定是偶数),那么就是n/2个点坐标。 91 | # 注意这里,必须是list 包含list,底层的list中必须有至少6个元素,否则coco api会过滤掉这个annotations,也就是说你必须用至少三个点来表达一块。 92 | # 外层的list的长度取决于一个完整的物体是否被分割成了数块,比如一个物体苹果没有任何的遮挡,则外部的List长度就为1 93 | # 按照给出各个坐标的顺序描点(顺时针、逆时针都行),eg: 94 | # gemfield_polygons1 = [[0,0,10,0,10,20,0,10]] # 逆时针 95 | # gemfield_polygons2 = [[0,0,0,10,10,20,10,0]] # 顺时针 96 | # gemfield_polygons3 = [[10,0,0,10,0,0,10,20]] # 注意次序,此时不是四边形,而是两个三角形 97 | 98 | "width": image_size[0], 99 | "height": image_size[1], 100 | } 101 | 102 | return annotation_info 103 | -------------------------------------------------------------------------------- /results/result_0.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/results/result_0.jpg -------------------------------------------------------------------------------- /results/result_1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/results/result_1.jpg -------------------------------------------------------------------------------- /results/result_2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/results/result_2.jpg -------------------------------------------------------------------------------- /results/rotated.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zoujuny/TableCell/f2c1c5136ab5d54cab83fdbb83657f889983ff85/results/rotated.png -------------------------------------------------------------------------------- /results_label/INTERnet_vzorcni%20primeri_TPNO70_seznam_nalog%20kritje_19042012_objavljeno_0.json: -------------------------------------------------------------------------------- 1 | [[[486, 737], [471, 712]], [[471, 737], [426, 712]], [[426, 737], [363, 712]], [[363, 737], [302, 712]], [[302, 737], [243, 712]], [[243, 737], [194, 712]], [[194, 737], [148, 712]], [[148, 737], [84, 712]], [[486, 712], [426, 684]], [[471, 712], [363, 684]], [[426, 712], [302, 684]], [[363, 712], [243, 684]], [[302, 712], [194, 684]], [[243, 712], [148, 684]], [[194, 712], [84, 684]], [[486, 674], [471, 648]], [[471, 674], [426, 648]], [[426, 674], [363, 648]], [[363, 674], [302, 648]], [[302, 674], [243, 648]], [[243, 674], [194, 648]], [[194, 674], [148, 648]], [[148, 674], [84, 648]], [[486, 648], [426, 620]], [[471, 648], [363, 620]], [[426, 648], [302, 620]], [[363, 648], [243, 620]], [[302, 648], [194, 620]], [[243, 648], [148, 620]], [[194, 648], [84, 620]], [[486, 610], [471, 572]], [[471, 610], [426, 572]], [[426, 610], [363, 572]], [[363, 610], [302, 572]], [[302, 610], [243, 572]], [[243, 610], [194, 572]], [[194, 610], [148, 572]], [[148, 610], [84, 572]], [[486, 572], [426, 532]], [[471, 572], [363, 532]], [[426, 572], [302, 532]], [[363, 572], [243, 532]], [[302, 572], [194, 532]], [[243, 572], [148, 532]], [[194, 572], [84, 532]], [[486, 522], [471, 484]], [[471, 522], [426, 484]], [[426, 522], [363, 484]], [[363, 522], [302, 484]], [[302, 522], [243, 484]], [[243, 522], [194, 484]], [[194, 522], [148, 484]], [[148, 522], [84, 484]], [[486, 484], [426, 444]], [[471, 484], [363, 444]], [[426, 484], [302, 444]], [[363, 484], [194, 444]], [[302, 484], [148, 444]], [[243, 484], [84, 444]], [[426, 409], [84, 201]], [[484, 201], [375, 176]], [[375, 201], [204, 176]], [[204, 201], [84, 176]], [[484, 176], [375, 150]], [[375, 176], [204, 150]], [[204, 176], [84, 150]], [[484, 150], [375, 125]], [[375, 150], [204, 125]], [[204, 150], [84, 125]], [[484, 125], [204, 96]]] -------------------------------------------------------------------------------- /results_label/Internet%20docs%20on%20Child%20labor_9.json: -------------------------------------------------------------------------------- 1 | [[[506, 612], [89, 446]], [[506, 417], [272, 388]], [[272, 417], [89, 388]], [[506, 388], [272, 359]], [[272, 388], [89, 359]], [[506, 359], [272, 329]], [[272, 359], [89, 329]], [[506, 329], [272, 300]], [[272, 329], [89, 300]], [[506, 300], [272, 248]], [[272, 300], [89, 248]], [[506, 248], [272, 218]], [[272, 248], [89, 218]], [[506, 218], [272, 189]], [[272, 218], [89, 189]], [[506, 189], [272, 160]], [[272, 189], [89, 160]], [[506, 160], [272, 131]], [[272, 160], [89, 131]], [[506, 131], [272, 101]], [[272, 131], [89, 101]], [[506, 101], [272, 72]], [[272, 101], [89, 72]]] -------------------------------------------------------------------------------- /results_label/internationell-politik-och-relationer-sh-b_0.json: -------------------------------------------------------------------------------- 1 | [[[506, 612], [89, 446]], [[506, 417], [272, 388]], [[272, 417], [89, 388]], [[506, 388], [272, 359]], [[272, 388], [89, 359]], [[506, 359], [272, 329]], [[272, 359], [89, 329]], [[506, 329], [272, 300]], [[272, 329], [89, 300]], [[506, 300], [272, 248]], [[272, 300], [89, 248]], [[506, 248], [272, 218]], [[272, 248], [89, 218]], [[506, 218], [272, 189]], [[272, 218], [89, 189]], [[506, 189], [272, 160]], [[272, 189], [89, 160]], [[506, 160], [272, 131]], [[272, 160], [89, 131]], [[506, 131], [272, 101]], [[272, 131], [89, 101]], [[506, 101], [272, 72]], [[272, 101], [89, 72]]] -------------------------------------------------------------------------------- /rotation-opencv.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | ''' 3 | 博客1:python+opencv实现基于傅里叶变换的旋转文本校正 4 | https://blog.csdn.net/qq_36387683/article/details/80530709 5 | 6 | 博客2:OpenCV—python 图像矫正(基于傅里叶变换—基于透视变换) 7 | https://blog.csdn.net/wsp_1138886114/article/details/83374333 8 | 9 | 10 | 傅里叶相关知识: 11 | https://blog.csdn.net/on2way/article/details/46981825 12 | 频率:对于图像来说就是指图像颜色值的梯度,即灰度级的变化速度 13 | 幅度:可以简单的理解为是频率的权,即该频率所占的比例 14 | DFT之前的原图像在x y方向上表示空间坐标,DFT是经过x y方向上的傅里叶变换来统计像素在这两个方向上不同频率的分布情况, 15 | 所以DFT得到的图像在x y方向上不再表示空间上的长度,而是频率。 16 | 17 | 仿射变换与透射变换: 18 | 仿射变换和透视变换更直观的叫法可以叫做“平面变换”和“空间变换”或者“二维坐标变换”和“三维坐标变换”. 19 | 从另一个角度也能说明三维变换和二维变换的意思,仿射变换的方程组有6个未知数,所以要求解就需要找到3组映射点, 20 | 三个点刚好确定一个平面. 21 | 透视变换的方程组有8个未知数,所以要求解就需要找到4组映射点,四个点就刚好确定了一个三维空间. 22 | 23 | 24 | 图像旋转算法 数学原理: 25 | https://blog.csdn.net/liyuan02/article/details/6750828 26 | 27 | 28 | 角度angle可以用np.angle() 29 | ϕ=atan(实部/虚部) 30 | numpy包中自带一个angle函数可以直接根据复数的实部与虚部求出角度(默认出来的角度是弧度)。 31 | ''' 32 | 33 | import cv2 as cv 34 | import numpy as np 35 | import math 36 | from matplotlib import pyplot as plt 37 | 38 | def fourier_demo(): 39 | #1、读取文件,灰度化 40 | img = cv.imread('img/table-3.png') 41 | cv.imshow('original', img) 42 | gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY) 43 | cv.imshow('gray', gray) 44 | 45 | #2、图像延扩 46 | # OpenCV中的DFT采用的是快速算法,这种算法要求图像的尺寸是2的、3和5的倍数是处理速度最快。 47 | # 所以需要用getOptimalDFTSize() 48 | # 找到最合适的尺寸,然后用copyMakeBorder()填充多余的部分。 49 | # 这里是让原图像和扩大的图像左上角对齐。填充的颜色如果是纯色, 50 | # 对变换结果的影响不会很大,后面寻找倾斜线的过程又会完全忽略这一点影响。 51 | h, w = img.shape[:2] 52 | new_h = cv.getOptimalDFTSize(h) 53 | new_w = cv.getOptimalDFTSize(w) 54 | right = new_w - w 55 | bottom = new_h - h 56 | nimg = cv.copyMakeBorder(gray, 0, bottom, 0, right, borderType=cv.BORDER_CONSTANT, value=0) 57 | cv.imshow('optim image', nimg) 58 | 59 | #3、执行傅里叶变换,并得到频域图像 60 | f = np.fft.fft2(nimg) # 将图像从空间域转到频域 61 | fshift = np.fft.fftshift(f) # 将低频分量移动到中心,得到复数形式(实部、虚部) 62 | magnitude = np.log(np.abs(fshift)) # 用abs()得到实数(imag()得到虚部),取对数是为了将数据变换到0-255,相当与实现了归一化。 63 | 64 | # 4、二值化,进行Houge直线检测 65 | # 二值化 66 | magnitude_uint = magnitude.astype(np.uint8) #HougnLinesP()函数要求输入图像必须为8位单通道图像 67 | ret, thresh = cv.threshold(magnitude_uint, thresh=11, maxval=255, type=cv.THRESH_BINARY) 68 | print("ret:",ret) 69 | cv.imshow('thresh', thresh) 70 | print("thresh.dtype:", thresh.dtype) 71 | #霍夫直线变换 72 | lines = cv.HoughLinesP(thresh, 2, np.pi/180, 30, minLineLength=40, maxLineGap=100) 73 | print("len(lines):", len(lines)) 74 | 75 | # 5、创建一个新图像,标注直线,找出偏移弧度 76 | #创建一个新图像,标注直线 77 | lineimg = np.ones(nimg.shape,dtype=np.uint8) 78 | lineimg = lineimg * 255 79 | 80 | piThresh = np.pi/180 81 | pi2 = np.pi/2 82 | print("piThresh:",piThresh) 83 | # 得到三个角度,一个是0度,一个是90度,另一个就是我们需要的倾斜角。 84 | for line in lines: 85 | x1, y1, x2, y2 = line[0] 86 | cv.line(lineimg, (x1, y1), (x2, y2), (0, 255, 0), 2) 87 | if x2 - x1 == 0: 88 | continue 89 | else: 90 | theta = (y2 - y1) / (x2 - x1) 91 | if abs(theta) < piThresh or abs(theta - pi2) < piThresh: 92 | continue 93 | else: 94 | print("theta:",theta) 95 | 96 | # 6、计算倾斜角,将弧度转换成角度,并注意误差 97 | angle = math.atan(theta) 98 | print("angle(弧度):",angle) 99 | angle = angle * (180 / np.pi) 100 | print("angle(角度1):",angle) 101 | angle = (angle - 90)/ (w/h) 102 | #由于DFT的特点,只有输出图像是正方形时,检测到的角才是文本真正旋转的角度。 103 | # 但是我们的输入图像不一定是正方形的,所以要根据图像的长宽比改变这个角度。 104 | print("angle(角度2):",angle) 105 | 106 | # 7、校正图片 107 | # 先用getRotationMatrix2D()获得一个仿射变换矩阵,再把这个矩阵输入warpAffine(), 108 | # 做一个单纯的仿射变换,得到校正的结果: 109 | center = (w//2, h//2) 110 | M = cv.getRotationMatrix2D(center, angle, 1.0) 111 | rotated = cv.warpAffine(img, M, (w, h), flags=cv.INTER_CUBIC, borderMode=cv.BORDER_REPLICATE) 112 | cv.imshow('line image', lineimg) 113 | cv.imshow('rotated', rotated) 114 | 115 | if __name__ == '__main__': 116 | 117 | fourier_demo() 118 | cv.waitKey(0) 119 | cv.destroyAllWindows() 120 | -------------------------------------------------------------------------------- /table-cell-to-coco.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*-# 2 | #------------------------------------------------------------------------------- 3 | # Name: table-cell-to-coco.py 4 | # Author: wdf 5 | # Date: 2019/7/20 6 | # IDE: PyCharm 7 | # Parameters: 8 | # @param: 9 | # @param: 10 | # Return: 11 | # 12 | # Description: 把左上角、右下角坐标转换为coco格式 13 | # Usage: 14 | #------------------------------------------------------------------------------- 15 | 16 | import datetime 17 | import json 18 | from pathlib import Path 19 | import re 20 | from PIL import Image 21 | import numpy as np 22 | import progressbar #用于Python的文本进度条库。文本进度条通常用于显示长时间运行的操作的进度,提供正在进行处理的可视化提示。 23 | from multiprocessing import pool 24 | from pycococreatortools import pycococreatortools 25 | 26 | ROOT_DIR = Path('/media/tristan/Files/dataset/TableBank/table-cell') 27 | IMAGE_DIR = ROOT_DIR / Path('images') 28 | ANNOTATION_DIR = ROOT_DIR / Path('labels') 29 | 30 | INFO = { 31 | "description": "TABLE-CELL Dataset", 32 | "url": "https://github.com/weidafeng", 33 | "version": "0.1.0", 34 | "year": 2019, 35 | "contributor": "DafengWei", 36 | "date_created": datetime.datetime.utcnow().isoformat(' ') # 显示此刻时间,格式:'2019-04-30 02:17:49.040415' 37 | } 38 | 39 | LICENSES = [ 40 | { 41 | "id": 1, 42 | "name": "Attribution-NonCommercial-ShareAlike License", 43 | "url": "http://creativecommons.org/licenses/by-nc-sa/2.0/" 44 | } 45 | ] 46 | 47 | CATEGORIES = [ 48 | { 49 | 'id': 1, 50 | 'name': 'cell', 51 | 'supercategory': 'shape', 52 | }, 53 | { 54 | 'id': 2, 55 | 'name': 'background', 56 | 'supercategory': 'shape', 57 | } 58 | # { 59 | # 'id': 3, 60 | # 'name': 'ignore', 61 | # 'supercategory': 'shape', 62 | # } 63 | ] 64 | 65 | 66 | # 获取 bounding-box, segmentation 信息 67 | def get_info(content): 68 | # 输入content是list格式, 69 | # 获得bounding-box的坐标和内容, 以及segmentation信息:有序的四个点的坐标(bounding-box坐标) 70 | left, top = float(content[1][0]), float(content[1][1]) # 左上角 71 | right, down = float(content[0][0]), float(content[0][1]) # 右下角 72 | 73 | height = down - top 74 | width = right-left 75 | # word = content['word'] # 不考虑 76 | segmentation = [left,top, left + width, top, left + width, top + height, left, top + height] # 浮点形式 77 | return [left, top, width, height], [segmentation] # bounding-box信息, coco格式: x,y,w,h);segmentation为[[1,2,3,4,5,6,7,8]]格式 78 | 79 | def main(): 80 | # coco lable文件(如training2017.json)需要存储的信息 81 | coco_output = { 82 | "info": INFO, 83 | "licenses": LICENSES, 84 | "categories": CATEGORIES, 85 | "images": [], 86 | "annotations": [] 87 | } 88 | 89 | # 初始化id(以后依次加一) 90 | image_id = 1 91 | annotation_id = 1 92 | 93 | # 加载图片信息 94 | im_files = [f for f in IMAGE_DIR.iterdir()] 95 | im_files.sort(key=lambda f: f.stem,reverse=True) # 以文件名排序,防止顺序错乱、数据和标签不对应 96 | # print("im-length:",len(im_files),"\n im_files:",im_files) 97 | 98 | # 加载annotation信息 99 | an_files = [f for f in ANNOTATION_DIR.iterdir()] 100 | an_files.sort(key=lambda f: f.stem,reverse=True) # 以文件名排序,防止顺序错乱、数据和标签不对应 101 | # print("an-length:",len(an_files),"\n an_files:",an_files) 102 | 103 | assert len(an_files)==len(im_files), "图片数与lablel文件数目不匹配,请运行diff_two_folder.py,删除不匹配的文件" # 确保每个图片与label文件相匹配 104 | 105 | for im_file, an_file in zip(im_files, an_files): 106 | # 以coco格式,写入图片信息(id、图片名、图片大小),其中id从1开始 107 | image = Image.open(im_file) 108 | im_info = pycococreatortools.create_image_info( image_id, im_file.name, image.size) # 图片信息 109 | coco_output['images'].append(im_info) # 存储图片信息(id、图片名、大小) 110 | myPool = pool.Pool(processes=16) # 并行化处理 111 | 112 | # 开始处理label 信息 113 | annotation_info_list = [] # 存储单张图片的所有标注信息 114 | 115 | with open(an_file, 'r') as f: 116 | datas = json.load(f) 117 | for i in range(len(datas)): 118 | data = datas[i] 119 | # print(data) 120 | bounding_box = get_info(data)[0] 121 | segmentation = get_info(data)[1] # 有序的四个点的坐标(bounding-box坐标) 122 | # print(bounding_box) 123 | # print(segmentation) 124 | 125 | class_id = 1 # label 数字形式 126 | 127 | # 显示日志 128 | print(bounding_box, segmentation) 129 | area = bounding_box[-1] * bounding_box[-2] # 当前bounding-box的面积,宽×高 130 | # an_infos = pycococreatortools.mask_create_annotation_info(annotation_id=annotation_id, image_id=image_id, category_id=class_id, area=area, image_size=image.size, bounding_box=bounding_box,segmentation = segmentation) 131 | # annotation_info_list.append(an_infos) 132 | myPool.apply_async(func=pycococreatortools.mask_create_annotation_info, 133 | args= (annotation_id, image_id, category_id, area, image.size, bounding_box, segmentation), 134 | callbacks=annotation_info_list.append) 135 | annotation_id += 1 136 | 137 | myPool.close() 138 | myPool.join() 139 | # 上面得到单张图片的所有bounding-box信息,接下来每单张图片存储一次 140 | for annotation_info in annotation_info_list: 141 | if annotation_info is not None: 142 | coco_output['annotations'].append(annotation_info) 143 | image_id += 1 144 | 145 | # 保存成json格式 146 | print("保存annotations文件") 147 | output_json = Path(f'table_coco.json') 148 | with output_json.open('w', encoding='utf-8') as f: 149 | json.dump(coco_output, f) 150 | print("Annotations JSON file saved in:", str(output_json)) 151 | 152 | if __name__ == "__main__": 153 | main() 154 | -------------------------------------------------------------------------------- /table-rotation.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*-# 2 | #------------------------------------------------------------------------------- 3 | # Name: table-rotation.py 4 | # Author: wdf 5 | # Date: 2019/7/7 6 | # IDE: PyCharm 7 | # Description: 8 | # 1. HoughLines ——> get the rotation angle 9 | # 2. warpAffine ——> affine(rotation) 10 | # 输入一张倾斜的图像,自动仿射变换、旋转调整整个图像 11 | # 原文:https: // blog.csdn.net / qq_37674858 / article / details / 80708393 12 | # Usage: 13 | # 1. input: raw image 14 | #------------------------------------------------------------------------------- 15 | 16 | 17 | import math 18 | import cv2 19 | import numpy as np 20 | 21 | # 利用hough line 得到最长的直线对应的角度(旋转角度) 22 | # 默认只显示最长的那条直线 23 | def get_rotation_angle(image, show_longest_line=True, show_all_lines=False): 24 | image = image.copy() # 复制备份,因为line()函数为in-place 25 | gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) 26 | edges = cv2.Canny(gray, 100, 150, apertureSize=3) # canny, 便于hough line减少运算量 27 | lines = cv2.HoughLinesP(edges, 1, np.pi/180, 100, minLineLength=50, maxLineGap=60) # 参数很关键 28 | # minLineLengh(线的最短长度,比这个短的都被忽略) 29 | # maxLineCap(两条直线之间的最大间隔,小于此值,认为是一条直线) 30 | # 函数cv2.HoughLinesP()是一种概率直线检测,原理上讲hough变换是一个耗时耗力的算法, 31 | # 尤其是每一个点计算,即使经过了canny转换了有的时候点的个数依然是庞大的, 32 | # 这个时候我们采取一种概率挑选机制,不是所有的点都计算,而是随机的选取一些个点来计算,相当于降采样。 33 | lengths = [] # 存储所有线的坐标、长度 34 | for line in lines: 35 | x1, y1, x2, y2 = line[0] 36 | length = ((x1-x2)**2 + (y1-y2)**2)**0.5 # 勾股定理,求直线长度 37 | lengths.append([x1, y1, x2, y2, length]) 38 | # print(line, length) 39 | if show_all_lines: 40 | cv2.line(image, (x1, y1), (x2, y2), (0, 0, 0), 2) # 绘制所有直线(黑色) 41 | # 绘制最长的直线 42 | lengths.sort(key=lambda x: x[-1]) 43 | longest_line = lengths[-1] 44 | print("longest_line: ",longest_line) 45 | x1, y1, x2, y2, length= longest_line 46 | if show_longest_line: 47 | cv2.line(image, (x1, y1), (x2, y2), (0, 0, 255), 2) # 绘制直线(红色) 48 | cv2.imshow("longest", image) 49 | # 计算这条直线的旋转角度 50 | angle = math.atan((y2-y1)/(x2-x1)) 51 | print("angle-radin:", angle) # 弧度形式 52 | angle = angle*(180 /math.pi) 53 | print("angle-degree:",angle) # 角度形式 54 | return angle 55 | 56 | 57 | # 输入图像、逆时针旋转的角度,旋转整个图像(解决了旋转后图像缺失的问题) 58 | def rotate_bound(image, angle): 59 | # 旋转中心点,默认为图像中心点 60 | (h, w) = image.shape[:2] 61 | (cX, cY) = (w // 2, h // 2) 62 | 63 | # 给定旋转角度后,得到旋转矩阵 64 | # 数学原理: 65 | # https://blog.csdn.net/liyuan02/article/details/6750828 66 | M = cv2.getRotationMatrix2D((cX, cY), angle, 1.0) # 得到旋转矩阵,1.0表示与原图大小一致 67 | # print("RotationMatrix2D:\n", M) 68 | cos = np.abs(M[0, 0]) 69 | sin = np.abs(M[0, 1]) 70 | 71 | # 计算旋转后的图像大小(避免图像裁剪) 72 | nW = int((h * sin) + (w * cos)) 73 | nH = int((h * cos) + (w * sin)) 74 | 75 | # 调整旋转矩阵(避免图像裁剪) 76 | M[0, 2] += (nW / 2) - cX 77 | M[1, 2] += (nH / 2) - cY 78 | print("RotationMatrix2D:\n", M) 79 | 80 | # 执行仿射变换、得到图像 81 | return cv2.warpAffine(image, M, (nW, nH),borderMode=cv2.BORDER_REPLICATE) 82 | # borderMode=cv2.BORDER_REPLICATE 使用边缘值填充 83 | # 或使用borderValue=(255,255,255)) # 使用常数填充边界(0,0,0)表示黑色 84 | 85 | def main(): 86 | img_path = "./img1/IMG_20190723_162452.jpg" 87 | # img_path = "./img/table-1.png" 88 | 89 | img = cv2.imread(img_path) 90 | angle = get_rotation_angle(img, show_longest_line=False, show_all_lines=False) 91 | 92 | imag = rotate_bound(img, angle) # 关键 93 | # cv2.imshow("raw",img) 94 | # cv2.imshow('rotated', imag) 95 | cv2.imwrite('rotated.png', imag) 96 | 97 | cv2.waitKey() 98 | cv2.destroyAllWindows() 99 | 100 | if __name__ == '__main__': 101 | main() -------------------------------------------------------------------------------- /utils/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*-# 2 | #------------------------------------------------------------------------------- 3 | # Name: __init__.py.py 4 | # Author: wdf 5 | # Date: 2019/7/18 6 | # IDE: PyCharm 7 | # Parameters: 8 | # @param: 9 | # @param: 10 | # Return: 11 | # 12 | # Description: 13 | # Usage: 14 | #------------------------------------------------------------------------------- 15 | 16 | -------------------------------------------------------------------------------- /utils/convert_dots_to_bbox.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*-# 2 | #------------------------------------------------------------------------------- 3 | # Name: convert_dots_to_bbox.py 4 | # Author: wdf 5 | # Date: 2019/7/18 6 | # IDE: PyCharm 7 | # Parameters: 8 | # @param: 9 | # @param: 10 | # Return: 11 | # 12 | # Description: 13 | # Usage: 14 | #------------------------------------------------------------------------------- 15 | 16 | 17 | def get_total_row_cols(x): 18 | # # 输入交点列表,计算每行一共有多少个点 19 | # 输出为点的行偏移、本行点数(字典形式) 20 | # 格式 21 | # [58, 174, 1, 1], 22 | # [557, 145, 1, 1], 23 | # [513, 145, 1, 1], 24 | # [471, 145, 1, 1], 25 | # [58, 145, 1, 1]] 26 | 27 | row = {} 28 | num = 1 29 | for i in range(len(x)-1): 30 | if x[i][1] == x[i+1][1]: 31 | num += 1 32 | row[x[i][1]] = num 33 | else: 34 | num = 1 35 | return row 36 | 37 | def get_dots(x, row): 38 | # 得到点的坐标 39 | # 输入: 40 | # 点列表x, 41 | # 每行点数 42 | results = [] 43 | # print("坐标值, 本行点数") 44 | for key in row: 45 | # print(row[key]) 46 | # print("*"*50) 47 | # print(key, row[key]) 48 | for val in range(row[key]): 49 | # print(key) 50 | yy = key 51 | xx = [val[0] for val in x if val[1]==yy] 52 | result = [[x,yy] for x in xx] 53 | # print(result) 54 | results.append(result) 55 | return results 56 | 57 | 58 | def get_bounding_box(results): 59 | # 得到bounding box的对角线两点坐标(右下角、左上角) 60 | # 输入:results = get_dots(row) 61 | # 输出: 62 | 63 | bounding_box = [] 64 | for i in range(len(results) - 1): 65 | col_down = results[i] 66 | col_up = results[i + 1] 67 | # print(col_down) 68 | # print(col_up) 69 | len_down, len_up = len(col_down), len(col_up) 70 | # print(len_down,len_up) 71 | 72 | if len_down == len_up: # 上下两行点数相同,直接取对角点 73 | # print("上下两行点数相同,直接取对角点") 74 | for j in range(len(col_down) - 1): 75 | # print(col_down[j], col_up[j + 1]) 76 | bounding_box.append([col_down[j], col_up[j + 1]]) 77 | elif len_down > len_up: # 下面点数多: 78 | # print("下面点数多") 79 | for j in range(len(col_up) - 1): 80 | k = j # k存储多的点 81 | while k < len_down - 1: # 遍历下面所有的点(点数多的那条直线) 82 | if col_down[k + 1][0] == col_up[j + 1][0]: 83 | # print(col_down[k], col_up[j + 1]) 84 | bounding_box.append([col_down[k], col_up[j + 1]]) 85 | k += 1 86 | else: # 上面点数多 87 | # print("上面点数多") 88 | for j in range(len(col_down) - 1): 89 | k = j # k存储多的点 90 | while k < len_up - 1: # 遍历上面所有的点(点数多的那条直线) 91 | if col_up[k + 1][0] == col_down[j + 1][0]: 92 | # print(col_down[j], col_up[k + 1]) 93 | bounding_box.append([col_down[j], col_up[k + 1]]) 94 | k += 1 95 | return bounding_box 96 | 97 | 98 | def main(): 99 | x = [[549, 764, 1, 1], [317, 764, 1, 1], [85, 764, 1, 1], [549, 738, 1, 1], [317, 738, 1, 1], [85, 738, 1, 1], 100 | [549, 712, 1, 1], [317, 712, 1, 1], [85, 712, 1, 1], [549, 687, 1, 1], [317, 687, 1, 1], [85, 687, 1, 1], 101 | [549, 636, 1, 1], [317, 636, 1, 1], [85, 636, 1, 1], [549, 539, 1, 1], [317, 539, 1, 1], [85, 539, 1, 1], 102 | [549, 488, 1, 1], [317, 488, 1, 1], [85, 488, 1, 1], [549, 462, 1, 1], [317, 462, 1, 1], [85, 462, 1, 1], 103 | [549, 343, 1, 1], [85, 343, 1, 1], [549, 317, 1, 1], [317, 317, 1, 1], [85, 317, 1, 1], [549, 279, 1, 1], 104 | [317, 279, 1, 1], [85, 279, 1, 1], [549, 253, 1, 1], [317, 253, 1, 1], [85, 253, 1, 1], [85, 82, 1, 1], 105 | [85, 69, 1, 1]] 106 | 107 | row = get_total_row_cols(x) 108 | results = get_dots(x, row) 109 | # print(results) 110 | 111 | bounding_boxs = get_bounding_box(results) 112 | print(bounding_boxs) 113 | 114 | if __name__ == '__main__': 115 | main() -------------------------------------------------------------------------------- /utils/draw_bbox.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*-# 2 | #------------------------------------------------------------------------------- 3 | # Name: draw_bbox.py 4 | # Author: wdf 5 | # Date: 2019/7/18 6 | # IDE: PyCharm 7 | # Parameters: 8 | # @param: 9 | # @param: 10 | # Return: 11 | # 12 | # Description: 13 | # Usage: 14 | #------------------------------------------------------------------------------- 15 | import cv2 16 | import numpy as np 17 | 18 | def draw_bbox(img,bboxs): 19 | """ 20 | 可视化单元格 21 | 输入:图像、坐标列表 22 | :param img: 23 | :return: 24 | """ 25 | for i in range(len(bboxs)): 26 | ''' 27 | cv2.rectangle 的两个参数分别代表矩形的左上角和右下角两个点, 28 | 而且 x 坐标轴是水平方向的,y 坐标轴是垂直方向的。 29 | 30 | x1,y1 ------ -> x 31 | | | 32 | | | 33 | | | 34 | --------x2,y2 35 | 36 | | 37 | ∨ 38 | y 39 | 40 | ''' 41 | for i in range(len(bboxs)): 42 | pt1 = (bboxs[i][1][0],bboxs[i][1][1]) # 左上角 43 | pt2 = (bboxs[i][0][0],bboxs[i][0][1]) # 右下角 44 | 45 | img = cv2.rectangle(img, 46 | pt1=pt1, 47 | pt2=pt2, 48 | color=(255, 0, 0), 49 | thickness=2, 50 | lineType=1, 51 | shift=0) 52 | cv2.imshow("contour",img) 53 | 54 | 55 | 56 | def main(): 57 | image = "../img/table.png" 58 | image1 = "../img/3.jpg" 59 | bboxs = [[[556, 679], [510, 619]], [[510, 679], [469, 619]], [[469, 679], [426, 619]], [[426, 679], [384, 619]], [[384, 679], [346, 619]], [[346, 679], [55, 619]], [[556, 619], [510, 523]], [[510, 619], [469, 523]], [[469, 619], [426, 523]], [[426, 619], [384, 523]], [[384, 619], [346, 523]], [[346, 619], [55, 523]], [[556, 523], [510, 450]], [[510, 523], [469, 450]], [[469, 523], [426, 450]], [[426, 523], [384, 450]], [[384, 523], [346, 450]], [[346, 523], [55, 450]], [[556, 450], [510, 410]], [[510, 450], [469, 410]], [[469, 450], [426, 410]], [[426, 450], [384, 410]], [[384, 450], [346, 410]], [[346, 450], [55, 410]], [[556, 410], [510, 370]], [[510, 410], [469, 370]], [[469, 410], [426, 370]], [[426, 410], [384, 370]], [[384, 410], [346, 370]], [[346, 410], [55, 370]], [[555, 307], [514, 256]], [[514, 307], [472, 256]], [[472, 307], [56, 256]], [[555, 256], [514, 185]], [[514, 256], [472, 185]], [[472, 256], [56, 185]], [[555, 185], [514, 125]], [[514, 185], [472, 125]], [[472, 185], [56, 125]], [[555, 125], [514, 100]], [[514, 125], [472, 100]], [[472, 125], [56, 100]]] 60 | 61 | img = cv2.imread(image1,1) 62 | cv2.imshow("img",img) 63 | draw_bbox(img,bboxs) 64 | 65 | cv2.waitKey(0) 66 | cv2.destroyAllWindows() 67 | 68 | 69 | 70 | if __name__ == '__main__': 71 | main() -------------------------------------------------------------------------------- /utils/iter_all_images.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*-# 2 | #------------------------------------------------------------------------------- 3 | # Name: iter_all_images.py 4 | # Author: wdf 5 | # Date: 2019/7/18 6 | # IDE: PyCharm 7 | # Parameters: 8 | # @param: 9 | # @param: 10 | # Return: 11 | # 12 | # Description: 13 | # Usage: 14 | #------------------------------------------------------------------------------- 15 | from pathlib import Path 16 | import progressbar 17 | 18 | def iter_all_files(folder_dir): 19 | im_files = [f for f in folder_dir.iterdir()] 20 | # im_files.sort(key=lambda f: int(f.stem[1:]),reverse=True) # 排序,防止顺序错乱、数据和标签不对应 21 | # print("length:",len(im_files),"\n im_files:",im_files) 22 | 23 | # 进度条 24 | w = progressbar.widgets 25 | widgets = ['Progress: ', w.Percentage(), ' ', w.Bar('#'), ' ', w.Timer(), 26 | ' ', w.ETA(), ' ', w.FileTransferSpeed()] 27 | progress = progressbar.ProgressBar(widgets=widgets) 28 | for im_file in progress(im_files): 29 | print(im_file) 30 | 31 | def main(): 32 | ROOT_DIR = Path('..') 33 | IMAGE_DIR = ROOT_DIR / Path('img') 34 | iter_all_files(IMAGE_DIR) 35 | 36 | if __name__ == '__main__': 37 | main() -------------------------------------------------------------------------------- /utils/match_files.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*-# 2 | #------------------------------------------------------------------------------- 3 | # Name: match_files.py 4 | # Author: wdf 5 | # Date: 2019/7/20 6 | # IDE: PyCharm 7 | # Parameters: 8 | # @param: 9 | # @param: 10 | # Return: 11 | # 12 | # Description: 根据图片文件夹下的图片名,找到标签文件夹下同名的标签,移动匹配到的标签到指定文件夹。 13 | # Usage: 14 | #------------------------------------------------------------------------------- 15 | 16 | import os 17 | import shutil 18 | 19 | # 定义获取文件名的方法 20 | def getFileNames(rootDir, suffix=".jpg"): 21 | fileNames = [] 22 | # 利用os.walk()函数获取根目录下文件夹名称,子文件夹名称及文件名称 23 | for dirName, subDirList, fileList in os.walk(rootDir): 24 | for fname in fileList: 25 | # 用os.path.split()函数来判断并获取文件的后缀名 26 | if os.path.splitext(fname)[1] == suffix: 27 | fileNames.append(fname.split(suffix)[0]) 28 | # fileNames.append(dirName + '/' + fname) 29 | return fileNames 30 | 31 | 32 | def match_labels(images,labels,src_path='E:/dataset/TableBank/table-cell/labels',dst_path='E:/dataset/TableBank/table-cell/labels'): 33 | ''' 34 | 根据输入的图片名,找到对应的标签,把标签复制到指定文件夹 35 | :param images: 图片名列表 36 | :param labels: 标签名列表 37 | :param src_path: 标签源文件的路径(用于获取源文件绝对路径) 38 | :param dst_path: 标签保存的目标路径 39 | :return: 40 | ''' 41 | for image in images: # 遍历每个图片名 42 | index = labels.index(image) # 获取标签索引 43 | label_name = str(src_path) + "/" + str(labels[index]) + ".json" # 找到对应标签文件路径 44 | # shutil.copy(label_name, dst_path) # 把标签复制到对应文件夹 45 | shutil.move(label_name, dst_path) # 把标签移动到对应文件夹 46 | 47 | 48 | 49 | def main(): 50 | image_root = 'E:/dataset/TableBank/table-cell/test_images' 51 | label_root = 'E:/dataset/TableBank/table-cell/labels' 52 | images = getFileNames(image_root,suffix='.jpg') 53 | labels = getFileNames(label_root,suffix='.json') 54 | print(images) 55 | print(labels) 56 | match_labels(images, labels,src_path=label_root, dst_path='E:/dataset/TableBank/table-cell/test_labels') 57 | 58 | 59 | 60 | if __name__ == '__main__': 61 | main() -------------------------------------------------------------------------------- /utils/random_save_some_files_to_another_folder.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*-# 2 | #------------------------------------------------------------------------------- 3 | # Name: random_save_some_files_to_another_folder.py 4 | # Author: wdf 5 | # Date: 2019/7/20 6 | # IDE: PyCharm 7 | # Parameters: 8 | # @param: 9 | # @param: 10 | # Return: 11 | # 12 | # Description: 13 | # Usage: 14 | #------------------------------------------------------------------------------- 15 | 16 | import random 17 | import os 18 | import shutil 19 | 20 | def random_copyfile(srcPath, dstPath, numfiles, move_or_cpoy='move'): 21 | name_list=list(os.path.join(srcPath,name) for name in os.listdir(srcPath)) 22 | random_name_list=list(random.sample(name_list,numfiles)) 23 | # if not os.path.exists(dstPath): 24 | # os.mkdir(dstPath) 25 | 26 | if move_or_cpoy=='move': 27 | for oldname in random_name_list: 28 | shutil.move(oldname,oldname.replace(srcPath, dstPath)) 29 | elif move_or_cpoy=='copy': 30 | for oldname in random_name_list: 31 | shutil.copyfile(oldname,oldname.replace(srcPath, dstPath)) 32 | else: 33 | return -1 34 | 35 | 36 | def main(): 37 | srcPath = 'E:/dataset/TableBank/table-cell/images' 38 | dstPath = 'E:/dataset/TableBank/table-cell/test_images' 39 | random_copyfile(srcPath, dstPath, 1000) 40 | 41 | 42 | if __name__ == '__main__': 43 | main() --------------------------------------------------------------------------------