├── LICENSE
├── README.md
├── diagram.diag
├── diagram.svg
├── examples
├── label_map.pbtxt
├── raccoon-197.xml
└── raccoon_labels.csv
├── generate_csv.py
├── generate_pbtxt.py
├── generate_tfrecord.py
├── generate_train_eval.py
└── generate_yolo_txt.py
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2017 Dat Tran
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Object detection utility scripts
2 |
3 | This repo contains a few Python scripts which may be useful for those trying to create the necessary prerequisite files to train an object detection model, either by using the [TensorFlow Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection) or [darknet](https://github.com/alexeyab/darknet).
4 |
5 | Take a look inside the `examples` directory to have an idea of the types of files that these scripts expect as input/generate as output.
6 |
7 | ## Usage
8 |
9 | The repo provides a collection of scripts, which are to be called from the terminal. In order to know what arguments a particular script expects, run it with the `-h` flag to see a help message. For example: `python generate_tfrecord.py -h`. The output files generated by some scripts are used as input files for other scripts. Please take a look at [the flowchart](#script-usage-flowchart) at the end of this page to understand in which order to use them.
10 |
11 | * **generate_csv.py** reads the contents of image annotations stored in [XML files](examples/raccoon-197.xml), created with [labelImg](https://github.com/tzutalin/labelImg), and generates a single [CSV file](examples/raccoon_labels.csv).
12 | * **generate_train_eval.py** reads the CSV file and separates it into train and evaluation datasets, which are also CSV files. There are options to stratify by class and to select which fraction of the input CSV will be directed to the train dataset (the rest going to evaluation).
13 | * **generate_pbtxt.py** reads the previously generated CSV file (or any CSV file that has a column named _"class"_) or a text file containing a single class name per line and no header, and generates a [label map](examples/label_map.pbtxt), one of the files needed to train a detection model using [TensorFlow's Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection).
14 | * **generate_tfrecord.py** reads the previously generated CSV and label map files, as well as all the images from a given directory, and generates a TFRecord file, which can then be used to train an object detection model with TensorFlow. The resulting TFRecord file is about the same size of all the original images that were included in it.
15 | * **generate_yolo_txt.py** reads the CSV file and generates one .txt file for each image mentioned in the CSV file, whith the same name of the image file. These .txt files contain the object annotations for that image, in a format which [darknet](https://pjreddie.com/darknet/yolo/) uses to train its models.
16 |
17 | ### Script usage flowchart
18 |
19 | 
20 |
21 | ## Copyright
22 |
23 | Licenses are so complicated. This work began as a fork of [Dat Tran's](http://www.dat-tran.com/) raccoon dataset repository, but then it became its own thing. Anyway, the license is unchanged and is in the repo.
24 |
--------------------------------------------------------------------------------
/diagram.diag:
--------------------------------------------------------------------------------
1 | blockdiag {
2 | orientation=portrait
3 | span_width=32
4 | span_height=32
5 |
6 | node_width = 95 // default value is 128
7 | node_height = 40 // default value is 40
8 |
9 | class dataset[color="#ffcc99",stacked,shape=box]
10 | class third_party[color="#ccf2ff",shape=box]
11 | class python[color="#ffff80",shape=roundedbox]
12 | class message[color="#ffffff",shape=roundedbox]
13 |
14 | group external {
15 | label = "External"
16 | color = "#ff7777"
17 |
18 | images[class=dataset,label="Image\ndataset"]
19 | xml_files[class=dataset,label="XML annotation\nfiles"]
20 | labelImg[class=third_party]
21 | }
22 |
23 | group my_repo {
24 | label = "This Repo"
25 | color = "#77ff77"
26 |
27 | generate_csv.py[class=python,height=30,width=110]
28 | generate_pbtxt.py[class=python,height=30,width=110]
29 | generate_tfrecord.py[class=python,height=30,width=110]
30 | generate_yolo_txt.py[class=python,height=30,width=110]
31 | generate_train_eval.py[class=python,height=30,width=110]
32 |
33 | csv[class=message,width=80,label="CSV"]
34 | pbtxt[class=message,width=80,label="label map"]
35 | tfrecord[class=message,width=80,label="TFRecord"]
36 | yolo_txt[class=message,width=80,label="YOLO\ntext files",stacked]
37 | }
38 |
39 | images -> labelImg -> xml_files -> generate_csv.py -> csv
40 | csv -> generate_pbtxt.py -> pbtxt -> generate_tfrecord.py -> tfrecord
41 | csv -> generate_yolo_txt.py -> yolo_txt
42 | csv -> generate_train_eval.py -> csv
43 | csv -> generate_tfrecord.py
44 | }
45 |
--------------------------------------------------------------------------------
/diagram.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
148 |
--------------------------------------------------------------------------------
/examples/label_map.pbtxt:
--------------------------------------------------------------------------------
1 | item {
2 | id: 1
3 | name: 'raccoon'
4 | }
--------------------------------------------------------------------------------
/examples/raccoon-197.xml:
--------------------------------------------------------------------------------
1 |
2 | images
3 | raccoon-197.jpg
4 | /home/user/raccoon/images/raccoon-197.jpg
5 |
6 | Unknown
7 |
8 |
9 | 1280
10 | 720
11 | 3
12 |
13 | 0
14 |
26 |
27 |
--------------------------------------------------------------------------------
/examples/raccoon_labels.csv:
--------------------------------------------------------------------------------
1 | filename,width,height,class,xmin,ymin,xmax,ymax
2 | raccoon-1.jpg,650,417,raccoon,81,88,522,408
3 | raccoon-10.jpg,450,495,raccoon,130,2,446,488
4 | raccoon-100.jpg,960,576,raccoon,548,10,954,520
5 | raccoon-101.jpg,640,426,raccoon,86,53,400,356
6 | raccoon-102.jpg,259,194,raccoon,1,1,118,152
7 | raccoon-103.jpg,480,640,raccoon,92,54,460,545
8 | raccoon-104.jpg,600,304,raccoon,189,41,340,249
9 | raccoon-105.jpg,720,960,raccoon,250,49,714,869
10 | raccoon-106.jpg,269,187,raccoon,31,21,226,146
11 | raccoon-107.jpg,500,622,raccoon,165,51,496,590
12 | raccoon-108.jpg,604,481,raccoon,99,53,402,464
13 | raccoon-109.jpg,192,259,raccoon,9,1,177,252
14 | raccoon-11.jpg,660,432,raccoon,3,1,461,431
15 | raccoon-110.jpg,184,274,raccoon,23,2,166,262
16 | raccoon-111.jpg,768,960,raccoon,41,5,683,917
17 | raccoon-112.jpg,800,574,raccoon,131,174,775,563
18 | raccoon-113.jpg,640,480,raccoon,1,1,384,436
19 | raccoon-114.jpg,625,418,raccoon,242,35,523,264
20 | raccoon-115.jpg,426,640,raccoon,51,130,351,556
21 | raccoon-116.jpg,660,432,raccoon,3,1,436,430
22 | raccoon-117.jpg,640,448,raccoon,100,124,266,324
23 | raccoon-117.jpg,640,448,raccoon,342,101,570,297
24 | raccoon-118.jpg,448,297,raccoon,109,31,307,297
25 | raccoon-119.jpg,400,533,raccoon,16,62,362,353
26 | raccoon-119.jpg,400,533,raccoon,211,359,277,402
27 | raccoon-119.jpg,400,533,raccoon,198,392,280,473
28 | raccoon-12.jpg,259,194,raccoon,28,21,126,181
29 | raccoon-12.jpg,259,194,raccoon,85,33,235,193
30 | raccoon-120.jpg,660,371,raccoon,129,12,510,331
31 | raccoon-121.jpg,600,399,raccoon,55,34,416,377
32 | raccoon-122.jpg,178,283,raccoon,7,7,174,198
33 | raccoon-123.jpg,640,406,raccoon,280,42,550,392
34 | raccoon-124.jpg,259,194,raccoon,17,39,239,147
35 | raccoon-125.jpg,259,195,raccoon,13,6,252,190
36 | raccoon-126.jpg,255,197,raccoon,5,5,246,192
37 | raccoon-127.jpg,253,199,raccoon,125,59,212,165
38 | raccoon-128.jpg,259,194,raccoon,76,87,190,148
39 | raccoon-129.jpg,639,315,raccoon,142,24,442,276
40 | raccoon-13.jpg,660,495,raccoon,55,28,393,313
41 | raccoon-130.jpg,640,426,raccoon,223,62,497,307
42 | raccoon-130.jpg,640,426,raccoon,453,41,640,423
43 | raccoon-131.jpg,259,194,raccoon,1,1,199,184
44 | raccoon-132.jpg,259,194,raccoon,6,2,240,131
45 | raccoon-133.jpg,490,640,raccoon,8,6,476,631
46 | raccoon-134.jpg,225,225,raccoon,125,87,194,169
47 | raccoon-135.jpg,640,426,raccoon,99,8,605,404
48 | raccoon-136.jpg,256,197,raccoon,51,24,198,192
49 | raccoon-137.jpg,320,240,raccoon,71,8,304,233
50 | raccoon-138.jpg,259,194,raccoon,56,54,226,150
51 | raccoon-139.jpg,259,194,raccoon,20,6,177,167
52 | raccoon-14.jpg,900,484,raccoon,163,81,546,438
53 | raccoon-140.jpg,204,247,raccoon,6,17,202,231
54 | raccoon-141.jpg,249,202,raccoon,1,1,154,176
55 | raccoon-142.jpg,1024,768,raccoon,171,162,811,740
56 | raccoon-143.jpg,259,194,raccoon,17,29,238,162
57 | raccoon-144.jpg,570,390,raccoon,117,42,387,390
58 | raccoon-145.jpg,600,450,raccoon,3,36,345,450
59 | raccoon-145.jpg,600,450,raccoon,260,41,569,449
60 | raccoon-146.jpg,275,183,raccoon,4,4,271,180
61 | raccoon-147.jpg,426,640,raccoon,13,1,426,486
62 | raccoon-148.jpg,500,375,raccoon,32,177,174,316
63 | raccoon-148.jpg,500,375,raccoon,309,172,428,315
64 | raccoon-149.jpg,500,375,raccoon,132,50,305,246
65 | raccoon-15.jpg,640,360,raccoon,313,61,614,360
66 | raccoon-150.jpg,275,183,raccoon,80,62,187,169
67 | raccoon-151.jpg,225,225,raccoon,42,94,108,224
68 | raccoon-152.jpg,275,183,raccoon,41,60,223,155
69 | raccoon-153.jpg,700,700,raccoon,10,1,612,700
70 | raccoon-154.jpg,650,419,raccoon,148,56,517,346
71 | raccoon-155.jpg,259,194,raccoon,46,91,143,169
72 | raccoon-156.jpg,201,251,raccoon,27,27,139,243
73 | raccoon-157.jpg,220,229,raccoon,1,1,144,209
74 | raccoon-158.jpg,275,183,raccoon,23,23,262,177
75 | raccoon-159.jpg,226,223,raccoon,14,11,223,221
76 | raccoon-16.jpg,424,640,raccoon,51,178,355,632
77 | raccoon-160.jpg,256,197,raccoon,7,42,162,197
78 | raccoon-161.jpg,500,347,raccoon,209,73,385,186
79 | raccoon-162.jpg,259,194,raccoon,45,34,161,184
80 | raccoon-163.jpg,248,203,raccoon,6,7,240,157
81 | raccoon-164.jpg,274,184,raccoon,10,27,178,184
82 | raccoon-165.jpg,199,253,raccoon,27,11,194,228
83 | raccoon-166.jpg,328,154,raccoon,108,31,208,120
84 | raccoon-167.jpg,259,195,raccoon,1,5,175,195
85 | raccoon-168.jpg,628,314,raccoon,98,88,374,303
86 | raccoon-168.jpg,628,314,raccoon,173,1,471,309
87 | raccoon-169.jpg,615,409,raccoon,194,1,549,409
88 | raccoon-17.jpg,259,194,raccoon,95,60,167,118
89 | raccoon-170.jpg,259,194,raccoon,53,27,254,173
90 | raccoon-171.jpg,224,225,raccoon,108,21,180,115
91 | raccoon-172.jpg,615,346,raccoon,183,53,399,302
92 | raccoon-173.jpg,550,388,raccoon,202,21,515,387
93 | raccoon-174.jpg,960,639,raccoon,125,43,588,527
94 | raccoon-175.jpg,634,381,raccoon,69,89,354,378
95 | raccoon-176.jpg,800,533,raccoon,308,90,611,426
96 | raccoon-176.jpg,800,533,raccoon,103,1,314,189
97 | raccoon-177.jpg,276,183,raccoon,8,18,157,178
98 | raccoon-177.jpg,276,183,raccoon,146,13,263,146
99 | raccoon-178.jpg,275,183,raccoon,59,12,242,180
100 | raccoon-179.jpg,600,450,raccoon,1,176,270,427
101 | raccoon-18.jpg,240,156,raccoon,32,25,201,130
102 | raccoon-180.jpg,600,400,raccoon,119,21,368,399
103 | raccoon-181.jpg,750,422,raccoon,100,1,420,411
104 | raccoon-182.jpg,500,500,raccoon,17,122,279,499
105 | raccoon-183.jpg,2000,1333,raccoon,358,21,1354,1119
106 | raccoon-184.jpg,640,640,raccoon,81,77,567,617
107 | raccoon-185.jpg,275,183,raccoon,25,1,200,181
108 | raccoon-186.jpg,640,428,raccoon,34,40,536,387
109 | raccoon-187.jpg,362,357,raccoon,161,112,292,276
110 | raccoon-188.jpg,460,379,raccoon,26,71,366,334
111 | raccoon-189.jpg,600,450,raccoon,19,2,508,438
112 | raccoon-19.jpg,259,194,raccoon,87,8,182,89
113 | raccoon-190.jpg,259,194,raccoon,78,54,153,135
114 | raccoon-191.jpg,634,445,raccoon,100,89,478,331
115 | raccoon-192.jpg,510,325,raccoon,127,160,298,289
116 | raccoon-193.jpg,634,852,raccoon,23,215,440,831
117 | raccoon-194.jpg,1080,1080,raccoon,1,63,885,1042
118 | raccoon-195.jpg,225,225,raccoon,25,111,197,225
119 | raccoon-196.jpg,233,216,raccoon,83,87,211,211
120 | raccoon-197.jpg,1280,720,raccoon,114,35,987,653
121 | raccoon-198.jpg,259,194,raccoon,57,21,158,184
122 | raccoon-198.jpg,259,194,raccoon,112,32,199,158
123 | raccoon-199.jpg,640,428,raccoon,28,64,530,402
124 | raccoon-2.jpg,800,573,raccoon,60,51,462,499
125 | raccoon-20.jpg,720,540,raccoon,2,29,720,503
126 | raccoon-200.jpg,261,193,raccoon,107,10,249,166
127 | raccoon-21.jpg,290,174,raccoon,59,2,216,171
128 | raccoon-22.jpg,640,360,raccoon,252,76,466,335
129 | raccoon-23.jpg,259,194,raccoon,108,1,258,194
130 | raccoon-24.jpg,268,188,raccoon,77,48,179,156
131 | raccoon-24.jpg,268,188,raccoon,139,77,202,145
132 | raccoon-25.jpg,634,641,raccoon,31,82,325,641
133 | raccoon-26.jpg,306,374,raccoon,114,5,306,337
134 | raccoon-27.jpg,602,401,raccoon,14,38,592,373
135 | raccoon-28.jpg,602,452,raccoon,93,80,601,452
136 | raccoon-29.jpg,275,183,raccoon,70,6,219,179
137 | raccoon-3.jpg,720,480,raccoon,1,1,720,476
138 | raccoon-30.jpg,266,190,raccoon,78,25,182,177
139 | raccoon-31.jpg,236,214,raccoon,82,21,187,197
140 | raccoon-31.jpg,236,214,raccoon,11,55,80,145
141 | raccoon-32.jpg,625,415,raccoon,88,92,473,328
142 | raccoon-33.jpg,602,843,raccoon,89,12,593,843
143 | raccoon-34.jpg,259,194,raccoon,1,2,227,194
144 | raccoon-35.jpg,275,183,raccoon,1,1,164,183
145 | raccoon-36.jpg,640,428,raccoon,113,27,468,428
146 | raccoon-37.jpg,520,593,raccoon,13,1,500,592
147 | raccoon-38.jpg,259,194,raccoon,7,17,257,180
148 | raccoon-39.jpg,250,172,raccoon,54,12,250,166
149 | raccoon-4.jpg,275,183,raccoon,21,11,200,183
150 | raccoon-40.jpg,480,360,raccoon,164,53,349,275
151 | raccoon-41.jpg,700,500,raccoon,211,78,530,468
152 | raccoon-42.jpg,577,1024,raccoon,121,206,410,767
153 | raccoon-43.jpg,480,360,raccoon,1,65,239,316
154 | raccoon-44.jpg,300,168,raccoon,45,14,247,165
155 | raccoon-45.jpg,620,372,raccoon,140,6,454,370
156 | raccoon-46.jpg,576,318,raccoon,145,2,423,318
157 | raccoon-47.jpg,262,193,raccoon,34,4,233,193
158 | raccoon-48.jpg,261,193,raccoon,43,28,240,176
159 | raccoon-49.jpg,640,395,raccoon,162,36,611,395
160 | raccoon-5.jpg,270,187,raccoon,3,3,260,179
161 | raccoon-50.jpg,275,183,raccoon,36,2,174,172
162 | raccoon-51.jpg,800,599,raccoon,315,105,772,540
163 | raccoon-52.jpg,800,533,raccoon,105,10,502,501
164 | raccoon-53.jpg,259,194,raccoon,71,45,197,171
165 | raccoon-54.jpg,602,339,raccoon,78,5,517,333
166 | raccoon-55.jpg,634,417,raccoon,6,49,250,320
167 | raccoon-55.jpg,634,417,raccoon,274,27,563,410
168 | raccoon-56.jpg,240,210,raccoon,20,6,224,201
169 | raccoon-57.jpg,640,425,raccoon,82,6,638,423
170 | raccoon-58.jpg,224,225,raccoon,2,1,199,221
171 | raccoon-59.jpg,600,600,raccoon,1,2,449,432
172 | raccoon-6.jpg,480,360,raccoon,1,44,307,316
173 | raccoon-60.jpg,273,185,raccoon,58,33,197,127
174 | raccoon-61.jpg,274,184,raccoon,94,63,195,148
175 | raccoon-61.jpg,274,184,raccoon,142,39,213,108
176 | raccoon-62.jpg,640,407,raccoon,73,19,632,407
177 | raccoon-63.jpg,600,400,raccoon,74,107,280,290
178 | raccoon-63.jpg,600,400,raccoon,227,93,403,298
179 | raccoon-64.jpg,259,194,raccoon,1,1,247,194
180 | raccoon-65.jpg,480,360,raccoon,123,27,338,284
181 | raccoon-66.jpg,860,484,raccoon,220,37,697,440
182 | raccoon-67.jpg,272,185,raccoon,18,17,224,168
183 | raccoon-68.jpg,640,423,raccoon,1,24,517,423
184 | raccoon-69.jpg,205,246,raccoon,12,11,188,240
185 | raccoon-7.jpg,410,308,raccoon,92,79,271,264
186 | raccoon-70.jpg,500,375,raccoon,60,4,421,369
187 | raccoon-71.jpg,640,426,raccoon,129,51,628,373
188 | raccoon-72.jpg,560,420,raccoon,219,195,446,375
189 | raccoon-72.jpg,560,420,raccoon,98,34,284,336
190 | raccoon-73.jpg,284,177,raccoon,56,16,274,166
191 | raccoon-74.jpg,800,533,raccoon,141,6,472,505
192 | raccoon-75.jpg,640,640,raccoon,1,1,640,459
193 | raccoon-76.jpg,225,225,raccoon,14,1,212,132
194 | raccoon-77.jpg,640,360,raccoon,161,1,627,330
195 | raccoon-78.jpg,223,226,raccoon,28,15,221,216
196 | raccoon-79.jpg,640,425,raccoon,120,1,568,425
197 | raccoon-8.jpg,259,194,raccoon,16,11,236,175
198 | raccoon-80.jpg,225,225,raccoon,21,27,177,182
199 | raccoon-81.jpg,600,450,raccoon,4,54,574,410
200 | raccoon-82.jpg,750,500,raccoon,6,1,632,500
201 | raccoon-83.jpg,660,371,raccoon,104,3,509,369
202 | raccoon-84.jpg,303,166,raccoon,31,6,197,163
203 | raccoon-85.jpg,620,465,raccoon,236,87,598,429
204 | raccoon-86.jpg,600,401,raccoon,129,34,475,401
205 | raccoon-87.jpg,256,197,raccoon,1,3,206,191
206 | raccoon-88.jpg,640,480,raccoon,116,41,526,436
207 | raccoon-89.jpg,259,194,raccoon,18,6,225,176
208 | raccoon-9.jpg,347,510,raccoon,10,7,347,471
209 | raccoon-90.jpg,640,426,raccoon,44,90,577,426
210 | raccoon-91.jpg,236,314,raccoon,22,14,216,308
211 | raccoon-92.jpg,960,640,raccoon,37,32,729,543
212 | raccoon-93.jpg,251,201,raccoon,66,29,233,190
213 | raccoon-94.jpg,700,467,raccoon,155,10,543,445
214 | raccoon-95.jpg,320,400,raccoon,50,45,272,289
215 | raccoon-96.jpg,230,219,raccoon,28,25,203,175
216 | raccoon-97.jpg,500,393,raccoon,1,32,343,307
217 | raccoon-98.jpg,480,360,raccoon,108,31,351,308
218 | raccoon-99.jpg,252,228,raccoon,15,40,132,226
219 |
--------------------------------------------------------------------------------
/generate_csv.py:
--------------------------------------------------------------------------------
1 | import json
2 | import os
3 | import glob
4 | import pandas as pd
5 | import argparse
6 | import xml.etree.ElementTree as ET
7 | from tqdm import tqdm
8 |
9 |
10 | def __list_to_csv(annotations, output_file):
11 | column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
12 | xml_df = pd.DataFrame(annotations, columns=column_name)
13 | xml_df.to_csv(output_file, index=None)
14 |
15 |
16 | def xml_to_csv(xml_dir, output_file):
17 | """Reads all XML files, generated by labelImg, from a directory and generates a single CSV file"""
18 | annotations = []
19 | for xml_file in tqdm(glob.glob(xml_dir + '/*.xml')):
20 | tree = ET.parse(xml_file)
21 | root = tree.getroot()
22 | for member in root.findall('object'):
23 | value = (root.find('filename').text,
24 | int(root.find('size')[0].text),
25 | int(root.find('size')[1].text),
26 | member[0].text,
27 | int(member[4][0].text),
28 | int(member[4][1].text),
29 | int(member[4][2].text),
30 | int(member[4][3].text))
31 | annotations.append(value)
32 |
33 | __list_to_csv(annotations, output_file)
34 |
35 |
36 | def json_to_csv(input_json, output_file):
37 | """Reads a JSON file, generated by the VGG Image Annotator, and generates a single CSV file"""
38 | with open(input_json) as f:
39 | images = json.load(f)
40 |
41 | annotations = []
42 |
43 | for entry in images:
44 | filename = images[entry]['filename']
45 | for region in images[entry]['regions']:
46 | c = region['region_attributes']['class']
47 | xmin = region['shape_attributes']['x']
48 | ymin = region['shape_attributes']['y']
49 | xmax = xmin + region['shape_attributes']['width']
50 | ymax = ymin + region['shape_attributes']['height']
51 | width = 0
52 | height = 0
53 |
54 | value = (filename, width, height, c, xmin, ymin, xmax, ymax)
55 | annotations.append(value)
56 |
57 | __list_to_csv(annotations, output_file)
58 |
59 |
60 | if __name__ == "__main__":
61 | parser = argparse.ArgumentParser(
62 | description=
63 | 'Reads all XML files, generated by labelImg, from a directory and generates a single CSV file',
64 | formatter_class=argparse.RawDescriptionHelpFormatter)
65 | parser.add_argument('type',
66 | metavar='type',
67 | default='xml',
68 | choices=['xml', 'json'],
69 | help='"xml" for LabelImg XML or "json" VIA JSON')
70 | parser.add_argument(
71 | 'input',
72 | metavar='input',
73 | type=str,
74 | help='Directory containing the XML files generated by labelImg or path to a single VIA JSON')
75 | parser.add_argument('output_csv',
76 | metavar='output_csv',
77 | type=str,
78 | help='Path where the CSV output will be created')
79 |
80 | args = parser.parse_args()
81 |
82 | if args.type == 'xml':
83 | xml_to_csv(args.input, args.output_csv)
84 | elif args.type == 'json':
85 | json_to_csv(args.input, args.output_csv)
86 |
--------------------------------------------------------------------------------
/generate_pbtxt.py:
--------------------------------------------------------------------------------
1 | import pandas as pd
2 | import argparse
3 |
4 |
5 | def pbtxt_from_classlist(l, pbtxt_path):
6 | pbtxt_text = ''
7 |
8 | for i, c in enumerate(l):
9 | pbtxt_text += 'item {\n id: ' + str(i + 1) + '\n display_name: "' + str(c) + '"\n}\n\n'
10 |
11 | with open(pbtxt_path, "w+") as pbtxt_file:
12 | pbtxt_file.write(pbtxt_text)
13 |
14 |
15 | def pbtxt_from_csv(csv_path, pbtxt_path):
16 | class_list = list(pd.read_csv(csv_path)['class'].unique())
17 | class_list.sort()
18 |
19 | pbtxt_from_classlist(class_list, pbtxt_path)
20 |
21 |
22 | def pbtxt_from_txt(txt_path, pbtxt_path):
23 | # read txt into a list, splitting by newlines
24 | data = [l.rstrip('\n').strip() for l in open(txt_path, 'r', encoding='utf-8-sig')]
25 |
26 | data = [l for l in data if len(l) > 0]
27 |
28 | pbtxt_from_classlist(data, pbtxt_path)
29 |
30 |
31 | if __name__ == "__main__":
32 | parser = argparse.ArgumentParser(
33 | description=
34 | 'Reads all XML files, generated by labelImg, from a directory and generates a single CSV file'
35 | )
36 | parser.add_argument(
37 | 'input_type',
38 | choices=['csv', 'txt'],
39 | help='type of input file (csv with at least one \'class\' column or txt with one class name by line)'
40 | )
41 | parser.add_argument('input_file',
42 | metavar='input_file',
43 | type=str,
44 | help='Path to the input txt or csv file')
45 | parser.add_argument('output_file',
46 | metavar='output_file',
47 | type=str,
48 | help='Path where the .pbtxt output will be created')
49 |
50 | args = parser.parse_args()
51 |
52 | if args.input_type == 'csv':
53 | pbtxt_from_csv(args.input_file, args.output_file)
54 | elif args.input_type == 'txt':
55 | pbtxt_from_txt(args.input_file, args.output_file)
56 |
--------------------------------------------------------------------------------
/generate_tfrecord.py:
--------------------------------------------------------------------------------
1 | from __future__ import division
2 | from __future__ import print_function
3 | from __future__ import absolute_import
4 |
5 | import os
6 | import io
7 | import pandas as pd
8 | import tensorflow as tf
9 | import argparse
10 |
11 | from PIL import Image
12 | from tqdm import tqdm
13 | from object_detection.utils import dataset_util
14 | from collections import namedtuple, OrderedDict
15 |
16 |
17 | def __split(df, group):
18 | data = namedtuple('data', ['filename', 'object'])
19 | gb = df.groupby(group)
20 | return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
21 |
22 |
23 | def create_tf_example(group, path, class_dict):
24 | with tf.io.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
25 | encoded_jpg = fid.read()
26 | encoded_jpg_io = io.BytesIO(encoded_jpg)
27 | image = Image.open(encoded_jpg_io)
28 | width, height = image.size
29 |
30 | filename = group.filename.encode('utf8')
31 | image_format = b'jpg'
32 | xmins = []
33 | xmaxs = []
34 | ymins = []
35 | ymaxs = []
36 | classes_text = []
37 | classes = []
38 |
39 | for index, row in group.object.iterrows():
40 | if set(['xmin_rel', 'xmax_rel', 'ymin_rel', 'ymax_rel']).issubset(set(row.index)):
41 | xmin = row['xmin_rel']
42 | xmax = row['xmax_rel']
43 | ymin = row['ymin_rel']
44 | ymax = row['ymax_rel']
45 |
46 | elif set(['xmin', 'xmax', 'ymin', 'ymax']).issubset(set(row.index)):
47 | xmin = row['xmin'] / width
48 | xmax = row['xmax'] / width
49 | ymin = row['ymin'] / height
50 | ymax = row['ymax'] / height
51 |
52 | xmins.append(xmin)
53 | xmaxs.append(xmax)
54 | ymins.append(ymin)
55 | ymaxs.append(ymax)
56 | classes_text.append(str(row['class']).encode('utf8'))
57 | classes.append(class_dict[str(row['class'])])
58 |
59 | tf_example = tf.train.Example(features=tf.train.Features(
60 | feature={
61 | 'image/height': dataset_util.int64_feature(height),
62 | 'image/width': dataset_util.int64_feature(width),
63 | 'image/filename': dataset_util.bytes_feature(filename),
64 | 'image/source_id': dataset_util.bytes_feature(filename),
65 | 'image/encoded': dataset_util.bytes_feature(encoded_jpg),
66 | 'image/format': dataset_util.bytes_feature(image_format),
67 | 'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
68 | 'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
69 | 'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
70 | 'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
71 | 'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
72 | 'image/object/class/label': dataset_util.int64_list_feature(classes), }))
73 | return tf_example
74 |
75 |
76 | def class_dict_from_pbtxt(pbtxt_path):
77 | # open file, strip \n, trim lines and keep only
78 | # lines beginning with id or display_name
79 |
80 | with open(pbtxt_path, 'r', encoding='utf-8-sig') as f:
81 | data = f.readlines()
82 |
83 | name_key = None
84 | if any('display_name:' in s for s in data):
85 | name_key = 'display_name:'
86 | elif any('name:' in s for s in data):
87 | name_key = 'name:'
88 |
89 | if name_key is None:
90 | raise ValueError(
91 | "label map does not have class names, provided by values with the 'display_name' or 'name' keys in the contents of the file"
92 | )
93 |
94 | data = [l.rstrip('\n').strip() for l in data if 'id:' in l or name_key in l]
95 |
96 | ids = [int(l.replace('id:', '')) for l in data if l.startswith('id')]
97 | names = [
98 | l.replace(name_key, '').replace('"', '').replace("'", '').strip() for l in data
99 | if l.startswith(name_key)]
100 |
101 | # join ids and display_names into a single dictionary
102 | class_dict = {}
103 | for i in range(len(ids)):
104 | class_dict[names[i]] = ids[i]
105 |
106 | return class_dict
107 |
108 |
109 | if __name__ == '__main__':
110 | parser = argparse.ArgumentParser(
111 | description='Create a TFRecord file for use with the TensorFlow Object Detection API.',
112 | formatter_class=argparse.RawDescriptionHelpFormatter)
113 | parser.add_argument('csv_input', metavar='csv_input', type=str, help='Path to the CSV input')
114 | parser.add_argument('pbtxt_input',
115 | metavar='pbtxt_input',
116 | type=str,
117 | help='Path to a pbtxt file containing class ids and display names')
118 | parser.add_argument('image_dir',
119 | metavar='image_dir',
120 | type=str,
121 | help='Path to the directory containing all images')
122 | parser.add_argument('output_path',
123 | metavar='output_path',
124 | type=str,
125 | help='Path to output TFRecord')
126 |
127 | args = parser.parse_args()
128 |
129 | class_dict = class_dict_from_pbtxt(args.pbtxt_input)
130 |
131 | writer = tf.compat.v1.python_io.TFRecordWriter(args.output_path)
132 | path = os.path.join(args.image_dir)
133 | examples = pd.read_csv(args.csv_input)
134 | grouped = __split(examples, 'filename')
135 |
136 | for group in tqdm(grouped, desc='groups'):
137 | tf_example = create_tf_example(group, path, class_dict)
138 | writer.write(tf_example.SerializeToString())
139 |
140 | writer.close()
141 | output_path = os.path.join(os.getcwd(), args.output_path)
142 | print('Successfully created the TFRecords: {}'.format(output_path))
143 |
--------------------------------------------------------------------------------
/generate_train_eval.py:
--------------------------------------------------------------------------------
1 | import os
2 | import argparse
3 | import pandas as pd
4 | from sklearn.model_selection import train_test_split
5 |
6 | if __name__ == "__main__":
7 | parser = argparse.ArgumentParser(
8 | description='Separates a CSV file into training and validation sets',
9 | formatter_class=argparse.RawDescriptionHelpFormatter)
10 | parser.add_argument('input_csv',
11 | metavar='input_csv',
12 | type=str,
13 | help='Path to the input CSV file')
14 | parser.add_argument(
15 | '-f',
16 | metavar='train_frac',
17 | type=float,
18 | default=.75,
19 | help='fraction of the dataset that will be separated for training (default .75)')
20 | parser.add_argument('-s',
21 | metavar='stratify',
22 | type=bool,
23 | default=True,
24 | help='Stratify by class instead of whole dataset (default True)')
25 | parser.add_argument(
26 | '-o',
27 | metavar='output_dir',
28 | type=str,
29 | default=None,
30 | help='Directory to output train and evaluation datasets (default input_csv directory)')
31 |
32 | args = parser.parse_args()
33 |
34 | if args.f < 0 or args.f > 1:
35 | raise ValueError('train_frac must be between 0 and 1')
36 |
37 | # output_dir = input_csv directory is None
38 | if args.o is None:
39 | output_dir, _ = os.path.split(args.input_csv)
40 | else:
41 | output_dir = args.o
42 |
43 | df = pd.read_csv(args.input_csv)
44 |
45 | # get 'class' column for stratification
46 | strat = df['class'] if args.s else None
47 |
48 | train_df, validation_df = train_test_split(df, test_size=None, train_size=args.f, stratify=strat)
49 |
50 | # output files have the same name of the input file, with some extra stuff appended
51 | new_csv_name = os.path.splitext(args.input_csv)[0]
52 | train_csv_path = os.path.join(output_dir, new_csv_name + '_train.csv')
53 | eval_csv_path = os.path.join(output_dir, new_csv_name + '_eval.csv')
54 |
55 | train_df.to_csv(train_csv_path, index=False)
56 | validation_df.to_csv(eval_csv_path, index=False)
57 |
--------------------------------------------------------------------------------
/generate_yolo_txt.py:
--------------------------------------------------------------------------------
1 | import pandas as pd
2 | import argparse
3 | from collections import namedtuple
4 | from tqdm import tqdm
5 | import os
6 |
7 |
8 | def __split(df, group):
9 | data = namedtuple('data', ['filename', 'object'])
10 | gb = df.groupby(group)
11 | return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
12 |
13 |
14 | def yolo_txt_from_csv(input_csv, input_names, output_dir):
15 | with open(input_names, "r") as file:
16 | names = file.read().split('\n')
17 | df = pd.read_csv(input_csv)
18 |
19 | grouped = __split(df, 'filename')
20 |
21 | for group in tqdm(grouped, desc='groups'):
22 | filename = group.filename
23 | xs = []
24 | ys = []
25 | widths = []
26 | heights = []
27 | classes = []
28 |
29 | for _, row in group.object.iterrows():
30 | if not set(['class', 'width', 'height', 'xmin', 'xmax', 'ymin', 'ymax']).issubset(
31 | set(row.index)):
32 | pass
33 |
34 | img_width = row['width']
35 | img_height = row['height']
36 |
37 | xmin = row['xmin']
38 | ymin = row['ymin']
39 | xmax = row['xmax']
40 | ymax = row['ymax']
41 |
42 | xs.append(round(xmin / img_width, 5))
43 | ys.append(round(ymin / img_height, 5))
44 | widths.append(round((xmax - xmin) / img_width, 5))
45 | heights.append(round((ymax - ymin) / img_height, 5))
46 | classes.append(row['class'])
47 |
48 | txt_filename = os.path.splitext(filename)[0] + '.txt'
49 |
50 | with open(os.path.join(output_dir, txt_filename), 'w+') as f:
51 | for i in range(len(classes)):
52 | f.write('{} {} {} {} {}\n'.format(names.index(classes[i]),
53 | xs[i],
54 | ys[i],
55 | widths[i],
56 | heights[i]))
57 |
58 |
59 | if __name__ == "__main__":
60 | parser = argparse.ArgumentParser(
61 | description=
62 | 'Reads the contents of a CSV file, containing object annotations and their corresponding images\'s dimensions, and generates TXT files for use with darknet and YOLOv3'
63 | )
64 | parser.add_argument('input_csv',
65 | metavar='input_csv',
66 | type=str,
67 | help='Path to the input CSV file')
68 | parser.add_argument(
69 | 'input_names',
70 | metavar='input_names',
71 | type=str,
72 | help='Path to the input .names file used by darknet, containing names of object classes')
73 | parser.add_argument(
74 | 'output_dir',
75 | metavar='output_dir',
76 | type=str,
77 | help='Directory where the .txt output files will be created, one for each image contained in the CSV fle'
78 | )
79 |
80 | args = parser.parse_args()
81 |
82 | yolo_txt_from_csv(args.input_csv, args.input_names, args.output_dir)
83 |
--------------------------------------------------------------------------------