├── .gitignore
├── LICENSE
├── README.md
├── data
    └── placeholder.txt
├── images
    ├── blouse.jpg
    ├── dress.jpg
    ├── outwear.jpg
    ├── skirt.jpg
    └── trousers.jpg
├── src
    ├── data_gen
    │   ├── data_generator.py
    │   ├── data_process.py
    │   ├── dataset.py
    │   ├── kpAnno.py
    │   ├── ohem.py
    │   └── utils.py
    ├── eval
    │   ├── eval_callback.py
    │   ├── evaluation.py
    │   └── post_process.py
    ├── top
    │   ├── demo.py
    │   ├── test.py
    │   └── train.py
    └── unet
    │   ├── fashion_net.py
    │   ├── refinenet.py
    │   ├── refinenet_mask_v3.py
    │   └── resnet101.py
├── submission
    └── placeholder.txt
└── trained_models
    └── placeholder.txt


/.gitignore:
--------------------------------------------------------------------------------
1 | .idea
2 | *.pyc
3 | *.pkl
4 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2018 VictorLi
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # AiFashion
 2 | 
 3 | - Author: VictorLi, yuanyuan.li85@gmail.com
 4 | - Code for  FashionAI Global Challenge—Key Points Detection of Apparel
 5 | [2018 TianChi](https://tianchi.aliyun.com/competition/introduction.htm?spm=5176.100068.5678.1.4ccc289bCzDJXu&raceId=231648&_lang=en_US)
 6 | - Rank 45/2322 at 1st round competition, score 0.61
 7 | - Rank 46 at 2nd round competition, score 0.477
 8 | 
 9 | ## Images with detected keypoints
10 | ### Dress
11 | ![Dress](./images/dress.jpg)
12 | ### Blouse
13 | ![Blouse](./images/blouse.jpg)
14 | ### Outwear
15 | ![Outwear](./images/outwear.jpg)
16 | ### Skirt
17 | ![Skirt](./images/skirt.jpg)
18 | ### Trousers
19 | ![Trousers](./images/trousers.jpg)
20 | 
21 | 
22 | ## Basic idea
23 | - The key idea comes from paper [Cascaded Pyramid Network for Multi-Person Pose Estimation](https://arxiv.org/abs/1711.07319). We have a 2 stage network called global net and refine net who are U-net like. The network was trained to detect the heatmap of cloth's key points. The backbone network used here is resnet101.  
24 | - To overcome the negative impact from different category, `input_mask` was introduced to zero the invalid keypoints. For example, skirt has 4 valid keypoints: `waistband_left`, `waistband_right`, `hemline_left` and `hemline_right`. In `input_mask`, only those valid masks are 1.0 , while other 20 masks are set as zero.
25 | - On line hard negative mining, at last stage of refinenet, only take the top losses as consideration and ignore the easy part (small loss)
26 | 
27 | ## Dependency
28 | - Keras2.0
29 | - Tensorflow
30 | - Opencv/Numpy/Pandas
31 | - Pretrained model weights, resenet101
32 | 
33 | ## Folder Structure
34 | - `data`: folder to store training and testing images and annotations
35 | - `trained_models`: folder to store trained models and logs
36 | - `submission`: folder to store generated submission for evaluation.
37 | - `src`: folder to put all of source code.  
38 | `src/data_gen`: code for data generator including data augmentation and pre-process  
39 | `src/eval`: code for evaluation, including inference and post-processing.  
40 | `src/unet`: code for cnn model definition, including train, fine-tune, loss, optimizer definition.  
41 | `src/top`:top level code for train, test and demo.   
42 | 
43 | ## How to train network  
44 | - Download dataset from competition webpage and put it under data.  
45 |   `data/train` : data used as train. `data/test` : data used for test  
46 | - Download [resnet101](https://gist.github.com/flyyufelix/65018873f8cb2bbe95f429c474aa1294) model and save it as `data/resnet101_weights_tf.h5`.   
47 | Note: all the models here use channel_last dim order.
48 | - Train all-in-one network from scratch  
49 | ```
50 | python train.py --category all --epochs 30 --network v11 --batchSize 3 --gpuID 2
51 | ```
52 | - The trained model and log will be put under `trained_models/all/xxxx`, i.e `trained_models/all/2018_05_23_15_18_07/`  
53 | - The evaluation  will run for each epoch and details saved to `val.log`
54 | - Resume training from a specific model.  
55 | ```
56 | python train.py --gpuID 2 --category all --epochs 30 --network v11 --batchSize 3 --resume True --resumeModel /path/to/model/start/with --initEpoch 6
57 | ```
58 | 
59 | ## How to test and generate submission
60 | - Run test and generate submission
61 | Below command search the best score from `modelpath` and use that to generate submission  
62 | ```
63 | python test.py --gpuID 2 --modelpath ../../trained_models/all/xxx --outpath ../../submission/2018_04_19/ --augment True
64 | ```
65 | The submission will be saved as `submission.csv`
66 | 
67 | ## How to run demo
68 | - Download the pre trained weights from [BaiduDisk](https://pan.baidu.com/s/1t7fB5wnRfW1Vny0gw7xUDQ) (password `1ae2`) or [GoogleDrive](https://drive.google.com/open?id=1VY-AO2F1XMQLBjEZjy6CrOSIPWWaHUGr)
69 | - Save it somewhere, i.e `trained_models/all/fashion_ai_keypoint_weights_epoch28.hdf5`
70 | - Or use your own trained model.
71 | - Run demo and the cloth with keypoints marked will be displayed.   
72 | ```
73 | python demo.py --gpuID 2 --modelfile ../../trained_models/all/fashion_ai_keypoint_weights_epoch28.hdf5
74 | ```
75 | 
76 | ## Reference
77 | - Resnet 101 Keras : https://github.com/statech/resnet
78 | 


--------------------------------------------------------------------------------
/data/placeholder.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuanyuanli85/FashionAI_KeyPoint_Detection_Challenge_Keras/0b3bd8cdee32e05619300e5466578644974279df/data/placeholder.txt


--------------------------------------------------------------------------------
/images/blouse.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuanyuanli85/FashionAI_KeyPoint_Detection_Challenge_Keras/0b3bd8cdee32e05619300e5466578644974279df/images/blouse.jpg


--------------------------------------------------------------------------------
/images/dress.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuanyuanli85/FashionAI_KeyPoint_Detection_Challenge_Keras/0b3bd8cdee32e05619300e5466578644974279df/images/dress.jpg


--------------------------------------------------------------------------------
/images/outwear.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuanyuanli85/FashionAI_KeyPoint_Detection_Challenge_Keras/0b3bd8cdee32e05619300e5466578644974279df/images/outwear.jpg


--------------------------------------------------------------------------------
/images/skirt.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuanyuanli85/FashionAI_KeyPoint_Detection_Challenge_Keras/0b3bd8cdee32e05619300e5466578644974279df/images/skirt.jpg


--------------------------------------------------------------------------------
/images/trousers.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuanyuanli85/FashionAI_KeyPoint_Detection_Challenge_Keras/0b3bd8cdee32e05619300e5466578644974279df/images/trousers.jpg


--------------------------------------------------------------------------------
/src/data_gen/data_generator.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import os
  3 | import cv2
  4 | import pandas as pd
  5 | import numpy as np
  6 | import random
  7 | 
  8 | from kpAnno import KpAnno
  9 | from dataset import getKpNum, getKpKeys, getFlipMapID,  generate_input_mask
 10 | from utils import make_gaussian, load_annotation_from_df
 11 | from data_process import pad_image, resize_image, normalize_image, rotate_image, \
 12 |     rotate_image_float, rotate_mask, crop_image
 13 | from ohem import generate_topk_mask_ohem
 14 | 
 15 | class DataGenerator(object):
 16 | 
 17 |     def __init__(self, category, annfile):
 18 |         self.category = category
 19 |         self.annfile  = annfile
 20 |         self._initialize()
 21 | 
 22 |     def get_dim_order(self):
 23 |         # default tensorflow dim order
 24 |         return "channels_last"
 25 | 
 26 |     def get_dataset_size(self):
 27 |         return len(self.annDataFrame)
 28 | 
 29 |     def generator_with_mask_ohem(self, graph, kerasModel, batchSize=16, inputSize=(512, 512), flipFlag=False, cropFlag=False,
 30 |                             shuffle=True, rotateFlag=True, nStackNum=1):
 31 | 
 32 |         '''
 33 |         Input:  batch_size * Height (512) * Width (512) * Channel (3)
 34 |         Input:  batch_size * 256 * 256 * Channel (N+1). Mask for each category. 1.0 for valid parts in category. 0.0 for invalid parts
 35 |         Output: batch_size * Height/2 (256) * Width/2 (256) * Channel (N+1)
 36 |         '''
 37 |         xdf = self.annDataFrame
 38 | 
 39 |         targetHeight, targetWidth = inputSize
 40 | 
 41 |         # train_input: npfloat,  height, width, channels
 42 |         # train_gthmap: npfloat, N heatmap + 1 background heatmap,
 43 |         train_input = np.zeros((batchSize, targetHeight, targetWidth, 3), dtype=np.float)
 44 |         train_mask = np.zeros((batchSize, targetHeight / 2, targetWidth / 2, getKpNum(self.category) ), dtype=np.float)
 45 |         train_gthmap = np.zeros((batchSize, targetHeight / 2, targetWidth / 2, getKpNum(self.category) ), dtype=np.float)
 46 |         train_ohem_mask = np.zeros((batchSize, targetHeight / 2, targetWidth / 2, getKpNum(self.category) ), dtype=np.float)
 47 |         train_ohem_gthmap = np.zeros((batchSize, targetHeight / 2, targetWidth / 2, getKpNum(self.category) ), dtype=np.float)
 48 | 
 49 |         ## generator need to be infinite loop
 50 |         while 1:
 51 |             # random shuffle at first
 52 |             if shuffle:
 53 |                 xdf = xdf.sample(frac=1)
 54 |             count = 0
 55 |             for _index, _row in xdf.iterrows():
 56 |                 xindex = count % batchSize
 57 |                 xinput, xhmap = self._prcoess_img(_row, inputSize, rotateFlag, flipFlag, cropFlag, nobgFlag=True)
 58 |                 xmask = generate_input_mask(_row['image_category'],
 59 |                                             (targetHeight, targetWidth, getKpNum(self.category)))
 60 | 
 61 |                 xohem_mask, xohem_gthmap = generate_topk_mask_ohem([xinput, xmask], xhmap, kerasModel, graph,
 62 |                                             8, _row['image_category'], dynamicFlag=False)
 63 | 
 64 |                 train_input[xindex, :, :, :] = xinput
 65 |                 train_mask[xindex, :, :, :] = xmask
 66 |                 train_gthmap[xindex, :, :, :] = xhmap
 67 |                 train_ohem_mask[xindex, :, :, :] = xohem_mask
 68 |                 train_ohem_gthmap[xindex, :, :, :] = xohem_gthmap
 69 | 
 70 |                 # if refinenet enable, refinenet has two outputs, globalnet and refinenet
 71 |                 if xindex == 0 and count != 0:
 72 |                     gthamplst = list()
 73 |                     for i in range(nStackNum):
 74 |                         gthamplst.append(train_gthmap)
 75 | 
 76 |                     # last stack will use ohem gthmap
 77 |                     gthamplst.append(train_ohem_gthmap)
 78 | 
 79 |                     yield [train_input, train_mask, train_ohem_mask], gthamplst
 80 | 
 81 |                 count += 1
 82 | 
 83 |     def _initialize(self):
 84 |         self._load_anno()
 85 | 
 86 |     def _load_anno(self):
 87 |         '''
 88 |         Load annotations from train.csv
 89 |         '''
 90 |         # Todo: check if category legal
 91 |         self.train_img_path = "../../data/train"
 92 | 
 93 |         # read into dataframe
 94 |         xpd = pd.read_csv(self.annfile)
 95 |         xpd = load_annotation_from_df(xpd, self.category)
 96 |         self.annDataFrame = xpd
 97 | 
 98 |     def _prcoess_img(self, dfrow, inputSize, rotateFlag, flipFlag, cropFlag, nobgFlag):
 99 | 
100 |         mlist = dfrow[getKpKeys(self.category)]
101 |         imgName, kpStr = mlist[0], mlist[1:]
102 | 
103 |         # read kp annotation from csv file
104 |         kpAnnlst = list()
105 |         for _kpstr in kpStr:
106 |             _kpAn = KpAnno.readFromStr(_kpstr)
107 |             kpAnnlst.append(_kpAn)
108 | 
109 |         assert (len(kpAnnlst) == getKpNum(self.category)), str(len(kpAnnlst))+" is not the same as "+str(getKpNum(self.category))
110 | 
111 | 
112 |         xcvmat = cv2.imread(os.path.join(self.train_img_path, imgName))
113 |         if xcvmat is None:
114 |             return None, None
115 | 
116 |         #flip as first operation.
117 |         # flip image
118 |         if random.choice([0, 1]) and flipFlag:
119 |             xcvmat, kpAnnlst = self.flip_image(xcvmat, kpAnnlst)
120 | 
121 |         #if cropFlag:
122 |         #    xcvmat, kpAnnlst = crop_image(xcvmat, kpAnnlst, 0.8, 0.95)
123 | 
124 |         # pad image to 512x512
125 |         paddedImg, kpAnnlst = pad_image(xcvmat, kpAnnlst, inputSize[0], inputSize[1])
126 | 
127 |         assert (len(kpAnnlst) == getKpNum(self.category)), str(len(kpAnnlst)) + " is not the same as " + str(
128 |             getKpNum(self.category))
129 | 
130 |         # output ground truth heatmap is 256x256
131 |         trainGtHmap = self.__generate_hmap(paddedImg, kpAnnlst)
132 | 
133 |         if random.choice([0,1]) and rotateFlag:
134 |             rAngle = np.random.randint(-1*40, 40)
135 |             rotatedImage,  _ = rotate_image(paddedImg, list(), rAngle)
136 |             rotatedGtHmap  = rotate_mask(trainGtHmap, rAngle)
137 |         else:
138 |             rotatedImage  = paddedImg
139 |             rotatedGtHmap = trainGtHmap
140 | 
141 |         # resize image
142 |         resizedImg    = cv2.resize(rotatedImage, inputSize)
143 |         resizedGtHmap = cv2.resize(rotatedGtHmap, (inputSize[0]//2, inputSize[1]//2))
144 | 
145 |         return normalize_image(resizedImg), resizedGtHmap
146 | 
147 | 
148 |     def __generate_hmap(self, cvmat, kpAnnolst):
149 |         # kpnum + background
150 |         gthmp = np.zeros((cvmat.shape[0], cvmat.shape[1], getKpNum(self.category)), dtype=np.float)
151 | 
152 |         for i, _kpAnn in enumerate(kpAnnolst):
153 |             if _kpAnn.visibility == -1:
154 |                 continue
155 | 
156 |             radius = 100
157 |             gaussMask = make_gaussian(radius, radius, 20, None)
158 | 
159 |             # avoid out of boundary
160 |             top_x, top_y = max(0, _kpAnn.x - radius/2), max(0, _kpAnn.y - radius/2)
161 |             bottom_x, bottom_y = min(cvmat.shape[1], _kpAnn.x + radius/2), min(cvmat.shape[0], _kpAnn.y + radius/2)
162 | 
163 |             top_x_offset = top_x - (_kpAnn.x - radius/2)
164 |             top_y_offset = top_y - (_kpAnn.y - radius/2)
165 | 
166 |             gthmp[ top_y:bottom_y, top_x:bottom_x, i] = gaussMask[top_y_offset:top_y_offset + bottom_y-top_y,
167 |                                                                   top_x_offset:top_x_offset + bottom_x-top_x]
168 | 
169 |         return gthmp
170 | 
171 |     def flip_image(self, orgimg, orgKpAnolst):
172 |         flipImg = cv2.flip(orgimg, flipCode=1)
173 |         flipannlst = self.flip_annlst(orgKpAnolst, orgimg.shape)
174 |         return flipImg, flipannlst
175 | 
176 | 
177 |     def flip_annlst(self, kpannlst, imgshape):
178 |         height, width, channels = imgshape
179 | 
180 |         # flip first
181 |         flipAnnlst = list()
182 |         for _kp in kpannlst:
183 |             flip_x = width - _kp.x
184 |             flipAnnlst.append(KpAnno(flip_x, _kp.y, _kp.visibility))
185 | 
186 |         # exchange location of flip keypoints, left->right
187 |         outAnnlst = flipAnnlst[:]
188 |         for i, _kp in enumerate(flipAnnlst):
189 |             mapId = getFlipMapID('all', i)
190 |             outAnnlst[mapId] = _kp
191 | 
192 |         return outAnnlst
193 | 
194 | 
195 | 
196 | 
197 | 


--------------------------------------------------------------------------------
/src/data_gen/data_process.py:
--------------------------------------------------------------------------------
  1 | import pandas as pd
  2 | import numpy as np
  3 | import cv2
  4 | import os
  5 | from kpAnno import KpAnno
  6 | 
  7 | def normalize_image(cvmat):
  8 |     assert (cvmat.dtype == np.uint8) , " only support normalize np.uint8 to float -0.5 ~ 0.5'"
  9 |     cvmat = cvmat.astype(np.float)
 10 |     cvmat = (cvmat - 128.0) / 256.0
 11 |     return cvmat
 12 | 
 13 | def resize_image(cvmat, targetWidth, targetHeight):
 14 | 
 15 |     assert (cvmat.dtype == np.uint8) , " only support normalize np.uint8  in  resize_image'"
 16 | 
 17 |     # get scale
 18 |     srcHeight, srcWidth, channles = cvmat.shape
 19 |     minScale = min( targetHeight*1.0/srcHeight,  targetWidth*1.0/srcWidth)
 20 | 
 21 |     # resize
 22 |     resizedMat = cv2.resize(cvmat, None, fx=minScale, fy=minScale)
 23 |     reHeight, reWidth, channles = resizedMat.shape
 24 | 
 25 |     # pad to targetWidth or targetHeight
 26 |     outmat = np.zeros((targetHeight, targetWidth, 3), dtype=cvmat.dtype) + 128
 27 | 
 28 |     if targetHeight == reHeight and targetWidth == reWidth:
 29 |         outmat = resizedMat
 30 |     elif targetWidth != reWidth and targetHeight == reHeight:
 31 |         # add pad to width
 32 |         outmat[:, 0:reWidth, :] = resizedMat
 33 |     elif targetHeight != reHeight and targetWidth == reWidth:
 34 |         # add padding to height
 35 |         outmat[0:reHeight, :, :] = resizedMat
 36 |     else:
 37 |         assert(0), "after resize either width or height same as target width or target height"
 38 |     return (outmat, minScale)
 39 | 
 40 | def pad_image(cvmat, kpAnno, targetWidth, targetHeight):
 41 |     '''
 42 | 
 43 |     :param cvmat: input mat
 44 |     :param targetWidth:  width to pad
 45 |     :param targetHeight: height to pad
 46 |     :return:
 47 |     '''
 48 |     assert (cvmat.dtype == np.uint8) , " only support normalize np.uint8  in pad_image'" + str(cvmat.dtype)
 49 | 
 50 |     srcHeight, srcWidth, channles = cvmat.shape
 51 |     outmat = np.zeros((targetHeight, targetWidth, 3), dtype=cvmat.dtype) + 128
 52 | 
 53 |     if targetHeight == srcHeight and targetWidth == srcWidth:
 54 |         outmat =  cvmat
 55 |         outkpAnno = kpAnno
 56 |     elif targetWidth != srcWidth and targetHeight == srcHeight:
 57 |         # add pad to width
 58 |         outmat[:, 0:srcWidth, :] = cvmat
 59 |         outkpAnno = kpAnno
 60 |     elif targetHeight != srcHeight and targetWidth == srcWidth:
 61 |         # add padding to height
 62 |         outmat[0:srcHeight, :, :] = cvmat
 63 |         outkpAnno = kpAnno
 64 |     else:
 65 |         # resize at first, then pad
 66 |         outmat, scale = resize_image(cvmat, targetWidth, targetHeight)
 67 |         outkpAnno = list()
 68 |         for _kpAnno in kpAnno:
 69 |             _nkp = KpAnno.applyScale(_kpAnno, scale)
 70 |             outkpAnno.append(_nkp)
 71 |     return (outmat, outkpAnno)
 72 | 
 73 | 
 74 | def pad_image_inference(cvmat, targetWidth, targetHeight):
 75 |     '''
 76 | 
 77 |     :param cvmat: input mat
 78 |     :param targetWidth:  width to pad
 79 |     :param targetHeight: height to pad
 80 |     :return:
 81 |     '''
 82 |     assert (cvmat.dtype == np.uint8), " only support normalize np.uint8  in pad_image'" + str(cvmat.dtype)
 83 | 
 84 |     srcHeight, srcWidth, channles = cvmat.shape
 85 |     outmat = np.zeros((targetHeight, targetWidth, 3), dtype=cvmat.dtype) + 128
 86 | 
 87 |     if targetHeight == srcHeight and targetWidth == srcWidth:
 88 |         outmat = cvmat
 89 |         scale = 1.0
 90 |     elif targetWidth > srcWidth and targetHeight == srcHeight:
 91 |         # add pad to width
 92 |         outmat[:, 0:srcWidth, :] = cvmat
 93 |         scale = 1.0
 94 |     elif targetHeight > srcHeight and targetWidth == srcWidth:
 95 |         # add padding to height
 96 |         outmat[0:srcHeight, :, :] = cvmat
 97 |         scale = 1.0
 98 |     else:
 99 |         # resize at first, then pad
100 |         outmat, scale = resize_image(cvmat, targetWidth, targetHeight)
101 | 
102 |     return (outmat, scale)
103 | 
104 | def rotate_image(cvmat, kpAnnLst, rotateAngle):
105 | 
106 |     assert (cvmat.dtype == np.uint8) , " only support normalize np.uint8  in rotate_image'"
107 | 
108 |     ##Make sure cvmat is square?
109 |     height, width, channel = cvmat.shape
110 | 
111 |     center = ( width//2, height//2)
112 |     rotateMatrix = cv2.getRotationMatrix2D(center, rotateAngle, 1.0)
113 | 
114 |     cos, sin = np.abs(rotateMatrix[0,0]), np.abs(rotateMatrix[0, 1])
115 |     newH = int((height*sin)+(width*cos))
116 |     newW = int((height*cos)+(width*sin))
117 | 
118 |     rotateMatrix[0,2] += (newW/2) - center[0] #x
119 |     rotateMatrix[1,2] += (newH/2) - center[1] #y
120 | 
121 |     # rotate image
122 |     outMat = cv2.warpAffine(cvmat, rotateMatrix, (newH, newW), borderValue=(128, 128, 128))
123 | 
124 |     # rotate annotations
125 |     nKpLst = list()
126 |     for _kp in kpAnnLst:
127 |         _newkp = KpAnno.applyRotate(_kp, rotateMatrix)
128 |         nKpLst.append(_newkp)
129 | 
130 |     return (outMat, nKpLst)
131 | 
132 | 
133 | def rotate_image_with_invrmat(cvmat, rotateAngle):
134 | 
135 |     assert (cvmat.dtype == np.uint8) , " only support normalize np.uint  in rotate_image_with_invrmat'"
136 | 
137 |     ##Make sure cvmat is square?
138 |     height, width, channel = cvmat.shape
139 | 
140 |     center = ( width//2, height//2)
141 |     rotateMatrix = cv2.getRotationMatrix2D(center, rotateAngle, 1.0)
142 | 
143 |     cos, sin = np.abs(rotateMatrix[0,0]), np.abs(rotateMatrix[0, 1])
144 |     newH = int((height*sin)+(width*cos))
145 |     newW = int((height*cos)+(width*sin))
146 | 
147 |     rotateMatrix[0,2] += (newW/2) - center[0] #x
148 |     rotateMatrix[1,2] += (newH/2) - center[1] #y
149 | 
150 |     # rotate image
151 |     outMat = cv2.warpAffine(cvmat, rotateMatrix, (newH, newW), borderValue=(128, 128, 128))
152 | 
153 |     # generate inv rotate matrix
154 |     invRotateMatrix = cv2.invertAffineTransform(rotateMatrix)
155 | 
156 |     return (outMat, invRotateMatrix, (width, height))
157 | 
158 | def rotate_mask(mask, rotateAngle):
159 | 
160 |     outmask = rotate_image_float(mask, rotateAngle)
161 | 
162 |     return outmask
163 | 
164 | def rotate_image_float(cvmat, rotateAngle, borderValue=(0.0, 0.0, 0.0)):
165 | 
166 |     assert (cvmat.dtype == np.float) , " only support normalize np.float  in rotate_image_float'"
167 | 
168 |     ##Make sure cvmat is square?
169 |     height, width, channels = cvmat.shape
170 | 
171 |     center = ( width//2, height//2)
172 |     rotateMatrix = cv2.getRotationMatrix2D(center, rotateAngle, 1.0)
173 | 
174 |     cos, sin = np.abs(rotateMatrix[0,0]), np.abs(rotateMatrix[0, 1])
175 |     newH = int((height*sin)+(width*cos))
176 |     newW = int((height*cos)+(width*sin))
177 | 
178 |     rotateMatrix[0,2] += (newW/2) - center[0] #x
179 |     rotateMatrix[1,2] += (newH/2) - center[1] #y
180 | 
181 |     # rotate image
182 |     outMat = cv2.warpAffine(cvmat, rotateMatrix, (newH, newW), borderValue=borderValue)
183 | 
184 |     return outMat
185 | 
186 | 
187 | def crop_image(cvmat, kpAnnLst, lowLimitRatio, upLimitRatio):
188 |     import random
189 | 
190 |     assert(lowLimitRatio < 1.0), 'lowLimitRatio should be less than 1.0'
191 |     assert(upLimitRatio < 1.0), 'upLimitRatio should be less than 1.0'
192 | 
193 |     height, width, channels = cvmat.shape
194 | 
195 |     cropHeight = random.randrange(int(lowLimitRatio*height),  int(upLimitRatio*height))
196 |     cropWidth  = random.randrange(int(lowLimitRatio*width),  int(upLimitRatio*width))
197 | 
198 |     top_x = random.randrange(0,  width - cropWidth)
199 |     top_y = random.randrange(0,  height - cropHeight)
200 | 
201 |     # apply offset for keypoints
202 |     nKpLst = list()
203 |     for _kp in kpAnnLst:
204 |         if _kp.visibility == -1:
205 |             _newkp = _kp
206 |         else:
207 |             _newkp = KpAnno.applyOffset(_kp, (top_x, top_y))
208 |             if _newkp.x <=0 or _newkp.y <=0:
209 |                 # negative location, return original image
210 |                 return cvmat, kpAnnLst
211 |             if _newkp.x >= cropWidth or _newkp.y >= cropHeight:
212 |                 # keypoints are cropped out
213 |                 return cvmat, kpAnnLst
214 |         nKpLst.append(_newkp)
215 | 
216 |     return cvmat[top_y:top_y+cropHeight,  top_x:top_x+cropWidth], nKpLst
217 | 
218 | if __name__ == "__main__":
219 |     pass


--------------------------------------------------------------------------------
/src/data_gen/dataset.py:
--------------------------------------------------------------------------------
  1 | 
  2 | 
  3 | def getKpNum(category):
  4 |     # remove one column 'image_id'
  5 |     return len(getKpKeys(category)) - 1
  6 | 
  7 | TROUSERS_PART_KYES=['waistband_left', 'waistband_right', 'crotch', 'bottom_left_in', 'bottom_left_out', 'bottom_right_in', 'bottom_right_out']
  8 | TROUSERS_PART_FLIP_KYES=['waistband_right', 'waistband_left', 'crotch', 'bottom_right_in', 'bottom_right_out', 'bottom_left_in', 'bottom_left_out']
  9 | 
 10 | SKIRT_PART_KEYS=['waistband_left', 'waistband_right', 'hemline_left', 'hemline_right']
 11 | SKIRT_PART_FLIP_KEYS=['waistband_right', 'waistband_left', 'hemline_right', 'hemline_left']
 12 | 
 13 | 
 14 | DRESS_PART_KEYS= ['neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right', 'center_front',
 15 |               'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in',
 16 |               'cuff_left_out', 'cuff_right_in', 'cuff_right_out', 'hemline_left', 'hemline_right']
 17 | DRESS_PART_FLIP_KEYS=['neckline_right', 'neckline_left', 'shoulder_right', 'shoulder_left', 'center_front',
 18 |                'armpit_right', 'armpit_left', 'waistline_right', 'waistline_left', 'cuff_right_in',
 19 |                'cuff_right_out', 'cuff_left_in', 'cuff_left_out', 'hemline_right', 'hemline_left']
 20 | 
 21 | BLOUSE_PART_KEYS=['neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right',
 22 |            'center_front', 'armpit_left', 'armpit_right', 'top_hem_left', 'top_hem_right',
 23 |            'cuff_left_in', 'cuff_left_out', 'cuff_right_in', 'cuff_right_out']
 24 | 
 25 | BLOUSE_PART_FLIP_KEYS=['neckline_right', 'neckline_left', 'shoulder_right', 'shoulder_left',
 26 |            'center_front', 'armpit_right', 'armpit_left', 'top_hem_right', 'top_hem_left',
 27 |            'cuff_right_in', 'cuff_right_out', 'cuff_left_in', 'cuff_left_out']
 28 | 
 29 | OUTWEAR_PART_KEYS=['neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right',
 30 |             'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in',
 31 |             'cuff_left_out', 'cuff_right_in', 'cuff_right_out', 'top_hem_left', 'top_hem_right']
 32 | 
 33 | OUTWEAR_PART_FLIP_KEYS = ['neckline_right', 'neckline_left', 'shoulder_right', 'shoulder_left',
 34 |            'armpit_right', 'armpit_left', 'waistline_right', 'waistline_left', 'cuff_right_in',
 35 |            'cuff_right_out', 'cuff_left_in', 'cuff_left_out', 'top_hem_right', 'top_hem_left']
 36 | 
 37 | ALL_PART_KEYS = ['neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
 38 |                'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in', 'cuff_left_out',
 39 |                'cuff_right_in', 'cuff_right_out', 'top_hem_left', 'top_hem_right', 'waistband_left', 'waistband_right',
 40 |                'hemline_left', 'hemline_right', 'crotch', 'bottom_left_in', 'bottom_left_out',
 41 |                'bottom_right_in', 'bottom_right_out']
 42 | 
 43 | ALL_PART_FLIP_KEYS = [  'neckline_right', 'neckline_left', 'center_front', 'shoulder_right', 'shoulder_left',
 44 |                         'armpit_right', 'armpit_left',   'waistline_right', 'waistline_left', 'cuff_right_in', 'cuff_right_out',
 45 |                         'cuff_left_in', 'cuff_left_out', 'top_hem_right', 'top_hem_left',  'waistband_right','waistband_left',
 46 |                         'hemline_right', 'hemline_left',  'crotch',  'bottom_right_in', 'bottom_right_out',
 47 |                         'bottom_left_in', 'bottom_left_out']
 48 | 
 49 | def getFlipKeys(category):
 50 |     if category == 'skirt':
 51 |         keys, mapkeys = SKIRT_PART_KEYS, SKIRT_PART_FLIP_KEYS
 52 |     elif category == 'dress':
 53 |         keys, mapkeys = DRESS_PART_KEYS, DRESS_PART_FLIP_KEYS
 54 |     elif category == 'trousers':
 55 |         keys, mapkeys = TROUSERS_PART_KYES, TROUSERS_PART_FLIP_KYES
 56 |     elif category == 'blouse':
 57 |         keys, mapkeys = BLOUSE_PART_KEYS, BLOUSE_PART_FLIP_KEYS
 58 |     elif category == 'outwear':
 59 |         keys, mapkeys = OUTWEAR_PART_KEYS, OUTWEAR_PART_FLIP_KEYS
 60 |     elif category == 'all':
 61 |         keys, mapkeys = ALL_PART_KEYS, ALL_PART_FLIP_KEYS
 62 |     else:
 63 |         assert (0), category + " not supported"
 64 | 
 65 |     xdict = dict()
 66 |     for i in range(len(keys)):
 67 |         xdict[keys[i]] = mapkeys[i]
 68 |     return keys, xdict
 69 | 
 70 | def getFlipMapID(category, partid):
 71 |     keys, mapDict = getFlipKeys(category)
 72 |     mapKey = mapDict[keys[partid]]
 73 |     mapID  = keys.index(mapKey)
 74 |     return mapID
 75 | 
 76 | def getKpKeys(category):
 77 |     '''
 78 | 
 79 |     :param category:
 80 |     :return: get the keypoint keys in annotation csv
 81 |     '''
 82 |     SKIRT_KP_KEYS = ['image_id', 'waistband_left', 'waistband_right', 'hemline_left', 'hemline_right']
 83 |     DRESS_KP_KEYS = ['image_id', 'neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right', 'center_front',
 84 |                      'armpit_left',  'armpit_right' ,  'waistline_left' , 'waistline_right', 'cuff_left_in',
 85 |                      'cuff_left_out', 'cuff_right_in',  'cuff_right_out',  'hemline_left',  'hemline_right']
 86 |     TROUSERS_KP_KEYS=['image_id',  'waistband_left', 'waistband_right', 'crotch',  'bottom_left_in',
 87 |                       'bottom_left_out', 'bottom_right_in', 'bottom_right_out']
 88 |     BLOUSE_KP_KEYS = [ 'image_id', 'neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right',
 89 |                        'center_front', 'armpit_left', 'armpit_right', 'top_hem_left', 'top_hem_right',
 90 |                        'cuff_left_in', 'cuff_left_out', 'cuff_right_in', 'cuff_right_out']
 91 |     OUTWEAR_KP_KEYS= ['image_id', 'neckline_left', 'neckline_right', 'shoulder_left', 'shoulder_right',
 92 |                       'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in',
 93 |                       'cuff_left_out', 'cuff_right_in', 'cuff_right_out', 'top_hem_left', 'top_hem_right']
 94 | 
 95 |     ALL_KP_KESY = ['image_id','neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
 96 |                  'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in', 'cuff_left_out', 'cuff_right_in',
 97 |                  'cuff_right_out', 'top_hem_left', 'top_hem_right', 'waistband_left', 'waistband_right', 'hemline_left', 'hemline_right' ,
 98 |                  'crotch', 'bottom_left_in' , 'bottom_left_out', 'bottom_right_in' ,'bottom_right_out']
 99 | 
100 |     if category == 'skirt':
101 |         return SKIRT_KP_KEYS
102 |     elif category == 'dress':
103 |         return DRESS_KP_KEYS
104 |     elif category == 'trousers':
105 |         return TROUSERS_KP_KEYS
106 |     elif category == 'blouse':
107 |         return BLOUSE_KP_KEYS
108 |     elif category == 'outwear':
109 |         return OUTWEAR_KP_KEYS
110 |     elif category == 'all':
111 |         return ALL_KP_KESY
112 |     else:
113 |         assert(0), category + ' not supported'
114 | 
115 | 
116 | def fill_dataframe(kplst, category, dfrow):
117 |     keys = getKpKeys(category)[1:]
118 | 
119 |     # fill category
120 |     dfrow['image_category'] = category
121 | 
122 |     assert (len(keys) == len(kplst)), str(len(kplst)) + ' must be the same as ' + str(len(keys))
123 |     for i, _key in enumerate(keys):
124 |         kpann = kplst[i]
125 |         outstr = str(int(kpann.x))+"_"+str(int(kpann.y))+"_"+str(1)
126 |         dfrow[_key] = outstr
127 | 
128 | 
129 | def get_kp_index_from_allkeys(kpname):
130 |     ALL_KP_KEYS = ['neckline_left', 'neckline_right', 'center_front', 'shoulder_left', 'shoulder_right',
131 |                    'armpit_left', 'armpit_right', 'waistline_left', 'waistline_right', 'cuff_left_in', 'cuff_left_out',
132 |                    'cuff_right_in', 'cuff_right_out', 'top_hem_left', 'top_hem_right', 'waistband_left', 'waistband_right',
133 |                    'hemline_left', 'hemline_right', 'crotch', 'bottom_left_in', 'bottom_left_out', 'bottom_right_in', 'bottom_right_out']
134 | 
135 |     return ALL_KP_KEYS.index(kpname)
136 | 
137 | 
138 | def generate_input_mask(image_category, shape, nobgFlag=True):
139 |     import numpy as np
140 |     # 0.0 for invalid key points for each category
141 |     # 1.0 for valid key points for each category
142 |     h, w, c = shape
143 |     mask = np.zeros((h // 2, w // 2, c), dtype=np.float)
144 | 
145 |     for key in getKpKeys(image_category)[1:]:
146 |         index = get_kp_index_from_allkeys(key)
147 |         mask[:, :, index] = 1.0
148 | 
149 |     # for last channel, background
150 |     if nobgFlag:     mask[:, :, -1] = 0.0
151 |     else:   mask[:, :, -1] = 1.0
152 | 
153 |     return mask


--------------------------------------------------------------------------------
/src/data_gen/kpAnno.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | class KpAnno(object):
 5 |     '''
 6 |         Convert string to x, y, visibility
 7 |     '''
 8 |     def __init__(self, x, y, visibility):
 9 |         self.x = int(x)
10 |         self.y = int(y)
11 |         self.visibility = visibility
12 | 
13 |     @classmethod
14 |     def readFromStr(cls, xstr):
15 |         xarray = xstr.split('_')
16 |         x = int(xarray[0])
17 |         y = int(xarray[1])
18 |         visibility = int(xarray[2])
19 |         return cls(x,y, visibility)
20 | 
21 |     @classmethod
22 |     def applyScale(cls, kpAnno, scale):
23 |         x = int(kpAnno.x*scale)
24 |         y = int(kpAnno.y*scale)
25 |         v = kpAnno.visibility
26 |         return cls(x, y, v)
27 | 
28 |     @classmethod
29 |     def applyRotate(cls, kpAnno, rotateMatrix):
30 |         vector = [kpAnno.x, kpAnno.y, 1]
31 |         rotatedV = np.dot(rotateMatrix, vector)
32 |         return cls( int(rotatedV[0]), int(rotatedV[1]), kpAnno.visibility)
33 | 
34 |     @classmethod
35 |     def applyOffset(cls, kpAnno, offset):
36 |         x = kpAnno.x - offset[0]
37 |         y = kpAnno.y - offset[1]
38 |         v = kpAnno.visibility
39 |         return cls(x, y, v)
40 | 
41 |     @staticmethod
42 |     def calcDistance(kpA, kpB):
43 |         distance = (kpA.x - kpB.x)**2 + (kpA.y - kpB.y)**2
44 |         return np.sqrt(distance)
45 | 


--------------------------------------------------------------------------------
/src/data_gen/ohem.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import sys
 3 | sys.path.insert(0, "../unet/")
 4 | 
 5 | from keras.models import *
 6 | from keras.layers import *
 7 | from utils import np_euclidean_l2
 8 | from dataset import getKpNum
 9 | 
10 | def generate_topk_mask_ohem(input_data, gthmap, keras_model, graph, topK, image_category, dynamicFlag=False):
11 |     '''
12 |     :param input_data: input
13 |     :param gthmap:  ground truth
14 |     :param keras_model: keras model
15 |     :param graph:  tf grpah to WA thread issue
16 |     :param topK: number of kp selected
17 |     :return:
18 |     '''
19 | 
20 |     # do inference, and calculate loss of each channel
21 |     mimg, mmask = input_data
22 |     ximg  = mimg[np.newaxis,:,:,:]
23 |     xmask = mmask[np.newaxis,:,:,:]
24 | 
25 |     if len(keras_model.input_layers) == 3:
26 |         # use original mask as ohem_mask
27 |         inputs = [ximg, xmask, xmask]
28 |     else:
29 |         inputs = [ximg, xmask]
30 | 
31 |     with graph.as_default():
32 |         keras_output = keras_model.predict(inputs)
33 | 
34 |     # heatmap of last stage
35 |     outhmap = keras_output[-1]
36 | 
37 |     channel_num = gthmap.shape[-1]
38 | 
39 |     # calculate loss
40 |     mloss = list()
41 |     for i in range(channel_num):
42 |         _dtmap = outhmap[0, :, :, i]
43 |         _gtmap = gthmap[:, :, i]
44 |         loss   = np_euclidean_l2(_dtmap, _gtmap)
45 |         mloss.append(loss)
46 | 
47 |     # refill input_mask, set topk as 1.0 and fill 0.0 for rest
48 |     # fixme: topk may different b/w category
49 |     if dynamicFlag:
50 |         topK = getKpNum(image_category)//2
51 | 
52 |     ohem_mask   = adjsut_mask(mloss, mmask, topK)
53 | 
54 |     ohem_gthmap = ohem_mask * gthmap
55 | 
56 |     return ohem_mask, ohem_gthmap
57 | 
58 | def adjsut_mask(loss, input_mask,  topk):
59 |     # pick topk loss from losses
60 |     # fill topk with 1.0 and fill the rest as 0.0
61 |     assert (len(loss) == input_mask.shape[-1]), \
62 |         "shape should be same" + str(len(loss)) + " vs " + str(input_mask.shape)
63 | 
64 |     outmask = np.zeros(input_mask.shape, dtype=np.float)
65 | 
66 |     topk_index = sorted(range(len(loss)), key=lambda i:loss[i])[-topk:]
67 | 
68 |     for i in range(len(loss)):
69 |         if i in topk_index:
70 |             outmask[:,:,i] = 1.0
71 | 
72 |     return outmask
73 | 


--------------------------------------------------------------------------------
/src/data_gen/utils.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import numpy as np
 3 | import pandas as pd
 4 | import os
 5 | 
 6 | def make_gaussian(width, height, sigma=3, center=None):
 7 |     '''
 8 |         generate 2d guassion heatmap
 9 |     :return:
10 |     '''
11 | 
12 |     x = np.arange(0, width, 1, float)
13 |     y = np.arange(0, height, 1, float)[:, np.newaxis]
14 | 
15 |     if center is None:
16 |         x0 = width // 2
17 |         y0 = height // 2
18 |     else:
19 |         x0 = center[0]
20 |         y0 = center[1]
21 | 
22 |     return np.exp( -4*np.log(2)*((x-x0)**2 + (y-y0)**2)/sigma**2)
23 | 
24 | 
25 | def split_csv_train_val(allcsv, traincsv, valcsv, ratio=0.8):
26 |     xdf = pd.read_csv(allcsv)
27 |     # random shuffle
28 |     xdf = xdf.sample(frac=1)
29 | 
30 |     # random sampling
31 |     msk = np.random.rand(len(xdf)) < ratio
32 |     trainDf= xdf[msk]
33 |     valDf= xdf[~msk]
34 |     print "total", len(xdf), "split into train ", len(trainDf), '  val', len(valDf)
35 | 
36 |     #save to file
37 |     trainDf.to_csv(traincsv, index=False)
38 |     valDf.to_csv(valcsv, index=False)
39 | 
40 | 
41 | def np_euclidean_l2(x, y):
42 |     assert (x.shape == y.shape), "shape mismatched " + x.shape +" :  " + y.shape
43 |     loss = np.sum((x - y)**2)
44 |     loss = np.sqrt(loss)
45 |     return loss
46 | 
47 | 
48 | def load_annotation_from_df(df, category):
49 |     if category == 'all':
50 |         return df
51 |     else:
52 |         return df[df['image_category'] == category]
53 | 
54 | 
55 | 


--------------------------------------------------------------------------------
/src/eval/eval_callback.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import keras
 3 | import os
 4 | import datetime
 5 | from evaluation import Evaluation
 6 | from time import time
 7 | class NormalizedErrorCallBack(keras.callbacks.Callback):
 8 | 
 9 |     def __init__(self, foldpath, category, multiOut=False, resumeFolder=None):
10 |         self.parentFoldPath = foldpath
11 |         self.category = category
12 | 
13 |         if resumeFolder is None:
14 |             self.foldPath = os.path.join(self.parentFoldPath, self.category, datetime.datetime.now().strftime('%Y_%m_%d_%H_%M_%S'))
15 |             if not os.path.exists(self.foldPath):
16 |                 os.mkdir(self.foldPath)
17 |         else:
18 |             self.foldPath = resumeFolder
19 | 
20 |         self.valLog = os.path.join(self.foldPath, 'val.log')
21 |         self.multiOut = multiOut
22 | 
23 |     def get_folder_path(self):
24 |         return self.foldPath
25 | 
26 |     def on_epoch_end(self, epoch, logs=None):
27 |         modelName = os.path.join(self.foldPath, self.category+"_weights_"+str(epoch)+".hdf5")
28 |         keras.models.save_model(self.model, modelName)
29 |         print "Saving model to ", modelName
30 | 
31 |         print "Runing evaluation ........."
32 | 
33 |         xEval = Evaluation(self.category, None)
34 |         xEval.init_from_model(self.model)
35 | 
36 |         start = time()
37 |         neScore, categoryDict = xEval.eval(self.multiOut, details=True)
38 |         end = time()
39 |         print "Evaluation Done", str(neScore), " cost ", end - start, " seconds!"
40 | 
41 |         for key in categoryDict.keys():
42 |             scores = categoryDict[key]
43 |             print key, ' score ', sum(scores)/len(scores)
44 | 
45 |         with open(self.valLog , 'a+') as xfile:
46 |             xfile.write(modelName + ", Socre "+ str(neScore)+"\n")
47 |             for key in categoryDict.keys():
48 |                 scores = categoryDict[key]
49 |                 xfile.write(key + ": " + str(sum(scores)/len(scores)) + "\n")
50 | 
51 |         xfile.close()


--------------------------------------------------------------------------------
/src/eval/evaluation.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import sys
  3 | sys.path.insert(0, "../data_gen/")
  4 | sys.path.insert(0, "../unet/")
  5 | 
  6 | import pandas as pd
  7 | from dataset import getKpKeys, getKpNum, getFlipMapID, get_kp_index_from_allkeys, generate_input_mask
  8 | from kpAnno import KpAnno
  9 | from post_process import post_process_heatmap
 10 | from keras.models import load_model
 11 | import os
 12 | from refinenet_mask_v3 import euclidean_loss
 13 | import numpy as np
 14 | import cv2
 15 | from resnet101 import Scale
 16 | from utils import load_annotation_from_df
 17 | from collections import defaultdict
 18 | import copy
 19 | from data_process import pad_image_inference
 20 | 
 21 | class Evaluation(object):
 22 |     def __init__(self, category, modelFile):
 23 |         self.category = category
 24 |         self.train_img_path = "../../data/train"
 25 |         if modelFile is not None:
 26 |             self._initialize(modelFile)
 27 | 
 28 |     def init_from_model(self, model):
 29 |         self._load_anno()
 30 |         self.net = model
 31 | 
 32 |     def eval(self, multiOut=False, details=False, flip=True):
 33 |         xdf = self.annDataFrame
 34 |         scores = list()
 35 |         xdict = dict()
 36 |         xcategoryDict = defaultdict(list)
 37 |         for _index, _row in xdf.iterrows():
 38 |             imgId = _row['image_id']
 39 |             category = _row['image_category']
 40 |             imgFile = os.path.join(self.train_img_path, imgId)
 41 |             gtKpAnno = self._get_groundtruth_kpAnno(_row)
 42 |             if flip:
 43 |                 predKpAnno = self.predict_kp_with_flip(imgFile, category)
 44 |             else:
 45 |                 predKpAnno = self.predict_kp(imgFile, category, multiOut)
 46 |             neScore = Evaluation.calc_ne_score(category, predKpAnno, gtKpAnno)
 47 |             scores.extend(neScore)
 48 |             if details:
 49 |                 xcategoryDict[category].extend(neScore)
 50 |         if details:
 51 |             return sum(scores)/len(scores), xcategoryDict
 52 |         else:
 53 |             return sum(scores)/len(scores)
 54 | 
 55 |     def _initialize(self, modelFile):
 56 |         self._load_anno()
 57 |         self._initialize_network(modelFile)
 58 | 
 59 |     def _initialize_network(self, modelFile):
 60 |         self.net = load_model(modelFile, custom_objects={'euclidean_loss': euclidean_loss, 'Scale': Scale})
 61 | 
 62 |     def _load_anno(self):
 63 |         '''
 64 |         Load annotations from train.csv
 65 |         '''
 66 |         self.annfile = os.path.join("../../data/train/Annotations", "val_split.csv")
 67 | 
 68 |         # read into dataframe
 69 |         xpd = pd.read_csv(self.annfile)
 70 |         xpd = load_annotation_from_df(xpd, self.category)
 71 |         self.annDataFrame = xpd
 72 | 
 73 | 
 74 |     def _get_groundtruth_kpAnno(self, dfrow):
 75 |         mlist = dfrow[getKpKeys(self.category)]
 76 |         imgName, kpStr = mlist[0], mlist[1:]
 77 |         # read kp annotation from csv file
 78 |         kpAnnlst = [KpAnno.readFromStr(_kpstr) for _kpstr in kpStr]
 79 |         return kpAnnlst
 80 | 
 81 |     def _net_inference_with_mask(self, imgFile, imgCategory):
 82 |         import cv2
 83 |         from data_process import normalize_image, pad_image_inference
 84 |         assert (len(self.net.input_layers) > 1), "input layer need to more than 1"
 85 | 
 86 |         # load image and preprocess
 87 |         img = cv2.imread(imgFile)
 88 | 
 89 |         img, scale = pad_image_inference(img, 512, 512)
 90 |         img   = normalize_image(img)
 91 |         input_img = img[np.newaxis, :, :, :]
 92 | 
 93 |         input_mask = generate_input_mask(imgCategory, (512, 512, getKpNum(self.category)) )
 94 |         input_mask = input_mask[np.newaxis, :, :, :]
 95 | 
 96 |         # inference
 97 |         heatmap = self.net.predict([input_img, input_mask, input_mask])
 98 | 
 99 |         return (heatmap, scale)
100 | 
101 |     def _heatmap_sum(self, heatmaplst):
102 |         outheatmap = np.copy(heatmaplst[0])
103 |         for i in range(1, len(heatmaplst), 1):
104 |             outheatmap += heatmaplst[i]
105 |         return outheatmap
106 | 
107 |     def predict_kp(self, imgFile, imgCategory, multiOutput=False):
108 | 
109 |         xnetout, scale = self._net_inference_with_mask(imgFile, imgCategory)
110 | 
111 |         if multiOutput:
112 |             #todo: fixme, it is tricky that the previous stage has beeter performance than last stage's output.
113 |             #todo: here, we are using multiple stage's output sum.
114 |             heatmap = self._heatmap_sum(xnetout)
115 |         else:
116 |             heatmap = xnetout
117 | 
118 |         detectedKps = post_process_heatmap(heatmap, kpConfidenceTh=0.2)
119 | 
120 |         # scale to padded resolution 256X256 -> 512X512
121 |         scaleTo512 = 2.0
122 | 
123 |         # apply scale to original resolution
124 |         detectedKps = [KpAnno(_kp.x*scaleTo512/scale , _kp.y*scaleTo512/scale, _kp.visibility) for _kp in detectedKps]
125 | 
126 |         return detectedKps
127 | 
128 | 
129 |     def predict_kp_with_flip(self, imgFile, imgCategory):
130 |         #  inference with flip and original image
131 |         heatmap, scale = self._net_inference_flip(imgFile, imgCategory)
132 | 
133 |         detectedKps = post_process_heatmap(heatmap, kpConfidenceTh=0.2)
134 | 
135 |         # scale to padded resolution 256X256 -> 512X512
136 |         scaleTo512 = 2.0
137 | 
138 |         # apply scale to original resolution
139 |         detectedKps = [KpAnno(_kp.x * scaleTo512 / scale, _kp.y * scaleTo512 / scale, _kp.visibility) for _kp in
140 |                        detectedKps]
141 | 
142 |         return detectedKps
143 | 
144 |     def _net_inference_flip(self, imgFile, imgCategory):
145 |         import cv2
146 |         from data_process import normalize_image, pad_image_inference
147 |         assert (len(self.net.input_layers) > 1), "input layer need to more than 1"
148 | 
149 |         batch_size =2
150 | 
151 |         input_img  = np.zeros(shape=(batch_size, 512, 512, 3), dtype=np.float)
152 |         input_mask = np.zeros(shape=(batch_size, 256, 256, getKpNum(self.category)), dtype=np.float)
153 | 
154 |         # load image and preprocess
155 |         orgimage = cv2.imread(imgFile)
156 | 
157 |         padimg, scale = pad_image_inference(orgimage, 512, 512)
158 |         flipimg = cv2.flip(padimg, flipCode=1)
159 | 
160 |         input_img[0,:,:,:] = normalize_image(padimg)
161 |         input_img[1,:,:,:] = normalize_image(flipimg)
162 | 
163 |         mask = generate_input_mask(imgCategory, (512, 512, getKpNum(self.category)))
164 |         input_mask[0,:,:,:] = mask
165 |         input_mask[1,:,:,:] = mask
166 | 
167 |         # inference
168 |         if len(self.net.input_layers) == 2:
169 |             heatmap = self.net.predict([input_img, input_mask])
170 |         elif len(self.net.input_layers) == 3:
171 |             heatmap = self.net.predict([input_img, input_mask, input_mask])
172 |         else:
173 |             assert (0), str(len(self.net.input_layers)) + " should be 2 or 3 "
174 | 
175 |         # sum heatmap
176 |         avgheatmap = self._heatmap_sum(heatmap)
177 | 
178 |         orgheatmap = avgheatmap[0,:,:,:]
179 | 
180 |         # convert to same sequency with original heatmap
181 |         flipheatmap = avgheatmap[1,:,:,:]
182 |         flipheatmap = self._flip_out_heatmap(flipheatmap)
183 | 
184 |         # average original and flip heatmap
185 |         outheatmap = flipheatmap + orgheatmap
186 |         outheatmap = outheatmap[np.newaxis, :, :, :]
187 | 
188 |         return (outheatmap, scale)
189 | 
190 |     def predict_kp_with_rotate(self, imgFile, imgCategory):
191 |         #  inference with rotated image
192 |         rotateheatmap = self._net_inference_rotate(imgFile, imgCategory)
193 |         rotateheatmap = rotateheatmap[np.newaxis, :, :, :]
194 | 
195 |         # original image and flip image
196 |         orgflipmap, scale = self._net_inference_flip(imgFile, imgCategory)
197 |         mflipmap = cv2.resize(orgflipmap[0,:,:,:], None, fx=2.0/scale, fy=2.0/scale)
198 | 
199 |         # add mflipmap and rotateheatmap
200 |         avgheatmap = mflipmap[np.newaxis, :, :, :]
201 | 
202 |         b, h, w , c = rotateheatmap.shape
203 |         avgheatmap[:, 0:h, 0:w,:] += rotateheatmap
204 | 
205 |         # generate key point locations
206 |         detectedKps = post_process_heatmap(avgheatmap, kpConfidenceTh=0.2)
207 | 
208 |         return detectedKps
209 | 
210 |     def _net_inference_rotate(self, imgFile, imgCategory):
211 |         from data_process import normalize_image, pad_image_inference, rotate_image_with_invrmat
212 | 
213 |         # load image and preprocess
214 |         orgimage = cv2.imread(imgFile)
215 | 
216 |         anglelst = [-20, -10, 10, 20]
217 | 
218 |         input_img  = np.zeros(shape=(len(anglelst), 512, 512, 3), dtype=np.float)
219 |         input_mask = np.zeros(shape=(len(anglelst), 256, 256, getKpNum(self.category)), dtype=np.float)
220 | 
221 |         mlist = list()
222 |         for i, angle in enumerate(anglelst):
223 |             rotateimg, invRotMatrix, orgImgSize = rotate_image_with_invrmat(orgimage, angle)
224 |             padimg, scale = pad_image_inference(rotateimg, 512, 512)
225 |             _img = normalize_image(padimg)
226 |             input_img[i, :, :, :] = _img
227 |             mlist.append((scale, invRotMatrix))
228 | 
229 |         mask = generate_input_mask(imgCategory, (512, 512, getKpNum(self.category)))
230 |         for i, angle in enumerate(anglelst):
231 |             input_mask[i, :,:,:] = mask
232 | 
233 |         # inference
234 |         heatmap = self.net.predict([input_img, input_mask, input_mask])
235 |         heatmap = self._heatmap_sum(heatmap)
236 | 
237 |         # rotate back to original resolution
238 |         sumheatmap =  np.zeros(shape=(orgimage.shape[0], orgimage.shape[1], getKpNum(self.category)), dtype=np.float)
239 |         for i, item in enumerate(mlist):
240 |             _heatmap = heatmap[i, :, :, :]
241 |             _scale, _invRotMatrix = item
242 |             _heatmap = cv2.resize(_heatmap, None, fx=2.0 / _scale, fy=2.0 / _scale)
243 |             _invheatmap = cv2.warpAffine(_heatmap, _invRotMatrix, (orgimage.shape[1], orgimage.shape[0]))
244 |             sumheatmap += _invheatmap
245 | 
246 |         return sumheatmap
247 | 
248 |     def _flip_out_heatmap(self, flipout):
249 |         outmap = np.zeros(flipout.shape, dtype=np.float)
250 |         for i in range(flipout.shape[-1]):
251 |             flipid = getFlipMapID(self.category, i)
252 |             mask = np.copy(flipout[:, :, i])
253 |             outmap[:, :, flipid] = cv2.flip(mask, flipCode=1)
254 |         return outmap
255 | 
256 | 
257 |     @staticmethod
258 |     def get_normized_distance(category, gtKp):
259 |         '''
260 | 
261 |         :param category:
262 |         :param gtKp:
263 |         :return: if ground truth's two points do not exist, return a big number 1e6
264 |         '''
265 | 
266 |         if category in ['skirt' ,'trousers']:
267 |             ##waistband left and right
268 |             waistband_left_index  = get_kp_index_from_allkeys('waistband_left')
269 |             waistband_right_index = get_kp_index_from_allkeys('waistband_right')
270 | 
271 |             if gtKp[waistband_left_index].visibility != -1 and gtKp[waistband_right_index].visibility != -1:
272 |                 distance = KpAnno.calcDistance(gtKp[waistband_left_index], gtKp[waistband_right_index])
273 |             else:
274 |                 distance = 1e6
275 |             return distance
276 |         elif category in ['blouse', 'dress', 'outwear']:
277 |             armpit_left_index  = get_kp_index_from_allkeys('armpit_left')
278 |             armpit_right_index = get_kp_index_from_allkeys('armpit_right')
279 |             ##armpit_left armpit_right'
280 |             if gtKp[armpit_left_index].visibility != -1 and gtKp[armpit_right_index].visibility != -1:
281 |                 distance = KpAnno.calcDistance(gtKp[armpit_left_index], gtKp[armpit_right_index])
282 |             else:
283 |                 distance = 1e6
284 |             return distance
285 |         else:
286 |             assert (0), category + " not implemented in _get_normized_distance"
287 | 
288 | 
289 |     @staticmethod
290 |     def calc_ne_score(category, dtKp, gtKp):
291 | 
292 |         assert (len(dtKp) == len(gtKp)), "predicted keypoint number should be the same as ground truth keypoints" + \
293 |                                          str(dtKp) + " vs " + str(gtKp)
294 | 
295 |         # calculate normalized error as score
296 |         normalizedDistance = Evaluation.get_normized_distance(category, gtKp)
297 | 
298 |         mlist = list()
299 |         for i in range(len(gtKp)):
300 |             if gtKp[i].visibility == 1:
301 |                 dk = KpAnno.calcDistance(dtKp[i], gtKp[i])
302 |                 mlist.append( dk/normalizedDistance)
303 | 
304 |         return mlist
305 | 


--------------------------------------------------------------------------------
/src/eval/post_process.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import numpy as np
 3 | from scipy.ndimage import gaussian_filter, maximum_filter
 4 | from keras.layers import *
 5 | from kpAnno import KpAnno
 6 | 
 7 | def post_process_heatmap(heatMap, kpConfidenceTh=0.2):
 8 |     kplst = list()
 9 |     for i in range(heatMap.shape[-1]):
10 |         # ignore last channel, background channel
11 |         _map = heatMap[0, :, :, i]
12 |         _map = gaussian_filter(_map, sigma=0.5)
13 |         _nmsPeaks = non_max_supression(_map, windowSize=3, threshold=1e-6)
14 | 
15 |         y, x = np.where(_nmsPeaks == _nmsPeaks.max())
16 |         confidence = np.amax(_nmsPeaks)
17 |         if confidence > kpConfidenceTh:
18 |             kplst.append(KpAnno(x[0], y[0], 1))
19 |         else:
20 |             kplst.append(KpAnno(x[0], y[0], -1))
21 |     return kplst
22 | 
23 | def non_max_supression(plain, windowSize=3, threshold=1e-6):
24 |     # clear value less than threshold
25 |     under_th_indices = plain < threshold
26 |     plain[under_th_indices] = 0
27 |     return plain* (plain == maximum_filter(plain, footprint=np.ones((windowSize, windowSize))))
28 | 


--------------------------------------------------------------------------------
/src/top/demo.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | sys.path.insert(0, "../data_gen/")
 3 | sys.path.insert(0, "../eval/")
 4 | sys.path.insert(0, "../unet/")
 5 | 
 6 | import argparse
 7 | import os
 8 | import pandas as pd
 9 | import cv2
10 | from evaluation import Evaluation
11 | from dataset import getKpKeys, get_kp_index_from_allkeys
12 | 
13 | def visualize_keypoint(imageName, category, dtkp):
14 |     cvmat = cv2.imread(imageName)
15 |     for key in getKpKeys(category)[1:]:
16 |         index = get_kp_index_from_allkeys(key)
17 |         _kp = dtkp[index]
18 |         cv2.circle(cvmat, center=(_kp.x, _kp.y), radius=7, color=(1.0, 0.0, 0.0), thickness=2)
19 |     cv2.imshow('demo', cvmat)
20 |     cv2.waitKey()
21 | 
22 | def demo(modelfile):
23 | 
24 |     # load network
25 |     xEval = Evaluation('all', modelfile)
26 | 
27 |     # load images and run prediction
28 |     testfile = os.path.join("../../data/test/", 'test.csv')
29 |     xdf = pd.read_csv(testfile)
30 |     xdf = xdf.sample(frac=1.0)
31 | 
32 |     for _index, _row in xdf.iterrows():
33 |         _image_id = _row['image_id']
34 |         _category = _row['image_category']
35 |         imageName = os.path.join("../../data/test", _image_id)
36 |         print _image_id, _category
37 |         dtkp = xEval.predict_kp_with_rotate(imageName, _category)
38 |         visualize_keypoint(imageName, _category, dtkp)
39 | 
40 | 
41 | if __name__ == "__main__":
42 |     parser = argparse.ArgumentParser()
43 |     parser.add_argument("--gpuID", default=0, type=int, help='gpu id')
44 |     parser.add_argument("--modelfile", help="file of model")
45 | 
46 |     args = parser.parse_args()
47 | 
48 |     print args
49 | 
50 |     os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
51 |     os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpuID)
52 | 
53 |     demo(args.modelfile)


--------------------------------------------------------------------------------
/src/top/test.py:
--------------------------------------------------------------------------------
  1 | import sys
  2 | sys.path.insert(0, "../data_gen/")
  3 | sys.path.insert(0, "../eval/")
  4 | sys.path.insert(0, "../unet/")
  5 | 
  6 | import argparse
  7 | import os
  8 | from fashion_net import FashionNet
  9 | from dataset import getKpNum, getKpKeys
 10 | import pandas as pd
 11 | from evaluation import Evaluation
 12 | import pickle
 13 | import numpy as np
 14 | 
 15 | 
 16 | def get_best_single_model(valfile):
 17 |     '''
 18 |     :param valfile: the log file with validation score for each snapshot
 19 |     :return: model file and score
 20 |     '''
 21 | 
 22 |     def get_key(item):
 23 |         return item[1]
 24 | 
 25 |     with open(valfile) as xval:
 26 |         lines = xval.readlines()
 27 | 
 28 |     xlist = list()
 29 |     for linenum, xline in enumerate(lines):
 30 |         if 'hdf5' in xline and 'Socre' in xline:
 31 |             modelname = xline.strip().split(',')[0]
 32 |             overallscore = xline.strip().split(',')[1]
 33 |             xlist.append((modelname, overallscore))
 34 | 
 35 |     bestmodel = sorted(xlist, key=get_key)[0]
 36 | 
 37 |     return bestmodel
 38 | 
 39 | 
 40 | def fill_dataframe(kplst, keys, dfrow, image_category):
 41 |     # fill category
 42 | 
 43 |     dfrow['image_category'] = image_category
 44 | 
 45 |     assert (len(keys) == len(kplst)), str(len(kplst)) + ' must be the same as ' + str(len(keys))
 46 |     for i, _key in enumerate(keys):
 47 |         kpann = kplst[i]
 48 |         outstr = str(int(kpann.x))+"_"+str(int(kpann.y))+"_"+str(1)
 49 |         dfrow[_key] = outstr
 50 | 
 51 | def get_kp_from_dict(mdict, image_category, image_id):
 52 |     if image_category in mdict.keys():
 53 |         xdict = mdict[image_category]
 54 |     else:
 55 |         xdict = mdict['all']
 56 |     return xdict[image_id]
 57 | 
 58 | def submission(pklpath):
 59 |     xdf = pd.read_csv("../../data/train/Annotations/train.csv")
 60 |     trainKeys = xdf.keys()
 61 | 
 62 |     testdf = pd.read_csv("../../data/test/test.csv")
 63 |     print len(testdf), " samples in test.csv"
 64 | 
 65 |     mdict = dict()
 66 |     for xfile in os.listdir(pklpath):
 67 |         if xfile.endswith('.pkl'):
 68 |             category = xfile.strip().split('.')[0]
 69 |             pkl = open(os.path.join(pklpath, xfile))
 70 |             mdict[category] = pickle.load(pkl)
 71 | 
 72 |     print testdf.keys()
 73 |     print mdict.keys()
 74 | 
 75 |     submissionDf = pd.DataFrame(columns=trainKeys, index=np.arange(testdf.shape[0]))
 76 |     submissionDf = submissionDf.fillna(value='-1_-1_-1')
 77 |     submissionDf['image_id'] = testdf['image_id']
 78 |     submissionDf['image_category'] = testdf['image_category']
 79 | 
 80 |     for _index, _row in submissionDf.iterrows():
 81 |         image_id = _row['image_id']
 82 |         image_category = _row['image_category']
 83 |         kplst = get_kp_from_dict(mdict, image_category, image_id)
 84 |         fill_dataframe(kplst, getKpKeys('all')[1:], _row, image_category)
 85 | 
 86 | 
 87 |     print len(submissionDf), "save to ",  os.path.join(pklpath, 'submission.csv')
 88 |     submissionDf.to_csv( os.path.join(pklpath, 'submission.csv'), index=False )
 89 | 
 90 | 
 91 | def load_image_names(annfile, category):
 92 |     # read into dataframe
 93 |     xdf = pd.read_csv(annfile)
 94 |     xdf = xdf[xdf['image_category'] == category]
 95 |     return xdf
 96 | 
 97 | def main_test(savepath, modelpath, augmentFlag):
 98 | 
 99 |     valfile = os.path.join(modelpath, 'val.log')
100 |     bestmodels = get_best_single_model(valfile)
101 | 
102 |     print bestmodels, augmentFlag
103 | 
104 |     xEval = Evaluation('all', bestmodels[0])
105 | 
106 |     # load images and run prediction
107 |     testfile = os.path.join("../../data/test/", 'test.csv')
108 | 
109 |     for category in ['skirt', 'blouse', 'trousers', 'outwear', 'dress']:
110 |         xdict = dict()
111 |         xdf = load_image_names(testfile, category)
112 |         print len(xdf), " images to process ", category
113 | 
114 |         count = 0
115 |         for _index, _row in xdf.iterrows():
116 |             count += 1
117 |             if count%1000 == 0:
118 |                 print count, "images have been processed"
119 | 
120 |             _image_id = _row['image_id']
121 |             imageName = os.path.join("../../data/test", _image_id)
122 |             if augmentFlag:
123 |                 dtkp = xEval.predict_kp_with_rotate(imageName, _row['image_category'])
124 |             else:
125 |                 dtkp = xEval.predict_kp(imageName, _row['image_category'], multiOutput=True)
126 |             xdict[_image_id] = dtkp
127 | 
128 |         savefile = os.path.join(savepath, category+'.pkl')
129 |         with open(savefile, 'wb') as xfile:
130 |             pickle.dump(xdict, xfile)
131 | 
132 |         print "prediction save to ", savefile
133 | 
134 | 
135 | if __name__ == "__main__":
136 |     parser = argparse.ArgumentParser()
137 |     parser.add_argument("--gpuID", default=0, type=int, help='gpu id')
138 |     parser.add_argument("--modelpath", help="path of trained model")
139 |     parser.add_argument("--outpath", help="path to save predicted keypoints")
140 |     parser.add_argument("--augment", default=False, type=bool, help="augment or not")
141 | 
142 |     args = parser.parse_args()
143 | 
144 |     print args
145 | 
146 |     os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
147 |     os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpuID)
148 | 
149 |     main_test(args.outpath, args.modelpath, args.augment)
150 |     submission(args.outpath)


--------------------------------------------------------------------------------
/src/top/train.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | sys.path.insert(0, "../data_gen/")
 3 | sys.path.insert(0, "../unet/")
 4 | 
 5 | import argparse
 6 | import os
 7 | from fashion_net import FashionNet
 8 | from dataset import getKpNum
 9 | import tensorflow as tf
10 | from keras import backend as k
11 | 
12 | if __name__ == "__main__":
13 |     parser = argparse.ArgumentParser()
14 |     parser.add_argument("--gpuID", default=0, type=int, help='gpu id')
15 |     parser.add_argument("--category", help="specify cloth category")
16 |     parser.add_argument("--network", help="specify  network arch'")
17 |     parser.add_argument("--batchSize", default=8, type=int, help='batch size for training')
18 |     parser.add_argument("--epochs", default=20, type=int, help="number of traning epochs")
19 |     parser.add_argument("--resume", default=False, type=bool,  help="resume training or not")
20 |     parser.add_argument("--lrdecay", default=False, type=bool,  help="lr decay or not")
21 |     parser.add_argument("--resumeModel", help="start point to retrain")
22 |     parser.add_argument("--initEpoch", type=int, help="epoch to resume")
23 | 
24 | 
25 |     args = parser.parse_args()
26 | 
27 |     os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
28 |     os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpuID)
29 | 
30 | 
31 |     # TensorFlow wizardry
32 |     config = tf.ConfigProto()
33 | 
34 |     # Don't pre-allocate memory; allocate as-needed
35 |     config.gpu_options.allow_growth = True
36 | 
37 |     # Only allow a total of half the GPU memory to be allocated
38 |     config.gpu_options.per_process_gpu_memory_fraction = 1.0
39 | 
40 |     # Create a session with the above options specified.
41 |     k.tensorflow_backend.set_session(tf.Session(config=config))
42 | 
43 |     if not args.resume :
44 |         xnet = FashionNet(512, 512, getKpNum(args.category))
45 |         xnet.build_model(modelName=args.network, show=True)
46 |         xnet.train(args.category, epochs=args.epochs, batchSize=args.batchSize, lrschedule=args.lrdecay)
47 |     else:
48 |         xnet = FashionNet(512, 512, getKpNum(args.category))
49 |         xnet.resume_train(args.category, args.resumeModel, args.network, args.initEpoch,
50 |                           epochs=args.epochs, batchSize=args.batchSize)


--------------------------------------------------------------------------------
/src/unet/fashion_net.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import sys
 3 | sys.path.insert(0, "../data_gen/")
 4 | sys.path.insert(0, "../eval/")
 5 | 
 6 | from data_generator import DataGenerator
 7 | from keras.callbacks import ModelCheckpoint, CSVLogger
 8 | from keras.models import load_model
 9 | from data_process import pad_image, normalize_image
10 | import os
11 | import cv2
12 | import numpy as np
13 | import datetime
14 | from eval_callback import NormalizedErrorCallBack
15 | from refinenet_mask_v3 import Res101RefineNetMaskV3, euclidean_loss
16 | from resnet101 import Scale
17 | import tensorflow as tf
18 | 
19 | class FashionNet(object):
20 | 
21 |     def __init__(self, inputHeight, inputWidth, nClasses):
22 |         self.inputWidth = inputWidth
23 |         self.inputHeight = inputHeight
24 |         self.nClass = nClasses
25 | 
26 |     def build_model(self, modelName='v2', show=False):
27 |         self.modelName = modelName
28 |         self.model = Res101RefineNetMaskV3(self.nClass, self.inputHeight, self.inputWidth, nStackNum=2)
29 |         self.nStackNum = 2
30 | 
31 |         # show model summary and layer name
32 |         if show:
33 |             self.model.summary()
34 |             for layer in self.model.layers:
35 |                 print layer.name, layer.trainable
36 | 
37 |     def train(self, category, batchSize=8, epochs=20, lrschedule=False):
38 |         trainDt = DataGenerator(category, os.path.join("../../data/train/Annotations", "train_split.csv"))
39 |         trainGen = trainDt.generator_with_mask_ohem( graph=tf.get_default_graph(), kerasModel=self.model,
40 |                                     batchSize= batchSize, inputSize=(self.inputHeight, self.inputWidth),
41 |                                     nStackNum=self.nStackNum, flipFlag=False, cropFlag=False)
42 | 
43 |         normalizedErrorCallBack = NormalizedErrorCallBack("../../trained_models/", category, True)
44 | 
45 |         csvlogger = CSVLogger( os.path.join(normalizedErrorCallBack.get_folder_path(),
46 |                                "csv_train_"+self.modelName+"_"+str(datetime.datetime.now().strftime('%H:%M'))+".csv"))
47 | 
48 |         xcallbacks = [normalizedErrorCallBack, csvlogger]
49 | 
50 |         self.model.fit_generator(generator=trainGen, steps_per_epoch=trainDt.get_dataset_size()//batchSize,
51 |                                  epochs=epochs,  callbacks=xcallbacks)
52 | 
53 |     def load_model(self, netWeightFile):
54 |         self.model = load_model(netWeightFile, custom_objects={'euclidean_loss': euclidean_loss, 'Scale': Scale})
55 | 
56 |     def resume_train(self, category, pretrainModel, modelName, initEpoch, batchSize=8, epochs=20):
57 |         self.modelName = modelName
58 |         self.load_model(pretrainModel)
59 |         refineNetflag = True
60 |         self.nStackNum = 2
61 | 
62 |         modelPath = os.path.dirname(pretrainModel)
63 | 
64 |         trainDt = DataGenerator(category, os.path.join("../../data/train/Annotations", "train_split.csv"))
65 |         trainGen = trainDt.generator_with_mask_ohem(graph=tf.get_default_graph(), kerasModel=self.model,
66 |                                                     batchSize=batchSize, inputSize=(self.inputHeight, self.inputWidth),
67 |                                                     nStackNum=self.nStackNum, flipFlag=False, cropFlag=False)
68 | 
69 | 
70 |         normalizedErrorCallBack = NormalizedErrorCallBack("../../trained_models/", category, refineNetflag, resumeFolder=modelPath)
71 | 
72 |         csvlogger = CSVLogger(os.path.join(normalizedErrorCallBack.get_folder_path(),
73 |                                            "csv_train_" + self.modelName + "_" + str(
74 |                                                datetime.datetime.now().strftime('%H:%M')) + ".csv"))
75 | 
76 |         self.model.fit_generator(initial_epoch=initEpoch, generator=trainGen, steps_per_epoch=trainDt.get_dataset_size() // batchSize,
77 |                                  epochs=epochs, callbacks=[normalizedErrorCallBack, csvlogger])
78 | 
79 | 
80 |     def predict_image(self, imgfile):
81 |         # load image and preprocess
82 |         img = cv2.imread(imgfile)
83 |         img, _ = pad_image(img, list(), 512, 512)
84 |         img = normalize_image(img)
85 |         input = img[np.newaxis,:,:,:]
86 |         # inference
87 |         heatmap = self.model.predict(input)
88 |         return heatmap
89 | 
90 | 
91 |     def predict(self, input):
92 |         # inference
93 |         heatmap = self.model.predict(input)
94 |         return heatmap


--------------------------------------------------------------------------------
/src/unet/refinenet.py:
--------------------------------------------------------------------------------
  1 | from keras.models import *
  2 | from keras.layers import *
  3 | from keras.optimizers import Adam, SGD
  4 | from keras import backend as K
  5 | from keras.applications.resnet50 import ResNet50
  6 | 
  7 | IMAGE_ORDERING = 'channels_last'
  8 | 
  9 | def Res101RefineNetDilated(n_classes, inputHeight, inputWidth):
 10 |     model = build_network_resnet101(inputHeight, inputWidth, n_classes, dilated=True)
 11 |     return model
 12 | 
 13 | def Res101RefineNetStacked(n_classes, inputHeight, inputWidth, nStackNum):
 14 |     model = build_network_resnet101_stack(inputHeight, inputWidth, n_classes, nStackNum)
 15 |     return model
 16 | 
 17 | def euclidean_loss(x, y):
 18 |     return K.sqrt(K.sum(K.square(x - y)))
 19 | 
 20 | 
 21 | def create_global_net(lowlevelFeatures, n_classes):
 22 |     lf2x, lf4x, lf8x, lf16x = lowlevelFeatures
 23 | 
 24 |     o = lf16x
 25 | 
 26 |     o = (Conv2D(256, (3, 3), activation='relu', padding='same', name='up16x_conv', data_format=IMAGE_ORDERING))(o)
 27 |     o = (BatchNormalization())(o)
 28 | 
 29 |     o = (Conv2DTranspose(256, kernel_size=(3, 3), strides=(2, 2), name='upsample_16x', activation='relu', padding='same',
 30 |                     data_format=IMAGE_ORDERING))(o)
 31 |     o = (concatenate([o, lf8x], axis=-1))
 32 |     o = (Conv2D(128, (3, 3), activation='relu', padding='same', name='up8x_conv', data_format=IMAGE_ORDERING))(o)
 33 |     o = (BatchNormalization())(o)
 34 |     fup8x = o
 35 | 
 36 |     o = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='upsample_8x', padding='same', activation='relu',
 37 |                          data_format=IMAGE_ORDERING))(o)
 38 |     o = (concatenate([o, lf4x], axis=-1))
 39 |     o = (Conv2D(64, (3, 3), activation='relu', padding='same', name='up4x_conv', data_format=IMAGE_ORDERING))(o)
 40 |     o = (BatchNormalization())(o)
 41 |     fup4x = o
 42 | 
 43 |     o = (Conv2DTranspose(64, kernel_size=(3, 3), strides=(2, 2), name='upsample_4x', padding='same', activation='relu',
 44 |                          data_format=IMAGE_ORDERING))(o)
 45 |     o = (concatenate([o, lf2x], axis=-1))
 46 |     o = (Conv2D(64, (3, 3), activation='relu', padding='same', name='up2x_conv', data_format=IMAGE_ORDERING))(o)
 47 |     o = (BatchNormalization())(o)
 48 |     fup2x = o
 49 | 
 50 |     out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out2x', data_format=IMAGE_ORDERING)(fup2x)
 51 |     out4x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out4x', data_format=IMAGE_ORDERING)(fup4x)
 52 |     out8x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out8x', data_format=IMAGE_ORDERING)(fup8x)
 53 | 
 54 |     x4x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(out8x)
 55 |     eadd4x = Add(name='global4x')([x4x, out4x])
 56 | 
 57 |     x2x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(eadd4x)
 58 |     eadd2x = Add(name='global2x')([x2x, out2x])
 59 | 
 60 |     return (fup8x, eadd4x, eadd2x)
 61 | 
 62 | def create_refine_net(inputFeatures, n_classes):
 63 |     f8x, f4x, f2x = inputFeatures
 64 | 
 65 |     # 2 Conv2DTranspose f8x -> fup8x
 66 |     fup8x = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='refine8x_deconv_1', padding='same', activation='relu',
 67 |                          data_format=IMAGE_ORDERING))(f8x)
 68 |     fup8x = (BatchNormalization())(fup8x)
 69 | 
 70 |     fup8x = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='refine8x_deconv_2', padding='same', activation='relu',
 71 |                          data_format=IMAGE_ORDERING))(fup8x)
 72 |     fup8x = (BatchNormalization())(fup8x)
 73 | 
 74 |     # 1 Conv2DTranspose f4x -> fup4x
 75 |     fup4x = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='refine4x_deconv', padding='same', activation='relu',
 76 |                     data_format=IMAGE_ORDERING))(f4x)
 77 | 
 78 |     fup4x = (BatchNormalization())(fup4x)
 79 | 
 80 |     # 1 conv f2x -> fup2x
 81 |     fup2x =  (Conv2D(128, (3, 3), activation='relu', padding='same', name='refine2x_conv', data_format=IMAGE_ORDERING))(f2x)
 82 |     fup2x =  (BatchNormalization())(fup2x)
 83 | 
 84 |     # concat f2x, fup8x, fup4x
 85 |     fconcat = (concatenate([fup8x, fup4x, fup2x], axis=-1, name='refine_concat'))
 86 | 
 87 |     # 1x1 to map to required feature map
 88 |     out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='refine2x', data_format=IMAGE_ORDERING)(fconcat)
 89 | 
 90 |     return out2x
 91 | 
 92 | 
 93 | def create_refine_net_bottleneck(inputFeatures, n_classes):
 94 |     f8x, f4x, f2x = inputFeatures
 95 | 
 96 |     # 2 Conv2DTranspose f8x -> fup8x
 97 |     fup8x = (Conv2D(256, kernel_size=(1, 1),  name='refine8x_1', padding='same', activation='relu', data_format=IMAGE_ORDERING))(f8x)
 98 |     fup8x = (BatchNormalization())(fup8x)
 99 | 
100 |     fup8x = (Conv2D(128, kernel_size=(1, 1),  name='refine8x_2', padding='same', activation='relu', data_format=IMAGE_ORDERING))(fup8x)
101 |     fup8x = (BatchNormalization())(fup8x)
102 | 
103 |     fup8x = UpSampling2D((4, 4), data_format=IMAGE_ORDERING)(fup8x)
104 | 
105 | 
106 |     # 1 Conv2DTranspose f4x -> fup4x
107 |     fup4x = (Conv2D(128, kernel_size=(1, 1), name='refine4x', padding='same', activation='relu', data_format=IMAGE_ORDERING))(f4x)
108 |     fup4x = (BatchNormalization())(fup4x)
109 |     fup4x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(fup4x)
110 | 
111 | 
112 |     # 1 conv f2x -> fup2x
113 |     fup2x =  (Conv2D(128, (1, 1), activation='relu', padding='same', name='refine2x_conv', data_format=IMAGE_ORDERING))(f2x)
114 |     fup2x =  (BatchNormalization())(fup2x)
115 | 
116 |     # concat f2x, fup8x, fup4x
117 |     fconcat = (concatenate([fup8x, fup4x, fup2x], axis=-1, name='refine_concat'))
118 | 
119 |     # 1x1 to map to required feature map
120 |     out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='refine2x', data_format=IMAGE_ORDERING)(fconcat)
121 | 
122 |     return out2x
123 | 
124 | 
125 | def create_stack_refinenet(inputFeatures, n_classes, layerName):
126 |     f8x, f4x, f2x = inputFeatures
127 | 
128 |     # 2 Conv2DTranspose f8x -> fup8x
129 |     fup8x = (Conv2D(256, kernel_size=(1, 1), name=layerName+'_refine8x_1', padding='same', activation='relu'))(f8x)
130 |     fup8x = (BatchNormalization())(fup8x)
131 | 
132 |     fup8x = (Conv2D(128, kernel_size=(1, 1), name=layerName+'refine8x_2', padding='same', activation='relu'))(fup8x)
133 |     fup8x = (BatchNormalization())(fup8x)
134 | 
135 |     out8x = fup8x
136 |     fup8x = UpSampling2D((4, 4), data_format=IMAGE_ORDERING)(fup8x)
137 | 
138 |     # 1 Conv2DTranspose f4x -> fup4x
139 |     fup4x = (Conv2D(128, kernel_size=(1, 1), name=layerName+'refine4x', padding='same', activation='relu'))(f4x)
140 |     fup4x = (BatchNormalization())(fup4x)
141 |     out4x = fup4x
142 |     fup4x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(fup4x)
143 | 
144 |     # 1 conv f2x -> fup2x
145 |     fup2x = (Conv2D(128, (1, 1), activation='relu', padding='same', name=layerName+'refine2x_conv'))(f2x)
146 |     fup2x = (BatchNormalization())(fup2x)
147 | 
148 |     # concat f2x, fup8x, fup4x
149 |     fconcat = (concatenate([fup8x, fup4x, fup2x], axis=-1, name=layerName+'refine_concat'))
150 | 
151 |     # 1x1 to map to required feature map
152 |     out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name=layerName+'refine2x')(fconcat)
153 | 
154 |     return out8x, out4x, out2x
155 | 
156 | 
157 | def create_global_net_dilated(lowlevelFeatures, n_classes):
158 |     lf2x, lf4x, lf8x, lf16x = lowlevelFeatures
159 | 
160 |     o = lf16x
161 | 
162 |     o = (Conv2D(256, (3, 3), dilation_rate=(2, 2), activation='relu', padding='same', name='up16x_conv', data_format=IMAGE_ORDERING))(o)
163 |     o = (BatchNormalization())(o)
164 | 
165 |     o = (Conv2DTranspose(256, kernel_size=(3, 3), strides=(2, 2), name='upsample_16x', activation='relu', padding='same',
166 |                     data_format=IMAGE_ORDERING))(o)
167 |     o = (concatenate([o, lf8x], axis=-1))
168 |     o = (Conv2D(128, (3, 3), dilation_rate=(2, 2), activation='relu', padding='same', name='up8x_conv', data_format=IMAGE_ORDERING))(o)
169 |     o = (BatchNormalization())(o)
170 |     fup8x = o
171 | 
172 |     o = (Conv2DTranspose(128, kernel_size=(3, 3), strides=(2, 2), name='upsample_8x', padding='same', activation='relu',
173 |                          data_format=IMAGE_ORDERING))(o)
174 |     o = (concatenate([o, lf4x], axis=-1))
175 |     o = (Conv2D(64, (3, 3), dilation_rate=(2, 2), activation='relu', padding='same', name='up4x_conv', data_format=IMAGE_ORDERING))(o)
176 |     o = (BatchNormalization())(o)
177 |     fup4x = o
178 | 
179 |     o = (Conv2DTranspose(64, kernel_size=(3, 3), strides=(2, 2), name='upsample_4x', padding='same', activation='relu',
180 |                          data_format=IMAGE_ORDERING))(o)
181 |     o = (concatenate([o, lf2x], axis=-1))
182 |     o = (Conv2D(64, (3, 3), dilation_rate=(2, 2), activation='relu', padding='same', name='up2x_conv', data_format=IMAGE_ORDERING))(o)
183 |     o = (BatchNormalization())(o)
184 |     fup2x = o
185 | 
186 |     out2x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out2x', data_format=IMAGE_ORDERING)(fup2x)
187 |     out4x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out4x', data_format=IMAGE_ORDERING)(fup4x)
188 |     out8x = Conv2D(n_classes, (1, 1), activation='linear', padding='same', name='out8x', data_format=IMAGE_ORDERING)(fup8x)
189 | 
190 |     x4x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(out8x)
191 |     eadd4x = Add(name='global4x')([x4x, out4x])
192 | 
193 |     x2x = UpSampling2D((2, 2), data_format=IMAGE_ORDERING)(eadd4x)
194 |     eadd2x = Add(name='global2x')([x2x, out2x])
195 | 
196 |     return (fup8x, eadd4x, eadd2x)
197 | 
198 | 
199 | def build_network_resnet101(inputHeight, inputWidth, n_classes, frozenlayers=True, dilated=False):
200 |     input, lf2x, lf4x, lf8x, lf16x = load_backbone_res101net(inputHeight, inputWidth)
201 | 
202 |     # global net 8x, 4x, and 2x
203 |     if dilated:
204 |         g8x, g4x, g2x = create_global_net_dilated((lf2x, lf4x, lf8x, lf16x), n_classes)
205 |     else:
206 |         g8x, g4x, g2x = create_global_net((lf2x, lf4x, lf8x, lf16x), n_classes)
207 | 
208 |     # refine net, only 2x as output
209 |     refine2x = create_refine_net_bottleneck((g8x, g4x, g2x), n_classes)
210 | 
211 |     model = Model(inputs=input, outputs=[g2x, refine2x])
212 | 
213 |     adam = Adam(lr=1e-4)
214 |     model.compile(optimizer=adam, loss=euclidean_loss, metrics=["accuracy"])
215 | 
216 |     return model
217 | 
218 | 
219 | def build_network_resnet101_stack(inputHeight, inputWidth, n_classes, nStack):
220 |     # backbone network
221 |     input, lf2x,lf4x, lf8x, lf16x = load_backbone_res101net(inputHeight, inputWidth)
222 | 
223 |     # global net
224 |     g8x, g4x, g2x = create_global_net_dilated((lf2x, lf4x, lf8x, lf16x), n_classes)
225 | 
226 |     s8x, s4x, s2x = g8x, g4x, g2x
227 | 
228 |     outputs =  [g2x]
229 |     for i in range(nStack):
230 |         s8x, s4x, s2x =  create_stack_refinenet((s8x, s4x, s2x), n_classes, 'stack_'+str(i))
231 |         outputs.append(s2x)
232 | 
233 |     model = Model(inputs=input, outputs=outputs)
234 | 
235 |     adam = Adam(lr=1e-4)
236 |     model.compile(optimizer=adam, loss=euclidean_loss, metrics=["accuracy"])
237 |     return model
238 | 
239 | 
240 | def load_backbone_res101net(inputHeight, inputWidth):
241 |     from resnet101 import ResNet101
242 |     xresnet = ResNet101(weights='imagenet', include_top=False, input_shape=(inputHeight, inputWidth, 3))
243 | 
244 |     xresnet.load_weights("../../data/resnet101_weights_tf.h5", by_name=True)
245 | 
246 |     lf16x = xresnet.get_layer('res4b22_relu').output
247 |     lf8x = xresnet.get_layer('res3b2_relu').output
248 |     lf4x = xresnet.get_layer('res2c_relu').output
249 |     lf2x = xresnet.get_layer('conv1_relu').output
250 | 
251 |     # add one padding for lf4x whose shape is 127x127
252 |     lf4xp = ZeroPadding2D(padding=((0, 1), (0, 1)))(lf4x)
253 | 
254 |     return (xresnet.input, lf2x, lf4xp, lf8x, lf16x)


--------------------------------------------------------------------------------
/src/unet/refinenet_mask_v3.py:
--------------------------------------------------------------------------------
 1 | 
 2 | from refinenet import load_backbone_res101net, create_global_net_dilated, create_stack_refinenet
 3 | from keras.models import *
 4 | from keras.layers import *
 5 | from keras.optimizers import Adam, SGD
 6 | from keras import backend as K
 7 | import keras
 8 | 
 9 | def Res101RefineNetMaskV3(n_classes, inputHeight, inputWidth, nStackNum):
10 |     model = build_resnet101_stack_mask_v3(inputHeight, inputWidth, n_classes, nStackNum)
11 |     return model
12 | 
13 | def euclidean_loss(x, y):
14 |     return K.sqrt(K.sum(K.square(x - y)))
15 | 
16 | def apply_mask_to_output(output, mask):
17 |     output_with_mask = keras.layers.multiply([output, mask])
18 |     return output_with_mask
19 | 
20 | def build_resnet101_stack_mask_v3(inputHeight, inputWidth, n_classes, nStack):
21 | 
22 |     input_mask = Input(shape=(inputHeight//2, inputHeight//2, n_classes), name='mask')
23 |     input_ohem_mask = Input(shape=(inputHeight//2, inputHeight//2, n_classes), name='ohem_mask')
24 | 
25 |     # backbone network
26 |     input_image, lf2x,lf4x, lf8x, lf16x = load_backbone_res101net(inputHeight, inputWidth)
27 | 
28 |     # global net
29 |     g8x, g4x, g2x = create_global_net_dilated((lf2x, lf4x, lf8x, lf16x), n_classes)
30 | 
31 |     s8x, s4x, s2x = g8x, g4x, g2x
32 | 
33 |     g2x_mask = apply_mask_to_output(g2x, input_mask)
34 | 
35 |     outputs =  [g2x_mask]
36 |     for i in range(nStack):
37 |         s8x, s4x, s2x =  create_stack_refinenet((s8x, s4x, s2x), n_classes, 'stack_'+str(i))
38 |         if i == (nStack-1): # last stack with ohem_mask
39 |             s2x_mask = apply_mask_to_output(s2x, input_ohem_mask)
40 |         else:
41 |             s2x_mask = apply_mask_to_output(s2x, input_mask)
42 |         outputs.append(s2x_mask)
43 | 
44 |     model = Model(inputs=[input_image, input_mask, input_ohem_mask], outputs=outputs)
45 | 
46 |     adam = Adam(lr=1e-4)
47 |     model.compile(optimizer=adam, loss=euclidean_loss, metrics=["accuracy"])
48 |     return model


--------------------------------------------------------------------------------
/src/unet/resnet101.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """ResNet-101 model for Keras.
  3 | 
  4 | # Reference:
  5 | 
  6 | - [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
  7 | 
  8 | Slightly modified Felix Yu's (https://github.com/flyyufelix) implementation of
  9 | ResNet-101 to have consistent API as those pre-trained models within
 10 | `keras.applications`. The original implementation is found here
 11 | https://gist.github.com/flyyufelix/65018873f8cb2bbe95f429c474aa1294#file-resnet-101_keras-py
 12 | 
 13 | Implementation is based on Keras 2.0
 14 | """
 15 | from keras.layers import (
 16 |     Input, Dense, Conv2D, MaxPooling2D, AveragePooling2D, ZeroPadding2D,
 17 |     Flatten, Activation, GlobalAveragePooling2D, GlobalMaxPooling2D, add)
 18 | from keras.layers.normalization import BatchNormalization
 19 | from keras.models import Model
 20 | from keras import initializers
 21 | from keras.engine import Layer, InputSpec
 22 | from keras.engine.topology import get_source_inputs
 23 | from keras import backend as K
 24 | from keras.applications.imagenet_utils import _obtain_input_shape
 25 | from keras.utils.data_utils import get_file
 26 | 
 27 | import warnings
 28 | import sys
 29 | sys.setrecursionlimit(3000)
 30 | 
 31 | 
 32 | WEIGHTS_PATH_TH = 'https://dl.dropboxusercontent.com/s/rrp56zm347fbrdn/resnet101_weights_th.h5?dl=0'
 33 | WEIGHTS_PATH_TF = 'https://dl.dropboxusercontent.com/s/a21lyqwgf88nz9b/resnet101_weights_tf.h5?dl=0'
 34 | MD5_HASH_TH = '3d2e9a49d05192ce6e22200324b7defe'
 35 | MD5_HASH_TF = '867a922efc475e9966d0f3f7b884dc15'
 36 | 
 37 | 
 38 | class Scale(Layer):
 39 |     '''Learns a set of weights and biases used for scaling the input data.
 40 |     the output consists simply in an element-wise multiplication of the input
 41 |     and a sum of a set of constants:
 42 | 
 43 |         out = in * gamma + beta,
 44 | 
 45 |     where 'gamma' and 'beta' are the weights and biases larned.
 46 | 
 47 |     # Arguments
 48 |         axis: integer, axis along which to normalize in mode 0. For instance,
 49 |             if your input tensor has shape (samples, channels, rows, cols),
 50 |             set axis to 1 to normalize per feature map (channels axis).
 51 |         momentum: momentum in the computation of the
 52 |             exponential average of the mean and standard deviation
 53 |             of the data, for feature-wise normalization.
 54 |         weights: Initialization weights.
 55 |             List of 2 Numpy arrays, with shapes:
 56 |             `[(input_shape,), (input_shape,)]`
 57 |         beta_init: name of initialization function for shift parameter
 58 |             (see [initializers](../initializers.md)), or alternatively,
 59 |             Theano/TensorFlow function to use for weights initialization.
 60 |             This parameter is only relevant if you don't pass a `weights`
 61 |             argument.
 62 |         gamma_init: name of initialization function for scale parameter (see
 63 |             [initializers](../initializers.md)), or alternatively,
 64 |             Theano/TensorFlow function to use for weights initialization.
 65 |             This parameter is only relevant if you don't pass a `weights`
 66 |             argument.
 67 |         gamma_init: name of initialization function for scale parameter (see
 68 |             [initializers](../initializers.md)), or alternatively,
 69 |             Theano/TensorFlow function to use for weights initialization.
 70 |             This parameter is only relevant if you don't pass a `weights`
 71 |             argument.
 72 |     '''
 73 |     def __init__(self,
 74 |                  weights=None,
 75 |                  axis=-1,
 76 |                  momentum=0.9,
 77 |                  beta_init='zero',
 78 |                  gamma_init='one',
 79 |                  **kwargs):
 80 |         self.momentum = momentum
 81 |         self.axis = axis
 82 |         self.beta_init = initializers.get(beta_init)
 83 |         self.gamma_init = initializers.get(gamma_init)
 84 |         self.initial_weights = weights
 85 |         super(Scale, self).__init__(**kwargs)
 86 | 
 87 |     def build(self, input_shape):
 88 |         self.input_spec = [InputSpec(shape=input_shape)]
 89 |         shape = (int(input_shape[self.axis]),)
 90 | 
 91 |         self.gamma = K.variable(
 92 |             self.gamma_init(shape),
 93 |             name='{}_gamma'.format(self.name))
 94 |         self.beta = K.variable(
 95 |             self.beta_init(shape),
 96 |             name='{}_beta'.format(self.name))
 97 |         self.trainable_weights = [self.gamma, self.beta]
 98 | 
 99 |         if self.initial_weights is not None:
100 |             self.set_weights(self.initial_weights)
101 |             del self.initial_weights
102 | 
103 |     def call(self, x, mask=None):
104 |         input_shape = self.input_spec[0].shape
105 |         broadcast_shape = [1] * len(input_shape)
106 |         broadcast_shape[self.axis] = input_shape[self.axis]
107 | 
108 |         out = K.reshape(
109 |             self.gamma,
110 |             broadcast_shape) * x + K.reshape(self.beta, broadcast_shape)
111 |         return out
112 | 
113 |     def get_config(self):
114 |         config = {"momentum": self.momentum, "axis": self.axis}
115 |         base_config = super(Scale, self).get_config()
116 |         return dict(list(base_config.items()) + list(config.items()))
117 | 
118 | 
119 | def identity_block(input_tensor, kernel_size, filters, stage, block):
120 |     '''The identity_block is the block that has no conv layer at shortcut
121 |     # Arguments
122 |         input_tensor: input tensor
123 |         kernel_size: defualt 3, the kernel size of middle conv layer at main
124 |             path
125 |         filters: list of integers, the nb_filters of 3 conv layer at main path
126 |         stage: integer, current stage label, used for generating layer names
127 |         block: 'a','b'..., current block label, used for generating layer names
128 |     '''
129 |     eps = 1.1e-5
130 |     if K.image_data_format() == 'channels_last':
131 |         bn_axis = 3
132 |     else:
133 |         bn_axis = 1
134 |     nb_filter1, nb_filter2, nb_filter3 = filters
135 |     conv_name_base = 'res' + str(stage) + block + '_branch'
136 |     bn_name_base = 'bn' + str(stage) + block + '_branch'
137 |     scale_name_base = 'scale' + str(stage) + block + '_branch'
138 | 
139 |     x = Conv2D(nb_filter1, (1, 1), name=conv_name_base + '2a',
140 |                use_bias=False)(input_tensor)
141 |     x = BatchNormalization(epsilon=eps, axis=bn_axis,
142 |                            name=bn_name_base + '2a')(x)
143 |     x = Scale(axis=bn_axis, name=scale_name_base + '2a')(x)
144 |     x = Activation('relu', name=conv_name_base + '2a_relu')(x)
145 | 
146 |     x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
147 |     x = Conv2D(nb_filter2, (kernel_size, kernel_size),
148 |                name=conv_name_base + '2b', use_bias=False)(x)
149 |     x = BatchNormalization(epsilon=eps, axis=bn_axis,
150 |                            name=bn_name_base + '2b')(x)
151 |     x = Scale(axis=bn_axis, name=scale_name_base + '2b')(x)
152 |     x = Activation('relu', name=conv_name_base + '2b_relu')(x)
153 | 
154 |     x = Conv2D(nb_filter3, (1, 1), name=conv_name_base + '2c',
155 |                use_bias=False)(x)
156 |     x = BatchNormalization(epsilon=eps, axis=bn_axis,
157 |                            name=bn_name_base + '2c')(x)
158 |     x = Scale(axis=bn_axis, name=scale_name_base + '2c')(x)
159 | 
160 |     x = add([x, input_tensor], name='res' + str(stage) + block)
161 |     x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
162 |     return x
163 | 
164 | 
165 | def conv_block(input_tensor,
166 |                kernel_size,
167 |                filters,
168 |                stage,
169 |                block,
170 |                strides=(2, 2)):
171 |     '''conv_block is the block that has a conv layer at shortcut
172 |     # Arguments
173 |         input_tensor: input tensor
174 |         kernel_size: defualt 3, the kernel size of middle conv layer at main
175 |             path
176 |         filters: list of integers, the nb_filters of 3 conv layer at main path
177 |         stage: integer, current stage label, used for generating layer names
178 |         block: 'a','b'..., current block label, used for generating layer names
179 |     Note that from stage 3, the first conv layer at main path is with
180 |     strides=(2,2). And the shortcut should have strides=(2,2) as well
181 |     '''
182 |     eps = 1.1e-5
183 |     if K.image_data_format() == 'channels_last':
184 |         bn_axis = 3
185 |     else:
186 |         bn_axis = 1
187 |     nb_filter1, nb_filter2, nb_filter3 = filters
188 |     conv_name_base = 'res' + str(stage) + block + '_branch'
189 |     bn_name_base = 'bn' + str(stage) + block + '_branch'
190 |     scale_name_base = 'scale' + str(stage) + block + '_branch'
191 | 
192 |     x = Conv2D(nb_filter1, (1, 1), strides=strides,
193 |                name=conv_name_base + '2a', use_bias=False)(input_tensor)
194 |     x = BatchNormalization(epsilon=eps, axis=bn_axis,
195 |                            name=bn_name_base + '2a')(x)
196 |     x = Scale(axis=bn_axis, name=scale_name_base + '2a')(x)
197 |     x = Activation('relu', name=conv_name_base + '2a_relu')(x)
198 | 
199 |     x = ZeroPadding2D((1, 1), name=conv_name_base + '2b_zeropadding')(x)
200 |     x = Conv2D(nb_filter2, (kernel_size, kernel_size),
201 |                name=conv_name_base + '2b', use_bias=False)(x)
202 |     x = BatchNormalization(epsilon=eps, axis=bn_axis,
203 |                            name=bn_name_base + '2b')(x)
204 |     x = Scale(axis=bn_axis, name=scale_name_base + '2b')(x)
205 |     x = Activation('relu', name=conv_name_base + '2b_relu')(x)
206 | 
207 |     x = Conv2D(nb_filter3, (1, 1),
208 |                name=conv_name_base + '2c', use_bias=False)(x)
209 |     x = BatchNormalization(epsilon=eps, axis=bn_axis,
210 |                            name=bn_name_base + '2c')(x)
211 |     x = Scale(axis=bn_axis, name=scale_name_base + '2c')(x)
212 | 
213 |     shortcut = Conv2D(nb_filter3, (1, 1), strides=strides,
214 |                       name=conv_name_base + '1', use_bias=False)(input_tensor)
215 |     shortcut = BatchNormalization(epsilon=eps, axis=bn_axis,
216 |                                   name=bn_name_base + '1')(shortcut)
217 |     shortcut = Scale(axis=bn_axis, name=scale_name_base + '1')(shortcut)
218 | 
219 |     x = add([x, shortcut], name='res' + str(stage) + block)
220 |     x = Activation('relu', name='res' + str(stage) + block + '_relu')(x)
221 |     return x
222 | 
223 | 
224 | def ResNet101(include_top=True,
225 |               weights='imagenet',
226 |               input_tensor=None,
227 |               input_shape=None,
228 |               pooling=None,
229 |               classes=1000):
230 |     """Instantiates the ResNet-101 architecture.
231 | 
232 |     Optionally loads weights pre-trained on ImageNet. Note that when using
233 |     TensorFlow, for best performance you should set
234 |     image_data_format='channels_last'` in your Keras config at
235 |     ~/.keras/keras.json.
236 | 
237 |     The model and the weights are compatible with both TensorFlow and Theano.
238 |     The data format convention used by the model is the one specified in your
239 |     Keras config file.
240 | 
241 |     Parameters
242 |     ----------
243 |         include_top: whether to include the fully-connected layer at the top of
244 |             the network.
245 |         weights: one of `None` (random initialization) or 'imagenet'
246 |             (pre-training on ImageNet).
247 |         input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
248 |             to use as image input for the model.
249 |         input_shape: optional shape tuple, only to be specified if
250 |             `include_top` is False (otherwise the input shape has to be
251 |             `(224, 224, 3)` (with `channels_last` data format) or
252 |             `(3, 224, 224)` (with `channels_first` data format). It should have
253 |             exactly 3 inputs channels, and width and height should be no
254 |             smaller than 197.
255 |             E.g. `(200, 200, 3)` would be one valid value.
256 |         pooling: Optional pooling mode for feature extraction when
257 |             `include_top` is `False`.
258 |             - `None` means that the output of the model will be the 4D tensor
259 |                 output of the last convolutional layer.
260 |             - `avg` means that global average pooling will be applied to the
261 |                 output of the last convolutional layer, and thus the output of
262 |                 the model will be a 2D tensor.
263 |             - `max` means that global max pooling will be applied.
264 |         classes: optional number of classes to classify images into, only to be
265 |             specified if `include_top` is True, and if no `weights` argument is
266 |             specified.
267 | 
268 |     Returns
269 |     -------
270 |         A Keras model instance.
271 | 
272 |     Raises
273 |     ------
274 |         ValueError: in case of invalid argument for `weights`, or invalid input
275 |         shape.
276 |     """
277 |     if weights not in {'imagenet', None}:
278 |         raise ValueError('The `weights` argument should be either '
279 |                          '`None` (random initialization) or `imagenet` '
280 |                          '(pre-training on ImageNet).')
281 | 
282 |     if weights == 'imagenet' and include_top and classes != 1000:
283 |         raise ValueError('If using `weights` as imagenet with `include_top`'
284 |                          ' as true, `classes` should be 1000')
285 | 
286 |     # Determine proper input shape
287 |     input_shape = _obtain_input_shape(input_shape,
288 |                                       default_size=224,
289 |                                       min_size=197,
290 |                                       data_format=K.image_data_format(),
291 |                                       require_flatten=include_top,
292 |                                       weights=weights)
293 | 
294 |     if input_tensor is None:
295 |         img_input = Input(shape=input_shape, name='data')
296 |     else:
297 |         if not K.is_keras_tensor(input_tensor):
298 |             img_input = Input(
299 |                 tensor=input_tensor, shape=input_shape, name='data')
300 |         else:
301 |             img_input = input_tensor
302 |     if K.image_data_format() == 'channels_last':
303 |         bn_axis = 3
304 |     else:
305 |         bn_axis = 1
306 |     eps = 1.1e-5
307 | 
308 |     x = ZeroPadding2D((3, 3), name='conv1_zeropadding')(img_input)
309 |     x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=False)(x)
310 |     x = BatchNormalization(epsilon=eps, axis=bn_axis, name='bn_conv1')(x)
311 |     x = Scale(axis=bn_axis, name='scale_conv1')(x)
312 |     x = Activation('relu', name='conv1_relu')(x)
313 |     x = MaxPooling2D((3, 3), strides=(2, 2), name='pool1')(x)
314 | 
315 |     x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
316 |     x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
317 |     x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
318 | 
319 |     x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
320 |     for i in range(1, 3):
321 |         x = identity_block(x, 3, [128, 128, 512], stage=3, block='b' + str(i))
322 | 
323 |     x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
324 |     for i in range(1, 23):
325 |         x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b' + str(i))
326 | 
327 |     x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
328 |     x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
329 |     x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
330 | 
331 |     x = AveragePooling2D((7, 7), name='avg_pool')(x)
332 | 
333 |     if include_top:
334 |         x = Flatten()(x)
335 |         x = Dense(classes, activation='softmax', name='mmfc1000')(x)
336 |     else:
337 |         if pooling == 'avg':
338 |             x = GlobalAveragePooling2D()(x)
339 |         elif pooling == 'max':
340 |             x = GlobalMaxPooling2D()(x)
341 | 
342 |     # Ensure that the model takes into account
343 |     # any potential predecessors of `input_tensor`.
344 |     if input_tensor is not None:
345 |         inputs = get_source_inputs(input_tensor)
346 |     else:
347 |         inputs = img_input
348 |     # Create model.
349 |     model = Model(inputs, x, name='resnet101')
350 | 
351 |     '''
352 |     # load weights
353 |     if weights == 'imagenet':
354 |         filename = 'resnet101_weights_{}.h5'.format(K.image_dim_ordering())
355 |         if K.backend() == 'theano':
356 |             path = WEIGHTS_PATH_TH
357 |             md5_hash = MD5_HASH_TH
358 |         else:
359 |             path = WEIGHTS_PATH_TF
360 |             md5_hash = MD5_HASH_TF
361 |         weights_path = get_file(
362 |             fname=filename,
363 |             origin=path,
364 |             cache_subdir='models',
365 |             md5_hash=md5_hash,
366 |             hash_algorithm='md5')
367 |         model.load_weights(weights_path, by_name=True)
368 | 
369 |         if K.image_data_format() == 'channels_first' and K.backend() == 'tensorflow':
370 |             warnings.warn('You are using the TensorFlow backend, yet you '
371 |                           'are using the Theano '
372 |                           'image data format convention '
373 |                           '(`image_data_format="channels_first"`). '
374 |                           'For best performance, set '
375 |                           '`image_data_format="channels_last"` in '
376 |                           'your Keras config '
377 |                           'at ~/.keras/keras.json.')
378 |     '''
379 |     return model
380 | 


--------------------------------------------------------------------------------
/submission/placeholder.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuanyuanli85/FashionAI_KeyPoint_Detection_Challenge_Keras/0b3bd8cdee32e05619300e5466578644974279df/submission/placeholder.txt


--------------------------------------------------------------------------------
/trained_models/placeholder.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yuanyuanli85/FashionAI_KeyPoint_Detection_Challenge_Keras/0b3bd8cdee32e05619300e5466578644974279df/trained_models/placeholder.txt


--------------------------------------------------------------------------------