├── .gitignore ├── README.md ├── hand-estimator-using-caffemodel ├── README.md ├── handpose.py ├── model │ └── pose_deploy.prototxt └── test_out │ ├── README.md │ ├── hand_out.jpg │ └── shake_out.jpg ├── pose-estimator-tensorflow ├── README.md ├── openpose_skeleton_sequence_drawer.py ├── openpose_tf.py └── test_out │ ├── README.md │ ├── skeleton_sequence.jpg │ └── skeleton_sequence_horizontal.jpg ├── pose-estimator-using-caffemodel ├── README.md ├── estimator_multi_people.py ├── estimator_single_person.py ├── model │ ├── body_25 │ │ ├── keypoints_pose_25.png │ │ └── pose_deploy.prototxt │ ├── coco │ │ ├── keypoints_pose_18.png │ │ ├── pose_deploy_linevec.prototxt │ │ └── pose_deploy_linevec_faster_4_stages.prototxt │ └── mpi │ │ ├── pose_deploy_linevec.prototxt │ │ └── pose_deploy_linevec_faster_4_stages.prototxt └── test_out │ └── README.md └── test_out ├── CSBDGS_out.jpg ├── Friends_out.jpg ├── Output-BODY_25.jpg ├── Output-COCO.jpg ├── Output-MPI.jpg ├── TBBT_out.jpg ├── WLWZ_out.jpg └── webcam_out.gif /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | MANIFEST 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .nox/ 42 | .coverage 43 | .coverage.* 44 | .cache 45 | nosetests.xml 46 | coverage.xml 47 | *.cover 48 | .hypothesis/ 49 | .pytest_cache/ 50 | 51 | # Translations 52 | *.mo 53 | *.pot 54 | 55 | # Django stuff: 56 | *.log 57 | local_settings.py 58 | db.sqlite3 59 | 60 | # Flask stuff: 61 | instance/ 62 | .webassets-cache 63 | 64 | # Scrapy stuff: 65 | .scrapy 66 | 67 | # Sphinx documentation 68 | docs/_build/ 69 | 70 | # PyBuilder 71 | target/ 72 | 73 | # Jupyter Notebook 74 | .ipynb_checkpoints 75 | 76 | # IPython 77 | profile_default/ 78 | ipython_config.py 79 | 80 | # pyenv 81 | .python-version 82 | 83 | # celery beat schedule file 84 | celerybeat-schedule 85 | 86 | # SageMath parsed files 87 | *.sage.py 88 | 89 | # Environments 90 | .env 91 | .venv 92 | env/ 93 | venv/ 94 | ENV/ 95 | env.bak/ 96 | venv.bak/ 97 | 98 | # Spyder project settings 99 | .spyderproject 100 | .spyproject 101 | 102 | # Rope project settings 103 | .ropeproject 104 | 105 | # mkdocs documentation 106 | /site 107 | 108 | # mypy 109 | .mypy_cache/ 110 | .dmypy.json 111 | dmypy.json 112 | 113 | # Pyre type checker 114 | .pyre/ 115 | 116 | test/1.jpg 117 | test/MrFang.mp4 118 | test/mySweet.jpg 119 | *.mp4 120 | *.caffemodel 121 | *.pb 122 | .idea 123 | __pycache__ -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # OpenPose Rebuilt-Python 2 | Rebuilting the CMU-OpenPose pose estimatior using Python with OpenCV and Tensorflow. 3 | (The code comments are partly descibed in chinese) 4 | 5 | ------- 6 | ## Pretrained-model Downloading 7 | In this work, I used both **caffemodel** and **tensorflow-graph-model**, you can download them [here](https://pan.baidu.com/s/1XT8pHtNP1FQs3BPHgD5f-A), Then place the pretrained models to corresponding directory respectively. 8 | #### *Examples:* 9 | - place `caffe_models\pose\body_25\pose_iter_584000.caffemodel` into `pose-estimator-using-caffemodel\model\body_25\` 10 | - place `caffe_models\hand\pose_iter_102000.caffemodel` into `hand-estimator-using-caffemodel\model\` 11 | - place `openpose graph model coco\graph_opt.pb` into `pose-estimator-tensorflow\graph_model_coco\` 12 | 13 | ------- 14 | ## Requirements : 15 | - OpenCV > 3.4.1 16 | - TensorFlow > 1.2.0 17 | - imutils 18 | 19 | ------- 20 | ## Usage: 21 | See the sub-README.md in sub-folder. 22 | 23 | ------- 24 | ## BODY_25 vs. COCO vs. MPI 25 | - BODY_25 model is ***faster, more accurate***, and it includes foot keypoints. 26 | - COCO requires less memory on GPU (being able to fit into 2GB GPUs with the default settings) and it runs ***faster on CPU-only mode***. 27 | - MPI model is only meant for people requiring the MPI-keypoint structure. It is also slower than BODY_25 and far less accurate. 28 | #### *Output Format* 29 | **Body_25 in left, COCO in middle, MPI in right.** 30 |
31 | 32 | **See more Output Format details [here](https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/output.md)**, and **Hand Output Format** included as well. 33 | 34 | ------- 35 | ## Results Showing 36 | #### *Image test* 37 |

38 | 39 |

40 |

41 | 42 |

43 |

44 | 45 |

46 |

47 | 48 |

49 | 50 | #### *VideoStream test* 51 |

52 | 53 |

54 | 55 | #### *Webcam-skeleton-drawer test* 56 | Script in `pose-estimator-tensorflow` folder. 57 |

58 | 59 |

60 |

61 | 62 |

63 | -------------------------------------------------------------------------------- /hand-estimator-using-caffemodel/README.md: -------------------------------------------------------------------------------- 1 | ## Usage Examples : 2 | Put the test file (iamge or video) under the same directory 3 | 4 | - `python3 handpose.py --image=test.jpg` 5 | - `python3 handpose.py --video=test.mp4` 6 | - if no argument provided, it starts the webcam. 7 | 8 | ## Note 9 | - see more details [here](https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/output.md#hand-output-format) 10 | - see the origin paper [here](https://arxiv.org/abs/1704.07809) 11 | - the code given can detect ***only one hand*** at a time. 12 | 13 | ## Result showing 14 |

15 | 16 |

17 |

18 | 19 |

20 | -------------------------------------------------------------------------------- /hand-estimator-using-caffemodel/handpose.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import os 3 | import argparse 4 | import sys 5 | from imutils.video import FPS 6 | from pathlib import Path 7 | 8 | 9 | FilePath = Path.cwd() 10 | outFilePath = Path(FilePath / "test_out/") 11 | threshold = 0.2 12 | input_width, input_height = 368, 368 13 | 14 | proto_file = str(FilePath/'model/pose_deploy.prototxt') 15 | weights_file = str(FilePath/"model/pose_iter_102000.caffemodel") 16 | nPoints = 22 17 | POSE_PAIRS = [[0, 1], [1, 2], [2, 3], [3, 4], [0, 5], [5, 6], [6, 7], [7, 8], [0, 9], 18 | [9, 10], [10, 11], [11, 12], [0, 13], [13, 14], [14, 15], [15, 16], [0, 17], 19 | [17, 18], [18, 19], [19, 20]] 20 | 21 | # Usage example: python openpose.py --video=test.mp4 22 | parser = argparse.ArgumentParser(description='OpenPose for hand skeleton in python') 23 | parser.add_argument('--image', help='Path to image file.') 24 | parser.add_argument('--video', help='Path to video file.') 25 | args = parser.parse_args() 26 | 27 | 28 | def choose_run_mode(): 29 | global outFilePath 30 | if args.image: 31 | # Open the image file 32 | if not os.path.isfile(args.image): 33 | print("Input image file ", args.image, " doesn't exist") 34 | sys.exit(1) 35 | cap = cv.VideoCapture(args.image) 36 | outFilePath = str(outFilePath / (args.image[:-4] + '_out.jpg')) 37 | elif args.video: 38 | # Open the video file 39 | if not os.path.isfile(args.video): 40 | print("Input video file ", args.video, " doesn't exist") 41 | sys.exit(1) 42 | cap = cv.VideoCapture(args.video) 43 | outFilePath = str(outFilePath/(args.video[:-4]+'_out.mp4')) 44 | else: 45 | # Webcam input 46 | cap = cv.VideoCapture(0) 47 | outFilePath = str(outFilePath / 'webcam_out.mp4') 48 | return cap 49 | 50 | 51 | def load_pretrain_model(): 52 | # 加载预训练后的POSE的caffe模型 53 | net = cv.dnn.readNetFromCaffe(proto_file, caffeModel=weights_file) 54 | print('POSE caffe model loaded successfully') 55 | # 调用GPU模块,但目前opencv仅支持intel GPU 56 | # tested with Intel GPUs only, or it will automatically switch to CPU 57 | net.setPreferableBackend(cv.dnn.DNN_BACKEND_OPENCV) 58 | net.setPreferableTarget(cv.dnn.DNN_TARGET_OPENCL) 59 | return net 60 | 61 | 62 | def show_fps(): 63 | if not args.image: 64 | fps.update() 65 | fps.stop() 66 | fps_label = "FPS: {:.2f}".format(fps.fps()) 67 | cv.putText(frame, fps_label, (0, origin_h - 25), cv.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2) 68 | 69 | 70 | if __name__ == "__main__": 71 | cap = choose_run_mode() 72 | net = load_pretrain_model() 73 | fps = FPS().start() 74 | vid_writer = cv.VideoWriter(outFilePath, cv.VideoWriter_fourcc(*'mp4v'), 30, 75 | (round(cap.get(cv.CAP_PROP_FRAME_WIDTH)), 76 | round(cap.get(cv.CAP_PROP_FRAME_HEIGHT)))) 77 | while cv.waitKey(1) < 0: 78 | hasFrame, frame = cap.read() 79 | if not hasFrame: 80 | print("Output file is stored as ", outFilePath) 81 | cv.waitKey(3000) 82 | break 83 | 84 | origin_h, origin_w = frame.shape[:2] 85 | blob = cv.dnn.blobFromImage(frame, 1.0 / 255, (input_width, input_height), 0, swapRB=False, crop=False) 86 | net.setInput(blob) 87 | detections = net.forward() 88 | H = detections.shape[2] 89 | W = detections.shape[3] 90 | 91 | # 存储关节点 92 | points = [] 93 | 94 | for i in range(nPoints): 95 | probility_map = detections[0, i, :, :] 96 | # 97 | min_value, confidence, min_loc, point = cv.minMaxLoc(probility_map) 98 | # 99 | x = int(origin_w * (point[0] / W)) 100 | y = int(origin_h * (point[1] / H)) 101 | if confidence > threshold: 102 | cv.circle(frame, (x, y), 6, (255, 255, 0), -1, cv.FILLED) 103 | cv.putText(frame, "{}".format(i), (x, y-15), cv.FONT_HERSHEY_SIMPLEX, 0.4, (0, 0, 255), 1, cv.LINE_AA) 104 | points.append((x, y)) 105 | else: 106 | points.append(None) 107 | 108 | # 画骨架 109 | for pair in POSE_PAIRS: 110 | A, B = pair[0], pair[1] 111 | if points[A] and points[B]: 112 | cv.line(frame, points[A], points[B], (0, 255, 255), 3, cv.LINE_AA) 113 | # cv.circle(frame, points[A], 8, (0, 0, 255), thickness=-1, lineType=cv.FILLED) 114 | # cv.circle(frame, points[B], 8, (0, 0, 255), thickness=-1, lineType=cv.FILLED) 115 | 116 | # Write the frame with the detection boxes 117 | if args.image: 118 | cv.imwrite(outFilePath, frame) 119 | else: 120 | vid_writer.write(frame) 121 | 122 | winName = 'Hand Skeleton from OpenPose' 123 | # cv.namedWindow(winName, cv.WINDOW_NORMAL) 124 | cv.imshow(winName, frame) 125 | 126 | if not args.image: 127 | vid_writer.release() 128 | cap.release() 129 | cv.destroyAllWindows() -------------------------------------------------------------------------------- /hand-estimator-using-caffemodel/model/pose_deploy.prototxt: -------------------------------------------------------------------------------- 1 | input: "image" 2 | input_dim: 1 # Original: 2 3 | input_dim: 3 # It crashes if not left to 3 4 | input_dim: 1 # Original: 368 5 | input_dim: 1 # Original: 368 6 | layer { 7 | name: "conv1_1" 8 | type: "Convolution" 9 | bottom: "image" 10 | top: "conv1_1" 11 | param { 12 | lr_mult: 1.0 13 | decay_mult: 1 14 | } 15 | param { 16 | lr_mult: 2.0 17 | decay_mult: 0 18 | } 19 | convolution_param { 20 | num_output: 64 21 | pad: 1 22 | kernel_size: 3 23 | weight_filler { 24 | type: "xavier" 25 | } 26 | bias_filler { 27 | type: "constant" 28 | } 29 | dilation: 1 30 | } 31 | } 32 | layer { 33 | name: "relu1_1" 34 | type: "ReLU" 35 | bottom: "conv1_1" 36 | top: "conv1_1" 37 | } 38 | layer { 39 | name: "conv1_2" 40 | type: "Convolution" 41 | bottom: "conv1_1" 42 | top: "conv1_2" 43 | param { 44 | lr_mult: 1.0 45 | decay_mult: 1 46 | } 47 | param { 48 | lr_mult: 2.0 49 | decay_mult: 0 50 | } 51 | convolution_param { 52 | num_output: 64 53 | pad: 1 54 | kernel_size: 3 55 | weight_filler { 56 | type: "xavier" 57 | } 58 | bias_filler { 59 | type: "constant" 60 | } 61 | dilation: 1 62 | } 63 | } 64 | layer { 65 | name: "relu1_2" 66 | type: "ReLU" 67 | bottom: "conv1_2" 68 | top: "conv1_2" 69 | } 70 | layer { 71 | name: "pool1_stage1" 72 | type: "Pooling" 73 | bottom: "conv1_2" 74 | top: "pool1_stage1" 75 | pooling_param { 76 | pool: MAX 77 | kernel_size: 2 78 | stride: 2 79 | } 80 | } 81 | layer { 82 | name: "conv2_1" 83 | type: "Convolution" 84 | bottom: "pool1_stage1" 85 | top: "conv2_1" 86 | param { 87 | lr_mult: 1.0 88 | decay_mult: 1 89 | } 90 | param { 91 | lr_mult: 2.0 92 | decay_mult: 0 93 | } 94 | convolution_param { 95 | num_output: 128 96 | pad: 1 97 | kernel_size: 3 98 | weight_filler { 99 | type: "xavier" 100 | } 101 | bias_filler { 102 | type: "constant" 103 | } 104 | dilation: 1 105 | } 106 | } 107 | layer { 108 | name: "relu2_1" 109 | type: "ReLU" 110 | bottom: "conv2_1" 111 | top: "conv2_1" 112 | } 113 | layer { 114 | name: "conv2_2" 115 | type: "Convolution" 116 | bottom: "conv2_1" 117 | top: "conv2_2" 118 | param { 119 | lr_mult: 1.0 120 | decay_mult: 1 121 | } 122 | param { 123 | lr_mult: 2.0 124 | decay_mult: 0 125 | } 126 | convolution_param { 127 | num_output: 128 128 | pad: 1 129 | kernel_size: 3 130 | weight_filler { 131 | type: "xavier" 132 | } 133 | bias_filler { 134 | type: "constant" 135 | } 136 | dilation: 1 137 | } 138 | } 139 | layer { 140 | name: "relu2_2" 141 | type: "ReLU" 142 | bottom: "conv2_2" 143 | top: "conv2_2" 144 | } 145 | layer { 146 | name: "pool2_stage1" 147 | type: "Pooling" 148 | bottom: "conv2_2" 149 | top: "pool2_stage1" 150 | pooling_param { 151 | pool: MAX 152 | kernel_size: 2 153 | stride: 2 154 | } 155 | } 156 | layer { 157 | name: "conv3_1" 158 | type: "Convolution" 159 | bottom: "pool2_stage1" 160 | top: "conv3_1" 161 | param { 162 | lr_mult: 1.0 163 | decay_mult: 1 164 | } 165 | param { 166 | lr_mult: 2.0 167 | decay_mult: 0 168 | } 169 | convolution_param { 170 | num_output: 256 171 | pad: 1 172 | kernel_size: 3 173 | weight_filler { 174 | type: "xavier" 175 | } 176 | bias_filler { 177 | type: "constant" 178 | } 179 | dilation: 1 180 | } 181 | } 182 | layer { 183 | name: "relu3_1" 184 | type: "ReLU" 185 | bottom: "conv3_1" 186 | top: "conv3_1" 187 | } 188 | layer { 189 | name: "conv3_2" 190 | type: "Convolution" 191 | bottom: "conv3_1" 192 | top: "conv3_2" 193 | param { 194 | lr_mult: 1.0 195 | decay_mult: 1 196 | } 197 | param { 198 | lr_mult: 2.0 199 | decay_mult: 0 200 | } 201 | convolution_param { 202 | num_output: 256 203 | pad: 1 204 | kernel_size: 3 205 | weight_filler { 206 | type: "xavier" 207 | } 208 | bias_filler { 209 | type: "constant" 210 | } 211 | dilation: 1 212 | } 213 | } 214 | layer { 215 | name: "relu3_2" 216 | type: "ReLU" 217 | bottom: "conv3_2" 218 | top: "conv3_2" 219 | } 220 | layer { 221 | name: "conv3_3" 222 | type: "Convolution" 223 | bottom: "conv3_2" 224 | top: "conv3_3" 225 | param { 226 | lr_mult: 1.0 227 | decay_mult: 1 228 | } 229 | param { 230 | lr_mult: 2.0 231 | decay_mult: 0 232 | } 233 | convolution_param { 234 | num_output: 256 235 | pad: 1 236 | kernel_size: 3 237 | weight_filler { 238 | type: "xavier" 239 | } 240 | bias_filler { 241 | type: "constant" 242 | } 243 | dilation: 1 244 | } 245 | } 246 | layer { 247 | name: "relu3_3" 248 | type: "ReLU" 249 | bottom: "conv3_3" 250 | top: "conv3_3" 251 | } 252 | layer { 253 | name: "conv3_4" 254 | type: "Convolution" 255 | bottom: "conv3_3" 256 | top: "conv3_4" 257 | param { 258 | lr_mult: 1.0 259 | decay_mult: 1 260 | } 261 | param { 262 | lr_mult: 2.0 263 | decay_mult: 0 264 | } 265 | convolution_param { 266 | num_output: 256 267 | pad: 1 268 | kernel_size: 3 269 | weight_filler { 270 | type: "xavier" 271 | } 272 | bias_filler { 273 | type: "constant" 274 | } 275 | dilation: 1 276 | } 277 | } 278 | layer { 279 | name: "relu3_4" 280 | type: "ReLU" 281 | bottom: "conv3_4" 282 | top: "conv3_4" 283 | } 284 | layer { 285 | name: "pool3_stage1" 286 | type: "Pooling" 287 | bottom: "conv3_4" 288 | top: "pool3_stage1" 289 | pooling_param { 290 | pool: MAX 291 | kernel_size: 2 292 | stride: 2 293 | } 294 | } 295 | layer { 296 | name: "conv4_1" 297 | type: "Convolution" 298 | bottom: "pool3_stage1" 299 | top: "conv4_1" 300 | param { 301 | lr_mult: 1.0 302 | decay_mult: 1 303 | } 304 | param { 305 | lr_mult: 2.0 306 | decay_mult: 0 307 | } 308 | convolution_param { 309 | num_output: 512 310 | pad: 1 311 | kernel_size: 3 312 | weight_filler { 313 | type: "xavier" 314 | } 315 | bias_filler { 316 | type: "constant" 317 | } 318 | dilation: 1 319 | } 320 | } 321 | layer { 322 | name: "relu4_1" 323 | type: "ReLU" 324 | bottom: "conv4_1" 325 | top: "conv4_1" 326 | } 327 | layer { 328 | name: "conv4_2" 329 | type: "Convolution" 330 | bottom: "conv4_1" 331 | top: "conv4_2" 332 | param { 333 | lr_mult: 1.0 334 | decay_mult: 1 335 | } 336 | param { 337 | lr_mult: 2.0 338 | decay_mult: 0 339 | } 340 | convolution_param { 341 | num_output: 512 342 | pad: 1 343 | kernel_size: 3 344 | weight_filler { 345 | type: "xavier" 346 | } 347 | bias_filler { 348 | type: "constant" 349 | } 350 | dilation: 1 351 | } 352 | } 353 | layer { 354 | name: "relu4_2" 355 | type: "ReLU" 356 | bottom: "conv4_2" 357 | top: "conv4_2" 358 | } 359 | layer { 360 | name: "conv4_3" 361 | type: "Convolution" 362 | bottom: "conv4_2" 363 | top: "conv4_3" 364 | param { 365 | lr_mult: 1.0 366 | decay_mult: 1 367 | } 368 | param { 369 | lr_mult: 2.0 370 | decay_mult: 0 371 | } 372 | convolution_param { 373 | num_output: 512 374 | pad: 1 375 | kernel_size: 3 376 | weight_filler { 377 | type: "xavier" 378 | } 379 | bias_filler { 380 | type: "constant" 381 | } 382 | dilation: 1 383 | } 384 | } 385 | layer { 386 | name: "relu4_3" 387 | type: "ReLU" 388 | bottom: "conv4_3" 389 | top: "conv4_3" 390 | } 391 | layer { 392 | name: "conv4_4" 393 | type: "Convolution" 394 | bottom: "conv4_3" 395 | top: "conv4_4" 396 | param { 397 | lr_mult: 1.0 398 | decay_mult: 1 399 | } 400 | param { 401 | lr_mult: 2.0 402 | decay_mult: 0 403 | } 404 | convolution_param { 405 | num_output: 512 406 | pad: 1 407 | kernel_size: 3 408 | weight_filler { 409 | type: "xavier" 410 | } 411 | bias_filler { 412 | type: "constant" 413 | } 414 | dilation: 1 415 | } 416 | } 417 | layer { 418 | name: "relu4_4" 419 | type: "ReLU" 420 | bottom: "conv4_4" 421 | top: "conv4_4" 422 | } 423 | layer { 424 | name: "conv5_1" 425 | type: "Convolution" 426 | bottom: "conv4_4" 427 | top: "conv5_1" 428 | param { 429 | lr_mult: 1.0 430 | decay_mult: 1 431 | } 432 | param { 433 | lr_mult: 2.0 434 | decay_mult: 0 435 | } 436 | convolution_param { 437 | num_output: 512 438 | pad: 1 439 | kernel_size: 3 440 | weight_filler { 441 | type: "xavier" 442 | } 443 | bias_filler { 444 | type: "constant" 445 | } 446 | dilation: 1 447 | } 448 | } 449 | layer { 450 | name: "relu5_1" 451 | type: "ReLU" 452 | bottom: "conv5_1" 453 | top: "conv5_1" 454 | } 455 | layer { 456 | name: "conv5_2" 457 | type: "Convolution" 458 | bottom: "conv5_1" 459 | top: "conv5_2" 460 | param { 461 | lr_mult: 1.0 462 | decay_mult: 1 463 | } 464 | param { 465 | lr_mult: 2.0 466 | decay_mult: 0 467 | } 468 | convolution_param { 469 | num_output: 512 470 | pad: 1 471 | kernel_size: 3 472 | weight_filler { 473 | type: "xavier" 474 | } 475 | bias_filler { 476 | type: "constant" 477 | } 478 | dilation: 1 479 | } 480 | } 481 | layer { 482 | name: "relu5_2" 483 | type: "ReLU" 484 | bottom: "conv5_2" 485 | top: "conv5_2" 486 | } 487 | layer { 488 | name: "conv5_3_CPM" 489 | type: "Convolution" 490 | bottom: "conv5_2" 491 | top: "conv5_3_CPM" 492 | param { 493 | lr_mult: 1.0 494 | decay_mult: 1 495 | } 496 | param { 497 | lr_mult: 2.0 498 | decay_mult: 0 499 | } 500 | convolution_param { 501 | num_output: 128 502 | pad: 1 503 | kernel_size: 3 504 | weight_filler { 505 | type: "gaussian" 506 | std: 0.01 507 | } 508 | bias_filler { 509 | type: "constant" 510 | } 511 | dilation: 1 512 | } 513 | } 514 | layer { 515 | name: "relu5_4_stage1_3" 516 | type: "ReLU" 517 | bottom: "conv5_3_CPM" 518 | top: "conv5_3_CPM" 519 | } 520 | layer { 521 | name: "conv6_1_CPM" 522 | type: "Convolution" 523 | bottom: "conv5_3_CPM" 524 | top: "conv6_1_CPM" 525 | param { 526 | lr_mult: 1.0 527 | decay_mult: 1 528 | } 529 | param { 530 | lr_mult: 2.0 531 | decay_mult: 0 532 | } 533 | convolution_param { 534 | num_output: 512 535 | pad: 0 536 | kernel_size: 1 537 | weight_filler { 538 | type: "gaussian" 539 | std: 0.01 540 | } 541 | bias_filler { 542 | type: "constant" 543 | } 544 | dilation: 1 545 | } 546 | } 547 | layer { 548 | name: "relu6_4_stage1_1" 549 | type: "ReLU" 550 | bottom: "conv6_1_CPM" 551 | top: "conv6_1_CPM" 552 | } 553 | layer { 554 | name: "conv6_2_CPM" 555 | type: "Convolution" 556 | bottom: "conv6_1_CPM" 557 | top: "conv6_2_CPM" 558 | param { 559 | lr_mult: 1.0 560 | decay_mult: 1 561 | } 562 | param { 563 | lr_mult: 2.0 564 | decay_mult: 0 565 | } 566 | convolution_param { 567 | num_output: 22 568 | pad: 0 569 | kernel_size: 1 570 | weight_filler { 571 | type: "gaussian" 572 | std: 0.01 573 | } 574 | bias_filler { 575 | type: "constant" 576 | } 577 | dilation: 1 578 | } 579 | } 580 | layer { 581 | name: "concat_stage2" 582 | type: "Concat" 583 | bottom: "conv6_2_CPM" 584 | bottom: "conv5_3_CPM" 585 | top: "concat_stage2" 586 | concat_param { 587 | axis: 1 588 | } 589 | } 590 | layer { 591 | name: "Mconv1_stage2" 592 | type: "Convolution" 593 | bottom: "concat_stage2" 594 | top: "Mconv1_stage2" 595 | param { 596 | lr_mult: 4.0 597 | decay_mult: 1 598 | } 599 | param { 600 | lr_mult: 8.0 601 | decay_mult: 0 602 | } 603 | convolution_param { 604 | num_output: 128 605 | pad: 3 606 | kernel_size: 7 607 | weight_filler { 608 | type: "gaussian" 609 | std: 0.01 610 | } 611 | bias_filler { 612 | type: "constant" 613 | } 614 | dilation: 1 615 | } 616 | } 617 | layer { 618 | name: "Mrelu1_2_stage2_1" 619 | type: "ReLU" 620 | bottom: "Mconv1_stage2" 621 | top: "Mconv1_stage2" 622 | } 623 | layer { 624 | name: "Mconv2_stage2" 625 | type: "Convolution" 626 | bottom: "Mconv1_stage2" 627 | top: "Mconv2_stage2" 628 | param { 629 | lr_mult: 4.0 630 | decay_mult: 1 631 | } 632 | param { 633 | lr_mult: 8.0 634 | decay_mult: 0 635 | } 636 | convolution_param { 637 | num_output: 128 638 | pad: 3 639 | kernel_size: 7 640 | weight_filler { 641 | type: "gaussian" 642 | std: 0.01 643 | } 644 | bias_filler { 645 | type: "constant" 646 | } 647 | dilation: 1 648 | } 649 | } 650 | layer { 651 | name: "Mrelu1_3_stage2_2" 652 | type: "ReLU" 653 | bottom: "Mconv2_stage2" 654 | top: "Mconv2_stage2" 655 | } 656 | layer { 657 | name: "Mconv3_stage2" 658 | type: "Convolution" 659 | bottom: "Mconv2_stage2" 660 | top: "Mconv3_stage2" 661 | param { 662 | lr_mult: 4.0 663 | decay_mult: 1 664 | } 665 | param { 666 | lr_mult: 8.0 667 | decay_mult: 0 668 | } 669 | convolution_param { 670 | num_output: 128 671 | pad: 3 672 | kernel_size: 7 673 | weight_filler { 674 | type: "gaussian" 675 | std: 0.01 676 | } 677 | bias_filler { 678 | type: "constant" 679 | } 680 | dilation: 1 681 | } 682 | } 683 | layer { 684 | name: "Mrelu1_4_stage2_3" 685 | type: "ReLU" 686 | bottom: "Mconv3_stage2" 687 | top: "Mconv3_stage2" 688 | } 689 | layer { 690 | name: "Mconv4_stage2" 691 | type: "Convolution" 692 | bottom: "Mconv3_stage2" 693 | top: "Mconv4_stage2" 694 | param { 695 | lr_mult: 4.0 696 | decay_mult: 1 697 | } 698 | param { 699 | lr_mult: 8.0 700 | decay_mult: 0 701 | } 702 | convolution_param { 703 | num_output: 128 704 | pad: 3 705 | kernel_size: 7 706 | weight_filler { 707 | type: "gaussian" 708 | std: 0.01 709 | } 710 | bias_filler { 711 | type: "constant" 712 | } 713 | dilation: 1 714 | } 715 | } 716 | layer { 717 | name: "Mrelu1_5_stage2_4" 718 | type: "ReLU" 719 | bottom: "Mconv4_stage2" 720 | top: "Mconv4_stage2" 721 | } 722 | layer { 723 | name: "Mconv5_stage2" 724 | type: "Convolution" 725 | bottom: "Mconv4_stage2" 726 | top: "Mconv5_stage2" 727 | param { 728 | lr_mult: 4.0 729 | decay_mult: 1 730 | } 731 | param { 732 | lr_mult: 8.0 733 | decay_mult: 0 734 | } 735 | convolution_param { 736 | num_output: 128 737 | pad: 3 738 | kernel_size: 7 739 | weight_filler { 740 | type: "gaussian" 741 | std: 0.01 742 | } 743 | bias_filler { 744 | type: "constant" 745 | } 746 | dilation: 1 747 | } 748 | } 749 | layer { 750 | name: "Mrelu1_6_stage2_5" 751 | type: "ReLU" 752 | bottom: "Mconv5_stage2" 753 | top: "Mconv5_stage2" 754 | } 755 | layer { 756 | name: "Mconv6_stage2" 757 | type: "Convolution" 758 | bottom: "Mconv5_stage2" 759 | top: "Mconv6_stage2" 760 | param { 761 | lr_mult: 4.0 762 | decay_mult: 1 763 | } 764 | param { 765 | lr_mult: 8.0 766 | decay_mult: 0 767 | } 768 | convolution_param { 769 | num_output: 128 770 | pad: 0 771 | kernel_size: 1 772 | weight_filler { 773 | type: "gaussian" 774 | std: 0.01 775 | } 776 | bias_filler { 777 | type: "constant" 778 | } 779 | dilation: 1 780 | } 781 | } 782 | layer { 783 | name: "Mrelu1_7_stage2_6" 784 | type: "ReLU" 785 | bottom: "Mconv6_stage2" 786 | top: "Mconv6_stage2" 787 | } 788 | layer { 789 | name: "Mconv7_stage2" 790 | type: "Convolution" 791 | bottom: "Mconv6_stage2" 792 | top: "Mconv7_stage2" 793 | param { 794 | lr_mult: 4.0 795 | decay_mult: 1 796 | } 797 | param { 798 | lr_mult: 8.0 799 | decay_mult: 0 800 | } 801 | convolution_param { 802 | num_output: 22 803 | pad: 0 804 | kernel_size: 1 805 | weight_filler { 806 | type: "gaussian" 807 | std: 0.01 808 | } 809 | bias_filler { 810 | type: "constant" 811 | } 812 | dilation: 1 813 | } 814 | } 815 | layer { 816 | name: "concat_stage3" 817 | type: "Concat" 818 | bottom: "Mconv7_stage2" 819 | bottom: "conv5_3_CPM" 820 | top: "concat_stage3" 821 | concat_param { 822 | axis: 1 823 | } 824 | } 825 | layer { 826 | name: "Mconv1_stage3" 827 | type: "Convolution" 828 | bottom: "concat_stage3" 829 | top: "Mconv1_stage3" 830 | param { 831 | lr_mult: 4.0 832 | decay_mult: 1 833 | } 834 | param { 835 | lr_mult: 8.0 836 | decay_mult: 0 837 | } 838 | convolution_param { 839 | num_output: 128 840 | pad: 3 841 | kernel_size: 7 842 | weight_filler { 843 | type: "gaussian" 844 | std: 0.01 845 | } 846 | bias_filler { 847 | type: "constant" 848 | } 849 | dilation: 1 850 | } 851 | } 852 | layer { 853 | name: "Mrelu1_2_stage3_1" 854 | type: "ReLU" 855 | bottom: "Mconv1_stage3" 856 | top: "Mconv1_stage3" 857 | } 858 | layer { 859 | name: "Mconv2_stage3" 860 | type: "Convolution" 861 | bottom: "Mconv1_stage3" 862 | top: "Mconv2_stage3" 863 | param { 864 | lr_mult: 4.0 865 | decay_mult: 1 866 | } 867 | param { 868 | lr_mult: 8.0 869 | decay_mult: 0 870 | } 871 | convolution_param { 872 | num_output: 128 873 | pad: 3 874 | kernel_size: 7 875 | weight_filler { 876 | type: "gaussian" 877 | std: 0.01 878 | } 879 | bias_filler { 880 | type: "constant" 881 | } 882 | dilation: 1 883 | } 884 | } 885 | layer { 886 | name: "Mrelu1_3_stage3_2" 887 | type: "ReLU" 888 | bottom: "Mconv2_stage3" 889 | top: "Mconv2_stage3" 890 | } 891 | layer { 892 | name: "Mconv3_stage3" 893 | type: "Convolution" 894 | bottom: "Mconv2_stage3" 895 | top: "Mconv3_stage3" 896 | param { 897 | lr_mult: 4.0 898 | decay_mult: 1 899 | } 900 | param { 901 | lr_mult: 8.0 902 | decay_mult: 0 903 | } 904 | convolution_param { 905 | num_output: 128 906 | pad: 3 907 | kernel_size: 7 908 | weight_filler { 909 | type: "gaussian" 910 | std: 0.01 911 | } 912 | bias_filler { 913 | type: "constant" 914 | } 915 | dilation: 1 916 | } 917 | } 918 | layer { 919 | name: "Mrelu1_4_stage3_3" 920 | type: "ReLU" 921 | bottom: "Mconv3_stage3" 922 | top: "Mconv3_stage3" 923 | } 924 | layer { 925 | name: "Mconv4_stage3" 926 | type: "Convolution" 927 | bottom: "Mconv3_stage3" 928 | top: "Mconv4_stage3" 929 | param { 930 | lr_mult: 4.0 931 | decay_mult: 1 932 | } 933 | param { 934 | lr_mult: 8.0 935 | decay_mult: 0 936 | } 937 | convolution_param { 938 | num_output: 128 939 | pad: 3 940 | kernel_size: 7 941 | weight_filler { 942 | type: "gaussian" 943 | std: 0.01 944 | } 945 | bias_filler { 946 | type: "constant" 947 | } 948 | dilation: 1 949 | } 950 | } 951 | layer { 952 | name: "Mrelu1_5_stage3_4" 953 | type: "ReLU" 954 | bottom: "Mconv4_stage3" 955 | top: "Mconv4_stage3" 956 | } 957 | layer { 958 | name: "Mconv5_stage3" 959 | type: "Convolution" 960 | bottom: "Mconv4_stage3" 961 | top: "Mconv5_stage3" 962 | param { 963 | lr_mult: 4.0 964 | decay_mult: 1 965 | } 966 | param { 967 | lr_mult: 8.0 968 | decay_mult: 0 969 | } 970 | convolution_param { 971 | num_output: 128 972 | pad: 3 973 | kernel_size: 7 974 | weight_filler { 975 | type: "gaussian" 976 | std: 0.01 977 | } 978 | bias_filler { 979 | type: "constant" 980 | } 981 | dilation: 1 982 | } 983 | } 984 | layer { 985 | name: "Mrelu1_6_stage3_5" 986 | type: "ReLU" 987 | bottom: "Mconv5_stage3" 988 | top: "Mconv5_stage3" 989 | } 990 | layer { 991 | name: "Mconv6_stage3" 992 | type: "Convolution" 993 | bottom: "Mconv5_stage3" 994 | top: "Mconv6_stage3" 995 | param { 996 | lr_mult: 4.0 997 | decay_mult: 1 998 | } 999 | param { 1000 | lr_mult: 8.0 1001 | decay_mult: 0 1002 | } 1003 | convolution_param { 1004 | num_output: 128 1005 | pad: 0 1006 | kernel_size: 1 1007 | weight_filler { 1008 | type: "gaussian" 1009 | std: 0.01 1010 | } 1011 | bias_filler { 1012 | type: "constant" 1013 | } 1014 | dilation: 1 1015 | } 1016 | } 1017 | layer { 1018 | name: "Mrelu1_7_stage3_6" 1019 | type: "ReLU" 1020 | bottom: "Mconv6_stage3" 1021 | top: "Mconv6_stage3" 1022 | } 1023 | layer { 1024 | name: "Mconv7_stage3" 1025 | type: "Convolution" 1026 | bottom: "Mconv6_stage3" 1027 | top: "Mconv7_stage3" 1028 | param { 1029 | lr_mult: 4.0 1030 | decay_mult: 1 1031 | } 1032 | param { 1033 | lr_mult: 8.0 1034 | decay_mult: 0 1035 | } 1036 | convolution_param { 1037 | num_output: 22 1038 | pad: 0 1039 | kernel_size: 1 1040 | weight_filler { 1041 | type: "gaussian" 1042 | std: 0.01 1043 | } 1044 | bias_filler { 1045 | type: "constant" 1046 | } 1047 | dilation: 1 1048 | } 1049 | } 1050 | layer { 1051 | name: "concat_stage4" 1052 | type: "Concat" 1053 | bottom: "Mconv7_stage3" 1054 | bottom: "conv5_3_CPM" 1055 | top: "concat_stage4" 1056 | concat_param { 1057 | axis: 1 1058 | } 1059 | } 1060 | layer { 1061 | name: "Mconv1_stage4" 1062 | type: "Convolution" 1063 | bottom: "concat_stage4" 1064 | top: "Mconv1_stage4" 1065 | param { 1066 | lr_mult: 4.0 1067 | decay_mult: 1 1068 | } 1069 | param { 1070 | lr_mult: 8.0 1071 | decay_mult: 0 1072 | } 1073 | convolution_param { 1074 | num_output: 128 1075 | pad: 3 1076 | kernel_size: 7 1077 | weight_filler { 1078 | type: "gaussian" 1079 | std: 0.01 1080 | } 1081 | bias_filler { 1082 | type: "constant" 1083 | } 1084 | dilation: 1 1085 | } 1086 | } 1087 | layer { 1088 | name: "Mrelu1_2_stage4_1" 1089 | type: "ReLU" 1090 | bottom: "Mconv1_stage4" 1091 | top: "Mconv1_stage4" 1092 | } 1093 | layer { 1094 | name: "Mconv2_stage4" 1095 | type: "Convolution" 1096 | bottom: "Mconv1_stage4" 1097 | top: "Mconv2_stage4" 1098 | param { 1099 | lr_mult: 4.0 1100 | decay_mult: 1 1101 | } 1102 | param { 1103 | lr_mult: 8.0 1104 | decay_mult: 0 1105 | } 1106 | convolution_param { 1107 | num_output: 128 1108 | pad: 3 1109 | kernel_size: 7 1110 | weight_filler { 1111 | type: "gaussian" 1112 | std: 0.01 1113 | } 1114 | bias_filler { 1115 | type: "constant" 1116 | } 1117 | dilation: 1 1118 | } 1119 | } 1120 | layer { 1121 | name: "Mrelu1_3_stage4_2" 1122 | type: "ReLU" 1123 | bottom: "Mconv2_stage4" 1124 | top: "Mconv2_stage4" 1125 | } 1126 | layer { 1127 | name: "Mconv3_stage4" 1128 | type: "Convolution" 1129 | bottom: "Mconv2_stage4" 1130 | top: "Mconv3_stage4" 1131 | param { 1132 | lr_mult: 4.0 1133 | decay_mult: 1 1134 | } 1135 | param { 1136 | lr_mult: 8.0 1137 | decay_mult: 0 1138 | } 1139 | convolution_param { 1140 | num_output: 128 1141 | pad: 3 1142 | kernel_size: 7 1143 | weight_filler { 1144 | type: "gaussian" 1145 | std: 0.01 1146 | } 1147 | bias_filler { 1148 | type: "constant" 1149 | } 1150 | dilation: 1 1151 | } 1152 | } 1153 | layer { 1154 | name: "Mrelu1_4_stage4_3" 1155 | type: "ReLU" 1156 | bottom: "Mconv3_stage4" 1157 | top: "Mconv3_stage4" 1158 | } 1159 | layer { 1160 | name: "Mconv4_stage4" 1161 | type: "Convolution" 1162 | bottom: "Mconv3_stage4" 1163 | top: "Mconv4_stage4" 1164 | param { 1165 | lr_mult: 4.0 1166 | decay_mult: 1 1167 | } 1168 | param { 1169 | lr_mult: 8.0 1170 | decay_mult: 0 1171 | } 1172 | convolution_param { 1173 | num_output: 128 1174 | pad: 3 1175 | kernel_size: 7 1176 | weight_filler { 1177 | type: "gaussian" 1178 | std: 0.01 1179 | } 1180 | bias_filler { 1181 | type: "constant" 1182 | } 1183 | dilation: 1 1184 | } 1185 | } 1186 | layer { 1187 | name: "Mrelu1_5_stage4_4" 1188 | type: "ReLU" 1189 | bottom: "Mconv4_stage4" 1190 | top: "Mconv4_stage4" 1191 | } 1192 | layer { 1193 | name: "Mconv5_stage4" 1194 | type: "Convolution" 1195 | bottom: "Mconv4_stage4" 1196 | top: "Mconv5_stage4" 1197 | param { 1198 | lr_mult: 4.0 1199 | decay_mult: 1 1200 | } 1201 | param { 1202 | lr_mult: 8.0 1203 | decay_mult: 0 1204 | } 1205 | convolution_param { 1206 | num_output: 128 1207 | pad: 3 1208 | kernel_size: 7 1209 | weight_filler { 1210 | type: "gaussian" 1211 | std: 0.01 1212 | } 1213 | bias_filler { 1214 | type: "constant" 1215 | } 1216 | dilation: 1 1217 | } 1218 | } 1219 | layer { 1220 | name: "Mrelu1_6_stage4_5" 1221 | type: "ReLU" 1222 | bottom: "Mconv5_stage4" 1223 | top: "Mconv5_stage4" 1224 | } 1225 | layer { 1226 | name: "Mconv6_stage4" 1227 | type: "Convolution" 1228 | bottom: "Mconv5_stage4" 1229 | top: "Mconv6_stage4" 1230 | param { 1231 | lr_mult: 4.0 1232 | decay_mult: 1 1233 | } 1234 | param { 1235 | lr_mult: 8.0 1236 | decay_mult: 0 1237 | } 1238 | convolution_param { 1239 | num_output: 128 1240 | pad: 0 1241 | kernel_size: 1 1242 | weight_filler { 1243 | type: "gaussian" 1244 | std: 0.01 1245 | } 1246 | bias_filler { 1247 | type: "constant" 1248 | } 1249 | dilation: 1 1250 | } 1251 | } 1252 | layer { 1253 | name: "Mrelu1_7_stage4_6" 1254 | type: "ReLU" 1255 | bottom: "Mconv6_stage4" 1256 | top: "Mconv6_stage4" 1257 | } 1258 | layer { 1259 | name: "Mconv7_stage4" 1260 | type: "Convolution" 1261 | bottom: "Mconv6_stage4" 1262 | top: "Mconv7_stage4" 1263 | param { 1264 | lr_mult: 4.0 1265 | decay_mult: 1 1266 | } 1267 | param { 1268 | lr_mult: 8.0 1269 | decay_mult: 0 1270 | } 1271 | convolution_param { 1272 | num_output: 22 1273 | pad: 0 1274 | kernel_size: 1 1275 | weight_filler { 1276 | type: "gaussian" 1277 | std: 0.01 1278 | } 1279 | bias_filler { 1280 | type: "constant" 1281 | } 1282 | dilation: 1 1283 | } 1284 | } 1285 | layer { 1286 | name: "concat_stage5" 1287 | type: "Concat" 1288 | bottom: "Mconv7_stage4" 1289 | bottom: "conv5_3_CPM" 1290 | top: "concat_stage5" 1291 | concat_param { 1292 | axis: 1 1293 | } 1294 | } 1295 | layer { 1296 | name: "Mconv1_stage5" 1297 | type: "Convolution" 1298 | bottom: "concat_stage5" 1299 | top: "Mconv1_stage5" 1300 | param { 1301 | lr_mult: 4.0 1302 | decay_mult: 1 1303 | } 1304 | param { 1305 | lr_mult: 8.0 1306 | decay_mult: 0 1307 | } 1308 | convolution_param { 1309 | num_output: 128 1310 | pad: 3 1311 | kernel_size: 7 1312 | weight_filler { 1313 | type: "gaussian" 1314 | std: 0.01 1315 | } 1316 | bias_filler { 1317 | type: "constant" 1318 | } 1319 | dilation: 1 1320 | } 1321 | } 1322 | layer { 1323 | name: "Mrelu1_2_stage5_1" 1324 | type: "ReLU" 1325 | bottom: "Mconv1_stage5" 1326 | top: "Mconv1_stage5" 1327 | } 1328 | layer { 1329 | name: "Mconv2_stage5" 1330 | type: "Convolution" 1331 | bottom: "Mconv1_stage5" 1332 | top: "Mconv2_stage5" 1333 | param { 1334 | lr_mult: 4.0 1335 | decay_mult: 1 1336 | } 1337 | param { 1338 | lr_mult: 8.0 1339 | decay_mult: 0 1340 | } 1341 | convolution_param { 1342 | num_output: 128 1343 | pad: 3 1344 | kernel_size: 7 1345 | weight_filler { 1346 | type: "gaussian" 1347 | std: 0.01 1348 | } 1349 | bias_filler { 1350 | type: "constant" 1351 | } 1352 | dilation: 1 1353 | } 1354 | } 1355 | layer { 1356 | name: "Mrelu1_3_stage5_2" 1357 | type: "ReLU" 1358 | bottom: "Mconv2_stage5" 1359 | top: "Mconv2_stage5" 1360 | } 1361 | layer { 1362 | name: "Mconv3_stage5" 1363 | type: "Convolution" 1364 | bottom: "Mconv2_stage5" 1365 | top: "Mconv3_stage5" 1366 | param { 1367 | lr_mult: 4.0 1368 | decay_mult: 1 1369 | } 1370 | param { 1371 | lr_mult: 8.0 1372 | decay_mult: 0 1373 | } 1374 | convolution_param { 1375 | num_output: 128 1376 | pad: 3 1377 | kernel_size: 7 1378 | weight_filler { 1379 | type: "gaussian" 1380 | std: 0.01 1381 | } 1382 | bias_filler { 1383 | type: "constant" 1384 | } 1385 | dilation: 1 1386 | } 1387 | } 1388 | layer { 1389 | name: "Mrelu1_4_stage5_3" 1390 | type: "ReLU" 1391 | bottom: "Mconv3_stage5" 1392 | top: "Mconv3_stage5" 1393 | } 1394 | layer { 1395 | name: "Mconv4_stage5" 1396 | type: "Convolution" 1397 | bottom: "Mconv3_stage5" 1398 | top: "Mconv4_stage5" 1399 | param { 1400 | lr_mult: 4.0 1401 | decay_mult: 1 1402 | } 1403 | param { 1404 | lr_mult: 8.0 1405 | decay_mult: 0 1406 | } 1407 | convolution_param { 1408 | num_output: 128 1409 | pad: 3 1410 | kernel_size: 7 1411 | weight_filler { 1412 | type: "gaussian" 1413 | std: 0.01 1414 | } 1415 | bias_filler { 1416 | type: "constant" 1417 | } 1418 | dilation: 1 1419 | } 1420 | } 1421 | layer { 1422 | name: "Mrelu1_5_stage5_4" 1423 | type: "ReLU" 1424 | bottom: "Mconv4_stage5" 1425 | top: "Mconv4_stage5" 1426 | } 1427 | layer { 1428 | name: "Mconv5_stage5" 1429 | type: "Convolution" 1430 | bottom: "Mconv4_stage5" 1431 | top: "Mconv5_stage5" 1432 | param { 1433 | lr_mult: 4.0 1434 | decay_mult: 1 1435 | } 1436 | param { 1437 | lr_mult: 8.0 1438 | decay_mult: 0 1439 | } 1440 | convolution_param { 1441 | num_output: 128 1442 | pad: 3 1443 | kernel_size: 7 1444 | weight_filler { 1445 | type: "gaussian" 1446 | std: 0.01 1447 | } 1448 | bias_filler { 1449 | type: "constant" 1450 | } 1451 | dilation: 1 1452 | } 1453 | } 1454 | layer { 1455 | name: "Mrelu1_6_stage5_5" 1456 | type: "ReLU" 1457 | bottom: "Mconv5_stage5" 1458 | top: "Mconv5_stage5" 1459 | } 1460 | layer { 1461 | name: "Mconv6_stage5" 1462 | type: "Convolution" 1463 | bottom: "Mconv5_stage5" 1464 | top: "Mconv6_stage5" 1465 | param { 1466 | lr_mult: 4.0 1467 | decay_mult: 1 1468 | } 1469 | param { 1470 | lr_mult: 8.0 1471 | decay_mult: 0 1472 | } 1473 | convolution_param { 1474 | num_output: 128 1475 | pad: 0 1476 | kernel_size: 1 1477 | weight_filler { 1478 | type: "gaussian" 1479 | std: 0.01 1480 | } 1481 | bias_filler { 1482 | type: "constant" 1483 | } 1484 | dilation: 1 1485 | } 1486 | } 1487 | layer { 1488 | name: "Mrelu1_7_stage5_6" 1489 | type: "ReLU" 1490 | bottom: "Mconv6_stage5" 1491 | top: "Mconv6_stage5" 1492 | } 1493 | layer { 1494 | name: "Mconv7_stage5" 1495 | type: "Convolution" 1496 | bottom: "Mconv6_stage5" 1497 | top: "Mconv7_stage5" 1498 | param { 1499 | lr_mult: 4.0 1500 | decay_mult: 1 1501 | } 1502 | param { 1503 | lr_mult: 8.0 1504 | decay_mult: 0 1505 | } 1506 | convolution_param { 1507 | num_output: 22 1508 | pad: 0 1509 | kernel_size: 1 1510 | weight_filler { 1511 | type: "gaussian" 1512 | std: 0.01 1513 | } 1514 | bias_filler { 1515 | type: "constant" 1516 | } 1517 | dilation: 1 1518 | } 1519 | } 1520 | layer { 1521 | name: "concat_stage6" 1522 | type: "Concat" 1523 | bottom: "Mconv7_stage5" 1524 | bottom: "conv5_3_CPM" 1525 | top: "concat_stage6" 1526 | concat_param { 1527 | axis: 1 1528 | } 1529 | } 1530 | layer { 1531 | name: "Mconv1_stage6" 1532 | type: "Convolution" 1533 | bottom: "concat_stage6" 1534 | top: "Mconv1_stage6" 1535 | param { 1536 | lr_mult: 4.0 1537 | decay_mult: 1 1538 | } 1539 | param { 1540 | lr_mult: 8.0 1541 | decay_mult: 0 1542 | } 1543 | convolution_param { 1544 | num_output: 128 1545 | pad: 3 1546 | kernel_size: 7 1547 | weight_filler { 1548 | type: "gaussian" 1549 | std: 0.01 1550 | } 1551 | bias_filler { 1552 | type: "constant" 1553 | } 1554 | dilation: 1 1555 | } 1556 | } 1557 | layer { 1558 | name: "Mrelu1_2_stage6_1" 1559 | type: "ReLU" 1560 | bottom: "Mconv1_stage6" 1561 | top: "Mconv1_stage6" 1562 | } 1563 | layer { 1564 | name: "Mconv2_stage6" 1565 | type: "Convolution" 1566 | bottom: "Mconv1_stage6" 1567 | top: "Mconv2_stage6" 1568 | param { 1569 | lr_mult: 4.0 1570 | decay_mult: 1 1571 | } 1572 | param { 1573 | lr_mult: 8.0 1574 | decay_mult: 0 1575 | } 1576 | convolution_param { 1577 | num_output: 128 1578 | pad: 3 1579 | kernel_size: 7 1580 | weight_filler { 1581 | type: "gaussian" 1582 | std: 0.01 1583 | } 1584 | bias_filler { 1585 | type: "constant" 1586 | } 1587 | dilation: 1 1588 | } 1589 | } 1590 | layer { 1591 | name: "Mrelu1_3_stage6_2" 1592 | type: "ReLU" 1593 | bottom: "Mconv2_stage6" 1594 | top: "Mconv2_stage6" 1595 | } 1596 | layer { 1597 | name: "Mconv3_stage6" 1598 | type: "Convolution" 1599 | bottom: "Mconv2_stage6" 1600 | top: "Mconv3_stage6" 1601 | param { 1602 | lr_mult: 4.0 1603 | decay_mult: 1 1604 | } 1605 | param { 1606 | lr_mult: 8.0 1607 | decay_mult: 0 1608 | } 1609 | convolution_param { 1610 | num_output: 128 1611 | pad: 3 1612 | kernel_size: 7 1613 | weight_filler { 1614 | type: "gaussian" 1615 | std: 0.01 1616 | } 1617 | bias_filler { 1618 | type: "constant" 1619 | } 1620 | dilation: 1 1621 | } 1622 | } 1623 | layer { 1624 | name: "Mrelu1_4_stage6_3" 1625 | type: "ReLU" 1626 | bottom: "Mconv3_stage6" 1627 | top: "Mconv3_stage6" 1628 | } 1629 | layer { 1630 | name: "Mconv4_stage6" 1631 | type: "Convolution" 1632 | bottom: "Mconv3_stage6" 1633 | top: "Mconv4_stage6" 1634 | param { 1635 | lr_mult: 4.0 1636 | decay_mult: 1 1637 | } 1638 | param { 1639 | lr_mult: 8.0 1640 | decay_mult: 0 1641 | } 1642 | convolution_param { 1643 | num_output: 128 1644 | pad: 3 1645 | kernel_size: 7 1646 | weight_filler { 1647 | type: "gaussian" 1648 | std: 0.01 1649 | } 1650 | bias_filler { 1651 | type: "constant" 1652 | } 1653 | dilation: 1 1654 | } 1655 | } 1656 | layer { 1657 | name: "Mrelu1_5_stage6_4" 1658 | type: "ReLU" 1659 | bottom: "Mconv4_stage6" 1660 | top: "Mconv4_stage6" 1661 | } 1662 | layer { 1663 | name: "Mconv5_stage6" 1664 | type: "Convolution" 1665 | bottom: "Mconv4_stage6" 1666 | top: "Mconv5_stage6" 1667 | param { 1668 | lr_mult: 4.0 1669 | decay_mult: 1 1670 | } 1671 | param { 1672 | lr_mult: 8.0 1673 | decay_mult: 0 1674 | } 1675 | convolution_param { 1676 | num_output: 128 1677 | pad: 3 1678 | kernel_size: 7 1679 | weight_filler { 1680 | type: "gaussian" 1681 | std: 0.01 1682 | } 1683 | bias_filler { 1684 | type: "constant" 1685 | } 1686 | dilation: 1 1687 | } 1688 | } 1689 | layer { 1690 | name: "Mrelu1_6_stage6_5" 1691 | type: "ReLU" 1692 | bottom: "Mconv5_stage6" 1693 | top: "Mconv5_stage6" 1694 | } 1695 | layer { 1696 | name: "Mconv6_stage6" 1697 | type: "Convolution" 1698 | bottom: "Mconv5_stage6" 1699 | top: "Mconv6_stage6" 1700 | param { 1701 | lr_mult: 4.0 1702 | decay_mult: 1 1703 | } 1704 | param { 1705 | lr_mult: 8.0 1706 | decay_mult: 0 1707 | } 1708 | convolution_param { 1709 | num_output: 128 1710 | pad: 0 1711 | kernel_size: 1 1712 | weight_filler { 1713 | type: "gaussian" 1714 | std: 0.01 1715 | } 1716 | bias_filler { 1717 | type: "constant" 1718 | } 1719 | dilation: 1 1720 | } 1721 | } 1722 | layer { 1723 | name: "Mrelu1_7_stage6_6" 1724 | type: "ReLU" 1725 | bottom: "Mconv6_stage6" 1726 | top: "Mconv6_stage6" 1727 | } 1728 | layer { 1729 | name: "Mconv7_stage6" 1730 | type: "Convolution" 1731 | bottom: "Mconv6_stage6" 1732 | # top: "Mconv7_stage6" 1733 | top: "net_output" 1734 | param { 1735 | lr_mult: 4.0 1736 | decay_mult: 1 1737 | } 1738 | param { 1739 | lr_mult: 8.0 1740 | decay_mult: 0 1741 | } 1742 | convolution_param { 1743 | num_output: 22 1744 | pad: 0 1745 | kernel_size: 1 1746 | weight_filler { 1747 | type: "gaussian" 1748 | std: 0.01 1749 | } 1750 | bias_filler { 1751 | type: "constant" 1752 | } 1753 | dilation: 1 1754 | } 1755 | } 1756 | 1757 | -------------------------------------------------------------------------------- /hand-estimator-using-caffemodel/test_out/README.md: -------------------------------------------------------------------------------- 1 | This directory is for saving the after-processing outputs. 2 | -------------------------------------------------------------------------------- /hand-estimator-using-caffemodel/test_out/hand_out.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/hand-estimator-using-caffemodel/test_out/hand_out.jpg -------------------------------------------------------------------------------- /hand-estimator-using-caffemodel/test_out/shake_out.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/hand-estimator-using-caffemodel/test_out/shake_out.jpg -------------------------------------------------------------------------------- /pose-estimator-tensorflow/README.md: -------------------------------------------------------------------------------- 1 | 2 | ## Usage Examples : 3 | ***for `openpose_tf.py`*** 4 | 1. Creat a file names `graph_model_coco`, and palce the pretrained model here; 5 | 2. Put the test file (image or video) under the same directory of `openpose_tf.py`. 6 | 3. Commmand line input: 7 | - `python openpose_tf.py --image=test.jpg` 8 | - `python openpose_tf.py --video=test.mp4` 9 | - if no argument provided, it starts the webcam. 10 | 11 | ***for `openpose_skeleton_sequence_drawer.py`*** 12 | 1. Same as the upper; 13 | 2. Commmand line input: 14 | - `python openpose_skeleton_sequence_drawer.py` 15 | 16 | ## Pretrained models intro 17 | - **graph_opt.pb**: training with the VGG net, as same as the CMU providing caffemodel, ***more accurate but slower*** 18 | - **graph_opt_mobile.pb**: training with the Mobilenet, much smaller than the origin VGG, ***faster but less accurate*** 19 | 20 | ## Acknowledgement 21 | Thanks [ildoonet](https://github.com/ildoonet) and his awesome work [tf-pose-estimation](https://github.com/ildoonet/tf-pose-estimation), the graph weight files are collected there. 22 | -------------------------------------------------------------------------------- /pose-estimator-tensorflow/openpose_skeleton_sequence_drawer.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import numpy as np 3 | import os 4 | import math 5 | import argparse 6 | import sys 7 | import itertools 8 | import tensorflow as tf 9 | from enum import Enum 10 | from collections import namedtuple 11 | from scipy.ndimage import maximum_filter, gaussian_filter 12 | from imutils.video import FPS 13 | from pathlib import Path 14 | import time 15 | 16 | 17 | class Human: 18 | """ 19 | body_parts: list of BodyPart 20 | """ 21 | __slots__ = ('body_parts', 'pairs', 'uidx_list') 22 | 23 | def __init__(self, pairs): 24 | self.pairs = [] 25 | self.uidx_list = set() 26 | self.body_parts = {} 27 | for pair in pairs: 28 | self.add_pair(pair) 29 | 30 | @staticmethod 31 | def _get_uidx(part_idx, idx): 32 | return '%d-%d' % (part_idx, idx) 33 | 34 | def add_pair(self, pair): 35 | self.pairs.append(pair) 36 | self.body_parts[pair.part_idx1] = BodyPart(Human._get_uidx(pair.part_idx1, pair.idx1), 37 | pair.part_idx1, 38 | pair.coord1[0], pair.coord1[1], pair.score) 39 | self.body_parts[pair.part_idx2] = BodyPart(Human._get_uidx(pair.part_idx2, pair.idx2), 40 | pair.part_idx2, 41 | pair.coord2[0], pair.coord2[1], pair.score) 42 | self.uidx_list.add(Human._get_uidx(pair.part_idx1, pair.idx1)) 43 | self.uidx_list.add(Human._get_uidx(pair.part_idx2, pair.idx2)) 44 | 45 | def is_connected(self, other): 46 | return len(self.uidx_list & other.uidx_list) > 0 47 | 48 | def merge(self, other): 49 | for pair in other.pairs: 50 | self.add_pair(pair) 51 | 52 | def part_count(self): 53 | return len(self.body_parts.keys()) 54 | 55 | def get_max_score(self): 56 | return max([x.score for _, x in self.body_parts.items()]) 57 | 58 | def __str__(self): 59 | return ' '.join([str(x) for x in self.body_parts.values()]) 60 | 61 | 62 | class BodyPart: 63 | """ 64 | part_idx : part index(eg. 0 for nose) 65 | x, y: coordinate of body part 66 | score : confidence score 67 | """ 68 | __slots__ = ('uidx', 'part_idx', 'x', 'y', 'score') 69 | 70 | def __init__(self, uidx, part_idx, x, y, score): 71 | self.uidx = uidx 72 | self.part_idx = part_idx 73 | self.x, self.y = x, y 74 | self.score = score 75 | 76 | 77 | class TfPoseEstimator: 78 | ENSEMBLE = 'addup' # average, addup 79 | 80 | def __init__(self, graph_path, target_size=(368, 368)): 81 | self.target_size = target_size 82 | 83 | # load graph 84 | with tf.gfile.GFile(graph_path, 'rb') as f: 85 | graph_def = tf.GraphDef() 86 | graph_def.ParseFromString(f.read()) 87 | 88 | self.graph = tf.get_default_graph() 89 | tf.import_graph_def(graph_def, name='TfPoseEstimator') 90 | self.persistent_sess = tf.Session(graph=self.graph) 91 | 92 | self.tensor_image = self.graph.get_tensor_by_name('TfPoseEstimator/image:0') 93 | self.tensor_output = self.graph.get_tensor_by_name('TfPoseEstimator/Openpose/concat_stage7:0') 94 | self.heatMat = self.pafMat = None 95 | 96 | @staticmethod 97 | def draw_pose_bbox(npimg, humans, imgcopy=False): 98 | if imgcopy: 99 | npimg = np.copy(npimg) 100 | image_h, image_w = npimg.shape[:2] 101 | joints, bboxes, xcenter = [], [], [] 102 | for human in humans: 103 | xs, ys, centers = [], [], {} 104 | # 将所有关节点绘制到图像上 105 | for i in range(CocoPart.Background.value): 106 | if i not in human.body_parts.keys(): 107 | continue 108 | 109 | body_part = human.body_parts[i] 110 | center = (int(body_part.x * image_w + 0.5), int(body_part.y * image_h + 0.5)) 111 | 112 | centers[i] = center 113 | xs.append(center[0]) 114 | ys.append(center[1]) 115 | # 绘制关节点 116 | cv.circle(npimg, center, 3, CocoColors[i], thickness=3, lineType=8, shift=0) 117 | # 将属于同一人的关节点按照各个部位相连 118 | for pair_order, pair in enumerate(CocoPairsRender): 119 | if pair[0] not in human.body_parts.keys() or pair[1] not in human.body_parts.keys(): 120 | continue 121 | cv.line(npimg, centers[pair[0]], centers[pair[1]], CocoColors[pair_order], 3, cv.LINE_AA) 122 | # 根据每个人的关节点信息生成ROI区域 123 | xmin = float(min(xs) / image_w) 124 | ymin = float(min(ys) / image_h) 125 | xmax = float(max(xs) / image_w) 126 | ymax = float(max(ys) / image_h) 127 | bboxes.append([xmin, ymin, xmax, ymax, 0.9999]) 128 | joints.append(centers) 129 | if 1 in centers: 130 | xcenter.append(centers[1][0]) 131 | 132 | # draw bounding_boxes 133 | # x_start, x_end= int(xmin*image_w) - 20, int(xmax*image_w) + 20 134 | # y_start, y_end =int(ymin*image_h) - 15, int(ymax*image_h) + 15 135 | # cv.rectangle(npimg, (x_start, y_start), (x_end, y_end), [0, 250, 0], 3) 136 | return npimg, joints, bboxes, xcenter 137 | 138 | @staticmethod 139 | def draw_pose_sequence(frame, humans, cnt): 140 | # global back_ground 141 | image_h, image_w = frame.shape[:2] 142 | for human in humans: 143 | xs, ys, centers = [], [], {} 144 | # 将所有关节点绘制到图像上 145 | joints_num = CocoPart.Background.value 146 | for i in range(joints_num): 147 | if i not in human.body_parts.keys(): 148 | continue 149 | 150 | body_part = human.body_parts[i] 151 | center = (int(body_part.x * image_w + 0.5), 152 | int(body_part.y * image_h + 0.5)) 153 | 154 | pos_move = cnt * back_ground_w / 500 155 | center = (int(center[0] + pos_move / 2), 156 | int(center[1] + back_ground_h / 1.5 - pos_move)) 157 | 158 | centers[i] = center 159 | xs.append(center[0]) 160 | ys.append(center[1]) 161 | # 绘制关节点 162 | cv.circle(back_ground, centers[i], 3, CocoColors[i], thickness=3, lineType=8, shift=0) 163 | 164 | # 将属于同一人的关节点按照各个部位相连 165 | for pair_order, pair in enumerate(CocoPairsRender): 166 | if pair[0] not in human.body_parts.keys() or pair[1] not in human.body_parts.keys(): 167 | continue 168 | cv.line(back_ground, centers[pair[0]], centers[pair[1]], CocoColors[pair_order], 3, cv.LINE_AA) 169 | 170 | @staticmethod 171 | def draw_pose_sequence_reverse(frame, humans, cnt): 172 | image_h, image_w = frame.shape[:2] 173 | for human in humans: 174 | xs, ys, centers = [], [], {} 175 | # 将所有关节点绘制到图像上 176 | joints_num = CocoPart.Background.value 177 | for i in range(joints_num): 178 | if i not in human.body_parts.keys(): 179 | continue 180 | 181 | body_part = human.body_parts[i] 182 | center = (int(body_part.x * image_w + 0.5), 183 | int(body_part.y * image_h + 0.5)) 184 | 185 | pos_move = cnt * back_ground_w / 500 186 | center = (int(center[0] + back_ground_w/1.5 - pos_move/2), 187 | int(center[1] + pos_move)) 188 | 189 | centers[i] = center 190 | xs.append(center[0]) 191 | ys.append(center[1]) 192 | # 绘制关节点 193 | cv.circle(back_ground, centers[i], 3, CocoColors[i], thickness=3, lineType=8, shift=0) 194 | 195 | # 将属于同一人的关节点按照各个部位相连 196 | for pair_order, pair in enumerate(CocoPairsRender): 197 | if pair[0] not in human.body_parts.keys() or pair[1] not in human.body_parts.keys(): 198 | continue 199 | cv.line(back_ground, centers[pair[0]], centers[pair[1]], CocoColors[pair_order], 3, cv.LINE_AA) 200 | 201 | @staticmethod 202 | def draw_pose_sequence_horizontally(frame, humans, cnt): 203 | image_h, image_w = frame.shape[:2] 204 | for human in humans: 205 | xs, ys, centers = [], [], {} 206 | # 将所有关节点绘制到图像上 207 | joints_num = CocoPart.Background.value 208 | for i in range(joints_num): 209 | if i not in human.body_parts.keys(): 210 | continue 211 | 212 | body_part = human.body_parts[i] 213 | center = (int(body_part.x * image_w + 0.5), 214 | int(body_part.y * image_h + 0.5)) 215 | 216 | pos_move = cnt * back_ground_w / 200 217 | center = (int(center[0] + pos_move), center[1]) 218 | 219 | centers[i] = center 220 | xs.append(center[0]) 221 | ys.append(center[1]) 222 | # 绘制关节点 223 | cv.circle(back_ground, centers[i], 3, CocoColors[i], thickness=3, lineType=8, shift=0) 224 | 225 | # 将属于同一人的关节点按照各个部位相连 226 | for pair_order, pair in enumerate(CocoPairsRender): 227 | if pair[0] not in human.body_parts.keys() or pair[1] not in human.body_parts.keys(): 228 | continue 229 | cv.line(back_ground, centers[pair[0]], centers[pair[1]], CocoColors[pair_order], 3, cv.LINE_AA) 230 | 231 | def inference(self, npimg): 232 | if npimg is None: 233 | raise Exception('The image does not exist.') 234 | 235 | rois = [] 236 | infos = [] 237 | # _get_scaled_img 238 | if npimg.shape[:2] != (self.target_size[1], self.target_size[0]): 239 | # resize 240 | npimg = cv.resize(npimg, self.target_size) 241 | rois.extend([npimg]) 242 | infos.extend([(0.0, 0.0, 1.0, 1.0)]) 243 | 244 | output = self.persistent_sess.run(self.tensor_output, feed_dict={self.tensor_image: rois}) 245 | 246 | heat_mats = output[:, :, :, :19] 247 | paf_mats = output[:, :, :, 19:] 248 | 249 | output_h, output_w = output.shape[1:3] 250 | max_ratio_w = max_ratio_h = 10000.0 251 | for info in infos: 252 | max_ratio_w = min(max_ratio_w, info[2]) 253 | max_ratio_h = min(max_ratio_h, info[3]) 254 | mat_w, mat_h = int(output_w / max_ratio_w), int(output_h / max_ratio_h) 255 | 256 | resized_heat_mat = np.zeros((mat_h, mat_w, 19), dtype=np.float32) 257 | resized_paf_mat = np.zeros((mat_h, mat_w, 38), dtype=np.float32) 258 | resized_cnt_mat = np.zeros((mat_h, mat_w, 1), dtype=np.float32) 259 | resized_cnt_mat += 1e-12 260 | 261 | for heatMat, pafMat, info in zip(heat_mats, paf_mats, infos): 262 | w, h = int(info[2] * mat_w), int(info[3] * mat_h) 263 | heatMat = cv.resize(heatMat, (w, h)) 264 | pafMat = cv.resize(pafMat, (w, h)) 265 | x, y = int(info[0] * mat_w), int(info[1] * mat_h) 266 | # add up 267 | resized_heat_mat[max(0, y):y + h, max(0, x):x + w, :] = np.maximum( 268 | resized_heat_mat[max(0, y):y + h, max(0, x):x + w, :], heatMat[max(0, -y):, max(0, -x):, :]) 269 | resized_paf_mat[max(0, y):y + h, max(0, x):x + w, :] += pafMat[max(0, -y):, max(0, -x):, :] 270 | resized_cnt_mat[max(0, y):y + h, max(0, x):x + w, :] += 1 271 | 272 | self.heatMat = resized_heat_mat 273 | self.pafMat = resized_paf_mat / (np.log(resized_cnt_mat) + 1) 274 | 275 | humans = PoseEstimator.estimate(self.heatMat, self.pafMat) 276 | return humans 277 | 278 | 279 | class PoseEstimator: 280 | heatmap_supress = False 281 | heatmap_gaussian = True 282 | adaptive_threshold = False 283 | 284 | NMS_Threshold = 0.15 285 | Local_PAF_Threshold = 0.2 286 | PAF_Count_Threshold = 5 287 | Part_Count_Threshold = 4 288 | Part_Score_Threshold = 4.5 289 | 290 | PartPair = namedtuple('PartPair', ['score', 'part_idx1', 'part_idx2', 'idx1', 'idx2', 291 | 'coord1', 'coord2', 'score1', 'score2'], verbose=False) 292 | 293 | @staticmethod 294 | def non_max_suppression(plain, window_size=3, threshold=NMS_Threshold): 295 | under_threshold_indices = plain < threshold 296 | plain[under_threshold_indices] = 0 297 | return plain * (plain == maximum_filter(plain, footprint=np.ones((window_size, window_size)))) 298 | 299 | @staticmethod 300 | def estimate(heat_mat, paf_mat): 301 | if heat_mat.shape[2] == 19: 302 | heat_mat = np.rollaxis(heat_mat, 2, 0) 303 | if paf_mat.shape[2] == 38: 304 | paf_mat = np.rollaxis(paf_mat, 2, 0) 305 | 306 | if PoseEstimator.heatmap_supress: 307 | heat_mat = heat_mat - heat_mat.min(axis=1).min(axis=1).reshape(19, 1, 1) 308 | heat_mat = heat_mat - heat_mat.min(axis=2).reshape(19, heat_mat.shape[1], 1) 309 | 310 | if PoseEstimator.heatmap_gaussian: 311 | heat_mat = gaussian_filter(heat_mat, sigma=0.5) 312 | 313 | if PoseEstimator.adaptive_threshold: 314 | _NMS_Threshold = max(np.average(heat_mat) * 4.0, PoseEstimator.NMS_Threshold) 315 | _NMS_Threshold = min(_NMS_Threshold, 0.3) 316 | else: 317 | _NMS_Threshold = PoseEstimator.NMS_Threshold 318 | 319 | # extract interesting coordinates using NMS. 320 | coords = [] # [[coords in plane1], [....], ...] 321 | for plain in heat_mat[:-1]: 322 | nms = PoseEstimator.non_max_suppression(plain, 5, _NMS_Threshold) 323 | coords.append(np.where(nms >= _NMS_Threshold)) 324 | 325 | # score pairs 326 | pairs_by_conn = list() 327 | for (part_idx1, part_idx2), (paf_x_idx, paf_y_idx) in zip(CocoPairs, CocoPairsNetwork): 328 | pairs = PoseEstimator.score_pairs( 329 | part_idx1, part_idx2, 330 | coords[part_idx1], coords[part_idx2], 331 | paf_mat[paf_x_idx], paf_mat[paf_y_idx], 332 | heatmap=heat_mat, 333 | rescale=(1.0 / heat_mat.shape[2], 1.0 / heat_mat.shape[1]) 334 | ) 335 | 336 | pairs_by_conn.extend(pairs) 337 | 338 | # merge pairs to human 339 | # pairs_by_conn is sorted by CocoPairs(part importance) and Score between Parts. 340 | humans = [Human([pair]) for pair in pairs_by_conn] 341 | while True: 342 | merge_items = None 343 | for k1, k2 in itertools.combinations(humans, 2): 344 | if k1 == k2: 345 | continue 346 | if k1.is_connected(k2): 347 | merge_items = (k1, k2) 348 | break 349 | 350 | if merge_items is not None: 351 | merge_items[0].merge(merge_items[1]) 352 | humans.remove(merge_items[1]) 353 | else: 354 | break 355 | 356 | # reject by subset count 357 | humans = [human for human in humans if human.part_count() >= PoseEstimator.PAF_Count_Threshold] 358 | # reject by subset max score 359 | humans = [human for human in humans if human.get_max_score() >= PoseEstimator.Part_Score_Threshold] 360 | return humans 361 | 362 | @staticmethod 363 | def score_pairs(part_idx1, part_idx2, coord_list1, coord_list2, paf_mat_x, paf_mat_y, heatmap, rescale=(1.0, 1.0)): 364 | connection_temp = [] 365 | 366 | cnt = 0 367 | for idx1, (y1, x1) in enumerate(zip(coord_list1[0], coord_list1[1])): 368 | for idx2, (y2, x2) in enumerate(zip(coord_list2[0], coord_list2[1])): 369 | score, count = PoseEstimator.get_score(x1, y1, x2, y2, paf_mat_x, paf_mat_y) 370 | cnt += 1 371 | if count < PoseEstimator.PAF_Count_Threshold or score <= 0.0: 372 | continue 373 | connection_temp.append(PoseEstimator.PartPair( 374 | score=score, 375 | part_idx1=part_idx1, part_idx2=part_idx2, 376 | idx1=idx1, idx2=idx2, 377 | coord1=(x1 * rescale[0], y1 * rescale[1]), 378 | coord2=(x2 * rescale[0], y2 * rescale[1]), 379 | score1=heatmap[part_idx1][y1][x1], 380 | score2=heatmap[part_idx2][y2][x2], 381 | )) 382 | 383 | connection = [] 384 | used_idx1, used_idx2 = set(), set() 385 | for candidate in sorted(connection_temp, key=lambda x: x.score, reverse=True): 386 | # check not connected 387 | if candidate.idx1 in used_idx1 or candidate.idx2 in used_idx2: 388 | continue 389 | connection.append(candidate) 390 | used_idx1.add(candidate.idx1) 391 | used_idx2.add(candidate.idx2) 392 | 393 | return connection 394 | 395 | @staticmethod 396 | def get_score(x1, y1, x2, y2, paf_mat_x, paf_mat_y): 397 | __num_inter = 10 398 | __num_inter_f = float(__num_inter) 399 | dx, dy = x2 - x1, y2 - y1 400 | normVec = math.sqrt(dx ** 2 + dy ** 2) 401 | 402 | if normVec < 1e-4: 403 | return 0.0, 0 404 | 405 | vx, vy = dx / normVec, dy / normVec 406 | 407 | xs = np.arange(x1, x2, dx / __num_inter_f) if x1 != x2 else np.full((__num_inter,), x1) 408 | ys = np.arange(y1, y2, dy / __num_inter_f) if y1 != y2 else np.full((__num_inter,), y1) 409 | xs = (xs + 0.5).astype(np.int8) 410 | ys = (ys + 0.5).astype(np.int8) 411 | 412 | # without vectorization 413 | pafXs = np.zeros(__num_inter) 414 | pafYs = np.zeros(__num_inter) 415 | for idx, (mx, my) in enumerate(zip(xs, ys)): 416 | pafXs[idx] = paf_mat_x[my][mx] 417 | pafYs[idx] = paf_mat_y[my][mx] 418 | 419 | local_scores = pafXs * vx + pafYs * vy 420 | thidxs = local_scores > PoseEstimator.Local_PAF_Threshold 421 | 422 | return sum(local_scores * thidxs), sum(thidxs) 423 | 424 | 425 | class CocoPart(Enum): 426 | Nose = 0 427 | Neck = 1 428 | RShoulder = 2 429 | RElbow = 3 430 | RWrist = 4 431 | LShoulder = 5 432 | LElbow = 6 433 | LWrist = 7 434 | RHip = 8 435 | RKnee = 9 436 | RAnkle = 10 437 | LHip = 11 438 | LKnee = 12 439 | LAnkle = 13 440 | REye = 14 441 | LEye = 15 442 | REar = 16 443 | LEar = 17 444 | Background = 18 445 | 446 | 447 | FilePath = Path.cwd() 448 | outFilePath = Path(FilePath / "test_out/") 449 | # 有两种输入:height 368 * width 368 (不保持宽高比);height 368 * width (保持宽高比) 450 | input_width, input_height = 490, 368 451 | 452 | # Usage example: python openpose.py --video=test.mp4 453 | parser = argparse.ArgumentParser(description='OpenPose for pose skeleton in python') 454 | parser.add_argument('--image', help='Path to image file.') 455 | parser.add_argument('--video', help='Path to video file.') 456 | args = parser.parse_args() 457 | 458 | skeleton_estimator = None 459 | nPoints = 18 460 | CocoPairs = [(1, 2), (1, 5), (2, 3), (3, 4), (5, 6), (6, 7), (1, 8), 461 | (8, 9), (9, 10), (1, 11), (11, 12), (12, 13), (1, 0), (0, 14), 462 | (14, 16), (0, 15), (15, 17), (2, 16), (5, 17)] # = 19 463 | CocoPairsRender = CocoPairs[:-2] 464 | CocoPairsNetwork = [(12, 13), (20, 21), (14, 15), (16, 17), (22, 23), (24, 25), (0, 1), 465 | (2, 3), (4, 5), (6, 7), (8, 9), (10, 11), (28, 29), (30, 31), 466 | (34, 35), (32, 33), (36, 37), (18, 19), (26, 27)] # = 19 467 | 468 | # CocoColors = [[0, 100, 255], [0, 100, 255], [0, 255, 255], [0, 100, 255], [0, 255, 255], [0, 100, 255], 469 | # [0, 255, 0], [255, 200, 100], [255, 0, 255], [0, 255, 0], [255, 200, 100], [255, 0, 255], 470 | # [0, 0, 255], [255, 0, 0], [200, 200, 0], [255, 0, 0], [200, 200, 0], [0, 0, 0]] 471 | CocoColors = [[255, 0, 0], [255, 85, 0], [255, 170, 0], [255, 255, 0], [170, 255, 0], [85, 255, 0], [0, 255, 0], 472 | [0, 255, 85], [0, 255, 170], [0, 255, 255], [0, 170, 255], [0, 85, 255], [0, 0, 255], [85, 0, 255], 473 | [170, 0, 255], [255, 0, 255], [255, 0, 170], [255, 0, 85]] 474 | 475 | 476 | def choose_run_mode(): 477 | global outFilePath 478 | if args.image: 479 | # Open the image file 480 | if not os.path.isfile(args.image): 481 | print("Input image file ", args.image, " doesn't exist") 482 | sys.exit(1) 483 | cap = cv.VideoCapture(args.image) 484 | outFilePath = str(outFilePath / (args.image[:-4] + '_out.jpg')) 485 | elif args.video: 486 | # Open the video file 487 | if not os.path.isfile(args.video): 488 | print("Input video file ", args.video, " doesn't exist") 489 | sys.exit(1) 490 | cap = cv.VideoCapture(args.video) 491 | outFilePath = str(outFilePath/(args.video[:-4]+'_out.mp4')) 492 | else: 493 | # Webcam input 494 | cap = cv.VideoCapture(0) 495 | outFilePath = str(outFilePath / 'webcam_out.mp4') 496 | return cap 497 | 498 | 499 | def load_pretrain_model(): 500 | global skeleton_estimator 501 | skeleton_estimator = TfPoseEstimator( 502 | get_graph_path('VGG_origin'), target_size=(input_width, input_height)) 503 | 504 | 505 | def get_graph_path(model_name): 506 | dyn_graph_path = { 507 | 'VGG_origin': str(FilePath/"graph_model_coco/graph_opt.pb"), 508 | 'mobilemet': str(FilePath/"graph_model_coco/graph_opt_mobile.pb") 509 | } 510 | graph_path = dyn_graph_path[model_name] 511 | if not os.path.isfile(graph_path): 512 | raise Exception('Graph file doesn\'t exist, path=%s' % graph_path) 513 | return graph_path 514 | 515 | 516 | if __name__ == "__main__": 517 | cap = choose_run_mode() 518 | load_pretrain_model() 519 | fps = FPS().start() 520 | vid_writer = cv.VideoWriter(outFilePath, cv.VideoWriter_fourcc(*'mp4v'), 15, 521 | (round(cap.get(cv.CAP_PROP_FRAME_WIDTH)), 522 | round(cap.get(cv.CAP_PROP_FRAME_HEIGHT)))) 523 | 524 | # 画骨架的序列图所需要的白色背景,可更改背景大小 525 | back_ground_h, back_ground_w = 1500, 2400 526 | back_ground = np.ones((back_ground_h, back_ground_w), dtype=np.uint8) 527 | back_ground = cv.cvtColor(back_ground, cv.COLOR_GRAY2BGR) 528 | back_ground[:, :, :] = 255 # white background 529 | # 间隔interval帧,抽取骨架信息一次 530 | interval = 12 531 | 532 | # 在线FPS计算参数 533 | start_time = time.time() 534 | fps_interval = 1 # 每隔1秒重新计算帧数 535 | fps_count = 0 536 | realtime_fps = 'Starting' 537 | 538 | frame_count = 0 539 | while cv.waitKey(1) < 0: 540 | has_frame, frame = cap.read() 541 | img_copy = np.copy(frame) 542 | if not has_frame: 543 | cv.waitKey(3000) 544 | break 545 | 546 | frame_count += 1 547 | fps_count += 1 548 | humans = skeleton_estimator.inference(frame) 549 | # frame_show = frame 550 | frame_show = TfPoseEstimator.draw_pose_bbox(frame, humans)[0] 551 | 552 | if frame_count % interval == 0 and frame_count <= 180: 553 | TfPoseEstimator.draw_pose_sequence_horizontally(frame, humans, frame_count) 554 | 555 | # FPS的实时显示 556 | fps_show = 'FPS:{0:.4}'.format(realtime_fps) 557 | cv.putText(frame, fps_show, (5, 15), cv.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2) 558 | if (time.time() - start_time) > fps_interval: 559 | # 计算这个interval过程中的帧数,若interval为1秒,则为FPS 560 | realtime_fps = fps_count / (time.time() - start_time) 561 | fps_count = 0 # 帧数清零 562 | start_time = time.time() 563 | 564 | winName = 'Pose Skeleton from OpenPose' 565 | cv.imshow(winName, frame_show) 566 | 567 | # Write the frame with the detection boxes 568 | if args.image: 569 | cv.imwrite(outFilePath, frame_show) 570 | else: 571 | # vid_writer.write(frame_show) 572 | vid_writer.write(img_copy) 573 | 574 | if not args.image: 575 | fps.stop() 576 | vid_writer.release() 577 | cap.release() 578 | cv.imwrite('skeleton_sequence.jpg', back_ground) 579 | cv.destroyAllWindows() 580 | -------------------------------------------------------------------------------- /pose-estimator-tensorflow/openpose_tf.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import numpy as np 3 | import os 4 | import math 5 | import argparse 6 | import sys 7 | import itertools 8 | import tensorflow as tf 9 | from enum import Enum 10 | from collections import namedtuple 11 | from scipy.ndimage import maximum_filter, gaussian_filter 12 | from imutils.video import FPS 13 | from pathlib import Path 14 | 15 | 16 | class Human: 17 | """ 18 | body_parts: list of BodyPart 19 | """ 20 | __slots__ = ('body_parts', 'pairs', 'uidx_list') 21 | 22 | def __init__(self, pairs): 23 | self.pairs = [] 24 | self.uidx_list = set() 25 | self.body_parts = {} 26 | for pair in pairs: 27 | self.add_pair(pair) 28 | 29 | @staticmethod 30 | def _get_uidx(part_idx, idx): 31 | return '%d-%d' % (part_idx, idx) 32 | 33 | def add_pair(self, pair): 34 | self.pairs.append(pair) 35 | self.body_parts[pair.part_idx1] = BodyPart(Human._get_uidx(pair.part_idx1, pair.idx1), 36 | pair.part_idx1, 37 | pair.coord1[0], pair.coord1[1], pair.score) 38 | self.body_parts[pair.part_idx2] = BodyPart(Human._get_uidx(pair.part_idx2, pair.idx2), 39 | pair.part_idx2, 40 | pair.coord2[0], pair.coord2[1], pair.score) 41 | self.uidx_list.add(Human._get_uidx(pair.part_idx1, pair.idx1)) 42 | self.uidx_list.add(Human._get_uidx(pair.part_idx2, pair.idx2)) 43 | 44 | def is_connected(self, other): 45 | return len(self.uidx_list & other.uidx_list) > 0 46 | 47 | def merge(self, other): 48 | for pair in other.pairs: 49 | self.add_pair(pair) 50 | 51 | def part_count(self): 52 | return len(self.body_parts.keys()) 53 | 54 | def get_max_score(self): 55 | return max([x.score for _, x in self.body_parts.items()]) 56 | 57 | def __str__(self): 58 | return ' '.join([str(x) for x in self.body_parts.values()]) 59 | 60 | 61 | class BodyPart: 62 | """ 63 | part_idx : part index(eg. 0 for nose) 64 | x, y: coordinate of body part 65 | score : confidence score 66 | """ 67 | __slots__ = ('uidx', 'part_idx', 'x', 'y', 'score') 68 | 69 | def __init__(self, uidx, part_idx, x, y, score): 70 | self.uidx = uidx 71 | self.part_idx = part_idx 72 | self.x, self.y = x, y 73 | self.score = score 74 | 75 | 76 | class TfPoseEstimator: 77 | ENSEMBLE = 'addup' # average, addup 78 | 79 | def __init__(self, graph_path, target_size=(320, 240)): 80 | self.target_size = target_size 81 | 82 | # load graph 83 | with tf.gfile.GFile(graph_path, 'rb') as f: 84 | graph_def = tf.GraphDef() 85 | graph_def.ParseFromString(f.read()) 86 | 87 | self.graph = tf.get_default_graph() 88 | tf.import_graph_def(graph_def, name='TfPoseEstimator') 89 | self.persistent_sess = tf.Session(graph=self.graph) 90 | 91 | self.tensor_image = self.graph.get_tensor_by_name('TfPoseEstimator/image:0') 92 | self.tensor_output = self.graph.get_tensor_by_name('TfPoseEstimator/Openpose/concat_stage7:0') 93 | self.heatMat = self.pafMat = None 94 | 95 | @staticmethod 96 | def draw_pose_bbox(npimg, humans, imgcopy=False): 97 | if imgcopy: 98 | npimg = np.copy(npimg) 99 | image_h, image_w = npimg.shape[:2] 100 | joints, bboxes, xcenter = [], [], [] 101 | for human in humans: 102 | xs, ys, centers = [], [], {} 103 | # 将所有关节点绘制到图像上 104 | for i in range(CocoPart.Background.value): 105 | if i not in human.body_parts.keys(): 106 | continue 107 | 108 | body_part = human.body_parts[i] 109 | center = (int(body_part.x * image_w + 0.5), int(body_part.y * image_h + 0.5)) 110 | 111 | centers[i] = center 112 | xs.append(center[0]) 113 | ys.append(center[1]) 114 | # 绘制关节点 115 | cv.circle(npimg, center, 3, CocoColors[i], thickness=3, lineType=8, shift=0) 116 | # 将属于同一人的关节点按照各个部位相连 117 | for pair_order, pair in enumerate(CocoPairsRender): 118 | if pair[0] not in human.body_parts.keys() or pair[1] not in human.body_parts.keys(): 119 | continue 120 | cv.line(npimg, centers[pair[0]], centers[pair[1]], CocoColors[pair_order], 3, cv.LINE_AA) 121 | # 根据每个人的关节点信息生成ROI区域 122 | xmin = float(min(xs) / image_w) 123 | ymin = float(min(ys) / image_h) 124 | xmax = float(max(xs) / image_w) 125 | ymax = float(max(ys) / image_h) 126 | bboxes.append([xmin, ymin, xmax, ymax, 0.9999]) 127 | joints.append(centers) 128 | if 1 in centers: 129 | xcenter.append(centers[1][0]) 130 | 131 | # draw bounding_boxes 132 | # x_start, x_end= int(xmin*image_w) - 20, int(xmax*image_w) + 20 133 | # y_start, y_end =int(ymin*image_h) - 15, int(ymax*image_h) + 15 134 | # cv.rectangle(npimg, (x_start, y_start), (x_end, y_end), [0, 250, 0], 3) 135 | return npimg, joints, bboxes, xcenter 136 | 137 | def inference(self, npimg): 138 | if npimg is None: 139 | raise Exception('The image does not exist.') 140 | 141 | rois = [] 142 | infos = [] 143 | # _get_scaled_img 144 | if npimg.shape[:2] != (self.target_size[1], self.target_size[0]): 145 | # resize 146 | npimg = cv.resize(npimg, self.target_size) 147 | rois.extend([npimg]) 148 | infos.extend([(0.0, 0.0, 1.0, 1.0)]) 149 | 150 | output = self.persistent_sess.run(self.tensor_output, feed_dict={self.tensor_image: rois}) 151 | 152 | heat_mats = output[:, :, :, :19] 153 | paf_mats = output[:, :, :, 19:] 154 | 155 | output_h, output_w = output.shape[1:3] 156 | max_ratio_w = max_ratio_h = 10000.0 157 | for info in infos: 158 | max_ratio_w = min(max_ratio_w, info[2]) 159 | max_ratio_h = min(max_ratio_h, info[3]) 160 | mat_w, mat_h = int(output_w / max_ratio_w), int(output_h / max_ratio_h) 161 | 162 | resized_heat_mat = np.zeros((mat_h, mat_w, 19), dtype=np.float32) 163 | resized_paf_mat = np.zeros((mat_h, mat_w, 38), dtype=np.float32) 164 | resized_cnt_mat = np.zeros((mat_h, mat_w, 1), dtype=np.float32) 165 | resized_cnt_mat += 1e-12 166 | 167 | for heatMat, pafMat, info in zip(heat_mats, paf_mats, infos): 168 | w, h = int(info[2] * mat_w), int(info[3] * mat_h) 169 | heatMat = cv.resize(heatMat, (w, h)) 170 | pafMat = cv.resize(pafMat, (w, h)) 171 | x, y = int(info[0] * mat_w), int(info[1] * mat_h) 172 | # add up 173 | resized_heat_mat[max(0, y):y + h, max(0, x):x + w, :] = np.maximum( 174 | resized_heat_mat[max(0, y):y + h, max(0, x):x + w, :], heatMat[max(0, -y):, max(0, -x):, :]) 175 | resized_paf_mat[max(0, y):y + h, max(0, x):x + w, :] += pafMat[max(0, -y):, max(0, -x):, :] 176 | resized_cnt_mat[max(0, y):y + h, max(0, x):x + w, :] += 1 177 | 178 | self.heatMat = resized_heat_mat 179 | self.pafMat = resized_paf_mat / (np.log(resized_cnt_mat) + 1) 180 | 181 | humans = PoseEstimator.estimate(self.heatMat, self.pafMat) 182 | return humans 183 | 184 | 185 | class PoseEstimator: 186 | heatmap_supress = False 187 | heatmap_gaussian = True 188 | adaptive_threshold = False 189 | 190 | NMS_Threshold = 0.15 191 | Local_PAF_Threshold = 0.2 192 | PAF_Count_Threshold = 5 193 | Part_Count_Threshold = 4 194 | Part_Score_Threshold = 4.5 195 | 196 | PartPair = namedtuple('PartPair', ['score', 'part_idx1', 'part_idx2', 'idx1', 'idx2', 197 | 'coord1', 'coord2', 'score1', 'score2'], verbose=False) 198 | 199 | @staticmethod 200 | def non_max_suppression(plain, window_size=3, threshold=NMS_Threshold): 201 | under_threshold_indices = plain < threshold 202 | plain[under_threshold_indices] = 0 203 | return plain * (plain == maximum_filter(plain, footprint=np.ones((window_size, window_size)))) 204 | 205 | @staticmethod 206 | def estimate(heat_mat, paf_mat): 207 | if heat_mat.shape[2] == 19: 208 | heat_mat = np.rollaxis(heat_mat, 2, 0) 209 | if paf_mat.shape[2] == 38: 210 | paf_mat = np.rollaxis(paf_mat, 2, 0) 211 | 212 | if PoseEstimator.heatmap_supress: 213 | heat_mat = heat_mat - heat_mat.min(axis=1).min(axis=1).reshape(19, 1, 1) 214 | heat_mat = heat_mat - heat_mat.min(axis=2).reshape(19, heat_mat.shape[1], 1) 215 | 216 | if PoseEstimator.heatmap_gaussian: 217 | heat_mat = gaussian_filter(heat_mat, sigma=0.5) 218 | 219 | if PoseEstimator.adaptive_threshold: 220 | _NMS_Threshold = max(np.average(heat_mat) * 4.0, PoseEstimator.NMS_Threshold) 221 | _NMS_Threshold = min(_NMS_Threshold, 0.3) 222 | else: 223 | _NMS_Threshold = PoseEstimator.NMS_Threshold 224 | 225 | # extract interesting coordinates using NMS. 226 | coords = [] # [[coords in plane1], [....], ...] 227 | for plain in heat_mat[:-1]: 228 | nms = PoseEstimator.non_max_suppression(plain, 5, _NMS_Threshold) 229 | coords.append(np.where(nms >= _NMS_Threshold)) 230 | 231 | # score pairs 232 | pairs_by_conn = list() 233 | for (part_idx1, part_idx2), (paf_x_idx, paf_y_idx) in zip(CocoPairs, CocoPairsNetwork): 234 | pairs = PoseEstimator.score_pairs( 235 | part_idx1, part_idx2, 236 | coords[part_idx1], coords[part_idx2], 237 | paf_mat[paf_x_idx], paf_mat[paf_y_idx], 238 | heatmap=heat_mat, 239 | rescale=(1.0 / heat_mat.shape[2], 1.0 / heat_mat.shape[1]) 240 | ) 241 | 242 | pairs_by_conn.extend(pairs) 243 | 244 | # merge pairs to human 245 | # pairs_by_conn is sorted by CocoPairs(part importance) and Score between Parts. 246 | humans = [Human([pair]) for pair in pairs_by_conn] 247 | while True: 248 | merge_items = None 249 | for k1, k2 in itertools.combinations(humans, 2): 250 | if k1 == k2: 251 | continue 252 | if k1.is_connected(k2): 253 | merge_items = (k1, k2) 254 | break 255 | 256 | if merge_items is not None: 257 | merge_items[0].merge(merge_items[1]) 258 | humans.remove(merge_items[1]) 259 | else: 260 | break 261 | 262 | # reject by subset count 263 | humans = [human for human in humans if human.part_count() >= PoseEstimator.PAF_Count_Threshold] 264 | # reject by subset max score 265 | humans = [human for human in humans if human.get_max_score() >= PoseEstimator.Part_Score_Threshold] 266 | return humans 267 | 268 | @staticmethod 269 | def score_pairs(part_idx1, part_idx2, coord_list1, coord_list2, paf_mat_x, paf_mat_y, heatmap, rescale=(1.0, 1.0)): 270 | connection_temp = [] 271 | 272 | cnt = 0 273 | for idx1, (y1, x1) in enumerate(zip(coord_list1[0], coord_list1[1])): 274 | for idx2, (y2, x2) in enumerate(zip(coord_list2[0], coord_list2[1])): 275 | score, count = PoseEstimator.get_score(x1, y1, x2, y2, paf_mat_x, paf_mat_y) 276 | cnt += 1 277 | if count < PoseEstimator.PAF_Count_Threshold or score <= 0.0: 278 | continue 279 | connection_temp.append(PoseEstimator.PartPair( 280 | score=score, 281 | part_idx1=part_idx1, part_idx2=part_idx2, 282 | idx1=idx1, idx2=idx2, 283 | coord1=(x1 * rescale[0], y1 * rescale[1]), 284 | coord2=(x2 * rescale[0], y2 * rescale[1]), 285 | score1=heatmap[part_idx1][y1][x1], 286 | score2=heatmap[part_idx2][y2][x2], 287 | )) 288 | 289 | connection = [] 290 | used_idx1, used_idx2 = set(), set() 291 | for candidate in sorted(connection_temp, key=lambda x: x.score, reverse=True): 292 | # check not connected 293 | if candidate.idx1 in used_idx1 or candidate.idx2 in used_idx2: 294 | continue 295 | connection.append(candidate) 296 | used_idx1.add(candidate.idx1) 297 | used_idx2.add(candidate.idx2) 298 | 299 | return connection 300 | 301 | @staticmethod 302 | def get_score(x1, y1, x2, y2, paf_mat_x, paf_mat_y): 303 | __num_inter = 10 304 | __num_inter_f = float(__num_inter) 305 | dx, dy = x2 - x1, y2 - y1 306 | normVec = math.sqrt(dx ** 2 + dy ** 2) 307 | 308 | if normVec < 1e-4: 309 | return 0.0, 0 310 | 311 | vx, vy = dx / normVec, dy / normVec 312 | 313 | xs = np.arange(x1, x2, dx / __num_inter_f) if x1 != x2 else np.full((__num_inter,), x1) 314 | ys = np.arange(y1, y2, dy / __num_inter_f) if y1 != y2 else np.full((__num_inter,), y1) 315 | xs = (xs + 0.5).astype(np.int8) 316 | ys = (ys + 0.5).astype(np.int8) 317 | 318 | # without vectorization 319 | pafXs = np.zeros(__num_inter) 320 | pafYs = np.zeros(__num_inter) 321 | for idx, (mx, my) in enumerate(zip(xs, ys)): 322 | pafXs[idx] = paf_mat_x[my][mx] 323 | pafYs[idx] = paf_mat_y[my][mx] 324 | 325 | local_scores = pafXs * vx + pafYs * vy 326 | thidxs = local_scores > PoseEstimator.Local_PAF_Threshold 327 | 328 | return sum(local_scores * thidxs), sum(thidxs) 329 | 330 | 331 | class CocoPart(Enum): 332 | Nose = 0 333 | Neck = 1 334 | RShoulder = 2 335 | RElbow = 3 336 | RWrist = 4 337 | LShoulder = 5 338 | LElbow = 6 339 | LWrist = 7 340 | RHip = 8 341 | RKnee = 9 342 | RAnkle = 10 343 | LHip = 11 344 | LKnee = 12 345 | LAnkle = 13 346 | REye = 14 347 | LEye = 15 348 | REar = 16 349 | LEar = 17 350 | Background = 18 351 | 352 | 353 | FilePath = Path.cwd() 354 | outFilePath = Path(FilePath / "test_out/") 355 | threshold = 0.1 356 | # winWidth, winHeight = 800, 600 357 | input_width, input_height = 432, 368 358 | 359 | # Usage example: python openpose.py --video=test.mp4 360 | parser = argparse.ArgumentParser(description='OpenPose for pose skeleton in python') 361 | parser.add_argument('--image', help='Path to image file.') 362 | parser.add_argument('--video', help='Path to video file.') 363 | args = parser.parse_args() 364 | 365 | skeleton_estimator = None 366 | nPoints = 18 367 | CocoPairs = [(1, 2), (1, 5), (2, 3), (3, 4), (5, 6), (6, 7), (1, 8), (8, 9), (9, 10), (1, 11), 368 | (11, 12), (12, 13), (1, 0), (0, 14), (14, 16), (0, 15), (15, 17), (2, 16), (5, 17)] # = 19 369 | CocoPairsRender = CocoPairs[:-2] 370 | CocoPairsNetwork = [(12, 13), (20, 21), (14, 15), (16, 17), (22, 23), (24, 25), (0, 1), (2, 3), (4, 5), 371 | (6, 7), (8, 9), (10, 11), (28, 29), (30, 31), (34, 35), (32, 33), (36, 37), (18, 19), (26, 27)] # = 19 372 | CocoColors = [[255, 0, 0], [255, 85, 0], [255, 170, 0], [255, 255, 0], [170, 255, 0], [85, 255, 0], [0, 255, 0], 373 | [0, 255, 85], [0, 255, 170], [0, 255, 255], [0, 170, 255], [0, 85, 255], [0, 0, 255], [85, 0, 255], 374 | [170, 0, 255], [255, 0, 255], [255, 0, 170], [255, 0, 85]] 375 | 376 | 377 | def choose_run_mode(): 378 | global outFilePath 379 | if args.image: 380 | # Open the image file 381 | if not os.path.isfile(args.image): 382 | print("Input image file ", args.image, " doesn't exist") 383 | sys.exit(1) 384 | cap = cv.VideoCapture(args.image) 385 | outFilePath = str(outFilePath / (args.image[:-4] + '_out.jpg')) 386 | elif args.video: 387 | # Open the video file 388 | if not os.path.isfile(args.video): 389 | print("Input video file ", args.video, " doesn't exist") 390 | sys.exit(1) 391 | cap = cv.VideoCapture(args.video) 392 | outFilePath = str(outFilePath/(args.video[:-4]+'_out.mp4')) 393 | else: 394 | # Webcam input 395 | cap = cv.VideoCapture(0) 396 | outFilePath = str(outFilePath / 'webcam_out.mp4') 397 | return cap 398 | 399 | 400 | def load_pretrain_model(): 401 | global skeleton_estimator 402 | skeleton_estimator = TfPoseEstimator( 403 | get_graph_path('VGG_origin'), target_size=(input_width, input_height)) 404 | 405 | 406 | def get_graph_path(model_name): 407 | dyn_graph_path = { 408 | 'VGG_origin': str(FilePath/"graph_model_coco/graph_opt.pb"), 409 | 'mobilemet':str(FilePath/"graph_model_coco/graph_opt_mobile.pb") 410 | } 411 | graph_path = dyn_graph_path[model_name] 412 | if not os.path.isfile(graph_path): 413 | raise Exception('Graph file doesn\'t exist, path=%s' % graph_path) 414 | return graph_path 415 | 416 | 417 | def show_fps(frame): 418 | if not args.image: 419 | fps.update() 420 | fps.stop() 421 | fps_label = "FPS: {:.2f}".format(fps.fps()) 422 | cv.putText(frame, fps_label, (5, 15), cv.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2) 423 | 424 | 425 | if __name__ == "__main__": 426 | cap = choose_run_mode() 427 | load_pretrain_model() 428 | fps = FPS().start() 429 | vid_writer = cv.VideoWriter(outFilePath, cv.VideoWriter_fourcc(*'mp4v'), 30, 430 | (round(cap.get(cv.CAP_PROP_FRAME_WIDTH)), 431 | round(cap.get(cv.CAP_PROP_FRAME_HEIGHT)))) 432 | 433 | while cv.waitKey(1) < 0: 434 | has_frame, frame = cap.read() 435 | # frame = cv.resize(frame, (winWidth, winHeight)) 436 | if not has_frame: 437 | cv.waitKey(3000) 438 | break 439 | humans = skeleton_estimator.inference(frame) 440 | frame_show = TfPoseEstimator.draw_pose_bbox(frame, humans)[0] 441 | # 显示实时FPS 442 | show_fps(frame_show) 443 | winName = 'Pose Skeleton from OpenPose' 444 | # cv.namedWindow(winName, cv.WINDOW_NORMAL) 445 | cv.imshow(winName, frame_show) 446 | 447 | # Write the frame with the detection boxes 448 | if args.image: 449 | cv.imwrite(outFilePath, frame_show) 450 | else: 451 | vid_writer.write(frame_show) 452 | 453 | if not args.image: 454 | vid_writer.release() 455 | cap.release() 456 | cv.destroyAllWindows() 457 | -------------------------------------------------------------------------------- /pose-estimator-tensorflow/test_out/README.md: -------------------------------------------------------------------------------- 1 | This directory is for saving the after-processing outputs. 2 | -------------------------------------------------------------------------------- /pose-estimator-tensorflow/test_out/skeleton_sequence.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/pose-estimator-tensorflow/test_out/skeleton_sequence.jpg -------------------------------------------------------------------------------- /pose-estimator-tensorflow/test_out/skeleton_sequence_horizontal.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/pose-estimator-tensorflow/test_out/skeleton_sequence_horizontal.jpg -------------------------------------------------------------------------------- /pose-estimator-using-caffemodel/README.md: -------------------------------------------------------------------------------- 1 | 2 | ### Usage Examples : 3 | Put the test file (iamge or video) under the same directory 4 | 5 | - `python3 estimator_multi_people.py --image=test.jpg` 6 | - `python3 estimator_multi_people.py --video=test.mp4` 7 | - if no argument provided, it starts the webcam. 8 | 9 | ### Note 10 | You can refer [here](https://www.learnopencv.com/multi-person-pose-estimation-in-opencv-using-openpose/) in english, or [here](https://blog.csdn.net/qq_27158179/article/details/82717821) in chinese. 11 | -------------------------------------------------------------------------------- /pose-estimator-using-caffemodel/estimator_multi_people.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import cv2 as cv 4 | import numpy as np 5 | from pathlib import Path 6 | import argparse 7 | from imutils.video import FPS 8 | 9 | 10 | # reference: https://blog.csdn.net/qq_27158179/article/details/82717821 11 | 12 | # 相关参数初始化 13 | FilePath = Path.cwd() 14 | out_file_path = Path(FilePath / "test_out/") 15 | 16 | joints_list_with_id = [] # 储存frame中所有joints, [[(x, y, conf, id), ...], [(x, y, conf, id), ...]] 17 | joints_list = np.zeros((0, 3)) # 储存frame中所有joints, [[x, y, conf], ...] 18 | joint_id = 0 # frame中所有joints的id 19 | 20 | # 以COCO骨架格式为例 21 | proto_file = str(FilePath / "model/coco/pose_deploy_linevec_faster_4_stages.prototxt") 22 | weights_file = str(FilePath / "model/coco/pose_iter_440000.caffemodel") 23 | 24 | coco_num = 18 25 | joints_mapping = ['Nose', 'Neck', 'R-Sho', 'R-Elb', 'R-Wr', 'L-Sho', 'L-Elb', 'L-Wr', 26 | 'R-Hip', 'R-Knee', 'R-Ank', 'L-Hip', 'L-Knee', 'L-Ank', 'R-Eye', 'L-Eye', 27 | 'R-Ear', 'L-Ear', 'Background'] 28 | POSE_PAIRS = [[1, 2], [1, 5], [2, 3], [3, 4], [5, 6], [6, 7], 29 | [1, 8], [8, 9], [9, 10], [1, 11], [11, 12], [12, 13], 30 | [1, 0], [0, 14], [14, 16], [0, 15], [15, 17], [2, 17], [5, 16]] 31 | # index of PAFs corresponding to the POSE_PAIRS 32 | # e.g for POSE_PAIR(1,2), the PAFs are located at indices (31,32) of output, Similarly, (1,5) -> (39,40) and so on. 33 | MAP_INDEX = [[31, 32], [39, 40], [33, 34], [35, 36], [41, 42], [43, 44], 34 | [19, 20], [21, 22], [23, 24], [25, 26], [27, 28], [29, 30], 35 | [47, 48], [49, 50], [53, 54], [51, 52], [55, 56], [37, 38], [45, 46]] 36 | 37 | COLORS = [[0, 100, 255], [0, 100, 255], [0, 255, 255], [0, 100, 255], [0, 255, 255], [0, 100, 255], 38 | [0, 255, 0], [255, 200, 100], [255, 0, 255], [0, 255, 0], [255, 200, 100], [255, 0, 255], 39 | [0, 0, 255], [255, 0, 0], [200, 200, 0], [255, 0, 0], [200, 200, 0], [0, 0, 0]] 40 | 41 | 42 | def choose_run_mode(): 43 | """ 44 | 选择输入:image/video/webcam 45 | """ 46 | global out_file_path 47 | if args.image: 48 | # Open the image file 49 | if not os.path.isfile(args.image): 50 | print("Input image file ", args.image, " doesn't exist") 51 | sys.exit(1) 52 | cap = cv.VideoCapture(args.image) 53 | out_file_path = str(out_file_path / (args.image[:-4] + '_out.jpg')) 54 | elif args.video: 55 | # Open the video file 56 | if not os.path.isfile(args.video): 57 | print("Input video file ", args.video, " doesn't exist") 58 | sys.exit(1) 59 | cap = cv.VideoCapture(args.video) 60 | out_file_path = str(out_file_path/(args.video[:-4]+'_out.mp4')) 61 | else: 62 | # Webcam input 63 | cap = cv.VideoCapture(0) 64 | out_file_path = str(out_file_path / 'webcam_out.mp4') 65 | return cap 66 | 67 | 68 | def get_joints(prob_map, threshold=0.15): 69 | """ 70 | :param prob_map: image中某一类joint的map 71 | :param threshold: 以这个threshold过滤掉map中置信度较低的部分 72 | :return: 所有该joint类别的集合,形式为(x, y, conf) 73 | """ 74 | # 以较低的threshold值,提取出可能为某关节点的所有像素区域 75 | map_smooth = cv.GaussianBlur(prob_map, (3, 3), 0, 0) 76 | # 生成这个区域的mask 77 | map_mask = np.uint8(map_smooth > threshold) 78 | joints = [] 79 | # 围绕mask画轮廓contour 80 | _, contours, _ = cv.findContours(map_mask, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE) 81 | # 针对每一个person某joint的contour,找到confidence最大的一个点,作为估计的关节点位置 82 | for cnt in contours: 83 | blob_mask = np.zeros(map_mask.shape) 84 | blob_mask = cv.fillConvexPoly(blob_mask, cnt, 1) 85 | masked_prob_map = map_smooth * blob_mask 86 | _, max_val, _, max_loc = cv.minMaxLoc(masked_prob_map) 87 | joints.append(max_loc + (prob_map[max_loc[1], max_loc[0]],)) 88 | # 返回 89 | return joints 90 | 91 | 92 | def get_valid_pairs(output): 93 | """ 94 | Find valid / invalid connections between the different joints of a all persons present 95 | """ 96 | valid_pairs = [] 97 | invalid_pairs = [] 98 | n_interp_samples = 10 # 插值点数目 99 | paf_threshold = 0.2 100 | conf_threshold = 0.5 101 | # loop for every POSE_PAIR 102 | for k in range(len(MAP_INDEX)): 103 | # a->b constitute a limb 104 | paf_a = output[0, MAP_INDEX[k][0], :, :] 105 | # print(paf_a.shape) 106 | paf_b = output[0, MAP_INDEX[k][1], :, :] 107 | paf_a = cv.resize(paf_a, (frameWidth, frameHeight)) 108 | paf_b = cv.resize(paf_b, (frameWidth, frameHeight)) 109 | 110 | # Find the joints for the first and second limb 111 | # cand_a为某一joint的列表, cand_b为另一与之相连接的joint的列表 112 | cand_a = joints_list_with_id[POSE_PAIRS[k][0]] 113 | cand_b = joints_list_with_id[POSE_PAIRS[k][1]] 114 | # 在完美检测到frame中所有joints的情况下, n_a = n_b = len(persons) 115 | n_a = len(cand_a) 116 | n_b = len(cand_b) 117 | 118 | # If joints for the joint-pair is detected 119 | # check every joint in cand_a with every joint in cand_b 120 | if n_a != 0 and n_b != 0: 121 | valid_pair = np.zeros((0, 3)) 122 | for i in range(n_a): 123 | max_j = -1 124 | max_score = -1 125 | found = False 126 | for j in range(n_b): 127 | # Calculate the distance vector between the two joints 128 | distance_ij = np.subtract(cand_b[j][:2], cand_a[i][:2]) 129 | # 求二范数,即求模,算两点距离 130 | norm = np.linalg.norm(distance_ij) 131 | if norm: 132 | # 距离不为零的话, 缩放到单位向量 133 | distance_ij = distance_ij / norm 134 | else: 135 | continue 136 | 137 | # Find p(u),在连接两joints的直线上创建一个n_interp_samples插值点的数组 138 | interp_coord = list(zip(np.linspace(cand_a[i][0], cand_b[j][0], num=n_interp_samples), 139 | np.linspace(cand_a[i][1], cand_b[j][1], num=n_interp_samples))) 140 | # Find the PAF values at a set of interpolated points between the joints 141 | paf_interp = [] 142 | for m in range(len(interp_coord)): 143 | paf_interp.append([paf_a[int(round(interp_coord[m][1])), int(round(interp_coord[m][0]))], 144 | paf_b[int(round(interp_coord[m][1])), int(round(interp_coord[m][0]))]]) 145 | # Find E 146 | paf_scores = np.dot(paf_interp, distance_ij) 147 | avg_paf_score = sum(paf_scores)/len(paf_scores) 148 | 149 | # Check if the connection is valid 150 | # If the fraction of interpolated vectors aligned with PAF is higher then threshold -> Valid Pair 151 | if (len(np.where(paf_scores > paf_threshold)[0]) / n_interp_samples) > conf_threshold: 152 | if avg_paf_score > max_score: 153 | # 如果这些点中有70%大于conf threshold,则把这一对当成有效 154 | max_j = j 155 | max_score = avg_paf_score 156 | found = True 157 | # Append the connection to the list 158 | if found: 159 | valid_pair = np.append(valid_pair, [[cand_a[i][3], cand_b[max_j][3], max_score]], axis=0) 160 | 161 | # Append the detected connections to the global list 162 | valid_pairs.append(valid_pair) 163 | # If no joints are detected 164 | else: 165 | # print("No Connection : k = {}".format(k)) 166 | invalid_pairs.append(k) 167 | valid_pairs.append([]) 168 | return valid_pairs, invalid_pairs 169 | 170 | 171 | def get_personwise_joints(valid_pairs, invalid_pairs): 172 | """ 173 | This function creates a list of joints belonging to each person 174 | For each detected valid pair, it assigns the joint(s) to a person 175 | """ 176 | # the last number in each row is the overall score 177 | personwise_joints = -1 * np.ones((0, 19)) 178 | # print(personwise_joints.shape) 179 | 180 | for k in range(len(MAP_INDEX)): 181 | if k not in invalid_pairs: 182 | part_as = valid_pairs[k][:, 0] 183 | part_bs = valid_pairs[k][:, 1] 184 | index_a, index_b = np.array(POSE_PAIRS[k]) 185 | 186 | for i in range(len(valid_pairs[k])): 187 | found = False 188 | person_idx = -1 189 | for j in range(len(personwise_joints)): 190 | if personwise_joints[j][index_a] == part_as[i]: 191 | person_idx = j 192 | found = True 193 | break 194 | 195 | if found: 196 | personwise_joints[person_idx][index_b] = part_bs[i] 197 | personwise_joints[person_idx][-1] += joints_list[part_bs[i].astype(int), 2] + valid_pairs[k][i][2] 198 | 199 | # if find no partA in the subset, create a new subset 200 | elif not found and k < 17: 201 | row = -1 * np.ones(19) 202 | row[index_a] = part_as[i] 203 | row[index_b] = part_bs[i] 204 | # add the joint_scores for the two joints and the paf_score 205 | row[-1] = sum(joints_list[valid_pairs[k][i, :2].astype(int), 2]) + valid_pairs[k][i][2] 206 | personwise_joints = np.vstack([personwise_joints, row]) 207 | return personwise_joints 208 | 209 | 210 | def load_pretrained_model(): 211 | """ 212 | 加载预训练后的POSE的caffe模型 213 | """ 214 | net = cv.dnn.readNetFromCaffe(proto_file, weights_file) 215 | # 调用GPU模块,但目前opencv仅支持intel GPU 216 | # tested with Intel GPUs only, or it will automatically switch to CPU 217 | net.setPreferableBackend(cv.dnn.DNN_BACKEND_OPENCV) 218 | net.setPreferableTarget(cv.dnn.DNN_TARGET_OPENCL) 219 | return net 220 | 221 | 222 | def show_joints(frame): 223 | for i in range(coco_num): 224 | for j in range(len(joints_list_with_id[i])): 225 | cv.circle(frame, joints_list_with_id[i][j][0:2], 5, COLORS[i], -1, cv.LINE_AA) 226 | cv.imshow("joints", frame) 227 | 228 | 229 | if __name__ == '__main__': 230 | # Usage example: python estimator_single_person.py --video=test.mp4 231 | parser = argparse.ArgumentParser(description='OpenPose for pose skeleton in python') 232 | parser.add_argument('--image', help='Path to image file.') 233 | parser.add_argument('--video', help='Path to video file.') 234 | args = parser.parse_args() 235 | 236 | cap = choose_run_mode() 237 | net = load_pretrained_model() 238 | fps = FPS().start() 239 | vid_writer = cv.VideoWriter(out_file_path, cv.VideoWriter_fourcc(*'mp4v'), 30, 240 | (round(cap.get(cv.CAP_PROP_FRAME_WIDTH)), 241 | round(cap.get(cv.CAP_PROP_FRAME_HEIGHT)))) 242 | 243 | while cv.waitKey(1) < 0: 244 | hasFrame, frame = cap.read() 245 | if not hasFrame: 246 | print("Output file is stored as ", out_file_path) 247 | cv.waitKey(3000) 248 | break 249 | 250 | frameHeight, frameWidth = frame.shape[:2] 251 | 252 | # Fix the input Height and get the width according to the Aspect Ratio 253 | # 有两种输入:height 368 * width 368 (不保持宽高比);height 368 * width (保持宽高比) 254 | inHeight, inwidth = 368, 368 255 | inWidth = int((inHeight/frameHeight)*frameWidth) # 源代码中是选择 保持宽高比 256 | 257 | # 首先,我们将像素值标准化为(0,1)。然后我们指定图像的尺寸。接下来,要减去的平均值,即(0,0,0) 258 | # 不同算法及训练模型的blobFromImage参数不同,可访问opencv的github地址查询 259 | # https://github.com/opencv/opencv/tree/master/samples/dnn 260 | inpBlob = cv.dnn.blobFromImage(frame, 1.0 / 255, (inWidth, inHeight), (0, 0, 0), swapRB=False, crop=False) 261 | net.setInput(inpBlob) 262 | # output的[0, i, :, :,]的前19个矩阵为19个(包含一个background)关节点的置信map 263 | # 后38个矩阵为每两个相连接关节点之间形成的PAF矩阵 264 | output = net.forward() 265 | 266 | # 找到image中所有可能的关节点joint的坐标 267 | for part in range(coco_num): 268 | # 按关节点的序列分别寻找 269 | joint_prob_map = output[0, part, :, :] 270 | joint_prob_map = cv.resize(joint_prob_map, (frame.shape[1], frame.shape[0])) 271 | # 调用get_joints函数,获取image中的所有人的某一joint,len(joints)==len(persons) 272 | joints = get_joints(joint_prob_map) 273 | 274 | joints_with_id = [] 275 | for i in range(len(joints)): 276 | # joints集合,竖直叠加(无id) 277 | joints_list = np.vstack([joints_list, joints[i]]) 278 | # joints集合(有id) 279 | joints_with_id.append(joints[i] + (joint_id,)) 280 | joint_id += 1 281 | joints_list_with_id.append(joints_with_id) 282 | 283 | # # 标出所有joints位置(no joints connection) 284 | # show_joints(frame.copy()) 285 | 286 | # 画多人skeleton(joints connection) 287 | valid_pairs, invalid_pairs = get_valid_pairs(output) 288 | personwise_joints = get_personwise_joints(valid_pairs, invalid_pairs) 289 | 290 | for i in range(17): 291 | for n in range(len(personwise_joints)): 292 | index = personwise_joints[n][np.array(POSE_PAIRS[i])] 293 | if -1 in index: 294 | continue 295 | B = np.int32(joints_list[index.astype(int), 0]) 296 | A = np.int32(joints_list[index.astype(int), 1]) 297 | cv.line(frame, (B[0], A[0]), (B[1], A[1]), COLORS[i], 3, cv.LINE_AA) 298 | 299 | # Write the frame with the detection boxes 300 | if args.image: 301 | cv.imwrite(out_file_path, frame) 302 | else: 303 | vid_writer.write(frame) 304 | 305 | cv.imshow("Detected Pose", frame) 306 | 307 | if not args.image: 308 | vid_writer.release() 309 | cap.release() 310 | cv.destroyAllWindows() 311 | -------------------------------------------------------------------------------- /pose-estimator-using-caffemodel/estimator_single_person.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import os 3 | import argparse 4 | import sys 5 | from imutils.video import FPS 6 | from pathlib import Path 7 | 8 | # 相关参数初始化 9 | FilePath = Path.cwd() 10 | out_file_path = Path(FilePath / "test_out/") 11 | threshold = 0.1 12 | input_width, input_height = 368, 368 13 | lineWidthMultiper = 2 14 | 15 | proto_file = "" 16 | weights_file = "" 17 | nPoints = 0 18 | POSE_PAIRS = [] 19 | 20 | 21 | def choose_pose(model='COCO'): 22 | """ 23 | 选择输出的骨架模型, 可选:MPI / COCO / BODY_25 24 | """ 25 | global proto_file, weights_file, nPoints, POSE_PAIRS 26 | if model is "COCO": 27 | proto_file = str(FilePath / "model/coco/pose_deploy_linevec_faster_4_stages.prototxt") 28 | weights_file = str(FilePath / "model/coco/pose_iter_440000.caffemodel") 29 | nPoints = 18 30 | POSE_PAIRS = [[1, 0], [1, 2], [1, 5], [2, 3], [3, 4], [5, 6], [6, 7], [1, 8], [8, 9], [9, 10], [1, 11], [11, 12], 31 | [12, 13], [0, 14], [0, 15], [14, 16], [15, 17]] 32 | elif model is "MPI": 33 | proto_file = str(FilePath / "model/mpi/pose_deploy_linevec_faster_4_stages.prototxt") 34 | weights_file = str(FilePath / "model/mpi/pose_iter_440000.caffemodel") 35 | nPoints = 15 36 | POSE_PAIRS = [[0, 1], [1, 2], [2, 3], [3, 4], [1, 5], [5, 6], [6, 7], [1, 14], [14, 8], [8, 9], [9, 10], [14, 11], 37 | [11, 12], [12, 13]] 38 | elif model is "BODY_25": 39 | proto_file = str(FilePath / "model/body_25/pose_deploy.prototxt") 40 | weights_file = str(FilePath / "model/body_25/pose_iter_584000.caffemodel") 41 | nPoints = 25 42 | POSE_PAIRS = [[1, 0], [1, 2], [1, 5], [2, 3], [3, 4], [5, 6], [6, 7], [1, 8], [8, 9], [9, 10], [10, 11], [11, 24], 43 | [11, 22], [22, 23], [8, 12], [12, 13], [13, 14], [14, 21], [14, 19], [19, 20], [0, 15], [15, 17], 44 | [0, 16], [16, 18]] 45 | 46 | 47 | def choose_run_mode(): 48 | """ 49 | 选择输入:image/video/webcam 50 | """ 51 | global out_file_path 52 | if args.image: 53 | # Open the image file 54 | if not os.path.isfile(args.image): 55 | print("Input image file ", args.image, " doesn't exist") 56 | sys.exit(1) 57 | cap = cv.VideoCapture(args.image) 58 | out_file_path = str(out_file_path / (args.image[:-4] + '_out.jpg')) 59 | elif args.video: 60 | # Open the video file 61 | if not os.path.isfile(args.video): 62 | print("Input video file ", args.video, " doesn't exist") 63 | sys.exit(1) 64 | cap = cv.VideoCapture(args.video) 65 | out_file_path = str(out_file_path/(args.video[:-4]+'_out.mp4')) 66 | else: 67 | # Webcam input 68 | cap = cv.VideoCapture(0) 69 | out_file_path = str(out_file_path / 'webcam_out.mp4') 70 | return cap 71 | 72 | 73 | def load_pretrain_model(): 74 | """ 75 | 加载预训练后的POSE的caffe模型 76 | """ 77 | net = cv.dnn.readNetFromCaffe(proto_file, weights_file) 78 | print('POSE caffe model loaded successfully') 79 | # 调用GPU模块,但目前opencv仅支持intel GPU 80 | # tested with Intel GPUs only, or it will automatically switch to CPU 81 | net.setPreferableBackend(cv.dnn.DNN_BACKEND_OPENCV) 82 | net.setPreferableTarget(cv.dnn.DNN_TARGET_OPENCL) 83 | return net 84 | 85 | 86 | def show_fps(): 87 | if not args.image: 88 | fps.update() 89 | fps.stop() 90 | fps_label = "FPS: {:.2f}".format(fps.fps()) 91 | cv.putText(frame, fps_label, (0, origin_h - 25), cv.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), lineWidthMultiper) 92 | 93 | 94 | if __name__ == "__main__": 95 | # Usage example: python estimator_single_person.py --video=test.mp4 96 | parser = argparse.ArgumentParser(description='OpenPose for pose skeleton in python') 97 | parser.add_argument('--image', help='Path to image file.') 98 | parser.add_argument('--video', help='Path to video file.') 99 | args = parser.parse_args() 100 | 101 | choose_pose('COCO') 102 | cap = choose_run_mode() 103 | net = load_pretrain_model() 104 | fps = FPS().start() 105 | vid_writer = cv.VideoWriter(out_file_path, cv.VideoWriter_fourcc(*'mp4v'), 30, 106 | (round(cap.get(cv.CAP_PROP_FRAME_WIDTH)), 107 | round(cap.get(cv.CAP_PROP_FRAME_HEIGHT)))) 108 | while cv.waitKey(1) < 0: 109 | hasFrame, frame = cap.read() 110 | if not hasFrame: 111 | print("Output file is stored as ", out_file_path) 112 | cv.waitKey(3000) 113 | break 114 | 115 | origin_h, origin_w = frame.shape[:2] 116 | # 首先,我们将像素值标准化为(0,1)。然后我们指定图像的尺寸。接下来,要减去的平均值,即(0,0,0) 117 | # 不同算法及训练模型的blobFromImage参数不同,可访问opencv的github地址查询 118 | # https://github.com/opencv/opencv/tree/master/samples/dnn 119 | blob = cv.dnn.blobFromImage(frame, 1.0 / 255, (input_width, input_height), 0, swapRB=False, crop=False) 120 | net.setInput(blob) 121 | # 第一个维度是图像ID(如果将多个图像传递给网络)。 122 | # 第二个维度表示关节点的索引。该模型生成Confidence Maps和Part Affinity Maps。 123 | # 对于COCO模型,它由57个部分组成: 18个关键点置信度map + 1个背景 + 19 * 2部分亲和力图。 124 | # 第三个维度是输出映射map的高度。 125 | # 第四个维度是输出映射map的宽度。 126 | detections = net.forward() 127 | H = detections.shape[2] 128 | W = detections.shape[3] 129 | # 存储关节点 130 | points = [] 131 | 132 | for i in range(nPoints): 133 | probility_map = detections[0, i, :, :] 134 | # 135 | min_value, confidence, min_loc, point = cv.minMaxLoc(probility_map) 136 | # 137 | x = int(origin_w * (point[0] / W)) 138 | y = int(origin_h * (point[1] / H)) 139 | 140 | if confidence > threshold: 141 | cv.circle(frame, (x, y), lineWidthMultiper*3, (0, 255, 255), -1, cv.FILLED) 142 | # cv.putText(frame, "{}".format(i), (x, y-15), cv.FONT_HERSHEY_SIMPLEX, 0.8, (0, 0, 255), 1, cv.LINE_AA) 143 | points.append((x, y)) 144 | else: 145 | points.append(None) 146 | 147 | # 画骨架 148 | for pair in POSE_PAIRS: 149 | A, B = pair[0], pair[1] 150 | if points[A] and points[B]: 151 | cv.line(frame, points[A], points[B], (0, 0, 255), lineWidthMultiper, cv.LINE_AA) 152 | # cv.circle(frame, points[A], 8, (0, 0, 255), thickness=-1, lineType=cv.FILLED) 153 | # cv.circle(frame, points[B], 8, (0, 0, 255), thickness=-1, lineType=cv.FILLED) 154 | # 显示实时FPS 155 | show_fps() 156 | 157 | # Write the frame with the detection boxes 158 | if args.image: 159 | cv.imwrite(out_file_path, frame) 160 | else: 161 | vid_writer.write(frame) 162 | 163 | winName = 'Pose Skeleton from OpenPose' 164 | # cv.namedWindow(winName, cv.WINDOW_NORMAL) 165 | cv.imshow(winName, frame) 166 | 167 | if not args.image: 168 | vid_writer.release() 169 | cap.release() 170 | cv.destroyAllWindows() 171 | -------------------------------------------------------------------------------- /pose-estimator-using-caffemodel/model/body_25/keypoints_pose_25.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/pose-estimator-using-caffemodel/model/body_25/keypoints_pose_25.png -------------------------------------------------------------------------------- /pose-estimator-using-caffemodel/model/coco/keypoints_pose_18.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/pose-estimator-using-caffemodel/model/coco/keypoints_pose_18.png -------------------------------------------------------------------------------- /pose-estimator-using-caffemodel/model/coco/pose_deploy_linevec_faster_4_stages.prototxt: -------------------------------------------------------------------------------- 1 | input: "image" 2 | input_dim: 1 3 | input_dim: 3 4 | input_dim: 1 # This value will be defined at runtime 5 | input_dim: 1 # This value will be defined at runtime 6 | layer { 7 | name: "conv1_1" 8 | type: "Convolution" 9 | bottom: "image" 10 | top: "conv1_1" 11 | param { 12 | lr_mult: 1.0 13 | decay_mult: 1 14 | } 15 | param { 16 | lr_mult: 2.0 17 | decay_mult: 0 18 | } 19 | convolution_param { 20 | num_output: 64 21 | pad: 1 22 | kernel_size: 3 23 | weight_filler { 24 | type: "gaussian" 25 | std: 0.01 26 | } 27 | bias_filler { 28 | type: "constant" 29 | } 30 | } 31 | } 32 | layer { 33 | name: "relu1_1" 34 | type: "ReLU" 35 | bottom: "conv1_1" 36 | top: "conv1_1" 37 | } 38 | layer { 39 | name: "conv1_2" 40 | type: "Convolution" 41 | bottom: "conv1_1" 42 | top: "conv1_2" 43 | param { 44 | lr_mult: 1.0 45 | decay_mult: 1 46 | } 47 | param { 48 | lr_mult: 2.0 49 | decay_mult: 0 50 | } 51 | convolution_param { 52 | num_output: 64 53 | pad: 1 54 | kernel_size: 3 55 | weight_filler { 56 | type: "gaussian" 57 | std: 0.01 58 | } 59 | bias_filler { 60 | type: "constant" 61 | } 62 | } 63 | } 64 | layer { 65 | name: "relu1_2" 66 | type: "ReLU" 67 | bottom: "conv1_2" 68 | top: "conv1_2" 69 | } 70 | layer { 71 | name: "pool1_stage1" 72 | type: "Pooling" 73 | bottom: "conv1_2" 74 | top: "pool1_stage1" 75 | pooling_param { 76 | pool: MAX 77 | kernel_size: 2 78 | stride: 2 79 | } 80 | } 81 | layer { 82 | name: "conv2_1" 83 | type: "Convolution" 84 | bottom: "pool1_stage1" 85 | top: "conv2_1" 86 | param { 87 | lr_mult: 1.0 88 | decay_mult: 1 89 | } 90 | param { 91 | lr_mult: 2.0 92 | decay_mult: 0 93 | } 94 | convolution_param { 95 | num_output: 128 96 | pad: 1 97 | kernel_size: 3 98 | weight_filler { 99 | type: "gaussian" 100 | std: 0.01 101 | } 102 | bias_filler { 103 | type: "constant" 104 | } 105 | } 106 | } 107 | layer { 108 | name: "relu2_1" 109 | type: "ReLU" 110 | bottom: "conv2_1" 111 | top: "conv2_1" 112 | } 113 | layer { 114 | name: "conv2_2" 115 | type: "Convolution" 116 | bottom: "conv2_1" 117 | top: "conv2_2" 118 | param { 119 | lr_mult: 1.0 120 | decay_mult: 1 121 | } 122 | param { 123 | lr_mult: 2.0 124 | decay_mult: 0 125 | } 126 | convolution_param { 127 | num_output: 128 128 | pad: 1 129 | kernel_size: 3 130 | weight_filler { 131 | type: "gaussian" 132 | std: 0.01 133 | } 134 | bias_filler { 135 | type: "constant" 136 | } 137 | } 138 | } 139 | layer { 140 | name: "relu2_2" 141 | type: "ReLU" 142 | bottom: "conv2_2" 143 | top: "conv2_2" 144 | } 145 | layer { 146 | name: "pool2_stage1" 147 | type: "Pooling" 148 | bottom: "conv2_2" 149 | top: "pool2_stage1" 150 | pooling_param { 151 | pool: MAX 152 | kernel_size: 2 153 | stride: 2 154 | } 155 | } 156 | layer { 157 | name: "conv3_1" 158 | type: "Convolution" 159 | bottom: "pool2_stage1" 160 | top: "conv3_1" 161 | param { 162 | lr_mult: 1.0 163 | decay_mult: 1 164 | } 165 | param { 166 | lr_mult: 2.0 167 | decay_mult: 0 168 | } 169 | convolution_param { 170 | num_output: 256 171 | pad: 1 172 | kernel_size: 3 173 | weight_filler { 174 | type: "gaussian" 175 | std: 0.01 176 | } 177 | bias_filler { 178 | type: "constant" 179 | } 180 | } 181 | } 182 | layer { 183 | name: "relu3_1" 184 | type: "ReLU" 185 | bottom: "conv3_1" 186 | top: "conv3_1" 187 | } 188 | layer { 189 | name: "conv3_2" 190 | type: "Convolution" 191 | bottom: "conv3_1" 192 | top: "conv3_2" 193 | param { 194 | lr_mult: 1.0 195 | decay_mult: 1 196 | } 197 | param { 198 | lr_mult: 2.0 199 | decay_mult: 0 200 | } 201 | convolution_param { 202 | num_output: 256 203 | pad: 1 204 | kernel_size: 3 205 | weight_filler { 206 | type: "gaussian" 207 | std: 0.01 208 | } 209 | bias_filler { 210 | type: "constant" 211 | } 212 | } 213 | } 214 | layer { 215 | name: "relu3_2" 216 | type: "ReLU" 217 | bottom: "conv3_2" 218 | top: "conv3_2" 219 | } 220 | layer { 221 | name: "conv3_3" 222 | type: "Convolution" 223 | bottom: "conv3_2" 224 | top: "conv3_3" 225 | param { 226 | lr_mult: 1.0 227 | decay_mult: 1 228 | } 229 | param { 230 | lr_mult: 2.0 231 | decay_mult: 0 232 | } 233 | convolution_param { 234 | num_output: 256 235 | pad: 1 236 | kernel_size: 3 237 | weight_filler { 238 | type: "gaussian" 239 | std: 0.01 240 | } 241 | bias_filler { 242 | type: "constant" 243 | } 244 | } 245 | } 246 | layer { 247 | name: "relu3_3" 248 | type: "ReLU" 249 | bottom: "conv3_3" 250 | top: "conv3_3" 251 | } 252 | layer { 253 | name: "conv3_4" 254 | type: "Convolution" 255 | bottom: "conv3_3" 256 | top: "conv3_4" 257 | param { 258 | lr_mult: 1.0 259 | decay_mult: 1 260 | } 261 | param { 262 | lr_mult: 2.0 263 | decay_mult: 0 264 | } 265 | convolution_param { 266 | num_output: 256 267 | pad: 1 268 | kernel_size: 3 269 | weight_filler { 270 | type: "gaussian" 271 | std: 0.01 272 | } 273 | bias_filler { 274 | type: "constant" 275 | } 276 | } 277 | } 278 | layer { 279 | name: "relu3_4" 280 | type: "ReLU" 281 | bottom: "conv3_4" 282 | top: "conv3_4" 283 | } 284 | layer { 285 | name: "pool3_stage1" 286 | type: "Pooling" 287 | bottom: "conv3_4" 288 | top: "pool3_stage1" 289 | pooling_param { 290 | pool: MAX 291 | kernel_size: 2 292 | stride: 2 293 | } 294 | } 295 | layer { 296 | name: "conv4_1" 297 | type: "Convolution" 298 | bottom: "pool3_stage1" 299 | top: "conv4_1" 300 | param { 301 | lr_mult: 1.0 302 | decay_mult: 1 303 | } 304 | param { 305 | lr_mult: 2.0 306 | decay_mult: 0 307 | } 308 | convolution_param { 309 | num_output: 512 310 | pad: 1 311 | kernel_size: 3 312 | weight_filler { 313 | type: "gaussian" 314 | std: 0.01 315 | } 316 | bias_filler { 317 | type: "constant" 318 | } 319 | } 320 | } 321 | layer { 322 | name: "relu4_1" 323 | type: "ReLU" 324 | bottom: "conv4_1" 325 | top: "conv4_1" 326 | } 327 | layer { 328 | name: "conv4_2" 329 | type: "Convolution" 330 | bottom: "conv4_1" 331 | top: "conv4_2" 332 | param { 333 | lr_mult: 1.0 334 | decay_mult: 1 335 | } 336 | param { 337 | lr_mult: 2.0 338 | decay_mult: 0 339 | } 340 | convolution_param { 341 | num_output: 512 342 | pad: 1 343 | kernel_size: 3 344 | weight_filler { 345 | type: "gaussian" 346 | std: 0.01 347 | } 348 | bias_filler { 349 | type: "constant" 350 | } 351 | } 352 | } 353 | layer { 354 | name: "relu4_2" 355 | type: "ReLU" 356 | bottom: "conv4_2" 357 | top: "conv4_2" 358 | } 359 | layer { 360 | name: "conv4_3_CPM" 361 | type: "Convolution" 362 | bottom: "conv4_2" 363 | top: "conv4_3_CPM" 364 | param { 365 | lr_mult: 1.0 366 | decay_mult: 1 367 | } 368 | param { 369 | lr_mult: 2.0 370 | decay_mult: 0 371 | } 372 | convolution_param { 373 | num_output: 256 374 | pad: 1 375 | kernel_size: 3 376 | weight_filler { 377 | type: "gaussian" 378 | std: 0.01 379 | } 380 | bias_filler { 381 | type: "constant" 382 | } 383 | } 384 | } 385 | layer { 386 | name: "relu4_3_CPM" 387 | type: "ReLU" 388 | bottom: "conv4_3_CPM" 389 | top: "conv4_3_CPM" 390 | } 391 | layer { 392 | name: "conv4_4_CPM" 393 | type: "Convolution" 394 | bottom: "conv4_3_CPM" 395 | top: "conv4_4_CPM" 396 | param { 397 | lr_mult: 1.0 398 | decay_mult: 1 399 | } 400 | param { 401 | lr_mult: 2.0 402 | decay_mult: 0 403 | } 404 | convolution_param { 405 | num_output: 128 406 | pad: 1 407 | kernel_size: 3 408 | weight_filler { 409 | type: "gaussian" 410 | std: 0.01 411 | } 412 | bias_filler { 413 | type: "constant" 414 | } 415 | } 416 | } 417 | layer { 418 | name: "relu4_4_CPM" 419 | type: "ReLU" 420 | bottom: "conv4_4_CPM" 421 | top: "conv4_4_CPM" 422 | } 423 | layer { 424 | name: "conv5_1_CPM_L1" 425 | type: "Convolution" 426 | bottom: "conv4_4_CPM" 427 | top: "conv5_1_CPM_L1" 428 | param { 429 | lr_mult: 1.0 430 | decay_mult: 1 431 | } 432 | param { 433 | lr_mult: 2.0 434 | decay_mult: 0 435 | } 436 | convolution_param { 437 | num_output: 128 438 | pad: 1 439 | kernel_size: 3 440 | weight_filler { 441 | type: "gaussian" 442 | std: 0.01 443 | } 444 | bias_filler { 445 | type: "constant" 446 | } 447 | } 448 | } 449 | layer { 450 | name: "relu5_1_CPM_L1" 451 | type: "ReLU" 452 | bottom: "conv5_1_CPM_L1" 453 | top: "conv5_1_CPM_L1" 454 | } 455 | layer { 456 | name: "conv5_1_CPM_L2" 457 | type: "Convolution" 458 | bottom: "conv4_4_CPM" 459 | top: "conv5_1_CPM_L2" 460 | param { 461 | lr_mult: 1.0 462 | decay_mult: 1 463 | } 464 | param { 465 | lr_mult: 2.0 466 | decay_mult: 0 467 | } 468 | convolution_param { 469 | num_output: 128 470 | pad: 1 471 | kernel_size: 3 472 | weight_filler { 473 | type: "gaussian" 474 | std: 0.01 475 | } 476 | bias_filler { 477 | type: "constant" 478 | } 479 | } 480 | } 481 | layer { 482 | name: "relu5_1_CPM_L2" 483 | type: "ReLU" 484 | bottom: "conv5_1_CPM_L2" 485 | top: "conv5_1_CPM_L2" 486 | } 487 | layer { 488 | name: "conv5_2_CPM_L1" 489 | type: "Convolution" 490 | bottom: "conv5_1_CPM_L1" 491 | top: "conv5_2_CPM_L1" 492 | param { 493 | lr_mult: 1.0 494 | decay_mult: 1 495 | } 496 | param { 497 | lr_mult: 2.0 498 | decay_mult: 0 499 | } 500 | convolution_param { 501 | num_output: 128 502 | pad: 1 503 | kernel_size: 3 504 | weight_filler { 505 | type: "gaussian" 506 | std: 0.01 507 | } 508 | bias_filler { 509 | type: "constant" 510 | } 511 | } 512 | } 513 | layer { 514 | name: "relu5_2_CPM_L1" 515 | type: "ReLU" 516 | bottom: "conv5_2_CPM_L1" 517 | top: "conv5_2_CPM_L1" 518 | } 519 | layer { 520 | name: "conv5_2_CPM_L2" 521 | type: "Convolution" 522 | bottom: "conv5_1_CPM_L2" 523 | top: "conv5_2_CPM_L2" 524 | param { 525 | lr_mult: 1.0 526 | decay_mult: 1 527 | } 528 | param { 529 | lr_mult: 2.0 530 | decay_mult: 0 531 | } 532 | convolution_param { 533 | num_output: 128 534 | pad: 1 535 | kernel_size: 3 536 | weight_filler { 537 | type: "gaussian" 538 | std: 0.01 539 | } 540 | bias_filler { 541 | type: "constant" 542 | } 543 | } 544 | } 545 | layer { 546 | name: "relu5_2_CPM_L2" 547 | type: "ReLU" 548 | bottom: "conv5_2_CPM_L2" 549 | top: "conv5_2_CPM_L2" 550 | } 551 | layer { 552 | name: "conv5_3_CPM_L1" 553 | type: "Convolution" 554 | bottom: "conv5_2_CPM_L1" 555 | top: "conv5_3_CPM_L1" 556 | param { 557 | lr_mult: 1.0 558 | decay_mult: 1 559 | } 560 | param { 561 | lr_mult: 2.0 562 | decay_mult: 0 563 | } 564 | convolution_param { 565 | num_output: 128 566 | pad: 1 567 | kernel_size: 3 568 | weight_filler { 569 | type: "gaussian" 570 | std: 0.01 571 | } 572 | bias_filler { 573 | type: "constant" 574 | } 575 | } 576 | } 577 | layer { 578 | name: "relu5_3_CPM_L1" 579 | type: "ReLU" 580 | bottom: "conv5_3_CPM_L1" 581 | top: "conv5_3_CPM_L1" 582 | } 583 | layer { 584 | name: "conv5_3_CPM_L2" 585 | type: "Convolution" 586 | bottom: "conv5_2_CPM_L2" 587 | top: "conv5_3_CPM_L2" 588 | param { 589 | lr_mult: 1.0 590 | decay_mult: 1 591 | } 592 | param { 593 | lr_mult: 2.0 594 | decay_mult: 0 595 | } 596 | convolution_param { 597 | num_output: 128 598 | pad: 1 599 | kernel_size: 3 600 | weight_filler { 601 | type: "gaussian" 602 | std: 0.01 603 | } 604 | bias_filler { 605 | type: "constant" 606 | } 607 | } 608 | } 609 | layer { 610 | name: "relu5_3_CPM_L2" 611 | type: "ReLU" 612 | bottom: "conv5_3_CPM_L2" 613 | top: "conv5_3_CPM_L2" 614 | } 615 | layer { 616 | name: "conv5_4_CPM_L1" 617 | type: "Convolution" 618 | bottom: "conv5_3_CPM_L1" 619 | top: "conv5_4_CPM_L1" 620 | param { 621 | lr_mult: 1.0 622 | decay_mult: 1 623 | } 624 | param { 625 | lr_mult: 2.0 626 | decay_mult: 0 627 | } 628 | convolution_param { 629 | num_output: 512 630 | pad: 0 631 | kernel_size: 1 632 | weight_filler { 633 | type: "gaussian" 634 | std: 0.01 635 | } 636 | bias_filler { 637 | type: "constant" 638 | } 639 | } 640 | } 641 | layer { 642 | name: "relu5_4_CPM_L1" 643 | type: "ReLU" 644 | bottom: "conv5_4_CPM_L1" 645 | top: "conv5_4_CPM_L1" 646 | } 647 | layer { 648 | name: "conv5_4_CPM_L2" 649 | type: "Convolution" 650 | bottom: "conv5_3_CPM_L2" 651 | top: "conv5_4_CPM_L2" 652 | param { 653 | lr_mult: 1.0 654 | decay_mult: 1 655 | } 656 | param { 657 | lr_mult: 2.0 658 | decay_mult: 0 659 | } 660 | convolution_param { 661 | num_output: 512 662 | pad: 0 663 | kernel_size: 1 664 | weight_filler { 665 | type: "gaussian" 666 | std: 0.01 667 | } 668 | bias_filler { 669 | type: "constant" 670 | } 671 | } 672 | } 673 | layer { 674 | name: "relu5_4_CPM_L2" 675 | type: "ReLU" 676 | bottom: "conv5_4_CPM_L2" 677 | top: "conv5_4_CPM_L2" 678 | } 679 | layer { 680 | name: "conv5_5_CPM_L1" 681 | type: "Convolution" 682 | bottom: "conv5_4_CPM_L1" 683 | top: "conv5_5_CPM_L1" 684 | param { 685 | lr_mult: 1.0 686 | decay_mult: 1 687 | } 688 | param { 689 | lr_mult: 2.0 690 | decay_mult: 0 691 | } 692 | convolution_param { 693 | num_output: 28 694 | pad: 0 695 | kernel_size: 1 696 | weight_filler { 697 | type: "gaussian" 698 | std: 0.01 699 | } 700 | bias_filler { 701 | type: "constant" 702 | } 703 | } 704 | } 705 | layer { 706 | name: "conv5_5_CPM_L2" 707 | type: "Convolution" 708 | bottom: "conv5_4_CPM_L2" 709 | top: "conv5_5_CPM_L2" 710 | param { 711 | lr_mult: 1.0 712 | decay_mult: 1 713 | } 714 | param { 715 | lr_mult: 2.0 716 | decay_mult: 0 717 | } 718 | convolution_param { 719 | num_output: 16 720 | pad: 0 721 | kernel_size: 1 722 | weight_filler { 723 | type: "gaussian" 724 | std: 0.01 725 | } 726 | bias_filler { 727 | type: "constant" 728 | } 729 | } 730 | } 731 | layer { 732 | name: "concat_stage2" 733 | type: "Concat" 734 | bottom: "conv5_5_CPM_L1" 735 | bottom: "conv5_5_CPM_L2" 736 | bottom: "conv4_4_CPM" 737 | top: "concat_stage2" 738 | concat_param { 739 | axis: 1 740 | } 741 | } 742 | layer { 743 | name: "Mconv1_stage2_L1" 744 | type: "Convolution" 745 | bottom: "concat_stage2" 746 | top: "Mconv1_stage2_L1" 747 | param { 748 | lr_mult: 4.0 749 | decay_mult: 1 750 | } 751 | param { 752 | lr_mult: 8.0 753 | decay_mult: 0 754 | } 755 | convolution_param { 756 | num_output: 128 757 | pad: 3 758 | kernel_size: 7 759 | weight_filler { 760 | type: "gaussian" 761 | std: 0.01 762 | } 763 | bias_filler { 764 | type: "constant" 765 | } 766 | } 767 | } 768 | layer { 769 | name: "Mrelu1_stage2_L1" 770 | type: "ReLU" 771 | bottom: "Mconv1_stage2_L1" 772 | top: "Mconv1_stage2_L1" 773 | } 774 | layer { 775 | name: "Mconv1_stage2_L2" 776 | type: "Convolution" 777 | bottom: "concat_stage2" 778 | top: "Mconv1_stage2_L2" 779 | param { 780 | lr_mult: 4.0 781 | decay_mult: 1 782 | } 783 | param { 784 | lr_mult: 8.0 785 | decay_mult: 0 786 | } 787 | convolution_param { 788 | num_output: 128 789 | pad: 3 790 | kernel_size: 7 791 | weight_filler { 792 | type: "gaussian" 793 | std: 0.01 794 | } 795 | bias_filler { 796 | type: "constant" 797 | } 798 | } 799 | } 800 | layer { 801 | name: "Mrelu1_stage2_L2" 802 | type: "ReLU" 803 | bottom: "Mconv1_stage2_L2" 804 | top: "Mconv1_stage2_L2" 805 | } 806 | layer { 807 | name: "Mconv2_stage2_L1" 808 | type: "Convolution" 809 | bottom: "Mconv1_stage2_L1" 810 | top: "Mconv2_stage2_L1" 811 | param { 812 | lr_mult: 4.0 813 | decay_mult: 1 814 | } 815 | param { 816 | lr_mult: 8.0 817 | decay_mult: 0 818 | } 819 | convolution_param { 820 | num_output: 128 821 | pad: 3 822 | kernel_size: 7 823 | weight_filler { 824 | type: "gaussian" 825 | std: 0.01 826 | } 827 | bias_filler { 828 | type: "constant" 829 | } 830 | } 831 | } 832 | layer { 833 | name: "Mrelu2_stage2_L1" 834 | type: "ReLU" 835 | bottom: "Mconv2_stage2_L1" 836 | top: "Mconv2_stage2_L1" 837 | } 838 | layer { 839 | name: "Mconv2_stage2_L2" 840 | type: "Convolution" 841 | bottom: "Mconv1_stage2_L2" 842 | top: "Mconv2_stage2_L2" 843 | param { 844 | lr_mult: 4.0 845 | decay_mult: 1 846 | } 847 | param { 848 | lr_mult: 8.0 849 | decay_mult: 0 850 | } 851 | convolution_param { 852 | num_output: 128 853 | pad: 3 854 | kernel_size: 7 855 | weight_filler { 856 | type: "gaussian" 857 | std: 0.01 858 | } 859 | bias_filler { 860 | type: "constant" 861 | } 862 | } 863 | } 864 | layer { 865 | name: "Mrelu2_stage2_L2" 866 | type: "ReLU" 867 | bottom: "Mconv2_stage2_L2" 868 | top: "Mconv2_stage2_L2" 869 | } 870 | layer { 871 | name: "Mconv3_stage2_L1" 872 | type: "Convolution" 873 | bottom: "Mconv2_stage2_L1" 874 | top: "Mconv3_stage2_L1" 875 | param { 876 | lr_mult: 4.0 877 | decay_mult: 1 878 | } 879 | param { 880 | lr_mult: 8.0 881 | decay_mult: 0 882 | } 883 | convolution_param { 884 | num_output: 128 885 | pad: 3 886 | kernel_size: 7 887 | weight_filler { 888 | type: "gaussian" 889 | std: 0.01 890 | } 891 | bias_filler { 892 | type: "constant" 893 | } 894 | } 895 | } 896 | layer { 897 | name: "Mrelu3_stage2_L1" 898 | type: "ReLU" 899 | bottom: "Mconv3_stage2_L1" 900 | top: "Mconv3_stage2_L1" 901 | } 902 | layer { 903 | name: "Mconv3_stage2_L2" 904 | type: "Convolution" 905 | bottom: "Mconv2_stage2_L2" 906 | top: "Mconv3_stage2_L2" 907 | param { 908 | lr_mult: 4.0 909 | decay_mult: 1 910 | } 911 | param { 912 | lr_mult: 8.0 913 | decay_mult: 0 914 | } 915 | convolution_param { 916 | num_output: 128 917 | pad: 3 918 | kernel_size: 7 919 | weight_filler { 920 | type: "gaussian" 921 | std: 0.01 922 | } 923 | bias_filler { 924 | type: "constant" 925 | } 926 | } 927 | } 928 | layer { 929 | name: "Mrelu3_stage2_L2" 930 | type: "ReLU" 931 | bottom: "Mconv3_stage2_L2" 932 | top: "Mconv3_stage2_L2" 933 | } 934 | layer { 935 | name: "Mconv4_stage2_L1" 936 | type: "Convolution" 937 | bottom: "Mconv3_stage2_L1" 938 | top: "Mconv4_stage2_L1" 939 | param { 940 | lr_mult: 4.0 941 | decay_mult: 1 942 | } 943 | param { 944 | lr_mult: 8.0 945 | decay_mult: 0 946 | } 947 | convolution_param { 948 | num_output: 128 949 | pad: 3 950 | kernel_size: 7 951 | weight_filler { 952 | type: "gaussian" 953 | std: 0.01 954 | } 955 | bias_filler { 956 | type: "constant" 957 | } 958 | } 959 | } 960 | layer { 961 | name: "Mrelu4_stage2_L1" 962 | type: "ReLU" 963 | bottom: "Mconv4_stage2_L1" 964 | top: "Mconv4_stage2_L1" 965 | } 966 | layer { 967 | name: "Mconv4_stage2_L2" 968 | type: "Convolution" 969 | bottom: "Mconv3_stage2_L2" 970 | top: "Mconv4_stage2_L2" 971 | param { 972 | lr_mult: 4.0 973 | decay_mult: 1 974 | } 975 | param { 976 | lr_mult: 8.0 977 | decay_mult: 0 978 | } 979 | convolution_param { 980 | num_output: 128 981 | pad: 3 982 | kernel_size: 7 983 | weight_filler { 984 | type: "gaussian" 985 | std: 0.01 986 | } 987 | bias_filler { 988 | type: "constant" 989 | } 990 | } 991 | } 992 | layer { 993 | name: "Mrelu4_stage2_L2" 994 | type: "ReLU" 995 | bottom: "Mconv4_stage2_L2" 996 | top: "Mconv4_stage2_L2" 997 | } 998 | layer { 999 | name: "Mconv5_stage2_L1" 1000 | type: "Convolution" 1001 | bottom: "Mconv4_stage2_L1" 1002 | top: "Mconv5_stage2_L1" 1003 | param { 1004 | lr_mult: 4.0 1005 | decay_mult: 1 1006 | } 1007 | param { 1008 | lr_mult: 8.0 1009 | decay_mult: 0 1010 | } 1011 | convolution_param { 1012 | num_output: 128 1013 | pad: 3 1014 | kernel_size: 7 1015 | weight_filler { 1016 | type: "gaussian" 1017 | std: 0.01 1018 | } 1019 | bias_filler { 1020 | type: "constant" 1021 | } 1022 | } 1023 | } 1024 | layer { 1025 | name: "Mrelu5_stage2_L1" 1026 | type: "ReLU" 1027 | bottom: "Mconv5_stage2_L1" 1028 | top: "Mconv5_stage2_L1" 1029 | } 1030 | layer { 1031 | name: "Mconv5_stage2_L2" 1032 | type: "Convolution" 1033 | bottom: "Mconv4_stage2_L2" 1034 | top: "Mconv5_stage2_L2" 1035 | param { 1036 | lr_mult: 4.0 1037 | decay_mult: 1 1038 | } 1039 | param { 1040 | lr_mult: 8.0 1041 | decay_mult: 0 1042 | } 1043 | convolution_param { 1044 | num_output: 128 1045 | pad: 3 1046 | kernel_size: 7 1047 | weight_filler { 1048 | type: "gaussian" 1049 | std: 0.01 1050 | } 1051 | bias_filler { 1052 | type: "constant" 1053 | } 1054 | } 1055 | } 1056 | layer { 1057 | name: "Mrelu5_stage2_L2" 1058 | type: "ReLU" 1059 | bottom: "Mconv5_stage2_L2" 1060 | top: "Mconv5_stage2_L2" 1061 | } 1062 | layer { 1063 | name: "Mconv6_stage2_L1" 1064 | type: "Convolution" 1065 | bottom: "Mconv5_stage2_L1" 1066 | top: "Mconv6_stage2_L1" 1067 | param { 1068 | lr_mult: 4.0 1069 | decay_mult: 1 1070 | } 1071 | param { 1072 | lr_mult: 8.0 1073 | decay_mult: 0 1074 | } 1075 | convolution_param { 1076 | num_output: 128 1077 | pad: 0 1078 | kernel_size: 1 1079 | weight_filler { 1080 | type: "gaussian" 1081 | std: 0.01 1082 | } 1083 | bias_filler { 1084 | type: "constant" 1085 | } 1086 | } 1087 | } 1088 | layer { 1089 | name: "Mrelu6_stage2_L1" 1090 | type: "ReLU" 1091 | bottom: "Mconv6_stage2_L1" 1092 | top: "Mconv6_stage2_L1" 1093 | } 1094 | layer { 1095 | name: "Mconv6_stage2_L2" 1096 | type: "Convolution" 1097 | bottom: "Mconv5_stage2_L2" 1098 | top: "Mconv6_stage2_L2" 1099 | param { 1100 | lr_mult: 4.0 1101 | decay_mult: 1 1102 | } 1103 | param { 1104 | lr_mult: 8.0 1105 | decay_mult: 0 1106 | } 1107 | convolution_param { 1108 | num_output: 128 1109 | pad: 0 1110 | kernel_size: 1 1111 | weight_filler { 1112 | type: "gaussian" 1113 | std: 0.01 1114 | } 1115 | bias_filler { 1116 | type: "constant" 1117 | } 1118 | } 1119 | } 1120 | layer { 1121 | name: "Mrelu6_stage2_L2" 1122 | type: "ReLU" 1123 | bottom: "Mconv6_stage2_L2" 1124 | top: "Mconv6_stage2_L2" 1125 | } 1126 | layer { 1127 | name: "Mconv7_stage2_L1" 1128 | type: "Convolution" 1129 | bottom: "Mconv6_stage2_L1" 1130 | top: "Mconv7_stage2_L1" 1131 | param { 1132 | lr_mult: 4.0 1133 | decay_mult: 1 1134 | } 1135 | param { 1136 | lr_mult: 8.0 1137 | decay_mult: 0 1138 | } 1139 | convolution_param { 1140 | num_output: 28 1141 | pad: 0 1142 | kernel_size: 1 1143 | weight_filler { 1144 | type: "gaussian" 1145 | std: 0.01 1146 | } 1147 | bias_filler { 1148 | type: "constant" 1149 | } 1150 | } 1151 | } 1152 | layer { 1153 | name: "Mconv7_stage2_L2" 1154 | type: "Convolution" 1155 | bottom: "Mconv6_stage2_L2" 1156 | top: "Mconv7_stage2_L2" 1157 | param { 1158 | lr_mult: 4.0 1159 | decay_mult: 1 1160 | } 1161 | param { 1162 | lr_mult: 8.0 1163 | decay_mult: 0 1164 | } 1165 | convolution_param { 1166 | num_output: 16 1167 | pad: 0 1168 | kernel_size: 1 1169 | weight_filler { 1170 | type: "gaussian" 1171 | std: 0.01 1172 | } 1173 | bias_filler { 1174 | type: "constant" 1175 | } 1176 | } 1177 | } 1178 | layer { 1179 | name: "concat_stage3" 1180 | type: "Concat" 1181 | bottom: "Mconv7_stage2_L1" 1182 | bottom: "Mconv7_stage2_L2" 1183 | bottom: "conv4_4_CPM" 1184 | top: "concat_stage3" 1185 | concat_param { 1186 | axis: 1 1187 | } 1188 | } 1189 | layer { 1190 | name: "Mconv1_stage3_L1" 1191 | type: "Convolution" 1192 | bottom: "concat_stage3" 1193 | top: "Mconv1_stage3_L1" 1194 | param { 1195 | lr_mult: 4.0 1196 | decay_mult: 1 1197 | } 1198 | param { 1199 | lr_mult: 8.0 1200 | decay_mult: 0 1201 | } 1202 | convolution_param { 1203 | num_output: 128 1204 | pad: 3 1205 | kernel_size: 7 1206 | weight_filler { 1207 | type: "gaussian" 1208 | std: 0.01 1209 | } 1210 | bias_filler { 1211 | type: "constant" 1212 | } 1213 | } 1214 | } 1215 | layer { 1216 | name: "Mrelu1_stage3_L1" 1217 | type: "ReLU" 1218 | bottom: "Mconv1_stage3_L1" 1219 | top: "Mconv1_stage3_L1" 1220 | } 1221 | layer { 1222 | name: "Mconv1_stage3_L2" 1223 | type: "Convolution" 1224 | bottom: "concat_stage3" 1225 | top: "Mconv1_stage3_L2" 1226 | param { 1227 | lr_mult: 4.0 1228 | decay_mult: 1 1229 | } 1230 | param { 1231 | lr_mult: 8.0 1232 | decay_mult: 0 1233 | } 1234 | convolution_param { 1235 | num_output: 128 1236 | pad: 3 1237 | kernel_size: 7 1238 | weight_filler { 1239 | type: "gaussian" 1240 | std: 0.01 1241 | } 1242 | bias_filler { 1243 | type: "constant" 1244 | } 1245 | } 1246 | } 1247 | layer { 1248 | name: "Mrelu1_stage3_L2" 1249 | type: "ReLU" 1250 | bottom: "Mconv1_stage3_L2" 1251 | top: "Mconv1_stage3_L2" 1252 | } 1253 | layer { 1254 | name: "Mconv2_stage3_L1" 1255 | type: "Convolution" 1256 | bottom: "Mconv1_stage3_L1" 1257 | top: "Mconv2_stage3_L1" 1258 | param { 1259 | lr_mult: 4.0 1260 | decay_mult: 1 1261 | } 1262 | param { 1263 | lr_mult: 8.0 1264 | decay_mult: 0 1265 | } 1266 | convolution_param { 1267 | num_output: 128 1268 | pad: 3 1269 | kernel_size: 7 1270 | weight_filler { 1271 | type: "gaussian" 1272 | std: 0.01 1273 | } 1274 | bias_filler { 1275 | type: "constant" 1276 | } 1277 | } 1278 | } 1279 | layer { 1280 | name: "Mrelu2_stage3_L1" 1281 | type: "ReLU" 1282 | bottom: "Mconv2_stage3_L1" 1283 | top: "Mconv2_stage3_L1" 1284 | } 1285 | layer { 1286 | name: "Mconv2_stage3_L2" 1287 | type: "Convolution" 1288 | bottom: "Mconv1_stage3_L2" 1289 | top: "Mconv2_stage3_L2" 1290 | param { 1291 | lr_mult: 4.0 1292 | decay_mult: 1 1293 | } 1294 | param { 1295 | lr_mult: 8.0 1296 | decay_mult: 0 1297 | } 1298 | convolution_param { 1299 | num_output: 128 1300 | pad: 3 1301 | kernel_size: 7 1302 | weight_filler { 1303 | type: "gaussian" 1304 | std: 0.01 1305 | } 1306 | bias_filler { 1307 | type: "constant" 1308 | } 1309 | } 1310 | } 1311 | layer { 1312 | name: "Mrelu2_stage3_L2" 1313 | type: "ReLU" 1314 | bottom: "Mconv2_stage3_L2" 1315 | top: "Mconv2_stage3_L2" 1316 | } 1317 | layer { 1318 | name: "Mconv3_stage3_L1" 1319 | type: "Convolution" 1320 | bottom: "Mconv2_stage3_L1" 1321 | top: "Mconv3_stage3_L1" 1322 | param { 1323 | lr_mult: 4.0 1324 | decay_mult: 1 1325 | } 1326 | param { 1327 | lr_mult: 8.0 1328 | decay_mult: 0 1329 | } 1330 | convolution_param { 1331 | num_output: 128 1332 | pad: 3 1333 | kernel_size: 7 1334 | weight_filler { 1335 | type: "gaussian" 1336 | std: 0.01 1337 | } 1338 | bias_filler { 1339 | type: "constant" 1340 | } 1341 | } 1342 | } 1343 | layer { 1344 | name: "Mrelu3_stage3_L1" 1345 | type: "ReLU" 1346 | bottom: "Mconv3_stage3_L1" 1347 | top: "Mconv3_stage3_L1" 1348 | } 1349 | layer { 1350 | name: "Mconv3_stage3_L2" 1351 | type: "Convolution" 1352 | bottom: "Mconv2_stage3_L2" 1353 | top: "Mconv3_stage3_L2" 1354 | param { 1355 | lr_mult: 4.0 1356 | decay_mult: 1 1357 | } 1358 | param { 1359 | lr_mult: 8.0 1360 | decay_mult: 0 1361 | } 1362 | convolution_param { 1363 | num_output: 128 1364 | pad: 3 1365 | kernel_size: 7 1366 | weight_filler { 1367 | type: "gaussian" 1368 | std: 0.01 1369 | } 1370 | bias_filler { 1371 | type: "constant" 1372 | } 1373 | } 1374 | } 1375 | layer { 1376 | name: "Mrelu3_stage3_L2" 1377 | type: "ReLU" 1378 | bottom: "Mconv3_stage3_L2" 1379 | top: "Mconv3_stage3_L2" 1380 | } 1381 | layer { 1382 | name: "Mconv4_stage3_L1" 1383 | type: "Convolution" 1384 | bottom: "Mconv3_stage3_L1" 1385 | top: "Mconv4_stage3_L1" 1386 | param { 1387 | lr_mult: 4.0 1388 | decay_mult: 1 1389 | } 1390 | param { 1391 | lr_mult: 8.0 1392 | decay_mult: 0 1393 | } 1394 | convolution_param { 1395 | num_output: 128 1396 | pad: 3 1397 | kernel_size: 7 1398 | weight_filler { 1399 | type: "gaussian" 1400 | std: 0.01 1401 | } 1402 | bias_filler { 1403 | type: "constant" 1404 | } 1405 | } 1406 | } 1407 | layer { 1408 | name: "Mrelu4_stage3_L1" 1409 | type: "ReLU" 1410 | bottom: "Mconv4_stage3_L1" 1411 | top: "Mconv4_stage3_L1" 1412 | } 1413 | layer { 1414 | name: "Mconv4_stage3_L2" 1415 | type: "Convolution" 1416 | bottom: "Mconv3_stage3_L2" 1417 | top: "Mconv4_stage3_L2" 1418 | param { 1419 | lr_mult: 4.0 1420 | decay_mult: 1 1421 | } 1422 | param { 1423 | lr_mult: 8.0 1424 | decay_mult: 0 1425 | } 1426 | convolution_param { 1427 | num_output: 128 1428 | pad: 3 1429 | kernel_size: 7 1430 | weight_filler { 1431 | type: "gaussian" 1432 | std: 0.01 1433 | } 1434 | bias_filler { 1435 | type: "constant" 1436 | } 1437 | } 1438 | } 1439 | layer { 1440 | name: "Mrelu4_stage3_L2" 1441 | type: "ReLU" 1442 | bottom: "Mconv4_stage3_L2" 1443 | top: "Mconv4_stage3_L2" 1444 | } 1445 | layer { 1446 | name: "Mconv5_stage3_L1" 1447 | type: "Convolution" 1448 | bottom: "Mconv4_stage3_L1" 1449 | top: "Mconv5_stage3_L1" 1450 | param { 1451 | lr_mult: 4.0 1452 | decay_mult: 1 1453 | } 1454 | param { 1455 | lr_mult: 8.0 1456 | decay_mult: 0 1457 | } 1458 | convolution_param { 1459 | num_output: 128 1460 | pad: 3 1461 | kernel_size: 7 1462 | weight_filler { 1463 | type: "gaussian" 1464 | std: 0.01 1465 | } 1466 | bias_filler { 1467 | type: "constant" 1468 | } 1469 | } 1470 | } 1471 | layer { 1472 | name: "Mrelu5_stage3_L1" 1473 | type: "ReLU" 1474 | bottom: "Mconv5_stage3_L1" 1475 | top: "Mconv5_stage3_L1" 1476 | } 1477 | layer { 1478 | name: "Mconv5_stage3_L2" 1479 | type: "Convolution" 1480 | bottom: "Mconv4_stage3_L2" 1481 | top: "Mconv5_stage3_L2" 1482 | param { 1483 | lr_mult: 4.0 1484 | decay_mult: 1 1485 | } 1486 | param { 1487 | lr_mult: 8.0 1488 | decay_mult: 0 1489 | } 1490 | convolution_param { 1491 | num_output: 128 1492 | pad: 3 1493 | kernel_size: 7 1494 | weight_filler { 1495 | type: "gaussian" 1496 | std: 0.01 1497 | } 1498 | bias_filler { 1499 | type: "constant" 1500 | } 1501 | } 1502 | } 1503 | layer { 1504 | name: "Mrelu5_stage3_L2" 1505 | type: "ReLU" 1506 | bottom: "Mconv5_stage3_L2" 1507 | top: "Mconv5_stage3_L2" 1508 | } 1509 | layer { 1510 | name: "Mconv6_stage3_L1" 1511 | type: "Convolution" 1512 | bottom: "Mconv5_stage3_L1" 1513 | top: "Mconv6_stage3_L1" 1514 | param { 1515 | lr_mult: 4.0 1516 | decay_mult: 1 1517 | } 1518 | param { 1519 | lr_mult: 8.0 1520 | decay_mult: 0 1521 | } 1522 | convolution_param { 1523 | num_output: 128 1524 | pad: 0 1525 | kernel_size: 1 1526 | weight_filler { 1527 | type: "gaussian" 1528 | std: 0.01 1529 | } 1530 | bias_filler { 1531 | type: "constant" 1532 | } 1533 | } 1534 | } 1535 | layer { 1536 | name: "Mrelu6_stage3_L1" 1537 | type: "ReLU" 1538 | bottom: "Mconv6_stage3_L1" 1539 | top: "Mconv6_stage3_L1" 1540 | } 1541 | layer { 1542 | name: "Mconv6_stage3_L2" 1543 | type: "Convolution" 1544 | bottom: "Mconv5_stage3_L2" 1545 | top: "Mconv6_stage3_L2" 1546 | param { 1547 | lr_mult: 4.0 1548 | decay_mult: 1 1549 | } 1550 | param { 1551 | lr_mult: 8.0 1552 | decay_mult: 0 1553 | } 1554 | convolution_param { 1555 | num_output: 128 1556 | pad: 0 1557 | kernel_size: 1 1558 | weight_filler { 1559 | type: "gaussian" 1560 | std: 0.01 1561 | } 1562 | bias_filler { 1563 | type: "constant" 1564 | } 1565 | } 1566 | } 1567 | layer { 1568 | name: "Mrelu6_stage3_L2" 1569 | type: "ReLU" 1570 | bottom: "Mconv6_stage3_L2" 1571 | top: "Mconv6_stage3_L2" 1572 | } 1573 | layer { 1574 | name: "Mconv7_stage3_L1" 1575 | type: "Convolution" 1576 | bottom: "Mconv6_stage3_L1" 1577 | top: "Mconv7_stage3_L1" 1578 | param { 1579 | lr_mult: 4.0 1580 | decay_mult: 1 1581 | } 1582 | param { 1583 | lr_mult: 8.0 1584 | decay_mult: 0 1585 | } 1586 | convolution_param { 1587 | num_output: 28 1588 | pad: 0 1589 | kernel_size: 1 1590 | weight_filler { 1591 | type: "gaussian" 1592 | std: 0.01 1593 | } 1594 | bias_filler { 1595 | type: "constant" 1596 | } 1597 | } 1598 | } 1599 | layer { 1600 | name: "Mconv7_stage3_L2" 1601 | type: "Convolution" 1602 | bottom: "Mconv6_stage3_L2" 1603 | top: "Mconv7_stage3_L2" 1604 | param { 1605 | lr_mult: 4.0 1606 | decay_mult: 1 1607 | } 1608 | param { 1609 | lr_mult: 8.0 1610 | decay_mult: 0 1611 | } 1612 | convolution_param { 1613 | num_output: 16 1614 | pad: 0 1615 | kernel_size: 1 1616 | weight_filler { 1617 | type: "gaussian" 1618 | std: 0.01 1619 | } 1620 | bias_filler { 1621 | type: "constant" 1622 | } 1623 | } 1624 | } 1625 | layer { 1626 | name: "concat_stage4" 1627 | type: "Concat" 1628 | bottom: "Mconv7_stage3_L1" 1629 | bottom: "Mconv7_stage3_L2" 1630 | bottom: "conv4_4_CPM" 1631 | top: "concat_stage4" 1632 | concat_param { 1633 | axis: 1 1634 | } 1635 | } 1636 | layer { 1637 | name: "Mconv1_stage4_L1" 1638 | type: "Convolution" 1639 | bottom: "concat_stage4" 1640 | top: "Mconv1_stage4_L1" 1641 | param { 1642 | lr_mult: 4.0 1643 | decay_mult: 1 1644 | } 1645 | param { 1646 | lr_mult: 8.0 1647 | decay_mult: 0 1648 | } 1649 | convolution_param { 1650 | num_output: 128 1651 | pad: 3 1652 | kernel_size: 7 1653 | weight_filler { 1654 | type: "gaussian" 1655 | std: 0.01 1656 | } 1657 | bias_filler { 1658 | type: "constant" 1659 | } 1660 | } 1661 | } 1662 | layer { 1663 | name: "Mrelu1_stage4_L1" 1664 | type: "ReLU" 1665 | bottom: "Mconv1_stage4_L1" 1666 | top: "Mconv1_stage4_L1" 1667 | } 1668 | layer { 1669 | name: "Mconv1_stage4_L2" 1670 | type: "Convolution" 1671 | bottom: "concat_stage4" 1672 | top: "Mconv1_stage4_L2" 1673 | param { 1674 | lr_mult: 4.0 1675 | decay_mult: 1 1676 | } 1677 | param { 1678 | lr_mult: 8.0 1679 | decay_mult: 0 1680 | } 1681 | convolution_param { 1682 | num_output: 128 1683 | pad: 3 1684 | kernel_size: 7 1685 | weight_filler { 1686 | type: "gaussian" 1687 | std: 0.01 1688 | } 1689 | bias_filler { 1690 | type: "constant" 1691 | } 1692 | } 1693 | } 1694 | layer { 1695 | name: "Mrelu1_stage4_L2" 1696 | type: "ReLU" 1697 | bottom: "Mconv1_stage4_L2" 1698 | top: "Mconv1_stage4_L2" 1699 | } 1700 | layer { 1701 | name: "Mconv2_stage4_L1" 1702 | type: "Convolution" 1703 | bottom: "Mconv1_stage4_L1" 1704 | top: "Mconv2_stage4_L1" 1705 | param { 1706 | lr_mult: 4.0 1707 | decay_mult: 1 1708 | } 1709 | param { 1710 | lr_mult: 8.0 1711 | decay_mult: 0 1712 | } 1713 | convolution_param { 1714 | num_output: 128 1715 | pad: 3 1716 | kernel_size: 7 1717 | weight_filler { 1718 | type: "gaussian" 1719 | std: 0.01 1720 | } 1721 | bias_filler { 1722 | type: "constant" 1723 | } 1724 | } 1725 | } 1726 | layer { 1727 | name: "Mrelu2_stage4_L1" 1728 | type: "ReLU" 1729 | bottom: "Mconv2_stage4_L1" 1730 | top: "Mconv2_stage4_L1" 1731 | } 1732 | layer { 1733 | name: "Mconv2_stage4_L2" 1734 | type: "Convolution" 1735 | bottom: "Mconv1_stage4_L2" 1736 | top: "Mconv2_stage4_L2" 1737 | param { 1738 | lr_mult: 4.0 1739 | decay_mult: 1 1740 | } 1741 | param { 1742 | lr_mult: 8.0 1743 | decay_mult: 0 1744 | } 1745 | convolution_param { 1746 | num_output: 128 1747 | pad: 3 1748 | kernel_size: 7 1749 | weight_filler { 1750 | type: "gaussian" 1751 | std: 0.01 1752 | } 1753 | bias_filler { 1754 | type: "constant" 1755 | } 1756 | } 1757 | } 1758 | layer { 1759 | name: "Mrelu2_stage4_L2" 1760 | type: "ReLU" 1761 | bottom: "Mconv2_stage4_L2" 1762 | top: "Mconv2_stage4_L2" 1763 | } 1764 | layer { 1765 | name: "Mconv3_stage4_L1" 1766 | type: "Convolution" 1767 | bottom: "Mconv2_stage4_L1" 1768 | top: "Mconv3_stage4_L1" 1769 | param { 1770 | lr_mult: 4.0 1771 | decay_mult: 1 1772 | } 1773 | param { 1774 | lr_mult: 8.0 1775 | decay_mult: 0 1776 | } 1777 | convolution_param { 1778 | num_output: 128 1779 | pad: 3 1780 | kernel_size: 7 1781 | weight_filler { 1782 | type: "gaussian" 1783 | std: 0.01 1784 | } 1785 | bias_filler { 1786 | type: "constant" 1787 | } 1788 | } 1789 | } 1790 | layer { 1791 | name: "Mrelu3_stage4_L1" 1792 | type: "ReLU" 1793 | bottom: "Mconv3_stage4_L1" 1794 | top: "Mconv3_stage4_L1" 1795 | } 1796 | layer { 1797 | name: "Mconv3_stage4_L2" 1798 | type: "Convolution" 1799 | bottom: "Mconv2_stage4_L2" 1800 | top: "Mconv3_stage4_L2" 1801 | param { 1802 | lr_mult: 4.0 1803 | decay_mult: 1 1804 | } 1805 | param { 1806 | lr_mult: 8.0 1807 | decay_mult: 0 1808 | } 1809 | convolution_param { 1810 | num_output: 128 1811 | pad: 3 1812 | kernel_size: 7 1813 | weight_filler { 1814 | type: "gaussian" 1815 | std: 0.01 1816 | } 1817 | bias_filler { 1818 | type: "constant" 1819 | } 1820 | } 1821 | } 1822 | layer { 1823 | name: "Mrelu3_stage4_L2" 1824 | type: "ReLU" 1825 | bottom: "Mconv3_stage4_L2" 1826 | top: "Mconv3_stage4_L2" 1827 | } 1828 | layer { 1829 | name: "Mconv4_stage4_L1" 1830 | type: "Convolution" 1831 | bottom: "Mconv3_stage4_L1" 1832 | top: "Mconv4_stage4_L1" 1833 | param { 1834 | lr_mult: 4.0 1835 | decay_mult: 1 1836 | } 1837 | param { 1838 | lr_mult: 8.0 1839 | decay_mult: 0 1840 | } 1841 | convolution_param { 1842 | num_output: 128 1843 | pad: 3 1844 | kernel_size: 7 1845 | weight_filler { 1846 | type: "gaussian" 1847 | std: 0.01 1848 | } 1849 | bias_filler { 1850 | type: "constant" 1851 | } 1852 | } 1853 | } 1854 | layer { 1855 | name: "Mrelu4_stage4_L1" 1856 | type: "ReLU" 1857 | bottom: "Mconv4_stage4_L1" 1858 | top: "Mconv4_stage4_L1" 1859 | } 1860 | layer { 1861 | name: "Mconv4_stage4_L2" 1862 | type: "Convolution" 1863 | bottom: "Mconv3_stage4_L2" 1864 | top: "Mconv4_stage4_L2" 1865 | param { 1866 | lr_mult: 4.0 1867 | decay_mult: 1 1868 | } 1869 | param { 1870 | lr_mult: 8.0 1871 | decay_mult: 0 1872 | } 1873 | convolution_param { 1874 | num_output: 128 1875 | pad: 3 1876 | kernel_size: 7 1877 | weight_filler { 1878 | type: "gaussian" 1879 | std: 0.01 1880 | } 1881 | bias_filler { 1882 | type: "constant" 1883 | } 1884 | } 1885 | } 1886 | layer { 1887 | name: "Mrelu4_stage4_L2" 1888 | type: "ReLU" 1889 | bottom: "Mconv4_stage4_L2" 1890 | top: "Mconv4_stage4_L2" 1891 | } 1892 | layer { 1893 | name: "Mconv5_stage4_L1" 1894 | type: "Convolution" 1895 | bottom: "Mconv4_stage4_L1" 1896 | top: "Mconv5_stage4_L1" 1897 | param { 1898 | lr_mult: 4.0 1899 | decay_mult: 1 1900 | } 1901 | param { 1902 | lr_mult: 8.0 1903 | decay_mult: 0 1904 | } 1905 | convolution_param { 1906 | num_output: 128 1907 | pad: 3 1908 | kernel_size: 7 1909 | weight_filler { 1910 | type: "gaussian" 1911 | std: 0.01 1912 | } 1913 | bias_filler { 1914 | type: "constant" 1915 | } 1916 | } 1917 | } 1918 | layer { 1919 | name: "Mrelu5_stage4_L1" 1920 | type: "ReLU" 1921 | bottom: "Mconv5_stage4_L1" 1922 | top: "Mconv5_stage4_L1" 1923 | } 1924 | layer { 1925 | name: "Mconv5_stage4_L2" 1926 | type: "Convolution" 1927 | bottom: "Mconv4_stage4_L2" 1928 | top: "Mconv5_stage4_L2" 1929 | param { 1930 | lr_mult: 4.0 1931 | decay_mult: 1 1932 | } 1933 | param { 1934 | lr_mult: 8.0 1935 | decay_mult: 0 1936 | } 1937 | convolution_param { 1938 | num_output: 128 1939 | pad: 3 1940 | kernel_size: 7 1941 | weight_filler { 1942 | type: "gaussian" 1943 | std: 0.01 1944 | } 1945 | bias_filler { 1946 | type: "constant" 1947 | } 1948 | } 1949 | } 1950 | layer { 1951 | name: "Mrelu5_stage4_L2" 1952 | type: "ReLU" 1953 | bottom: "Mconv5_stage4_L2" 1954 | top: "Mconv5_stage4_L2" 1955 | } 1956 | layer { 1957 | name: "Mconv6_stage4_L1" 1958 | type: "Convolution" 1959 | bottom: "Mconv5_stage4_L1" 1960 | top: "Mconv6_stage4_L1" 1961 | param { 1962 | lr_mult: 4.0 1963 | decay_mult: 1 1964 | } 1965 | param { 1966 | lr_mult: 8.0 1967 | decay_mult: 0 1968 | } 1969 | convolution_param { 1970 | num_output: 128 1971 | pad: 0 1972 | kernel_size: 1 1973 | weight_filler { 1974 | type: "gaussian" 1975 | std: 0.01 1976 | } 1977 | bias_filler { 1978 | type: "constant" 1979 | } 1980 | } 1981 | } 1982 | layer { 1983 | name: "Mrelu6_stage4_L1" 1984 | type: "ReLU" 1985 | bottom: "Mconv6_stage4_L1" 1986 | top: "Mconv6_stage4_L1" 1987 | } 1988 | layer { 1989 | name: "Mconv6_stage4_L2" 1990 | type: "Convolution" 1991 | bottom: "Mconv5_stage4_L2" 1992 | top: "Mconv6_stage4_L2" 1993 | param { 1994 | lr_mult: 4.0 1995 | decay_mult: 1 1996 | } 1997 | param { 1998 | lr_mult: 8.0 1999 | decay_mult: 0 2000 | } 2001 | convolution_param { 2002 | num_output: 128 2003 | pad: 0 2004 | kernel_size: 1 2005 | weight_filler { 2006 | type: "gaussian" 2007 | std: 0.01 2008 | } 2009 | bias_filler { 2010 | type: "constant" 2011 | } 2012 | } 2013 | } 2014 | layer { 2015 | name: "Mrelu6_stage4_L2" 2016 | type: "ReLU" 2017 | bottom: "Mconv6_stage4_L2" 2018 | top: "Mconv6_stage4_L2" 2019 | } 2020 | layer { 2021 | name: "Mconv7_stage4_L1" 2022 | type: "Convolution" 2023 | bottom: "Mconv6_stage4_L1" 2024 | top: "Mconv7_stage4_L1" 2025 | param { 2026 | lr_mult: 4.0 2027 | decay_mult: 1 2028 | } 2029 | param { 2030 | lr_mult: 8.0 2031 | decay_mult: 0 2032 | } 2033 | convolution_param { 2034 | num_output: 28 2035 | pad: 0 2036 | kernel_size: 1 2037 | weight_filler { 2038 | type: "gaussian" 2039 | std: 0.01 2040 | } 2041 | bias_filler { 2042 | type: "constant" 2043 | } 2044 | } 2045 | } 2046 | layer { 2047 | name: "Mconv7_stage4_L2" 2048 | type: "Convolution" 2049 | bottom: "Mconv6_stage4_L2" 2050 | top: "Mconv7_stage4_L2" 2051 | param { 2052 | lr_mult: 4.0 2053 | decay_mult: 1 2054 | } 2055 | param { 2056 | lr_mult: 8.0 2057 | decay_mult: 0 2058 | } 2059 | convolution_param { 2060 | num_output: 16 2061 | pad: 0 2062 | kernel_size: 1 2063 | weight_filler { 2064 | type: "gaussian" 2065 | std: 0.01 2066 | } 2067 | bias_filler { 2068 | type: "constant" 2069 | } 2070 | } 2071 | } 2072 | layer { 2073 | name: "concat_stage7" 2074 | type: "Concat" 2075 | bottom: "Mconv7_stage4_L2" 2076 | bottom: "Mconv7_stage4_L1" 2077 | top: "net_output" 2078 | concat_param { 2079 | axis: 1 2080 | } 2081 | } 2082 | -------------------------------------------------------------------------------- /pose-estimator-using-caffemodel/model/mpi/pose_deploy_linevec_faster_4_stages.prototxt: -------------------------------------------------------------------------------- 1 | input: "image" 2 | input_dim: 1 3 | input_dim: 3 4 | input_dim: 1 # This value will be defined at runtime 5 | input_dim: 1 # This value will be defined at runtime 6 | layer { 7 | name: "conv1_1" 8 | type: "Convolution" 9 | bottom: "image" 10 | top: "conv1_1" 11 | param { 12 | lr_mult: 1.0 13 | decay_mult: 1 14 | } 15 | param { 16 | lr_mult: 2.0 17 | decay_mult: 0 18 | } 19 | convolution_param { 20 | num_output: 64 21 | pad: 1 22 | kernel_size: 3 23 | weight_filler { 24 | type: "gaussian" 25 | std: 0.01 26 | } 27 | bias_filler { 28 | type: "constant" 29 | } 30 | } 31 | } 32 | layer { 33 | name: "relu1_1" 34 | type: "ReLU" 35 | bottom: "conv1_1" 36 | top: "conv1_1" 37 | } 38 | layer { 39 | name: "conv1_2" 40 | type: "Convolution" 41 | bottom: "conv1_1" 42 | top: "conv1_2" 43 | param { 44 | lr_mult: 1.0 45 | decay_mult: 1 46 | } 47 | param { 48 | lr_mult: 2.0 49 | decay_mult: 0 50 | } 51 | convolution_param { 52 | num_output: 64 53 | pad: 1 54 | kernel_size: 3 55 | weight_filler { 56 | type: "gaussian" 57 | std: 0.01 58 | } 59 | bias_filler { 60 | type: "constant" 61 | } 62 | } 63 | } 64 | layer { 65 | name: "relu1_2" 66 | type: "ReLU" 67 | bottom: "conv1_2" 68 | top: "conv1_2" 69 | } 70 | layer { 71 | name: "pool1_stage1" 72 | type: "Pooling" 73 | bottom: "conv1_2" 74 | top: "pool1_stage1" 75 | pooling_param { 76 | pool: MAX 77 | kernel_size: 2 78 | stride: 2 79 | } 80 | } 81 | layer { 82 | name: "conv2_1" 83 | type: "Convolution" 84 | bottom: "pool1_stage1" 85 | top: "conv2_1" 86 | param { 87 | lr_mult: 1.0 88 | decay_mult: 1 89 | } 90 | param { 91 | lr_mult: 2.0 92 | decay_mult: 0 93 | } 94 | convolution_param { 95 | num_output: 128 96 | pad: 1 97 | kernel_size: 3 98 | weight_filler { 99 | type: "gaussian" 100 | std: 0.01 101 | } 102 | bias_filler { 103 | type: "constant" 104 | } 105 | } 106 | } 107 | layer { 108 | name: "relu2_1" 109 | type: "ReLU" 110 | bottom: "conv2_1" 111 | top: "conv2_1" 112 | } 113 | layer { 114 | name: "conv2_2" 115 | type: "Convolution" 116 | bottom: "conv2_1" 117 | top: "conv2_2" 118 | param { 119 | lr_mult: 1.0 120 | decay_mult: 1 121 | } 122 | param { 123 | lr_mult: 2.0 124 | decay_mult: 0 125 | } 126 | convolution_param { 127 | num_output: 128 128 | pad: 1 129 | kernel_size: 3 130 | weight_filler { 131 | type: "gaussian" 132 | std: 0.01 133 | } 134 | bias_filler { 135 | type: "constant" 136 | } 137 | } 138 | } 139 | layer { 140 | name: "relu2_2" 141 | type: "ReLU" 142 | bottom: "conv2_2" 143 | top: "conv2_2" 144 | } 145 | layer { 146 | name: "pool2_stage1" 147 | type: "Pooling" 148 | bottom: "conv2_2" 149 | top: "pool2_stage1" 150 | pooling_param { 151 | pool: MAX 152 | kernel_size: 2 153 | stride: 2 154 | } 155 | } 156 | layer { 157 | name: "conv3_1" 158 | type: "Convolution" 159 | bottom: "pool2_stage1" 160 | top: "conv3_1" 161 | param { 162 | lr_mult: 1.0 163 | decay_mult: 1 164 | } 165 | param { 166 | lr_mult: 2.0 167 | decay_mult: 0 168 | } 169 | convolution_param { 170 | num_output: 256 171 | pad: 1 172 | kernel_size: 3 173 | weight_filler { 174 | type: "gaussian" 175 | std: 0.01 176 | } 177 | bias_filler { 178 | type: "constant" 179 | } 180 | } 181 | } 182 | layer { 183 | name: "relu3_1" 184 | type: "ReLU" 185 | bottom: "conv3_1" 186 | top: "conv3_1" 187 | } 188 | layer { 189 | name: "conv3_2" 190 | type: "Convolution" 191 | bottom: "conv3_1" 192 | top: "conv3_2" 193 | param { 194 | lr_mult: 1.0 195 | decay_mult: 1 196 | } 197 | param { 198 | lr_mult: 2.0 199 | decay_mult: 0 200 | } 201 | convolution_param { 202 | num_output: 256 203 | pad: 1 204 | kernel_size: 3 205 | weight_filler { 206 | type: "gaussian" 207 | std: 0.01 208 | } 209 | bias_filler { 210 | type: "constant" 211 | } 212 | } 213 | } 214 | layer { 215 | name: "relu3_2" 216 | type: "ReLU" 217 | bottom: "conv3_2" 218 | top: "conv3_2" 219 | } 220 | layer { 221 | name: "conv3_3" 222 | type: "Convolution" 223 | bottom: "conv3_2" 224 | top: "conv3_3" 225 | param { 226 | lr_mult: 1.0 227 | decay_mult: 1 228 | } 229 | param { 230 | lr_mult: 2.0 231 | decay_mult: 0 232 | } 233 | convolution_param { 234 | num_output: 256 235 | pad: 1 236 | kernel_size: 3 237 | weight_filler { 238 | type: "gaussian" 239 | std: 0.01 240 | } 241 | bias_filler { 242 | type: "constant" 243 | } 244 | } 245 | } 246 | layer { 247 | name: "relu3_3" 248 | type: "ReLU" 249 | bottom: "conv3_3" 250 | top: "conv3_3" 251 | } 252 | layer { 253 | name: "conv3_4" 254 | type: "Convolution" 255 | bottom: "conv3_3" 256 | top: "conv3_4" 257 | param { 258 | lr_mult: 1.0 259 | decay_mult: 1 260 | } 261 | param { 262 | lr_mult: 2.0 263 | decay_mult: 0 264 | } 265 | convolution_param { 266 | num_output: 256 267 | pad: 1 268 | kernel_size: 3 269 | weight_filler { 270 | type: "gaussian" 271 | std: 0.01 272 | } 273 | bias_filler { 274 | type: "constant" 275 | } 276 | } 277 | } 278 | layer { 279 | name: "relu3_4" 280 | type: "ReLU" 281 | bottom: "conv3_4" 282 | top: "conv3_4" 283 | } 284 | layer { 285 | name: "pool3_stage1" 286 | type: "Pooling" 287 | bottom: "conv3_4" 288 | top: "pool3_stage1" 289 | pooling_param { 290 | pool: MAX 291 | kernel_size: 2 292 | stride: 2 293 | } 294 | } 295 | layer { 296 | name: "conv4_1" 297 | type: "Convolution" 298 | bottom: "pool3_stage1" 299 | top: "conv4_1" 300 | param { 301 | lr_mult: 1.0 302 | decay_mult: 1 303 | } 304 | param { 305 | lr_mult: 2.0 306 | decay_mult: 0 307 | } 308 | convolution_param { 309 | num_output: 512 310 | pad: 1 311 | kernel_size: 3 312 | weight_filler { 313 | type: "gaussian" 314 | std: 0.01 315 | } 316 | bias_filler { 317 | type: "constant" 318 | } 319 | } 320 | } 321 | layer { 322 | name: "relu4_1" 323 | type: "ReLU" 324 | bottom: "conv4_1" 325 | top: "conv4_1" 326 | } 327 | layer { 328 | name: "conv4_2" 329 | type: "Convolution" 330 | bottom: "conv4_1" 331 | top: "conv4_2" 332 | param { 333 | lr_mult: 1.0 334 | decay_mult: 1 335 | } 336 | param { 337 | lr_mult: 2.0 338 | decay_mult: 0 339 | } 340 | convolution_param { 341 | num_output: 512 342 | pad: 1 343 | kernel_size: 3 344 | weight_filler { 345 | type: "gaussian" 346 | std: 0.01 347 | } 348 | bias_filler { 349 | type: "constant" 350 | } 351 | } 352 | } 353 | layer { 354 | name: "relu4_2" 355 | type: "ReLU" 356 | bottom: "conv4_2" 357 | top: "conv4_2" 358 | } 359 | layer { 360 | name: "conv4_3_CPM" 361 | type: "Convolution" 362 | bottom: "conv4_2" 363 | top: "conv4_3_CPM" 364 | param { 365 | lr_mult: 1.0 366 | decay_mult: 1 367 | } 368 | param { 369 | lr_mult: 2.0 370 | decay_mult: 0 371 | } 372 | convolution_param { 373 | num_output: 256 374 | pad: 1 375 | kernel_size: 3 376 | weight_filler { 377 | type: "gaussian" 378 | std: 0.01 379 | } 380 | bias_filler { 381 | type: "constant" 382 | } 383 | } 384 | } 385 | layer { 386 | name: "relu4_3_CPM" 387 | type: "ReLU" 388 | bottom: "conv4_3_CPM" 389 | top: "conv4_3_CPM" 390 | } 391 | layer { 392 | name: "conv4_4_CPM" 393 | type: "Convolution" 394 | bottom: "conv4_3_CPM" 395 | top: "conv4_4_CPM" 396 | param { 397 | lr_mult: 1.0 398 | decay_mult: 1 399 | } 400 | param { 401 | lr_mult: 2.0 402 | decay_mult: 0 403 | } 404 | convolution_param { 405 | num_output: 128 406 | pad: 1 407 | kernel_size: 3 408 | weight_filler { 409 | type: "gaussian" 410 | std: 0.01 411 | } 412 | bias_filler { 413 | type: "constant" 414 | } 415 | } 416 | } 417 | layer { 418 | name: "relu4_4_CPM" 419 | type: "ReLU" 420 | bottom: "conv4_4_CPM" 421 | top: "conv4_4_CPM" 422 | } 423 | layer { 424 | name: "conv5_1_CPM_L1" 425 | type: "Convolution" 426 | bottom: "conv4_4_CPM" 427 | top: "conv5_1_CPM_L1" 428 | param { 429 | lr_mult: 1.0 430 | decay_mult: 1 431 | } 432 | param { 433 | lr_mult: 2.0 434 | decay_mult: 0 435 | } 436 | convolution_param { 437 | num_output: 128 438 | pad: 1 439 | kernel_size: 3 440 | weight_filler { 441 | type: "gaussian" 442 | std: 0.01 443 | } 444 | bias_filler { 445 | type: "constant" 446 | } 447 | } 448 | } 449 | layer { 450 | name: "relu5_1_CPM_L1" 451 | type: "ReLU" 452 | bottom: "conv5_1_CPM_L1" 453 | top: "conv5_1_CPM_L1" 454 | } 455 | layer { 456 | name: "conv5_1_CPM_L2" 457 | type: "Convolution" 458 | bottom: "conv4_4_CPM" 459 | top: "conv5_1_CPM_L2" 460 | param { 461 | lr_mult: 1.0 462 | decay_mult: 1 463 | } 464 | param { 465 | lr_mult: 2.0 466 | decay_mult: 0 467 | } 468 | convolution_param { 469 | num_output: 128 470 | pad: 1 471 | kernel_size: 3 472 | weight_filler { 473 | type: "gaussian" 474 | std: 0.01 475 | } 476 | bias_filler { 477 | type: "constant" 478 | } 479 | } 480 | } 481 | layer { 482 | name: "relu5_1_CPM_L2" 483 | type: "ReLU" 484 | bottom: "conv5_1_CPM_L2" 485 | top: "conv5_1_CPM_L2" 486 | } 487 | layer { 488 | name: "conv5_2_CPM_L1" 489 | type: "Convolution" 490 | bottom: "conv5_1_CPM_L1" 491 | top: "conv5_2_CPM_L1" 492 | param { 493 | lr_mult: 1.0 494 | decay_mult: 1 495 | } 496 | param { 497 | lr_mult: 2.0 498 | decay_mult: 0 499 | } 500 | convolution_param { 501 | num_output: 128 502 | pad: 1 503 | kernel_size: 3 504 | weight_filler { 505 | type: "gaussian" 506 | std: 0.01 507 | } 508 | bias_filler { 509 | type: "constant" 510 | } 511 | } 512 | } 513 | layer { 514 | name: "relu5_2_CPM_L1" 515 | type: "ReLU" 516 | bottom: "conv5_2_CPM_L1" 517 | top: "conv5_2_CPM_L1" 518 | } 519 | layer { 520 | name: "conv5_2_CPM_L2" 521 | type: "Convolution" 522 | bottom: "conv5_1_CPM_L2" 523 | top: "conv5_2_CPM_L2" 524 | param { 525 | lr_mult: 1.0 526 | decay_mult: 1 527 | } 528 | param { 529 | lr_mult: 2.0 530 | decay_mult: 0 531 | } 532 | convolution_param { 533 | num_output: 128 534 | pad: 1 535 | kernel_size: 3 536 | weight_filler { 537 | type: "gaussian" 538 | std: 0.01 539 | } 540 | bias_filler { 541 | type: "constant" 542 | } 543 | } 544 | } 545 | layer { 546 | name: "relu5_2_CPM_L2" 547 | type: "ReLU" 548 | bottom: "conv5_2_CPM_L2" 549 | top: "conv5_2_CPM_L2" 550 | } 551 | layer { 552 | name: "conv5_3_CPM_L1" 553 | type: "Convolution" 554 | bottom: "conv5_2_CPM_L1" 555 | top: "conv5_3_CPM_L1" 556 | param { 557 | lr_mult: 1.0 558 | decay_mult: 1 559 | } 560 | param { 561 | lr_mult: 2.0 562 | decay_mult: 0 563 | } 564 | convolution_param { 565 | num_output: 128 566 | pad: 1 567 | kernel_size: 3 568 | weight_filler { 569 | type: "gaussian" 570 | std: 0.01 571 | } 572 | bias_filler { 573 | type: "constant" 574 | } 575 | } 576 | } 577 | layer { 578 | name: "relu5_3_CPM_L1" 579 | type: "ReLU" 580 | bottom: "conv5_3_CPM_L1" 581 | top: "conv5_3_CPM_L1" 582 | } 583 | layer { 584 | name: "conv5_3_CPM_L2" 585 | type: "Convolution" 586 | bottom: "conv5_2_CPM_L2" 587 | top: "conv5_3_CPM_L2" 588 | param { 589 | lr_mult: 1.0 590 | decay_mult: 1 591 | } 592 | param { 593 | lr_mult: 2.0 594 | decay_mult: 0 595 | } 596 | convolution_param { 597 | num_output: 128 598 | pad: 1 599 | kernel_size: 3 600 | weight_filler { 601 | type: "gaussian" 602 | std: 0.01 603 | } 604 | bias_filler { 605 | type: "constant" 606 | } 607 | } 608 | } 609 | layer { 610 | name: "relu5_3_CPM_L2" 611 | type: "ReLU" 612 | bottom: "conv5_3_CPM_L2" 613 | top: "conv5_3_CPM_L2" 614 | } 615 | layer { 616 | name: "conv5_4_CPM_L1" 617 | type: "Convolution" 618 | bottom: "conv5_3_CPM_L1" 619 | top: "conv5_4_CPM_L1" 620 | param { 621 | lr_mult: 1.0 622 | decay_mult: 1 623 | } 624 | param { 625 | lr_mult: 2.0 626 | decay_mult: 0 627 | } 628 | convolution_param { 629 | num_output: 512 630 | pad: 0 631 | kernel_size: 1 632 | weight_filler { 633 | type: "gaussian" 634 | std: 0.01 635 | } 636 | bias_filler { 637 | type: "constant" 638 | } 639 | } 640 | } 641 | layer { 642 | name: "relu5_4_CPM_L1" 643 | type: "ReLU" 644 | bottom: "conv5_4_CPM_L1" 645 | top: "conv5_4_CPM_L1" 646 | } 647 | layer { 648 | name: "conv5_4_CPM_L2" 649 | type: "Convolution" 650 | bottom: "conv5_3_CPM_L2" 651 | top: "conv5_4_CPM_L2" 652 | param { 653 | lr_mult: 1.0 654 | decay_mult: 1 655 | } 656 | param { 657 | lr_mult: 2.0 658 | decay_mult: 0 659 | } 660 | convolution_param { 661 | num_output: 512 662 | pad: 0 663 | kernel_size: 1 664 | weight_filler { 665 | type: "gaussian" 666 | std: 0.01 667 | } 668 | bias_filler { 669 | type: "constant" 670 | } 671 | } 672 | } 673 | layer { 674 | name: "relu5_4_CPM_L2" 675 | type: "ReLU" 676 | bottom: "conv5_4_CPM_L2" 677 | top: "conv5_4_CPM_L2" 678 | } 679 | layer { 680 | name: "conv5_5_CPM_L1" 681 | type: "Convolution" 682 | bottom: "conv5_4_CPM_L1" 683 | top: "conv5_5_CPM_L1" 684 | param { 685 | lr_mult: 1.0 686 | decay_mult: 1 687 | } 688 | param { 689 | lr_mult: 2.0 690 | decay_mult: 0 691 | } 692 | convolution_param { 693 | num_output: 28 694 | pad: 0 695 | kernel_size: 1 696 | weight_filler { 697 | type: "gaussian" 698 | std: 0.01 699 | } 700 | bias_filler { 701 | type: "constant" 702 | } 703 | } 704 | } 705 | layer { 706 | name: "conv5_5_CPM_L2" 707 | type: "Convolution" 708 | bottom: "conv5_4_CPM_L2" 709 | top: "conv5_5_CPM_L2" 710 | param { 711 | lr_mult: 1.0 712 | decay_mult: 1 713 | } 714 | param { 715 | lr_mult: 2.0 716 | decay_mult: 0 717 | } 718 | convolution_param { 719 | num_output: 16 720 | pad: 0 721 | kernel_size: 1 722 | weight_filler { 723 | type: "gaussian" 724 | std: 0.01 725 | } 726 | bias_filler { 727 | type: "constant" 728 | } 729 | } 730 | } 731 | layer { 732 | name: "concat_stage2" 733 | type: "Concat" 734 | bottom: "conv5_5_CPM_L1" 735 | bottom: "conv5_5_CPM_L2" 736 | bottom: "conv4_4_CPM" 737 | top: "concat_stage2" 738 | concat_param { 739 | axis: 1 740 | } 741 | } 742 | layer { 743 | name: "Mconv1_stage2_L1" 744 | type: "Convolution" 745 | bottom: "concat_stage2" 746 | top: "Mconv1_stage2_L1" 747 | param { 748 | lr_mult: 4.0 749 | decay_mult: 1 750 | } 751 | param { 752 | lr_mult: 8.0 753 | decay_mult: 0 754 | } 755 | convolution_param { 756 | num_output: 128 757 | pad: 3 758 | kernel_size: 7 759 | weight_filler { 760 | type: "gaussian" 761 | std: 0.01 762 | } 763 | bias_filler { 764 | type: "constant" 765 | } 766 | } 767 | } 768 | layer { 769 | name: "Mrelu1_stage2_L1" 770 | type: "ReLU" 771 | bottom: "Mconv1_stage2_L1" 772 | top: "Mconv1_stage2_L1" 773 | } 774 | layer { 775 | name: "Mconv1_stage2_L2" 776 | type: "Convolution" 777 | bottom: "concat_stage2" 778 | top: "Mconv1_stage2_L2" 779 | param { 780 | lr_mult: 4.0 781 | decay_mult: 1 782 | } 783 | param { 784 | lr_mult: 8.0 785 | decay_mult: 0 786 | } 787 | convolution_param { 788 | num_output: 128 789 | pad: 3 790 | kernel_size: 7 791 | weight_filler { 792 | type: "gaussian" 793 | std: 0.01 794 | } 795 | bias_filler { 796 | type: "constant" 797 | } 798 | } 799 | } 800 | layer { 801 | name: "Mrelu1_stage2_L2" 802 | type: "ReLU" 803 | bottom: "Mconv1_stage2_L2" 804 | top: "Mconv1_stage2_L2" 805 | } 806 | layer { 807 | name: "Mconv2_stage2_L1" 808 | type: "Convolution" 809 | bottom: "Mconv1_stage2_L1" 810 | top: "Mconv2_stage2_L1" 811 | param { 812 | lr_mult: 4.0 813 | decay_mult: 1 814 | } 815 | param { 816 | lr_mult: 8.0 817 | decay_mult: 0 818 | } 819 | convolution_param { 820 | num_output: 128 821 | pad: 3 822 | kernel_size: 7 823 | weight_filler { 824 | type: "gaussian" 825 | std: 0.01 826 | } 827 | bias_filler { 828 | type: "constant" 829 | } 830 | } 831 | } 832 | layer { 833 | name: "Mrelu2_stage2_L1" 834 | type: "ReLU" 835 | bottom: "Mconv2_stage2_L1" 836 | top: "Mconv2_stage2_L1" 837 | } 838 | layer { 839 | name: "Mconv2_stage2_L2" 840 | type: "Convolution" 841 | bottom: "Mconv1_stage2_L2" 842 | top: "Mconv2_stage2_L2" 843 | param { 844 | lr_mult: 4.0 845 | decay_mult: 1 846 | } 847 | param { 848 | lr_mult: 8.0 849 | decay_mult: 0 850 | } 851 | convolution_param { 852 | num_output: 128 853 | pad: 3 854 | kernel_size: 7 855 | weight_filler { 856 | type: "gaussian" 857 | std: 0.01 858 | } 859 | bias_filler { 860 | type: "constant" 861 | } 862 | } 863 | } 864 | layer { 865 | name: "Mrelu2_stage2_L2" 866 | type: "ReLU" 867 | bottom: "Mconv2_stage2_L2" 868 | top: "Mconv2_stage2_L2" 869 | } 870 | layer { 871 | name: "Mconv3_stage2_L1" 872 | type: "Convolution" 873 | bottom: "Mconv2_stage2_L1" 874 | top: "Mconv3_stage2_L1" 875 | param { 876 | lr_mult: 4.0 877 | decay_mult: 1 878 | } 879 | param { 880 | lr_mult: 8.0 881 | decay_mult: 0 882 | } 883 | convolution_param { 884 | num_output: 128 885 | pad: 3 886 | kernel_size: 7 887 | weight_filler { 888 | type: "gaussian" 889 | std: 0.01 890 | } 891 | bias_filler { 892 | type: "constant" 893 | } 894 | } 895 | } 896 | layer { 897 | name: "Mrelu3_stage2_L1" 898 | type: "ReLU" 899 | bottom: "Mconv3_stage2_L1" 900 | top: "Mconv3_stage2_L1" 901 | } 902 | layer { 903 | name: "Mconv3_stage2_L2" 904 | type: "Convolution" 905 | bottom: "Mconv2_stage2_L2" 906 | top: "Mconv3_stage2_L2" 907 | param { 908 | lr_mult: 4.0 909 | decay_mult: 1 910 | } 911 | param { 912 | lr_mult: 8.0 913 | decay_mult: 0 914 | } 915 | convolution_param { 916 | num_output: 128 917 | pad: 3 918 | kernel_size: 7 919 | weight_filler { 920 | type: "gaussian" 921 | std: 0.01 922 | } 923 | bias_filler { 924 | type: "constant" 925 | } 926 | } 927 | } 928 | layer { 929 | name: "Mrelu3_stage2_L2" 930 | type: "ReLU" 931 | bottom: "Mconv3_stage2_L2" 932 | top: "Mconv3_stage2_L2" 933 | } 934 | layer { 935 | name: "Mconv4_stage2_L1" 936 | type: "Convolution" 937 | bottom: "Mconv3_stage2_L1" 938 | top: "Mconv4_stage2_L1" 939 | param { 940 | lr_mult: 4.0 941 | decay_mult: 1 942 | } 943 | param { 944 | lr_mult: 8.0 945 | decay_mult: 0 946 | } 947 | convolution_param { 948 | num_output: 128 949 | pad: 3 950 | kernel_size: 7 951 | weight_filler { 952 | type: "gaussian" 953 | std: 0.01 954 | } 955 | bias_filler { 956 | type: "constant" 957 | } 958 | } 959 | } 960 | layer { 961 | name: "Mrelu4_stage2_L1" 962 | type: "ReLU" 963 | bottom: "Mconv4_stage2_L1" 964 | top: "Mconv4_stage2_L1" 965 | } 966 | layer { 967 | name: "Mconv4_stage2_L2" 968 | type: "Convolution" 969 | bottom: "Mconv3_stage2_L2" 970 | top: "Mconv4_stage2_L2" 971 | param { 972 | lr_mult: 4.0 973 | decay_mult: 1 974 | } 975 | param { 976 | lr_mult: 8.0 977 | decay_mult: 0 978 | } 979 | convolution_param { 980 | num_output: 128 981 | pad: 3 982 | kernel_size: 7 983 | weight_filler { 984 | type: "gaussian" 985 | std: 0.01 986 | } 987 | bias_filler { 988 | type: "constant" 989 | } 990 | } 991 | } 992 | layer { 993 | name: "Mrelu4_stage2_L2" 994 | type: "ReLU" 995 | bottom: "Mconv4_stage2_L2" 996 | top: "Mconv4_stage2_L2" 997 | } 998 | layer { 999 | name: "Mconv5_stage2_L1" 1000 | type: "Convolution" 1001 | bottom: "Mconv4_stage2_L1" 1002 | top: "Mconv5_stage2_L1" 1003 | param { 1004 | lr_mult: 4.0 1005 | decay_mult: 1 1006 | } 1007 | param { 1008 | lr_mult: 8.0 1009 | decay_mult: 0 1010 | } 1011 | convolution_param { 1012 | num_output: 128 1013 | pad: 3 1014 | kernel_size: 7 1015 | weight_filler { 1016 | type: "gaussian" 1017 | std: 0.01 1018 | } 1019 | bias_filler { 1020 | type: "constant" 1021 | } 1022 | } 1023 | } 1024 | layer { 1025 | name: "Mrelu5_stage2_L1" 1026 | type: "ReLU" 1027 | bottom: "Mconv5_stage2_L1" 1028 | top: "Mconv5_stage2_L1" 1029 | } 1030 | layer { 1031 | name: "Mconv5_stage2_L2" 1032 | type: "Convolution" 1033 | bottom: "Mconv4_stage2_L2" 1034 | top: "Mconv5_stage2_L2" 1035 | param { 1036 | lr_mult: 4.0 1037 | decay_mult: 1 1038 | } 1039 | param { 1040 | lr_mult: 8.0 1041 | decay_mult: 0 1042 | } 1043 | convolution_param { 1044 | num_output: 128 1045 | pad: 3 1046 | kernel_size: 7 1047 | weight_filler { 1048 | type: "gaussian" 1049 | std: 0.01 1050 | } 1051 | bias_filler { 1052 | type: "constant" 1053 | } 1054 | } 1055 | } 1056 | layer { 1057 | name: "Mrelu5_stage2_L2" 1058 | type: "ReLU" 1059 | bottom: "Mconv5_stage2_L2" 1060 | top: "Mconv5_stage2_L2" 1061 | } 1062 | layer { 1063 | name: "Mconv6_stage2_L1" 1064 | type: "Convolution" 1065 | bottom: "Mconv5_stage2_L1" 1066 | top: "Mconv6_stage2_L1" 1067 | param { 1068 | lr_mult: 4.0 1069 | decay_mult: 1 1070 | } 1071 | param { 1072 | lr_mult: 8.0 1073 | decay_mult: 0 1074 | } 1075 | convolution_param { 1076 | num_output: 128 1077 | pad: 0 1078 | kernel_size: 1 1079 | weight_filler { 1080 | type: "gaussian" 1081 | std: 0.01 1082 | } 1083 | bias_filler { 1084 | type: "constant" 1085 | } 1086 | } 1087 | } 1088 | layer { 1089 | name: "Mrelu6_stage2_L1" 1090 | type: "ReLU" 1091 | bottom: "Mconv6_stage2_L1" 1092 | top: "Mconv6_stage2_L1" 1093 | } 1094 | layer { 1095 | name: "Mconv6_stage2_L2" 1096 | type: "Convolution" 1097 | bottom: "Mconv5_stage2_L2" 1098 | top: "Mconv6_stage2_L2" 1099 | param { 1100 | lr_mult: 4.0 1101 | decay_mult: 1 1102 | } 1103 | param { 1104 | lr_mult: 8.0 1105 | decay_mult: 0 1106 | } 1107 | convolution_param { 1108 | num_output: 128 1109 | pad: 0 1110 | kernel_size: 1 1111 | weight_filler { 1112 | type: "gaussian" 1113 | std: 0.01 1114 | } 1115 | bias_filler { 1116 | type: "constant" 1117 | } 1118 | } 1119 | } 1120 | layer { 1121 | name: "Mrelu6_stage2_L2" 1122 | type: "ReLU" 1123 | bottom: "Mconv6_stage2_L2" 1124 | top: "Mconv6_stage2_L2" 1125 | } 1126 | layer { 1127 | name: "Mconv7_stage2_L1" 1128 | type: "Convolution" 1129 | bottom: "Mconv6_stage2_L1" 1130 | top: "Mconv7_stage2_L1" 1131 | param { 1132 | lr_mult: 4.0 1133 | decay_mult: 1 1134 | } 1135 | param { 1136 | lr_mult: 8.0 1137 | decay_mult: 0 1138 | } 1139 | convolution_param { 1140 | num_output: 28 1141 | pad: 0 1142 | kernel_size: 1 1143 | weight_filler { 1144 | type: "gaussian" 1145 | std: 0.01 1146 | } 1147 | bias_filler { 1148 | type: "constant" 1149 | } 1150 | } 1151 | } 1152 | layer { 1153 | name: "Mconv7_stage2_L2" 1154 | type: "Convolution" 1155 | bottom: "Mconv6_stage2_L2" 1156 | top: "Mconv7_stage2_L2" 1157 | param { 1158 | lr_mult: 4.0 1159 | decay_mult: 1 1160 | } 1161 | param { 1162 | lr_mult: 8.0 1163 | decay_mult: 0 1164 | } 1165 | convolution_param { 1166 | num_output: 16 1167 | pad: 0 1168 | kernel_size: 1 1169 | weight_filler { 1170 | type: "gaussian" 1171 | std: 0.01 1172 | } 1173 | bias_filler { 1174 | type: "constant" 1175 | } 1176 | } 1177 | } 1178 | layer { 1179 | name: "concat_stage3" 1180 | type: "Concat" 1181 | bottom: "Mconv7_stage2_L1" 1182 | bottom: "Mconv7_stage2_L2" 1183 | bottom: "conv4_4_CPM" 1184 | top: "concat_stage3" 1185 | concat_param { 1186 | axis: 1 1187 | } 1188 | } 1189 | layer { 1190 | name: "Mconv1_stage3_L1" 1191 | type: "Convolution" 1192 | bottom: "concat_stage3" 1193 | top: "Mconv1_stage3_L1" 1194 | param { 1195 | lr_mult: 4.0 1196 | decay_mult: 1 1197 | } 1198 | param { 1199 | lr_mult: 8.0 1200 | decay_mult: 0 1201 | } 1202 | convolution_param { 1203 | num_output: 128 1204 | pad: 3 1205 | kernel_size: 7 1206 | weight_filler { 1207 | type: "gaussian" 1208 | std: 0.01 1209 | } 1210 | bias_filler { 1211 | type: "constant" 1212 | } 1213 | } 1214 | } 1215 | layer { 1216 | name: "Mrelu1_stage3_L1" 1217 | type: "ReLU" 1218 | bottom: "Mconv1_stage3_L1" 1219 | top: "Mconv1_stage3_L1" 1220 | } 1221 | layer { 1222 | name: "Mconv1_stage3_L2" 1223 | type: "Convolution" 1224 | bottom: "concat_stage3" 1225 | top: "Mconv1_stage3_L2" 1226 | param { 1227 | lr_mult: 4.0 1228 | decay_mult: 1 1229 | } 1230 | param { 1231 | lr_mult: 8.0 1232 | decay_mult: 0 1233 | } 1234 | convolution_param { 1235 | num_output: 128 1236 | pad: 3 1237 | kernel_size: 7 1238 | weight_filler { 1239 | type: "gaussian" 1240 | std: 0.01 1241 | } 1242 | bias_filler { 1243 | type: "constant" 1244 | } 1245 | } 1246 | } 1247 | layer { 1248 | name: "Mrelu1_stage3_L2" 1249 | type: "ReLU" 1250 | bottom: "Mconv1_stage3_L2" 1251 | top: "Mconv1_stage3_L2" 1252 | } 1253 | layer { 1254 | name: "Mconv2_stage3_L1" 1255 | type: "Convolution" 1256 | bottom: "Mconv1_stage3_L1" 1257 | top: "Mconv2_stage3_L1" 1258 | param { 1259 | lr_mult: 4.0 1260 | decay_mult: 1 1261 | } 1262 | param { 1263 | lr_mult: 8.0 1264 | decay_mult: 0 1265 | } 1266 | convolution_param { 1267 | num_output: 128 1268 | pad: 3 1269 | kernel_size: 7 1270 | weight_filler { 1271 | type: "gaussian" 1272 | std: 0.01 1273 | } 1274 | bias_filler { 1275 | type: "constant" 1276 | } 1277 | } 1278 | } 1279 | layer { 1280 | name: "Mrelu2_stage3_L1" 1281 | type: "ReLU" 1282 | bottom: "Mconv2_stage3_L1" 1283 | top: "Mconv2_stage3_L1" 1284 | } 1285 | layer { 1286 | name: "Mconv2_stage3_L2" 1287 | type: "Convolution" 1288 | bottom: "Mconv1_stage3_L2" 1289 | top: "Mconv2_stage3_L2" 1290 | param { 1291 | lr_mult: 4.0 1292 | decay_mult: 1 1293 | } 1294 | param { 1295 | lr_mult: 8.0 1296 | decay_mult: 0 1297 | } 1298 | convolution_param { 1299 | num_output: 128 1300 | pad: 3 1301 | kernel_size: 7 1302 | weight_filler { 1303 | type: "gaussian" 1304 | std: 0.01 1305 | } 1306 | bias_filler { 1307 | type: "constant" 1308 | } 1309 | } 1310 | } 1311 | layer { 1312 | name: "Mrelu2_stage3_L2" 1313 | type: "ReLU" 1314 | bottom: "Mconv2_stage3_L2" 1315 | top: "Mconv2_stage3_L2" 1316 | } 1317 | layer { 1318 | name: "Mconv3_stage3_L1" 1319 | type: "Convolution" 1320 | bottom: "Mconv2_stage3_L1" 1321 | top: "Mconv3_stage3_L1" 1322 | param { 1323 | lr_mult: 4.0 1324 | decay_mult: 1 1325 | } 1326 | param { 1327 | lr_mult: 8.0 1328 | decay_mult: 0 1329 | } 1330 | convolution_param { 1331 | num_output: 128 1332 | pad: 3 1333 | kernel_size: 7 1334 | weight_filler { 1335 | type: "gaussian" 1336 | std: 0.01 1337 | } 1338 | bias_filler { 1339 | type: "constant" 1340 | } 1341 | } 1342 | } 1343 | layer { 1344 | name: "Mrelu3_stage3_L1" 1345 | type: "ReLU" 1346 | bottom: "Mconv3_stage3_L1" 1347 | top: "Mconv3_stage3_L1" 1348 | } 1349 | layer { 1350 | name: "Mconv3_stage3_L2" 1351 | type: "Convolution" 1352 | bottom: "Mconv2_stage3_L2" 1353 | top: "Mconv3_stage3_L2" 1354 | param { 1355 | lr_mult: 4.0 1356 | decay_mult: 1 1357 | } 1358 | param { 1359 | lr_mult: 8.0 1360 | decay_mult: 0 1361 | } 1362 | convolution_param { 1363 | num_output: 128 1364 | pad: 3 1365 | kernel_size: 7 1366 | weight_filler { 1367 | type: "gaussian" 1368 | std: 0.01 1369 | } 1370 | bias_filler { 1371 | type: "constant" 1372 | } 1373 | } 1374 | } 1375 | layer { 1376 | name: "Mrelu3_stage3_L2" 1377 | type: "ReLU" 1378 | bottom: "Mconv3_stage3_L2" 1379 | top: "Mconv3_stage3_L2" 1380 | } 1381 | layer { 1382 | name: "Mconv4_stage3_L1" 1383 | type: "Convolution" 1384 | bottom: "Mconv3_stage3_L1" 1385 | top: "Mconv4_stage3_L1" 1386 | param { 1387 | lr_mult: 4.0 1388 | decay_mult: 1 1389 | } 1390 | param { 1391 | lr_mult: 8.0 1392 | decay_mult: 0 1393 | } 1394 | convolution_param { 1395 | num_output: 128 1396 | pad: 3 1397 | kernel_size: 7 1398 | weight_filler { 1399 | type: "gaussian" 1400 | std: 0.01 1401 | } 1402 | bias_filler { 1403 | type: "constant" 1404 | } 1405 | } 1406 | } 1407 | layer { 1408 | name: "Mrelu4_stage3_L1" 1409 | type: "ReLU" 1410 | bottom: "Mconv4_stage3_L1" 1411 | top: "Mconv4_stage3_L1" 1412 | } 1413 | layer { 1414 | name: "Mconv4_stage3_L2" 1415 | type: "Convolution" 1416 | bottom: "Mconv3_stage3_L2" 1417 | top: "Mconv4_stage3_L2" 1418 | param { 1419 | lr_mult: 4.0 1420 | decay_mult: 1 1421 | } 1422 | param { 1423 | lr_mult: 8.0 1424 | decay_mult: 0 1425 | } 1426 | convolution_param { 1427 | num_output: 128 1428 | pad: 3 1429 | kernel_size: 7 1430 | weight_filler { 1431 | type: "gaussian" 1432 | std: 0.01 1433 | } 1434 | bias_filler { 1435 | type: "constant" 1436 | } 1437 | } 1438 | } 1439 | layer { 1440 | name: "Mrelu4_stage3_L2" 1441 | type: "ReLU" 1442 | bottom: "Mconv4_stage3_L2" 1443 | top: "Mconv4_stage3_L2" 1444 | } 1445 | layer { 1446 | name: "Mconv5_stage3_L1" 1447 | type: "Convolution" 1448 | bottom: "Mconv4_stage3_L1" 1449 | top: "Mconv5_stage3_L1" 1450 | param { 1451 | lr_mult: 4.0 1452 | decay_mult: 1 1453 | } 1454 | param { 1455 | lr_mult: 8.0 1456 | decay_mult: 0 1457 | } 1458 | convolution_param { 1459 | num_output: 128 1460 | pad: 3 1461 | kernel_size: 7 1462 | weight_filler { 1463 | type: "gaussian" 1464 | std: 0.01 1465 | } 1466 | bias_filler { 1467 | type: "constant" 1468 | } 1469 | } 1470 | } 1471 | layer { 1472 | name: "Mrelu5_stage3_L1" 1473 | type: "ReLU" 1474 | bottom: "Mconv5_stage3_L1" 1475 | top: "Mconv5_stage3_L1" 1476 | } 1477 | layer { 1478 | name: "Mconv5_stage3_L2" 1479 | type: "Convolution" 1480 | bottom: "Mconv4_stage3_L2" 1481 | top: "Mconv5_stage3_L2" 1482 | param { 1483 | lr_mult: 4.0 1484 | decay_mult: 1 1485 | } 1486 | param { 1487 | lr_mult: 8.0 1488 | decay_mult: 0 1489 | } 1490 | convolution_param { 1491 | num_output: 128 1492 | pad: 3 1493 | kernel_size: 7 1494 | weight_filler { 1495 | type: "gaussian" 1496 | std: 0.01 1497 | } 1498 | bias_filler { 1499 | type: "constant" 1500 | } 1501 | } 1502 | } 1503 | layer { 1504 | name: "Mrelu5_stage3_L2" 1505 | type: "ReLU" 1506 | bottom: "Mconv5_stage3_L2" 1507 | top: "Mconv5_stage3_L2" 1508 | } 1509 | layer { 1510 | name: "Mconv6_stage3_L1" 1511 | type: "Convolution" 1512 | bottom: "Mconv5_stage3_L1" 1513 | top: "Mconv6_stage3_L1" 1514 | param { 1515 | lr_mult: 4.0 1516 | decay_mult: 1 1517 | } 1518 | param { 1519 | lr_mult: 8.0 1520 | decay_mult: 0 1521 | } 1522 | convolution_param { 1523 | num_output: 128 1524 | pad: 0 1525 | kernel_size: 1 1526 | weight_filler { 1527 | type: "gaussian" 1528 | std: 0.01 1529 | } 1530 | bias_filler { 1531 | type: "constant" 1532 | } 1533 | } 1534 | } 1535 | layer { 1536 | name: "Mrelu6_stage3_L1" 1537 | type: "ReLU" 1538 | bottom: "Mconv6_stage3_L1" 1539 | top: "Mconv6_stage3_L1" 1540 | } 1541 | layer { 1542 | name: "Mconv6_stage3_L2" 1543 | type: "Convolution" 1544 | bottom: "Mconv5_stage3_L2" 1545 | top: "Mconv6_stage3_L2" 1546 | param { 1547 | lr_mult: 4.0 1548 | decay_mult: 1 1549 | } 1550 | param { 1551 | lr_mult: 8.0 1552 | decay_mult: 0 1553 | } 1554 | convolution_param { 1555 | num_output: 128 1556 | pad: 0 1557 | kernel_size: 1 1558 | weight_filler { 1559 | type: "gaussian" 1560 | std: 0.01 1561 | } 1562 | bias_filler { 1563 | type: "constant" 1564 | } 1565 | } 1566 | } 1567 | layer { 1568 | name: "Mrelu6_stage3_L2" 1569 | type: "ReLU" 1570 | bottom: "Mconv6_stage3_L2" 1571 | top: "Mconv6_stage3_L2" 1572 | } 1573 | layer { 1574 | name: "Mconv7_stage3_L1" 1575 | type: "Convolution" 1576 | bottom: "Mconv6_stage3_L1" 1577 | top: "Mconv7_stage3_L1" 1578 | param { 1579 | lr_mult: 4.0 1580 | decay_mult: 1 1581 | } 1582 | param { 1583 | lr_mult: 8.0 1584 | decay_mult: 0 1585 | } 1586 | convolution_param { 1587 | num_output: 28 1588 | pad: 0 1589 | kernel_size: 1 1590 | weight_filler { 1591 | type: "gaussian" 1592 | std: 0.01 1593 | } 1594 | bias_filler { 1595 | type: "constant" 1596 | } 1597 | } 1598 | } 1599 | layer { 1600 | name: "Mconv7_stage3_L2" 1601 | type: "Convolution" 1602 | bottom: "Mconv6_stage3_L2" 1603 | top: "Mconv7_stage3_L2" 1604 | param { 1605 | lr_mult: 4.0 1606 | decay_mult: 1 1607 | } 1608 | param { 1609 | lr_mult: 8.0 1610 | decay_mult: 0 1611 | } 1612 | convolution_param { 1613 | num_output: 16 1614 | pad: 0 1615 | kernel_size: 1 1616 | weight_filler { 1617 | type: "gaussian" 1618 | std: 0.01 1619 | } 1620 | bias_filler { 1621 | type: "constant" 1622 | } 1623 | } 1624 | } 1625 | layer { 1626 | name: "concat_stage4" 1627 | type: "Concat" 1628 | bottom: "Mconv7_stage3_L1" 1629 | bottom: "Mconv7_stage3_L2" 1630 | bottom: "conv4_4_CPM" 1631 | top: "concat_stage4" 1632 | concat_param { 1633 | axis: 1 1634 | } 1635 | } 1636 | layer { 1637 | name: "Mconv1_stage4_L1" 1638 | type: "Convolution" 1639 | bottom: "concat_stage4" 1640 | top: "Mconv1_stage4_L1" 1641 | param { 1642 | lr_mult: 4.0 1643 | decay_mult: 1 1644 | } 1645 | param { 1646 | lr_mult: 8.0 1647 | decay_mult: 0 1648 | } 1649 | convolution_param { 1650 | num_output: 128 1651 | pad: 3 1652 | kernel_size: 7 1653 | weight_filler { 1654 | type: "gaussian" 1655 | std: 0.01 1656 | } 1657 | bias_filler { 1658 | type: "constant" 1659 | } 1660 | } 1661 | } 1662 | layer { 1663 | name: "Mrelu1_stage4_L1" 1664 | type: "ReLU" 1665 | bottom: "Mconv1_stage4_L1" 1666 | top: "Mconv1_stage4_L1" 1667 | } 1668 | layer { 1669 | name: "Mconv1_stage4_L2" 1670 | type: "Convolution" 1671 | bottom: "concat_stage4" 1672 | top: "Mconv1_stage4_L2" 1673 | param { 1674 | lr_mult: 4.0 1675 | decay_mult: 1 1676 | } 1677 | param { 1678 | lr_mult: 8.0 1679 | decay_mult: 0 1680 | } 1681 | convolution_param { 1682 | num_output: 128 1683 | pad: 3 1684 | kernel_size: 7 1685 | weight_filler { 1686 | type: "gaussian" 1687 | std: 0.01 1688 | } 1689 | bias_filler { 1690 | type: "constant" 1691 | } 1692 | } 1693 | } 1694 | layer { 1695 | name: "Mrelu1_stage4_L2" 1696 | type: "ReLU" 1697 | bottom: "Mconv1_stage4_L2" 1698 | top: "Mconv1_stage4_L2" 1699 | } 1700 | layer { 1701 | name: "Mconv2_stage4_L1" 1702 | type: "Convolution" 1703 | bottom: "Mconv1_stage4_L1" 1704 | top: "Mconv2_stage4_L1" 1705 | param { 1706 | lr_mult: 4.0 1707 | decay_mult: 1 1708 | } 1709 | param { 1710 | lr_mult: 8.0 1711 | decay_mult: 0 1712 | } 1713 | convolution_param { 1714 | num_output: 128 1715 | pad: 3 1716 | kernel_size: 7 1717 | weight_filler { 1718 | type: "gaussian" 1719 | std: 0.01 1720 | } 1721 | bias_filler { 1722 | type: "constant" 1723 | } 1724 | } 1725 | } 1726 | layer { 1727 | name: "Mrelu2_stage4_L1" 1728 | type: "ReLU" 1729 | bottom: "Mconv2_stage4_L1" 1730 | top: "Mconv2_stage4_L1" 1731 | } 1732 | layer { 1733 | name: "Mconv2_stage4_L2" 1734 | type: "Convolution" 1735 | bottom: "Mconv1_stage4_L2" 1736 | top: "Mconv2_stage4_L2" 1737 | param { 1738 | lr_mult: 4.0 1739 | decay_mult: 1 1740 | } 1741 | param { 1742 | lr_mult: 8.0 1743 | decay_mult: 0 1744 | } 1745 | convolution_param { 1746 | num_output: 128 1747 | pad: 3 1748 | kernel_size: 7 1749 | weight_filler { 1750 | type: "gaussian" 1751 | std: 0.01 1752 | } 1753 | bias_filler { 1754 | type: "constant" 1755 | } 1756 | } 1757 | } 1758 | layer { 1759 | name: "Mrelu2_stage4_L2" 1760 | type: "ReLU" 1761 | bottom: "Mconv2_stage4_L2" 1762 | top: "Mconv2_stage4_L2" 1763 | } 1764 | layer { 1765 | name: "Mconv3_stage4_L1" 1766 | type: "Convolution" 1767 | bottom: "Mconv2_stage4_L1" 1768 | top: "Mconv3_stage4_L1" 1769 | param { 1770 | lr_mult: 4.0 1771 | decay_mult: 1 1772 | } 1773 | param { 1774 | lr_mult: 8.0 1775 | decay_mult: 0 1776 | } 1777 | convolution_param { 1778 | num_output: 128 1779 | pad: 3 1780 | kernel_size: 7 1781 | weight_filler { 1782 | type: "gaussian" 1783 | std: 0.01 1784 | } 1785 | bias_filler { 1786 | type: "constant" 1787 | } 1788 | } 1789 | } 1790 | layer { 1791 | name: "Mrelu3_stage4_L1" 1792 | type: "ReLU" 1793 | bottom: "Mconv3_stage4_L1" 1794 | top: "Mconv3_stage4_L1" 1795 | } 1796 | layer { 1797 | name: "Mconv3_stage4_L2" 1798 | type: "Convolution" 1799 | bottom: "Mconv2_stage4_L2" 1800 | top: "Mconv3_stage4_L2" 1801 | param { 1802 | lr_mult: 4.0 1803 | decay_mult: 1 1804 | } 1805 | param { 1806 | lr_mult: 8.0 1807 | decay_mult: 0 1808 | } 1809 | convolution_param { 1810 | num_output: 128 1811 | pad: 3 1812 | kernel_size: 7 1813 | weight_filler { 1814 | type: "gaussian" 1815 | std: 0.01 1816 | } 1817 | bias_filler { 1818 | type: "constant" 1819 | } 1820 | } 1821 | } 1822 | layer { 1823 | name: "Mrelu3_stage4_L2" 1824 | type: "ReLU" 1825 | bottom: "Mconv3_stage4_L2" 1826 | top: "Mconv3_stage4_L2" 1827 | } 1828 | layer { 1829 | name: "Mconv4_stage4_L1" 1830 | type: "Convolution" 1831 | bottom: "Mconv3_stage4_L1" 1832 | top: "Mconv4_stage4_L1" 1833 | param { 1834 | lr_mult: 4.0 1835 | decay_mult: 1 1836 | } 1837 | param { 1838 | lr_mult: 8.0 1839 | decay_mult: 0 1840 | } 1841 | convolution_param { 1842 | num_output: 128 1843 | pad: 3 1844 | kernel_size: 7 1845 | weight_filler { 1846 | type: "gaussian" 1847 | std: 0.01 1848 | } 1849 | bias_filler { 1850 | type: "constant" 1851 | } 1852 | } 1853 | } 1854 | layer { 1855 | name: "Mrelu4_stage4_L1" 1856 | type: "ReLU" 1857 | bottom: "Mconv4_stage4_L1" 1858 | top: "Mconv4_stage4_L1" 1859 | } 1860 | layer { 1861 | name: "Mconv4_stage4_L2" 1862 | type: "Convolution" 1863 | bottom: "Mconv3_stage4_L2" 1864 | top: "Mconv4_stage4_L2" 1865 | param { 1866 | lr_mult: 4.0 1867 | decay_mult: 1 1868 | } 1869 | param { 1870 | lr_mult: 8.0 1871 | decay_mult: 0 1872 | } 1873 | convolution_param { 1874 | num_output: 128 1875 | pad: 3 1876 | kernel_size: 7 1877 | weight_filler { 1878 | type: "gaussian" 1879 | std: 0.01 1880 | } 1881 | bias_filler { 1882 | type: "constant" 1883 | } 1884 | } 1885 | } 1886 | layer { 1887 | name: "Mrelu4_stage4_L2" 1888 | type: "ReLU" 1889 | bottom: "Mconv4_stage4_L2" 1890 | top: "Mconv4_stage4_L2" 1891 | } 1892 | layer { 1893 | name: "Mconv5_stage4_L1" 1894 | type: "Convolution" 1895 | bottom: "Mconv4_stage4_L1" 1896 | top: "Mconv5_stage4_L1" 1897 | param { 1898 | lr_mult: 4.0 1899 | decay_mult: 1 1900 | } 1901 | param { 1902 | lr_mult: 8.0 1903 | decay_mult: 0 1904 | } 1905 | convolution_param { 1906 | num_output: 128 1907 | pad: 3 1908 | kernel_size: 7 1909 | weight_filler { 1910 | type: "gaussian" 1911 | std: 0.01 1912 | } 1913 | bias_filler { 1914 | type: "constant" 1915 | } 1916 | } 1917 | } 1918 | layer { 1919 | name: "Mrelu5_stage4_L1" 1920 | type: "ReLU" 1921 | bottom: "Mconv5_stage4_L1" 1922 | top: "Mconv5_stage4_L1" 1923 | } 1924 | layer { 1925 | name: "Mconv5_stage4_L2" 1926 | type: "Convolution" 1927 | bottom: "Mconv4_stage4_L2" 1928 | top: "Mconv5_stage4_L2" 1929 | param { 1930 | lr_mult: 4.0 1931 | decay_mult: 1 1932 | } 1933 | param { 1934 | lr_mult: 8.0 1935 | decay_mult: 0 1936 | } 1937 | convolution_param { 1938 | num_output: 128 1939 | pad: 3 1940 | kernel_size: 7 1941 | weight_filler { 1942 | type: "gaussian" 1943 | std: 0.01 1944 | } 1945 | bias_filler { 1946 | type: "constant" 1947 | } 1948 | } 1949 | } 1950 | layer { 1951 | name: "Mrelu5_stage4_L2" 1952 | type: "ReLU" 1953 | bottom: "Mconv5_stage4_L2" 1954 | top: "Mconv5_stage4_L2" 1955 | } 1956 | layer { 1957 | name: "Mconv6_stage4_L1" 1958 | type: "Convolution" 1959 | bottom: "Mconv5_stage4_L1" 1960 | top: "Mconv6_stage4_L1" 1961 | param { 1962 | lr_mult: 4.0 1963 | decay_mult: 1 1964 | } 1965 | param { 1966 | lr_mult: 8.0 1967 | decay_mult: 0 1968 | } 1969 | convolution_param { 1970 | num_output: 128 1971 | pad: 0 1972 | kernel_size: 1 1973 | weight_filler { 1974 | type: "gaussian" 1975 | std: 0.01 1976 | } 1977 | bias_filler { 1978 | type: "constant" 1979 | } 1980 | } 1981 | } 1982 | layer { 1983 | name: "Mrelu6_stage4_L1" 1984 | type: "ReLU" 1985 | bottom: "Mconv6_stage4_L1" 1986 | top: "Mconv6_stage4_L1" 1987 | } 1988 | layer { 1989 | name: "Mconv6_stage4_L2" 1990 | type: "Convolution" 1991 | bottom: "Mconv5_stage4_L2" 1992 | top: "Mconv6_stage4_L2" 1993 | param { 1994 | lr_mult: 4.0 1995 | decay_mult: 1 1996 | } 1997 | param { 1998 | lr_mult: 8.0 1999 | decay_mult: 0 2000 | } 2001 | convolution_param { 2002 | num_output: 128 2003 | pad: 0 2004 | kernel_size: 1 2005 | weight_filler { 2006 | type: "gaussian" 2007 | std: 0.01 2008 | } 2009 | bias_filler { 2010 | type: "constant" 2011 | } 2012 | } 2013 | } 2014 | layer { 2015 | name: "Mrelu6_stage4_L2" 2016 | type: "ReLU" 2017 | bottom: "Mconv6_stage4_L2" 2018 | top: "Mconv6_stage4_L2" 2019 | } 2020 | layer { 2021 | name: "Mconv7_stage4_L1" 2022 | type: "Convolution" 2023 | bottom: "Mconv6_stage4_L1" 2024 | top: "Mconv7_stage4_L1" 2025 | param { 2026 | lr_mult: 4.0 2027 | decay_mult: 1 2028 | } 2029 | param { 2030 | lr_mult: 8.0 2031 | decay_mult: 0 2032 | } 2033 | convolution_param { 2034 | num_output: 28 2035 | pad: 0 2036 | kernel_size: 1 2037 | weight_filler { 2038 | type: "gaussian" 2039 | std: 0.01 2040 | } 2041 | bias_filler { 2042 | type: "constant" 2043 | } 2044 | } 2045 | } 2046 | layer { 2047 | name: "Mconv7_stage4_L2" 2048 | type: "Convolution" 2049 | bottom: "Mconv6_stage4_L2" 2050 | top: "Mconv7_stage4_L2" 2051 | param { 2052 | lr_mult: 4.0 2053 | decay_mult: 1 2054 | } 2055 | param { 2056 | lr_mult: 8.0 2057 | decay_mult: 0 2058 | } 2059 | convolution_param { 2060 | num_output: 16 2061 | pad: 0 2062 | kernel_size: 1 2063 | weight_filler { 2064 | type: "gaussian" 2065 | std: 0.01 2066 | } 2067 | bias_filler { 2068 | type: "constant" 2069 | } 2070 | } 2071 | } 2072 | layer { 2073 | name: "concat_stage7" 2074 | type: "Concat" 2075 | bottom: "Mconv7_stage4_L2" 2076 | bottom: "Mconv7_stage4_L1" 2077 | top: "net_output" 2078 | concat_param { 2079 | axis: 1 2080 | } 2081 | } 2082 | -------------------------------------------------------------------------------- /pose-estimator-using-caffemodel/test_out/README.md: -------------------------------------------------------------------------------- 1 | This directory is for saving the after-processing outputs. 2 | -------------------------------------------------------------------------------- /test_out/CSBDGS_out.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/test_out/CSBDGS_out.jpg -------------------------------------------------------------------------------- /test_out/Friends_out.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/test_out/Friends_out.jpg -------------------------------------------------------------------------------- /test_out/Output-BODY_25.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/test_out/Output-BODY_25.jpg -------------------------------------------------------------------------------- /test_out/Output-COCO.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/test_out/Output-COCO.jpg -------------------------------------------------------------------------------- /test_out/Output-MPI.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/test_out/Output-MPI.jpg -------------------------------------------------------------------------------- /test_out/TBBT_out.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/test_out/TBBT_out.jpg -------------------------------------------------------------------------------- /test_out/WLWZ_out.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/test_out/WLWZ_out.jpg -------------------------------------------------------------------------------- /test_out/webcam_out.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LZQthePlane/OpenPose-Rebuilt-Python/d7c927ec2d3dd724a8822bb58c3c6a936450eb14/test_out/webcam_out.gif --------------------------------------------------------------------------------