├── .gitignore ├── README.md ├── demo.py ├── demo_camera.py ├── demo_video.py ├── images ├── body_preview.jpg ├── body_preview_estimation.jpg ├── body_preview_keypoints.jpg ├── demo.jpg ├── demo_preview.png ├── detect_hand_preview.jpg ├── hand.jpg ├── hand_preview.png ├── hand_preview_estimation.png ├── kc-e129SBb4-sample.processed.gif ├── keypoints_hand.png ├── keypoints_pose_18.png ├── skeleton.jpg ├── ski.jpg └── yOAmYSW3WyU-sample.small.processed.gif ├── model └── .gitkeep ├── notebooks ├── detectHand.ipynb ├── hand.ipynb └── network_graph.ipynb ├── requirements.txt └── src ├── __init__.py ├── body.py ├── hand.py ├── hand_model_output_size.json ├── hand_model_outputsize.py ├── model.py └── util.py /.gitignore: -------------------------------------------------------------------------------- 1 | hiddenlayer/ 2 | *.pth 3 | *.caffemodel 4 | .idea/ 5 | .ipynb_checkpoints/ 6 | __pycache__ 7 | *.prototxt 8 | videos/ 9 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## pytorch-openpose 2 | 3 | pytorch implementation of [openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) including **Body and Hand Pose Estimation**, and the pytorch model is directly converted from [openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) caffemodel by [caffemodel2pytorch](https://github.com/vadimkantorov/caffemodel2pytorch). You could implement face keypoint detection in the same way if you are interested in. Pay attention to that the face keypoint detector was trained using the procedure described in [Simon et al. 2017] for hands. 4 | 5 | openpose detects hand by the result of body pose estimation, please refer to the code of [handDetector.cpp](https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/src/openpose/hand/handDetector.cpp). 6 | In the paper, it states as: 7 | ``` 8 | This is an important detail: to use the keypoint detector in any practical situation, 9 | we need a way to generate this bounding box. 10 | We directly use the body pose estimation models from [29] and [4], 11 | and use the wrist and elbow position to approximate the hand location, 12 | assuming the hand extends 0.15 times the length of the forearm in the same direction. 13 | ``` 14 | 15 | If anybody wants a pure python wrapper, please refer to my [pytorch implementation](https://github.com/Hzzone/pytorch-openpose) of openpose, maybe it helps you to implement a standalone hand keypoint detector. 16 | 17 | Don't be mean to star this repo if it helps your research. 18 | 19 | ### Getting Started 20 | 21 | #### Install Requriements 22 | 23 | Create a python 3.7 environement, eg: 24 | 25 | conda create -n pytorch-openpose python=3.7 26 | conda activate pytorch-openpose 27 | 28 | Install pytorch by following the quick start guide here (use pip) https://download.pytorch.org/whl/torch_stable.html 29 | 30 | Install other requirements with pip 31 | 32 | pip install -r requirements.txt 33 | 34 | #### Download the Models 35 | 36 | * [dropbox](https://www.dropbox.com/sh/7xbup2qsn7vvjxo/AABWFksdlgOMXR_r5v3RwKRYa?dl=0) 37 | * [baiduyun](https://pan.baidu.com/s/1IlkvuSi0ocNckwbnUe7j-g) 38 | * [google drive](https://drive.google.com/drive/folders/1JsvI4M4ZTg98fmnCZLFM-3TeovnCRElG?usp=sharing) 39 | 40 | `*.pth` files are pytorch model, you could also download caffemodel file if you want to use caffe as backend. 41 | 42 | Download the pytorch models and put them in a directory named `model` in the project root directory 43 | 44 | #### Run the Demo 45 | 46 | Run: 47 | 48 | python demo_camera.py 49 | 50 | to run a demo with a feed from your webcam or run 51 | 52 | python demo.py 53 | 54 | to use a image from the images folder or run 55 | 56 | python demo_video.py 57 | 58 | to process a video file (requires [ffmpeg-python][ffmpeg]). 59 | 60 | [ffmpeg]: https://pypi.org/project/ffmpeg-python/ 61 | 62 | ### Todo list 63 | - [x] convert caffemodel to pytorch. 64 | - [x] Body Pose Estimation. 65 | - [x] Hand Pose Estimation. 66 | - [ ] Performance test. 67 | - [ ] Speed up. 68 | 69 | ### Demo 70 | #### Skeleton 71 | 72 | ![](images/skeleton.jpg) 73 | #### Body Pose Estimation 74 | 75 | ![](images/body_preview.jpg) 76 | 77 | #### Hand Pose Estimation 78 | ![](images/hand_preview.png) 79 | 80 | #### Body + Hand 81 | ![](images/demo_preview.png) 82 | 83 | #### Video Body 84 | 85 | ![](images/kc-e129SBb4-sample.processed.gif) 86 | 87 | Attribution: [this video](https://www.youtube.com/watch?v=kc-e129SBb4). 88 | 89 | #### Video Hand 90 | 91 | ![](images/yOAmYSW3WyU-sample.small.processed.gif) 92 | 93 | Attribution: [this video](https://www.youtube.com/watch?v=yOAmYSW3WyU). 94 | 95 | ### Citation 96 | Please cite these papers in your publications if it helps your research (the face keypoint detector was trained using the procedure described in [Simon et al. 2017] for hands): 97 | 98 | ``` 99 | @inproceedings{cao2017realtime, 100 | author = {Zhe Cao and Tomas Simon and Shih-En Wei and Yaser Sheikh}, 101 | booktitle = {CVPR}, 102 | title = {Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields}, 103 | year = {2017} 104 | } 105 | 106 | @inproceedings{simon2017hand, 107 | author = {Tomas Simon and Hanbyul Joo and Iain Matthews and Yaser Sheikh}, 108 | booktitle = {CVPR}, 109 | title = {Hand Keypoint Detection in Single Images using Multiview Bootstrapping}, 110 | year = {2017} 111 | } 112 | 113 | @inproceedings{wei2016cpm, 114 | author = {Shih-En Wei and Varun Ramakrishna and Takeo Kanade and Yaser Sheikh}, 115 | booktitle = {CVPR}, 116 | title = {Convolutional pose machines}, 117 | year = {2016} 118 | } 119 | ``` 120 | -------------------------------------------------------------------------------- /demo.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import matplotlib.pyplot as plt 3 | import copy 4 | import numpy as np 5 | 6 | from src import model 7 | from src import util 8 | from src.body import Body 9 | from src.hand import Hand 10 | 11 | body_estimation = Body('model/body_pose_model.pth') 12 | hand_estimation = Hand('model/hand_pose_model.pth') 13 | 14 | test_image = 'images/demo.jpg' 15 | oriImg = cv2.imread(test_image) # B,G,R order 16 | candidate, subset = body_estimation(oriImg) 17 | canvas = copy.deepcopy(oriImg) 18 | canvas = util.draw_bodypose(canvas, candidate, subset) 19 | # detect hand 20 | hands_list = util.handDetect(candidate, subset, oriImg) 21 | 22 | all_hand_peaks = [] 23 | for x, y, w, is_left in hands_list: 24 | # cv2.rectangle(canvas, (x, y), (x+w, y+w), (0, 255, 0), 2, lineType=cv2.LINE_AA) 25 | # cv2.putText(canvas, 'left' if is_left else 'right', (x, y), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2) 26 | 27 | # if is_left: 28 | # plt.imshow(oriImg[y:y+w, x:x+w, :][:, :, [2, 1, 0]]) 29 | # plt.show() 30 | peaks = hand_estimation(oriImg[y:y+w, x:x+w, :]) 31 | peaks[:, 0] = np.where(peaks[:, 0]==0, peaks[:, 0], peaks[:, 0]+x) 32 | peaks[:, 1] = np.where(peaks[:, 1]==0, peaks[:, 1], peaks[:, 1]+y) 33 | # else: 34 | # peaks = hand_estimation(cv2.flip(oriImg[y:y+w, x:x+w, :], 1)) 35 | # peaks[:, 0] = np.where(peaks[:, 0]==0, peaks[:, 0], w-peaks[:, 0]-1+x) 36 | # peaks[:, 1] = np.where(peaks[:, 1]==0, peaks[:, 1], peaks[:, 1]+y) 37 | # print(peaks) 38 | all_hand_peaks.append(peaks) 39 | 40 | canvas = util.draw_handpose(canvas, all_hand_peaks) 41 | 42 | plt.imshow(canvas[:, :, [2, 1, 0]]) 43 | plt.axis('off') 44 | plt.show() 45 | -------------------------------------------------------------------------------- /demo_camera.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import matplotlib.pyplot as plt 3 | import copy 4 | import numpy as np 5 | import torch 6 | 7 | from src import model 8 | from src import util 9 | from src.body import Body 10 | from src.hand import Hand 11 | 12 | body_estimation = Body('model/body_pose_model.pth') 13 | hand_estimation = Hand('model/hand_pose_model.pth') 14 | 15 | print(f"Torch device: {torch.cuda.get_device_name()}") 16 | 17 | cap = cv2.VideoCapture(0) 18 | cap.set(3, 640) 19 | cap.set(4, 480) 20 | while True: 21 | ret, oriImg = cap.read() 22 | candidate, subset = body_estimation(oriImg) 23 | canvas = copy.deepcopy(oriImg) 24 | canvas = util.draw_bodypose(canvas, candidate, subset) 25 | 26 | # detect hand 27 | hands_list = util.handDetect(candidate, subset, oriImg) 28 | 29 | all_hand_peaks = [] 30 | for x, y, w, is_left in hands_list: 31 | peaks = hand_estimation(oriImg[y:y+w, x:x+w, :]) 32 | peaks[:, 0] = np.where(peaks[:, 0]==0, peaks[:, 0], peaks[:, 0]+x) 33 | peaks[:, 1] = np.where(peaks[:, 1]==0, peaks[:, 1], peaks[:, 1]+y) 34 | all_hand_peaks.append(peaks) 35 | 36 | canvas = util.draw_handpose(canvas, all_hand_peaks) 37 | 38 | cv2.imshow('demo', canvas)#一个窗口用以显示原视频 39 | if cv2.waitKey(1) & 0xFF == ord('q'): 40 | break 41 | 42 | cap.release() 43 | cv2.destroyAllWindows() 44 | 45 | -------------------------------------------------------------------------------- /demo_video.py: -------------------------------------------------------------------------------- 1 | import copy 2 | import numpy as np 3 | import cv2 4 | from glob import glob 5 | import os 6 | import argparse 7 | import json 8 | 9 | # video file processing setup 10 | # from: https://stackoverflow.com/a/61927951 11 | import argparse 12 | import subprocess 13 | import sys 14 | from pathlib import Path 15 | from typing import NamedTuple 16 | 17 | 18 | class FFProbeResult(NamedTuple): 19 | return_code: int 20 | json: str 21 | error: str 22 | 23 | 24 | def ffprobe(file_path) -> FFProbeResult: 25 | command_array = ["ffprobe", 26 | "-v", "quiet", 27 | "-print_format", "json", 28 | "-show_format", 29 | "-show_streams", 30 | file_path] 31 | result = subprocess.run(command_array, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True) 32 | return FFProbeResult(return_code=result.returncode, 33 | json=result.stdout, 34 | error=result.stderr) 35 | 36 | 37 | # openpose setup 38 | from src import model 39 | from src import util 40 | from src.body import Body 41 | from src.hand import Hand 42 | 43 | body_estimation = Body('model/body_pose_model.pth') 44 | hand_estimation = Hand('model/hand_pose_model.pth') 45 | 46 | def process_frame(frame, body=True, hands=True): 47 | canvas = copy.deepcopy(frame) 48 | if body: 49 | candidate, subset = body_estimation(frame) 50 | canvas = util.draw_bodypose(canvas, candidate, subset) 51 | if hands: 52 | hands_list = util.handDetect(candidate, subset, frame) 53 | all_hand_peaks = [] 54 | for x, y, w, is_left in hands_list: 55 | peaks = hand_estimation(frame[y:y+w, x:x+w, :]) 56 | peaks[:, 0] = np.where(peaks[:, 0]==0, peaks[:, 0], peaks[:, 0]+x) 57 | peaks[:, 1] = np.where(peaks[:, 1]==0, peaks[:, 1], peaks[:, 1]+y) 58 | all_hand_peaks.append(peaks) 59 | canvas = util.draw_handpose(canvas, all_hand_peaks) 60 | return canvas 61 | 62 | # writing video with ffmpeg because cv2 writer failed 63 | # https://stackoverflow.com/questions/61036822/opencv-videowriter-produces-cant-find-starting-number-error 64 | import ffmpeg 65 | 66 | # open specified video 67 | parser = argparse.ArgumentParser( 68 | description="Process a video annotating poses detected.") 69 | parser.add_argument('file', type=str, help='Video file location to process.') 70 | parser.add_argument('--no_hands', action='store_true', help='No hand pose') 71 | parser.add_argument('--no_body', action='store_true', help='No body pose') 72 | args = parser.parse_args() 73 | video_file = args.file 74 | cap = cv2.VideoCapture(video_file) 75 | 76 | # get video file info 77 | ffprobe_result = ffprobe(args.file) 78 | info = json.loads(ffprobe_result.json) 79 | videoinfo = [i for i in info["streams"] if i["codec_type"] == "video"][0] 80 | input_fps = videoinfo["avg_frame_rate"] 81 | # input_fps = float(input_fps[0])/float(input_fps[1]) 82 | input_pix_fmt = videoinfo["pix_fmt"] 83 | input_vcodec = videoinfo["codec_name"] 84 | 85 | # define a writer object to write to a movidified file 86 | postfix = info["format"]["format_name"].split(",")[0] 87 | output_file = ".".join(video_file.split(".")[:-1])+".processed." + postfix 88 | 89 | 90 | class Writer(): 91 | def __init__(self, output_file, input_fps, input_framesize, input_pix_fmt, 92 | input_vcodec): 93 | if os.path.exists(output_file): 94 | os.remove(output_file) 95 | self.ff_proc = ( 96 | ffmpeg 97 | .input('pipe:', 98 | format='rawvideo', 99 | pix_fmt="bgr24", 100 | s='%sx%s'%(input_framesize[1],input_framesize[0]), 101 | r=input_fps) 102 | .output(output_file, pix_fmt=input_pix_fmt, vcodec=input_vcodec) 103 | .overwrite_output() 104 | .run_async(pipe_stdin=True) 105 | ) 106 | 107 | def __call__(self, frame): 108 | self.ff_proc.stdin.write(frame.tobytes()) 109 | 110 | def close(self): 111 | self.ff_proc.stdin.close() 112 | self.ff_proc.wait() 113 | 114 | 115 | writer = None 116 | while(cap.isOpened()): 117 | ret, frame = cap.read() 118 | if frame is None: 119 | break 120 | 121 | posed_frame = process_frame(frame, body=not args.no_body, 122 | hands=not args.no_hands) 123 | 124 | if writer is None: 125 | input_framesize = posed_frame.shape[:2] 126 | writer = Writer(output_file, input_fps, input_framesize, input_pix_fmt, 127 | input_vcodec) 128 | 129 | cv2.imshow('frame', posed_frame) 130 | 131 | # write the frame 132 | writer(posed_frame) 133 | 134 | if cv2.waitKey(1) & 0xFF == ord('q'): 135 | break 136 | 137 | cap.release() 138 | writer.close() 139 | cv2.destroyAllWindows() 140 | -------------------------------------------------------------------------------- /images/body_preview.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/body_preview.jpg -------------------------------------------------------------------------------- /images/body_preview_estimation.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/body_preview_estimation.jpg -------------------------------------------------------------------------------- /images/body_preview_keypoints.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/body_preview_keypoints.jpg -------------------------------------------------------------------------------- /images/demo.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/demo.jpg -------------------------------------------------------------------------------- /images/demo_preview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/demo_preview.png -------------------------------------------------------------------------------- /images/detect_hand_preview.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/detect_hand_preview.jpg -------------------------------------------------------------------------------- /images/hand.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/hand.jpg -------------------------------------------------------------------------------- /images/hand_preview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/hand_preview.png -------------------------------------------------------------------------------- /images/hand_preview_estimation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/hand_preview_estimation.png -------------------------------------------------------------------------------- /images/kc-e129SBb4-sample.processed.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/kc-e129SBb4-sample.processed.gif -------------------------------------------------------------------------------- /images/keypoints_hand.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/keypoints_hand.png -------------------------------------------------------------------------------- /images/keypoints_pose_18.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/keypoints_pose_18.png -------------------------------------------------------------------------------- /images/skeleton.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/skeleton.jpg -------------------------------------------------------------------------------- /images/ski.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/ski.jpg -------------------------------------------------------------------------------- /images/yOAmYSW3WyU-sample.small.processed.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/images/yOAmYSW3WyU-sample.small.processed.gif -------------------------------------------------------------------------------- /model/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/model/.gitkeep -------------------------------------------------------------------------------- /notebooks/network_graph.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import sys\n", 10 | "sys.path.insert(0, '../python')\n", 11 | "sys.path.insert(0, '../')" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 2, 17 | "metadata": {}, 18 | "outputs": [], 19 | "source": [ 20 | "import torch\n", 21 | "import torchvision.models\n", 22 | "import hiddenlayer as hl\n", 23 | "from model import bodypose_model, handpose_model" 24 | ] 25 | }, 26 | { 27 | "cell_type": "code", 28 | "execution_count": 3, 29 | "metadata": { 30 | "scrolled": false 31 | }, 32 | "outputs": [ 33 | { 34 | "data": { 35 | "image/svg+xml": [ 36 | "\n", 37 | "\n", 39 | "\n", 41 | "\n", 42 | "\n", 44 | "\n", 45 | "%3\n", 46 | "\n", 47 | "\n", 48 | "\n", 49 | "bodypose_model/Sequential[model0]/MaxPool2d[pool1_stage1]/outputs/189\n", 50 | "\n", 51 | "MaxPool2x2\n", 52 | "\n", 53 | "\n", 54 | "\n", 55 | "13064966811616914696\n", 56 | "\n", 57 | "Conv3x3 > Relu\n", 58 | "x2\n", 59 | "\n", 60 | "\n", 61 | "\n", 62 | "bodypose_model/Sequential[model0]/MaxPool2d[pool1_stage1]/outputs/189->13064966811616914696\n", 63 | "\n", 64 | "\n", 65 | "\n", 66 | "\n", 67 | "\n", 68 | "bodypose_model/Sequential[model0]/MaxPool2d[pool2_stage1]/outputs/194\n", 69 | "\n", 70 | "MaxPool2x2\n", 71 | "\n", 72 | "\n", 73 | "\n", 74 | "15125386261721430011\n", 75 | "\n", 76 | "Conv3x3 > Relu\n", 77 | "x4\n", 78 | "\n", 79 | "\n", 80 | "\n", 81 | "bodypose_model/Sequential[model0]/MaxPool2d[pool2_stage1]/outputs/194->15125386261721430011\n", 82 | "\n", 83 | "\n", 84 | "\n", 85 | "\n", 86 | "\n", 87 | "bodypose_model/Sequential[model0]/MaxPool2d[pool3_stage1]/outputs/203\n", 88 | "\n", 89 | "MaxPool2x2\n", 90 | "\n", 91 | "\n", 92 | "\n", 93 | "3416934003633053932\n", 94 | "\n", 95 | "Conv3x3 > Relu\n", 96 | "x4\n", 97 | "\n", 98 | "\n", 99 | "\n", 100 | "bodypose_model/Sequential[model0]/MaxPool2d[pool3_stage1]/outputs/203->3416934003633053932\n", 101 | "\n", 102 | "\n", 103 | "\n", 104 | "\n", 105 | "\n", 106 | "bodypose_model/Sequential[model1_1]/Conv2d[conv5_5_CPM_L1]/outputs/220\n", 107 | "\n", 108 | "Conv1x1\n", 109 | "\n", 110 | "\n", 111 | "\n", 112 | "bodypose_model/outputs/230\n", 113 | "\n", 114 | "Concat\n", 115 | "\n", 116 | "\n", 117 | "\n", 118 | "bodypose_model/Sequential[model1_1]/Conv2d[conv5_5_CPM_L1]/outputs/220->bodypose_model/outputs/230\n", 119 | "\n", 120 | "\n", 121 | "1x38x46x46\n", 122 | "\n", 123 | "\n", 124 | "\n", 125 | "bodypose_model/Sequential[model1_2]/Conv2d[conv5_5_CPM_L2]/outputs/229\n", 126 | "\n", 127 | "Conv1x1\n", 128 | "\n", 129 | "\n", 130 | "\n", 131 | "bodypose_model/Sequential[model1_2]/Conv2d[conv5_5_CPM_L2]/outputs/229->bodypose_model/outputs/230\n", 132 | "\n", 133 | "\n", 134 | "1x19x46x46\n", 135 | "\n", 136 | "\n", 137 | "\n", 138 | "14230139936353591237\n", 139 | "\n", 140 | "Conv7x7 > Relu\n", 141 | "x6\n", 142 | "\n", 143 | "\n", 144 | "\n", 145 | "bodypose_model/outputs/230->14230139936353591237\n", 146 | "\n", 147 | "\n", 148 | "\n", 149 | "\n", 150 | "\n", 151 | "5665934897048359346\n", 152 | "\n", 153 | "Conv7x7 > Relu\n", 154 | "x6\n", 155 | "\n", 156 | "\n", 157 | "\n", 158 | "bodypose_model/outputs/230->5665934897048359346\n", 159 | "\n", 160 | "\n", 161 | "\n", 162 | "\n", 163 | "\n", 164 | "bodypose_model/Sequential[model2_1]/Conv2d[Mconv7_stage2_L1]/outputs/243\n", 165 | "\n", 166 | "Conv1x1\n", 167 | "\n", 168 | "\n", 169 | "\n", 170 | "bodypose_model/outputs/257\n", 171 | "\n", 172 | "Concat\n", 173 | "\n", 174 | "\n", 175 | "\n", 176 | "bodypose_model/Sequential[model2_1]/Conv2d[Mconv7_stage2_L1]/outputs/243->bodypose_model/outputs/257\n", 177 | "\n", 178 | "\n", 179 | "1x38x46x46\n", 180 | "\n", 181 | "\n", 182 | "\n", 183 | "bodypose_model/Sequential[model2_2]/Conv2d[Mconv7_stage2_L2]/outputs/256\n", 184 | "\n", 185 | "Conv1x1\n", 186 | "\n", 187 | "\n", 188 | "\n", 189 | "bodypose_model/Sequential[model2_2]/Conv2d[Mconv7_stage2_L2]/outputs/256->bodypose_model/outputs/257\n", 190 | "\n", 191 | "\n", 192 | "1x19x46x46\n", 193 | "\n", 194 | "\n", 195 | "\n", 196 | "4916951163406574479\n", 197 | "\n", 198 | "Conv7x7 > Relu\n", 199 | "x6\n", 200 | "\n", 201 | "\n", 202 | "\n", 203 | "bodypose_model/outputs/257->4916951163406574479\n", 204 | "\n", 205 | "\n", 206 | "\n", 207 | "\n", 208 | "\n", 209 | "10021701631470090739\n", 210 | "\n", 211 | "Conv7x7 > Relu\n", 212 | "x6\n", 213 | "\n", 214 | "\n", 215 | "\n", 216 | "bodypose_model/outputs/257->10021701631470090739\n", 217 | "\n", 218 | "\n", 219 | "\n", 220 | "\n", 221 | "\n", 222 | "bodypose_model/Sequential[model3_1]/Conv2d[Mconv7_stage3_L1]/outputs/270\n", 223 | "\n", 224 | "Conv1x1\n", 225 | "\n", 226 | "\n", 227 | "\n", 228 | "bodypose_model/outputs/284\n", 229 | "\n", 230 | "Concat\n", 231 | "\n", 232 | "\n", 233 | "\n", 234 | "bodypose_model/Sequential[model3_1]/Conv2d[Mconv7_stage3_L1]/outputs/270->bodypose_model/outputs/284\n", 235 | "\n", 236 | "\n", 237 | "1x38x46x46\n", 238 | "\n", 239 | "\n", 240 | "\n", 241 | "bodypose_model/Sequential[model3_2]/Conv2d[Mconv7_stage3_L2]/outputs/283\n", 242 | "\n", 243 | "Conv1x1\n", 244 | "\n", 245 | "\n", 246 | "\n", 247 | "bodypose_model/Sequential[model3_2]/Conv2d[Mconv7_stage3_L2]/outputs/283->bodypose_model/outputs/284\n", 248 | "\n", 249 | "\n", 250 | "1x19x46x46\n", 251 | "\n", 252 | "\n", 253 | "\n", 254 | "10019260897339890291\n", 255 | "\n", 256 | "Conv7x7 > Relu\n", 257 | "x6\n", 258 | "\n", 259 | "\n", 260 | "\n", 261 | "bodypose_model/outputs/284->10019260897339890291\n", 262 | "\n", 263 | "\n", 264 | "\n", 265 | "\n", 266 | "\n", 267 | "14236905943657278176\n", 268 | "\n", 269 | "Conv7x7 > Relu\n", 270 | "x6\n", 271 | "\n", 272 | "\n", 273 | "\n", 274 | "bodypose_model/outputs/284->14236905943657278176\n", 275 | "\n", 276 | "\n", 277 | "\n", 278 | "\n", 279 | "\n", 280 | "bodypose_model/Sequential[model4_1]/Conv2d[Mconv7_stage4_L1]/outputs/297\n", 281 | "\n", 282 | "Conv1x1\n", 283 | "\n", 284 | "\n", 285 | "\n", 286 | "bodypose_model/outputs/311\n", 287 | "\n", 288 | "Concat\n", 289 | "\n", 290 | "\n", 291 | "\n", 292 | "bodypose_model/Sequential[model4_1]/Conv2d[Mconv7_stage4_L1]/outputs/297->bodypose_model/outputs/311\n", 293 | "\n", 294 | "\n", 295 | "1x38x46x46\n", 296 | "\n", 297 | "\n", 298 | "\n", 299 | "bodypose_model/Sequential[model4_2]/Conv2d[Mconv7_stage4_L2]/outputs/310\n", 300 | "\n", 301 | "Conv1x1\n", 302 | "\n", 303 | "\n", 304 | "\n", 305 | "bodypose_model/Sequential[model4_2]/Conv2d[Mconv7_stage4_L2]/outputs/310->bodypose_model/outputs/311\n", 306 | "\n", 307 | "\n", 308 | "1x19x46x46\n", 309 | "\n", 310 | "\n", 311 | "\n", 312 | "6940965854023558923\n", 313 | "\n", 314 | "Conv7x7 > Relu\n", 315 | "x6\n", 316 | "\n", 317 | "\n", 318 | "\n", 319 | "bodypose_model/outputs/311->6940965854023558923\n", 320 | "\n", 321 | "\n", 322 | "\n", 323 | "\n", 324 | "\n", 325 | "8536199470902917928\n", 326 | "\n", 327 | "Conv7x7 > Relu\n", 328 | "x6\n", 329 | "\n", 330 | "\n", 331 | "\n", 332 | "bodypose_model/outputs/311->8536199470902917928\n", 333 | "\n", 334 | "\n", 335 | "\n", 336 | "\n", 337 | "\n", 338 | "bodypose_model/Sequential[model5_1]/Conv2d[Mconv7_stage5_L1]/outputs/324\n", 339 | "\n", 340 | "Conv1x1\n", 341 | "\n", 342 | "\n", 343 | "\n", 344 | "bodypose_model/outputs/338\n", 345 | "\n", 346 | "Concat\n", 347 | "\n", 348 | "\n", 349 | "\n", 350 | "bodypose_model/Sequential[model5_1]/Conv2d[Mconv7_stage5_L1]/outputs/324->bodypose_model/outputs/338\n", 351 | "\n", 352 | "\n", 353 | "1x38x46x46\n", 354 | "\n", 355 | "\n", 356 | "\n", 357 | "bodypose_model/Sequential[model5_2]/Conv2d[Mconv7_stage5_L2]/outputs/337\n", 358 | "\n", 359 | "Conv1x1\n", 360 | "\n", 361 | "\n", 362 | "\n", 363 | "bodypose_model/Sequential[model5_2]/Conv2d[Mconv7_stage5_L2]/outputs/337->bodypose_model/outputs/338\n", 364 | "\n", 365 | "\n", 366 | "1x19x46x46\n", 367 | "\n", 368 | "\n", 369 | "\n", 370 | "9469636319838519469\n", 371 | "\n", 372 | "Conv7x7 > Relu\n", 373 | "x6\n", 374 | "\n", 375 | "\n", 376 | "\n", 377 | "bodypose_model/outputs/338->9469636319838519469\n", 378 | "\n", 379 | "\n", 380 | "\n", 381 | "\n", 382 | "\n", 383 | "3442820789049449491\n", 384 | "\n", 385 | "Conv7x7 > Relu\n", 386 | "x7\n", 387 | "\n", 388 | "\n", 389 | "\n", 390 | "bodypose_model/outputs/338->3442820789049449491\n", 391 | "\n", 392 | "\n", 393 | "\n", 394 | "\n", 395 | "\n", 396 | "bodypose_model/Sequential[model6_1]/Conv2d[Mconv7_stage6_L1]/outputs/351\n", 397 | "\n", 398 | "Conv1x1\n", 399 | "\n", 400 | "\n", 401 | "\n", 402 | "1274573194035219541\n", 403 | "\n", 404 | "Conv3x3 > Relu\n", 405 | "x2\n", 406 | "\n", 407 | "\n", 408 | "\n", 409 | "1274573194035219541->bodypose_model/Sequential[model0]/MaxPool2d[pool1_stage1]/outputs/189\n", 410 | "\n", 411 | "\n", 412 | "\n", 413 | "\n", 414 | "\n", 415 | "13064966811616914696->bodypose_model/Sequential[model0]/MaxPool2d[pool2_stage1]/outputs/194\n", 416 | "\n", 417 | "\n", 418 | "\n", 419 | "\n", 420 | "\n", 421 | "15125386261721430011->bodypose_model/Sequential[model0]/MaxPool2d[pool3_stage1]/outputs/203\n", 422 | "\n", 423 | "\n", 424 | "\n", 425 | "\n", 426 | "\n", 427 | "3416934003633053932->bodypose_model/outputs/230\n", 428 | "\n", 429 | "\n", 430 | "\n", 431 | "\n", 432 | "\n", 433 | "3416934003633053932->bodypose_model/outputs/257\n", 434 | "\n", 435 | "\n", 436 | "\n", 437 | "\n", 438 | "\n", 439 | "3416934003633053932->bodypose_model/outputs/284\n", 440 | "\n", 441 | "\n", 442 | "\n", 443 | "\n", 444 | "\n", 445 | "3416934003633053932->bodypose_model/outputs/311\n", 446 | "\n", 447 | "\n", 448 | "\n", 449 | "\n", 450 | "\n", 451 | "3416934003633053932->bodypose_model/outputs/338\n", 452 | "\n", 453 | "\n", 454 | "\n", 455 | "\n", 456 | "\n", 457 | "8144613976419209978\n", 458 | "\n", 459 | "Conv3x3 > Relu\n", 460 | "x4\n", 461 | "\n", 462 | "\n", 463 | "\n", 464 | "3416934003633053932->8144613976419209978\n", 465 | "\n", 466 | "\n", 467 | "\n", 468 | "\n", 469 | "\n", 470 | "14481921719748847651\n", 471 | "\n", 472 | "Conv3x3 > Relu\n", 473 | "x4\n", 474 | "\n", 475 | "\n", 476 | "\n", 477 | "3416934003633053932->14481921719748847651\n", 478 | "\n", 479 | "\n", 480 | "\n", 481 | "\n", 482 | "\n", 483 | "8144613976419209978->bodypose_model/Sequential[model1_1]/Conv2d[conv5_5_CPM_L1]/outputs/220\n", 484 | "\n", 485 | "\n", 486 | "\n", 487 | "\n", 488 | "\n", 489 | "14481921719748847651->bodypose_model/Sequential[model1_2]/Conv2d[conv5_5_CPM_L2]/outputs/229\n", 490 | "\n", 491 | "\n", 492 | "\n", 493 | "\n", 494 | "\n", 495 | "14230139936353591237->bodypose_model/Sequential[model2_1]/Conv2d[Mconv7_stage2_L1]/outputs/243\n", 496 | "\n", 497 | "\n", 498 | "\n", 499 | "\n", 500 | "\n", 501 | "5665934897048359346->bodypose_model/Sequential[model2_2]/Conv2d[Mconv7_stage2_L2]/outputs/256\n", 502 | "\n", 503 | "\n", 504 | "\n", 505 | "\n", 506 | "\n", 507 | "4916951163406574479->bodypose_model/Sequential[model3_1]/Conv2d[Mconv7_stage3_L1]/outputs/270\n", 508 | "\n", 509 | "\n", 510 | "\n", 511 | "\n", 512 | "\n", 513 | "10021701631470090739->bodypose_model/Sequential[model3_2]/Conv2d[Mconv7_stage3_L2]/outputs/283\n", 514 | "\n", 515 | "\n", 516 | "\n", 517 | "\n", 518 | "\n", 519 | "10019260897339890291->bodypose_model/Sequential[model4_1]/Conv2d[Mconv7_stage4_L1]/outputs/297\n", 520 | "\n", 521 | "\n", 522 | "\n", 523 | "\n", 524 | "\n", 525 | "14236905943657278176->bodypose_model/Sequential[model4_2]/Conv2d[Mconv7_stage4_L2]/outputs/310\n", 526 | "\n", 527 | "\n", 528 | "\n", 529 | "\n", 530 | "\n", 531 | "6940965854023558923->bodypose_model/Sequential[model5_1]/Conv2d[Mconv7_stage5_L1]/outputs/324\n", 532 | "\n", 533 | "\n", 534 | "\n", 535 | "\n", 536 | "\n", 537 | "8536199470902917928->bodypose_model/Sequential[model5_2]/Conv2d[Mconv7_stage5_L2]/outputs/337\n", 538 | "\n", 539 | "\n", 540 | "\n", 541 | "\n", 542 | "\n", 543 | "9469636319838519469->bodypose_model/Sequential[model6_1]/Conv2d[Mconv7_stage6_L1]/outputs/351\n", 544 | "\n", 545 | "\n", 546 | "\n", 547 | "\n", 548 | "\n" 549 | ], 550 | "text/plain": [ 551 | "" 552 | ] 553 | }, 554 | "execution_count": 3, 555 | "metadata": {}, 556 | "output_type": "execute_result" 557 | } 558 | ], 559 | "source": [ 560 | "bodymodel = bodypose_model()\n", 561 | "hl.build_graph(bodymodel, torch.zeros([1, 3, 368, 368]))" 562 | ] 563 | }, 564 | { 565 | "cell_type": "code", 566 | "execution_count": 4, 567 | "metadata": { 568 | "scrolled": false 569 | }, 570 | "outputs": [ 571 | { 572 | "data": { 573 | "image/svg+xml": [ 574 | "\n", 575 | "\n", 577 | "\n", 579 | "\n", 580 | "\n", 582 | "\n", 583 | "%3\n", 584 | "\n", 585 | "\n", 586 | "\n", 587 | "handpose_model/Sequential[model1_0]/MaxPool2d[pool1_stage1]/outputs/109\n", 588 | "\n", 589 | "MaxPool2x2\n", 590 | "\n", 591 | "\n", 592 | "\n", 593 | "11513693159356215371\n", 594 | "\n", 595 | "Conv3x3 > Relu\n", 596 | "x2\n", 597 | "\n", 598 | "\n", 599 | "\n", 600 | "handpose_model/Sequential[model1_0]/MaxPool2d[pool1_stage1]/outputs/109->11513693159356215371\n", 601 | "\n", 602 | "\n", 603 | "\n", 604 | "\n", 605 | "\n", 606 | "handpose_model/Sequential[model1_0]/MaxPool2d[pool2_stage1]/outputs/114\n", 607 | "\n", 608 | "MaxPool2x2\n", 609 | "\n", 610 | "\n", 611 | "\n", 612 | "7803407982992576603\n", 613 | "\n", 614 | "Conv3x3 > Relu\n", 615 | "x4\n", 616 | "\n", 617 | "\n", 618 | "\n", 619 | "handpose_model/Sequential[model1_0]/MaxPool2d[pool2_stage1]/outputs/114->7803407982992576603\n", 620 | "\n", 621 | "\n", 622 | "\n", 623 | "\n", 624 | "\n", 625 | "handpose_model/Sequential[model1_0]/MaxPool2d[pool3_stage1]/outputs/123\n", 626 | "\n", 627 | "MaxPool2x2\n", 628 | "\n", 629 | "\n", 630 | "\n", 631 | "12968857949380585058\n", 632 | "\n", 633 | "Conv3x3 > Relu\n", 634 | "x7\n", 635 | "\n", 636 | "\n", 637 | "\n", 638 | "handpose_model/Sequential[model1_0]/MaxPool2d[pool3_stage1]/outputs/123->12968857949380585058\n", 639 | "\n", 640 | "\n", 641 | "\n", 642 | "\n", 643 | "\n", 644 | "handpose_model/Sequential[model1_1]/Conv2d[conv6_2_CPM]/outputs/140\n", 645 | "\n", 646 | "Conv1x1\n", 647 | "\n", 648 | "\n", 649 | "\n", 650 | "handpose_model/outputs/141\n", 651 | "\n", 652 | "Concat\n", 653 | "\n", 654 | "\n", 655 | "\n", 656 | "handpose_model/Sequential[model1_1]/Conv2d[conv6_2_CPM]/outputs/140->handpose_model/outputs/141\n", 657 | "\n", 658 | "\n", 659 | "1x22x46x46\n", 660 | "\n", 661 | "\n", 662 | "\n", 663 | "1509426316506609987\n", 664 | "\n", 665 | "Conv7x7 > Relu\n", 666 | "x6\n", 667 | "\n", 668 | "\n", 669 | "\n", 670 | "handpose_model/outputs/141->1509426316506609987\n", 671 | "\n", 672 | "\n", 673 | "\n", 674 | "\n", 675 | "\n", 676 | "handpose_model/Sequential[model2]/Conv2d[Mconv7_stage2]/outputs/154\n", 677 | "\n", 678 | "Conv1x1\n", 679 | "\n", 680 | "\n", 681 | "\n", 682 | "handpose_model/outputs/155\n", 683 | "\n", 684 | "Concat\n", 685 | "\n", 686 | "\n", 687 | "\n", 688 | "handpose_model/Sequential[model2]/Conv2d[Mconv7_stage2]/outputs/154->handpose_model/outputs/155\n", 689 | "\n", 690 | "\n", 691 | "1x22x46x46\n", 692 | "\n", 693 | "\n", 694 | "\n", 695 | "5669622596333078851\n", 696 | "\n", 697 | "Conv7x7 > Relu\n", 698 | "x6\n", 699 | "\n", 700 | "\n", 701 | "\n", 702 | "handpose_model/outputs/155->5669622596333078851\n", 703 | "\n", 704 | "\n", 705 | "\n", 706 | "\n", 707 | "\n", 708 | "handpose_model/Sequential[model3]/Conv2d[Mconv7_stage3]/outputs/168\n", 709 | "\n", 710 | "Conv1x1\n", 711 | "\n", 712 | "\n", 713 | "\n", 714 | "handpose_model/outputs/169\n", 715 | "\n", 716 | "Concat\n", 717 | "\n", 718 | "\n", 719 | "\n", 720 | "handpose_model/Sequential[model3]/Conv2d[Mconv7_stage3]/outputs/168->handpose_model/outputs/169\n", 721 | "\n", 722 | "\n", 723 | "1x22x46x46\n", 724 | "\n", 725 | "\n", 726 | "\n", 727 | "5355895542688103140\n", 728 | "\n", 729 | "Conv7x7 > Relu\n", 730 | "x6\n", 731 | "\n", 732 | "\n", 733 | "\n", 734 | "handpose_model/outputs/169->5355895542688103140\n", 735 | "\n", 736 | "\n", 737 | "\n", 738 | "\n", 739 | "\n", 740 | "handpose_model/Sequential[model4]/Conv2d[Mconv7_stage4]/outputs/182\n", 741 | "\n", 742 | "Conv1x1\n", 743 | "\n", 744 | "\n", 745 | "\n", 746 | "handpose_model/outputs/183\n", 747 | "\n", 748 | "Concat\n", 749 | "\n", 750 | "\n", 751 | "\n", 752 | "handpose_model/Sequential[model4]/Conv2d[Mconv7_stage4]/outputs/182->handpose_model/outputs/183\n", 753 | "\n", 754 | "\n", 755 | "1x22x46x46\n", 756 | "\n", 757 | "\n", 758 | "\n", 759 | "17276234060089573109\n", 760 | "\n", 761 | "Conv7x7 > Relu\n", 762 | "x6\n", 763 | "\n", 764 | "\n", 765 | "\n", 766 | "handpose_model/outputs/183->17276234060089573109\n", 767 | "\n", 768 | "\n", 769 | "\n", 770 | "\n", 771 | "\n", 772 | "handpose_model/Sequential[model5]/Conv2d[Mconv7_stage5]/outputs/196\n", 773 | "\n", 774 | "Conv1x1\n", 775 | "\n", 776 | "\n", 777 | "\n", 778 | "handpose_model/outputs/197\n", 779 | "\n", 780 | "Concat\n", 781 | "\n", 782 | "\n", 783 | "\n", 784 | "handpose_model/Sequential[model5]/Conv2d[Mconv7_stage5]/outputs/196->handpose_model/outputs/197\n", 785 | "\n", 786 | "\n", 787 | "1x22x46x46\n", 788 | "\n", 789 | "\n", 790 | "\n", 791 | "15790283256455772952\n", 792 | "\n", 793 | "Conv7x7 > Relu\n", 794 | "x6\n", 795 | "\n", 796 | "\n", 797 | "\n", 798 | "handpose_model/outputs/197->15790283256455772952\n", 799 | "\n", 800 | "\n", 801 | "\n", 802 | "\n", 803 | "\n", 804 | "handpose_model/Sequential[model6]/Conv2d[Mconv7_stage6]/outputs/210\n", 805 | "\n", 806 | "Conv1x1\n", 807 | "\n", 808 | "\n", 809 | "\n", 810 | "1285738913264440487\n", 811 | "\n", 812 | "Conv1x1 > Relu\n", 813 | "\n", 814 | "\n", 815 | "\n", 816 | "1285738913264440487->handpose_model/Sequential[model1_1]/Conv2d[conv6_2_CPM]/outputs/140\n", 817 | "\n", 818 | "\n", 819 | "\n", 820 | "\n", 821 | "\n", 822 | "6698621066210107444\n", 823 | "\n", 824 | "Conv3x3 > Relu\n", 825 | "x2\n", 826 | "\n", 827 | "\n", 828 | "\n", 829 | "6698621066210107444->handpose_model/Sequential[model1_0]/MaxPool2d[pool1_stage1]/outputs/109\n", 830 | "\n", 831 | "\n", 832 | "\n", 833 | "\n", 834 | "\n", 835 | "11513693159356215371->handpose_model/Sequential[model1_0]/MaxPool2d[pool2_stage1]/outputs/114\n", 836 | "\n", 837 | "\n", 838 | "\n", 839 | "\n", 840 | "\n", 841 | "7803407982992576603->handpose_model/Sequential[model1_0]/MaxPool2d[pool3_stage1]/outputs/123\n", 842 | "\n", 843 | "\n", 844 | "\n", 845 | "\n", 846 | "\n", 847 | "12968857949380585058->handpose_model/outputs/141\n", 848 | "\n", 849 | "\n", 850 | "\n", 851 | "\n", 852 | "\n", 853 | "12968857949380585058->handpose_model/outputs/155\n", 854 | "\n", 855 | "\n", 856 | "\n", 857 | "\n", 858 | "\n", 859 | "12968857949380585058->handpose_model/outputs/169\n", 860 | "\n", 861 | "\n", 862 | "\n", 863 | "\n", 864 | "\n", 865 | "12968857949380585058->handpose_model/outputs/183\n", 866 | "\n", 867 | "\n", 868 | "\n", 869 | "\n", 870 | "\n", 871 | "12968857949380585058->handpose_model/outputs/197\n", 872 | "\n", 873 | "\n", 874 | "\n", 875 | "\n", 876 | "\n", 877 | "12968857949380585058->1285738913264440487\n", 878 | "\n", 879 | "\n", 880 | "\n", 881 | "\n", 882 | "\n", 883 | "1509426316506609987->handpose_model/Sequential[model2]/Conv2d[Mconv7_stage2]/outputs/154\n", 884 | "\n", 885 | "\n", 886 | "\n", 887 | "\n", 888 | "\n", 889 | "5669622596333078851->handpose_model/Sequential[model3]/Conv2d[Mconv7_stage3]/outputs/168\n", 890 | "\n", 891 | "\n", 892 | "\n", 893 | "\n", 894 | "\n", 895 | "5355895542688103140->handpose_model/Sequential[model4]/Conv2d[Mconv7_stage4]/outputs/182\n", 896 | "\n", 897 | "\n", 898 | "\n", 899 | "\n", 900 | "\n", 901 | "17276234060089573109->handpose_model/Sequential[model5]/Conv2d[Mconv7_stage5]/outputs/196\n", 902 | "\n", 903 | "\n", 904 | "\n", 905 | "\n", 906 | "\n", 907 | "15790283256455772952->handpose_model/Sequential[model6]/Conv2d[Mconv7_stage6]/outputs/210\n", 908 | "\n", 909 | "\n", 910 | "\n", 911 | "\n", 912 | "\n" 913 | ], 914 | "text/plain": [ 915 | "" 916 | ] 917 | }, 918 | "execution_count": 4, 919 | "metadata": {}, 920 | "output_type": "execute_result" 921 | } 922 | ], 923 | "source": [ 924 | "handmodel = handpose_model()\n", 925 | "hl.build_graph(handmodel, torch.zeros([1, 3, 368, 368]))" 926 | ] 927 | }, 928 | { 929 | "cell_type": "code", 930 | "execution_count": null, 931 | "metadata": {}, 932 | "outputs": [], 933 | "source": [] 934 | } 935 | ], 936 | "metadata": { 937 | "kernelspec": { 938 | "display_name": "Python 3", 939 | "language": "python", 940 | "name": "python3" 941 | }, 942 | "language_info": { 943 | "codemirror_mode": { 944 | "name": "ipython", 945 | "version": 3 946 | }, 947 | "file_extension": ".py", 948 | "mimetype": "text/x-python", 949 | "name": "python", 950 | "nbconvert_exporter": "python", 951 | "pygments_lexer": "ipython3", 952 | "version": "3.6.6" 953 | } 954 | }, 955 | "nbformat": 4, 956 | "nbformat_minor": 2 957 | } 958 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy 2 | matplotlib 3 | opencv-python 4 | scipy 5 | scikit-image 6 | tqdm -------------------------------------------------------------------------------- /src/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Hzzone/pytorch-openpose/5ee71dc10020403dc3def2bb68f9b77c40337ae2/src/__init__.py -------------------------------------------------------------------------------- /src/body.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import numpy as np 3 | import math 4 | import time 5 | from scipy.ndimage.filters import gaussian_filter 6 | import matplotlib.pyplot as plt 7 | import matplotlib 8 | import torch 9 | from torchvision import transforms 10 | 11 | from src import util 12 | from src.model import bodypose_model 13 | 14 | class Body(object): 15 | def __init__(self, model_path): 16 | self.model = bodypose_model() 17 | if torch.cuda.is_available(): 18 | self.model = self.model.cuda() 19 | model_dict = util.transfer(self.model, torch.load(model_path)) 20 | self.model.load_state_dict(model_dict) 21 | self.model.eval() 22 | 23 | def __call__(self, oriImg): 24 | # scale_search = [0.5, 1.0, 1.5, 2.0] 25 | scale_search = [0.5] 26 | boxsize = 368 27 | stride = 8 28 | padValue = 128 29 | thre1 = 0.1 30 | thre2 = 0.05 31 | multiplier = [x * boxsize / oriImg.shape[0] for x in scale_search] 32 | heatmap_avg = np.zeros((oriImg.shape[0], oriImg.shape[1], 19)) 33 | paf_avg = np.zeros((oriImg.shape[0], oriImg.shape[1], 38)) 34 | 35 | for m in range(len(multiplier)): 36 | scale = multiplier[m] 37 | imageToTest = cv2.resize(oriImg, (0, 0), fx=scale, fy=scale, interpolation=cv2.INTER_CUBIC) 38 | imageToTest_padded, pad = util.padRightDownCorner(imageToTest, stride, padValue) 39 | im = np.transpose(np.float32(imageToTest_padded[:, :, :, np.newaxis]), (3, 2, 0, 1)) / 256 - 0.5 40 | im = np.ascontiguousarray(im) 41 | 42 | data = torch.from_numpy(im).float() 43 | if torch.cuda.is_available(): 44 | data = data.cuda() 45 | # data = data.permute([2, 0, 1]).unsqueeze(0).float() 46 | with torch.no_grad(): 47 | Mconv7_stage6_L1, Mconv7_stage6_L2 = self.model(data) 48 | Mconv7_stage6_L1 = Mconv7_stage6_L1.cpu().numpy() 49 | Mconv7_stage6_L2 = Mconv7_stage6_L2.cpu().numpy() 50 | 51 | # extract outputs, resize, and remove padding 52 | # heatmap = np.transpose(np.squeeze(net.blobs[output_blobs.keys()[1]].data), (1, 2, 0)) # output 1 is heatmaps 53 | heatmap = np.transpose(np.squeeze(Mconv7_stage6_L2), (1, 2, 0)) # output 1 is heatmaps 54 | heatmap = cv2.resize(heatmap, (0, 0), fx=stride, fy=stride, interpolation=cv2.INTER_CUBIC) 55 | heatmap = heatmap[:imageToTest_padded.shape[0] - pad[2], :imageToTest_padded.shape[1] - pad[3], :] 56 | heatmap = cv2.resize(heatmap, (oriImg.shape[1], oriImg.shape[0]), interpolation=cv2.INTER_CUBIC) 57 | 58 | # paf = np.transpose(np.squeeze(net.blobs[output_blobs.keys()[0]].data), (1, 2, 0)) # output 0 is PAFs 59 | paf = np.transpose(np.squeeze(Mconv7_stage6_L1), (1, 2, 0)) # output 0 is PAFs 60 | paf = cv2.resize(paf, (0, 0), fx=stride, fy=stride, interpolation=cv2.INTER_CUBIC) 61 | paf = paf[:imageToTest_padded.shape[0] - pad[2], :imageToTest_padded.shape[1] - pad[3], :] 62 | paf = cv2.resize(paf, (oriImg.shape[1], oriImg.shape[0]), interpolation=cv2.INTER_CUBIC) 63 | 64 | heatmap_avg += heatmap_avg + heatmap / len(multiplier) 65 | paf_avg += + paf / len(multiplier) 66 | 67 | all_peaks = [] 68 | peak_counter = 0 69 | 70 | for part in range(18): 71 | map_ori = heatmap_avg[:, :, part] 72 | one_heatmap = gaussian_filter(map_ori, sigma=3) 73 | 74 | map_left = np.zeros(one_heatmap.shape) 75 | map_left[1:, :] = one_heatmap[:-1, :] 76 | map_right = np.zeros(one_heatmap.shape) 77 | map_right[:-1, :] = one_heatmap[1:, :] 78 | map_up = np.zeros(one_heatmap.shape) 79 | map_up[:, 1:] = one_heatmap[:, :-1] 80 | map_down = np.zeros(one_heatmap.shape) 81 | map_down[:, :-1] = one_heatmap[:, 1:] 82 | 83 | peaks_binary = np.logical_and.reduce( 84 | (one_heatmap >= map_left, one_heatmap >= map_right, one_heatmap >= map_up, one_heatmap >= map_down, one_heatmap > thre1)) 85 | peaks = list(zip(np.nonzero(peaks_binary)[1], np.nonzero(peaks_binary)[0])) # note reverse 86 | peaks_with_score = [x + (map_ori[x[1], x[0]],) for x in peaks] 87 | peak_id = range(peak_counter, peak_counter + len(peaks)) 88 | peaks_with_score_and_id = [peaks_with_score[i] + (peak_id[i],) for i in range(len(peak_id))] 89 | 90 | all_peaks.append(peaks_with_score_and_id) 91 | peak_counter += len(peaks) 92 | 93 | # find connection in the specified sequence, center 29 is in the position 15 94 | limbSeq = [[2, 3], [2, 6], [3, 4], [4, 5], [6, 7], [7, 8], [2, 9], [9, 10], \ 95 | [10, 11], [2, 12], [12, 13], [13, 14], [2, 1], [1, 15], [15, 17], \ 96 | [1, 16], [16, 18], [3, 17], [6, 18]] 97 | # the middle joints heatmap correpondence 98 | mapIdx = [[31, 32], [39, 40], [33, 34], [35, 36], [41, 42], [43, 44], [19, 20], [21, 22], \ 99 | [23, 24], [25, 26], [27, 28], [29, 30], [47, 48], [49, 50], [53, 54], [51, 52], \ 100 | [55, 56], [37, 38], [45, 46]] 101 | 102 | connection_all = [] 103 | special_k = [] 104 | mid_num = 10 105 | 106 | for k in range(len(mapIdx)): 107 | score_mid = paf_avg[:, :, [x - 19 for x in mapIdx[k]]] 108 | candA = all_peaks[limbSeq[k][0] - 1] 109 | candB = all_peaks[limbSeq[k][1] - 1] 110 | nA = len(candA) 111 | nB = len(candB) 112 | indexA, indexB = limbSeq[k] 113 | if (nA != 0 and nB != 0): 114 | connection_candidate = [] 115 | for i in range(nA): 116 | for j in range(nB): 117 | vec = np.subtract(candB[j][:2], candA[i][:2]) 118 | norm = math.sqrt(vec[0] * vec[0] + vec[1] * vec[1]) 119 | norm = max(0.001, norm) 120 | vec = np.divide(vec, norm) 121 | 122 | startend = list(zip(np.linspace(candA[i][0], candB[j][0], num=mid_num), \ 123 | np.linspace(candA[i][1], candB[j][1], num=mid_num))) 124 | 125 | vec_x = np.array([score_mid[int(round(startend[I][1])), int(round(startend[I][0])), 0] \ 126 | for I in range(len(startend))]) 127 | vec_y = np.array([score_mid[int(round(startend[I][1])), int(round(startend[I][0])), 1] \ 128 | for I in range(len(startend))]) 129 | 130 | score_midpts = np.multiply(vec_x, vec[0]) + np.multiply(vec_y, vec[1]) 131 | score_with_dist_prior = sum(score_midpts) / len(score_midpts) + min( 132 | 0.5 * oriImg.shape[0] / norm - 1, 0) 133 | criterion1 = len(np.nonzero(score_midpts > thre2)[0]) > 0.8 * len(score_midpts) 134 | criterion2 = score_with_dist_prior > 0 135 | if criterion1 and criterion2: 136 | connection_candidate.append( 137 | [i, j, score_with_dist_prior, score_with_dist_prior + candA[i][2] + candB[j][2]]) 138 | 139 | connection_candidate = sorted(connection_candidate, key=lambda x: x[2], reverse=True) 140 | connection = np.zeros((0, 5)) 141 | for c in range(len(connection_candidate)): 142 | i, j, s = connection_candidate[c][0:3] 143 | if (i not in connection[:, 3] and j not in connection[:, 4]): 144 | connection = np.vstack([connection, [candA[i][3], candB[j][3], s, i, j]]) 145 | if (len(connection) >= min(nA, nB)): 146 | break 147 | 148 | connection_all.append(connection) 149 | else: 150 | special_k.append(k) 151 | connection_all.append([]) 152 | 153 | # last number in each row is the total parts number of that person 154 | # the second last number in each row is the score of the overall configuration 155 | subset = -1 * np.ones((0, 20)) 156 | candidate = np.array([item for sublist in all_peaks for item in sublist]) 157 | 158 | for k in range(len(mapIdx)): 159 | if k not in special_k: 160 | partAs = connection_all[k][:, 0] 161 | partBs = connection_all[k][:, 1] 162 | indexA, indexB = np.array(limbSeq[k]) - 1 163 | 164 | for i in range(len(connection_all[k])): # = 1:size(temp,1) 165 | found = 0 166 | subset_idx = [-1, -1] 167 | for j in range(len(subset)): # 1:size(subset,1): 168 | if subset[j][indexA] == partAs[i] or subset[j][indexB] == partBs[i]: 169 | subset_idx[found] = j 170 | found += 1 171 | 172 | if found == 1: 173 | j = subset_idx[0] 174 | if subset[j][indexB] != partBs[i]: 175 | subset[j][indexB] = partBs[i] 176 | subset[j][-1] += 1 177 | subset[j][-2] += candidate[partBs[i].astype(int), 2] + connection_all[k][i][2] 178 | elif found == 2: # if found 2 and disjoint, merge them 179 | j1, j2 = subset_idx 180 | membership = ((subset[j1] >= 0).astype(int) + (subset[j2] >= 0).astype(int))[:-2] 181 | if len(np.nonzero(membership == 2)[0]) == 0: # merge 182 | subset[j1][:-2] += (subset[j2][:-2] + 1) 183 | subset[j1][-2:] += subset[j2][-2:] 184 | subset[j1][-2] += connection_all[k][i][2] 185 | subset = np.delete(subset, j2, 0) 186 | else: # as like found == 1 187 | subset[j1][indexB] = partBs[i] 188 | subset[j1][-1] += 1 189 | subset[j1][-2] += candidate[partBs[i].astype(int), 2] + connection_all[k][i][2] 190 | 191 | # if find no partA in the subset, create a new subset 192 | elif not found and k < 17: 193 | row = -1 * np.ones(20) 194 | row[indexA] = partAs[i] 195 | row[indexB] = partBs[i] 196 | row[-1] = 2 197 | row[-2] = sum(candidate[connection_all[k][i, :2].astype(int), 2]) + connection_all[k][i][2] 198 | subset = np.vstack([subset, row]) 199 | # delete some rows of subset which has few parts occur 200 | deleteIdx = [] 201 | for i in range(len(subset)): 202 | if subset[i][-1] < 4 or subset[i][-2] / subset[i][-1] < 0.4: 203 | deleteIdx.append(i) 204 | subset = np.delete(subset, deleteIdx, axis=0) 205 | 206 | # subset: n*20 array, 0-17 is the index in candidate, 18 is the total score, 19 is the total parts 207 | # candidate: x, y, score, id 208 | return candidate, subset 209 | 210 | if __name__ == "__main__": 211 | body_estimation = Body('../model/body_pose_model.pth') 212 | 213 | test_image = '../images/ski.jpg' 214 | oriImg = cv2.imread(test_image) # B,G,R order 215 | candidate, subset = body_estimation(oriImg) 216 | canvas = util.draw_bodypose(oriImg, candidate, subset) 217 | plt.imshow(canvas[:, :, [2, 1, 0]]) 218 | plt.show() 219 | -------------------------------------------------------------------------------- /src/hand.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import json 3 | import numpy as np 4 | import math 5 | import time 6 | from scipy.ndimage.filters import gaussian_filter 7 | import matplotlib.pyplot as plt 8 | import matplotlib 9 | import torch 10 | from skimage.measure import label 11 | 12 | from src.model import handpose_model 13 | from src import util 14 | 15 | class Hand(object): 16 | def __init__(self, model_path): 17 | self.model = handpose_model() 18 | if torch.cuda.is_available(): 19 | self.model = self.model.cuda() 20 | model_dict = util.transfer(self.model, torch.load(model_path)) 21 | self.model.load_state_dict(model_dict) 22 | self.model.eval() 23 | 24 | def __call__(self, oriImg): 25 | scale_search = [0.5, 1.0, 1.5, 2.0] 26 | # scale_search = [0.5] 27 | boxsize = 368 28 | stride = 8 29 | padValue = 128 30 | thre = 0.05 31 | multiplier = [x * boxsize / oriImg.shape[0] for x in scale_search] 32 | heatmap_avg = np.zeros((oriImg.shape[0], oriImg.shape[1], 22)) 33 | # paf_avg = np.zeros((oriImg.shape[0], oriImg.shape[1], 38)) 34 | 35 | for m in range(len(multiplier)): 36 | scale = multiplier[m] 37 | imageToTest = cv2.resize(oriImg, (0, 0), fx=scale, fy=scale, interpolation=cv2.INTER_CUBIC) 38 | imageToTest_padded, pad = util.padRightDownCorner(imageToTest, stride, padValue) 39 | im = np.transpose(np.float32(imageToTest_padded[:, :, :, np.newaxis]), (3, 2, 0, 1)) / 256 - 0.5 40 | im = np.ascontiguousarray(im) 41 | 42 | data = torch.from_numpy(im).float() 43 | if torch.cuda.is_available(): 44 | data = data.cuda() 45 | # data = data.permute([2, 0, 1]).unsqueeze(0).float() 46 | with torch.no_grad(): 47 | output = self.model(data).cpu().numpy() 48 | # output = self.model(data).numpy()q 49 | 50 | # extract outputs, resize, and remove padding 51 | heatmap = np.transpose(np.squeeze(output), (1, 2, 0)) # output 1 is heatmaps 52 | heatmap = cv2.resize(heatmap, (0, 0), fx=stride, fy=stride, interpolation=cv2.INTER_CUBIC) 53 | heatmap = heatmap[:imageToTest_padded.shape[0] - pad[2], :imageToTest_padded.shape[1] - pad[3], :] 54 | heatmap = cv2.resize(heatmap, (oriImg.shape[1], oriImg.shape[0]), interpolation=cv2.INTER_CUBIC) 55 | 56 | heatmap_avg += heatmap / len(multiplier) 57 | 58 | all_peaks = [] 59 | for part in range(21): 60 | map_ori = heatmap_avg[:, :, part] 61 | one_heatmap = gaussian_filter(map_ori, sigma=3) 62 | binary = np.ascontiguousarray(one_heatmap > thre, dtype=np.uint8) 63 | # 全部小于阈值 64 | if np.sum(binary) == 0: 65 | all_peaks.append([0, 0]) 66 | continue 67 | label_img, label_numbers = label(binary, return_num=True, connectivity=binary.ndim) 68 | max_index = np.argmax([np.sum(map_ori[label_img == i]) for i in range(1, label_numbers + 1)]) + 1 69 | label_img[label_img != max_index] = 0 70 | map_ori[label_img == 0] = 0 71 | 72 | y, x = util.npmax(map_ori) 73 | all_peaks.append([x, y]) 74 | return np.array(all_peaks) 75 | 76 | if __name__ == "__main__": 77 | hand_estimation = Hand('../model/hand_pose_model.pth') 78 | 79 | # test_image = '../images/hand.jpg' 80 | test_image = '../images/hand.jpg' 81 | oriImg = cv2.imread(test_image) # B,G,R order 82 | peaks = hand_estimation(oriImg) 83 | canvas = util.draw_handpose(oriImg, peaks, True) 84 | cv2.imshow('', canvas) 85 | cv2.waitKey(0) -------------------------------------------------------------------------------- /src/hand_model_output_size.json: -------------------------------------------------------------------------------- 1 | { 2 | "10":1, 3 | "11":1, 4 | "12":1, 5 | "13":1, 6 | "14":1, 7 | "15":1, 8 | "16":2, 9 | "17":2, 10 | "18":2, 11 | "19":2, 12 | "20":2, 13 | "21":2, 14 | "22":2, 15 | "23":2, 16 | "24":3, 17 | "25":3, 18 | "26":3, 19 | "27":3, 20 | "28":3, 21 | "29":3, 22 | "30":3, 23 | "31":3, 24 | "32":4, 25 | "33":4, 26 | "34":4, 27 | "35":4, 28 | "36":4, 29 | "37":4, 30 | "38":4, 31 | "39":4, 32 | "40":5, 33 | "41":5, 34 | "42":5, 35 | "43":5, 36 | "44":5, 37 | "45":5, 38 | "46":5, 39 | "47":5, 40 | "48":6, 41 | "49":6, 42 | "50":6, 43 | "51":6, 44 | "52":6, 45 | "53":6, 46 | "54":6, 47 | "55":6, 48 | "56":7, 49 | "57":7, 50 | "58":7, 51 | "59":7, 52 | "60":7, 53 | "61":7, 54 | "62":7, 55 | "63":7, 56 | "64":8, 57 | "65":8, 58 | "66":8, 59 | "67":8, 60 | "68":8, 61 | "69":8, 62 | "70":8, 63 | "71":8, 64 | "72":9, 65 | "73":9, 66 | "74":9, 67 | "75":9, 68 | "76":9, 69 | "77":9, 70 | "78":9, 71 | "79":9, 72 | "80":10, 73 | "81":10, 74 | "82":10, 75 | "83":10, 76 | "84":10, 77 | "85":10, 78 | "86":10, 79 | "87":10, 80 | "88":11, 81 | "89":11, 82 | "90":11, 83 | "91":11, 84 | "92":11, 85 | "93":11, 86 | "94":11, 87 | "95":11, 88 | "96":12, 89 | "97":12, 90 | "98":12, 91 | "99":12, 92 | "100":12, 93 | "101":12, 94 | "102":12, 95 | "103":12, 96 | "104":13, 97 | "105":13, 98 | "106":13, 99 | "107":13, 100 | "108":13, 101 | "109":13, 102 | "110":13, 103 | "111":13, 104 | "112":14, 105 | "113":14, 106 | "114":14, 107 | "115":14, 108 | "116":14, 109 | "117":14, 110 | "118":14, 111 | "119":14, 112 | "120":15, 113 | "121":15, 114 | "122":15, 115 | "123":15, 116 | "124":15, 117 | "125":15, 118 | "126":15, 119 | "127":15, 120 | "128":16, 121 | "129":16, 122 | "130":16, 123 | "131":16, 124 | "132":16, 125 | "133":16, 126 | "134":16, 127 | "135":16, 128 | "136":17, 129 | "137":17, 130 | "138":17, 131 | "139":17, 132 | "140":17, 133 | "141":17, 134 | "142":17, 135 | "143":17, 136 | "144":18, 137 | "145":18, 138 | "146":18, 139 | "147":18, 140 | "148":18, 141 | "149":18, 142 | "150":18, 143 | "151":18, 144 | "152":19, 145 | "153":19, 146 | "154":19, 147 | "155":19, 148 | "156":19, 149 | "157":19, 150 | "158":19, 151 | "159":19, 152 | "160":20, 153 | "161":20, 154 | "162":20, 155 | "163":20, 156 | "164":20, 157 | "165":20, 158 | "166":20, 159 | "167":20, 160 | "168":21, 161 | "169":21, 162 | "170":21, 163 | "171":21, 164 | "172":21, 165 | "173":21, 166 | "174":21, 167 | "175":21, 168 | "176":22, 169 | "177":22, 170 | "178":22, 171 | "179":22, 172 | "180":22, 173 | "181":22, 174 | "182":22, 175 | "183":22, 176 | "184":23, 177 | "185":23, 178 | "186":23, 179 | "187":23, 180 | "188":23, 181 | "189":23, 182 | "190":23, 183 | "191":23, 184 | "192":24, 185 | "193":24, 186 | "194":24, 187 | "195":24, 188 | "196":24, 189 | "197":24, 190 | "198":24, 191 | "199":24, 192 | "200":25, 193 | "201":25, 194 | "202":25, 195 | "203":25, 196 | "204":25, 197 | "205":25, 198 | "206":25, 199 | "207":25, 200 | "208":26, 201 | "209":26, 202 | "210":26, 203 | "211":26, 204 | "212":26, 205 | "213":26, 206 | "214":26, 207 | "215":26, 208 | "216":27, 209 | "217":27, 210 | "218":27, 211 | "219":27, 212 | "220":27, 213 | "221":27, 214 | "222":27, 215 | "223":27, 216 | "224":28, 217 | "225":28, 218 | "226":28, 219 | "227":28, 220 | "228":28, 221 | "229":28, 222 | "230":28, 223 | "231":28, 224 | "232":29, 225 | "233":29, 226 | "234":29, 227 | "235":29, 228 | "236":29, 229 | "237":29, 230 | "238":29, 231 | "239":29, 232 | "240":30, 233 | "241":30, 234 | "242":30, 235 | "243":30, 236 | "244":30, 237 | "245":30, 238 | "246":30, 239 | "247":30, 240 | "248":31, 241 | "249":31, 242 | "250":31, 243 | "251":31, 244 | "252":31, 245 | "253":31, 246 | "254":31, 247 | "255":31, 248 | "256":32, 249 | "257":32, 250 | "258":32, 251 | "259":32, 252 | "260":32, 253 | "261":32, 254 | "262":32, 255 | "263":32, 256 | "264":33, 257 | "265":33, 258 | "266":33, 259 | "267":33, 260 | "268":33, 261 | "269":33, 262 | "270":33, 263 | "271":33, 264 | "272":34, 265 | "273":34, 266 | "274":34, 267 | "275":34, 268 | "276":34, 269 | "277":34, 270 | "278":34, 271 | "279":34, 272 | "280":35, 273 | "281":35, 274 | "282":35, 275 | "283":35, 276 | "284":35, 277 | "285":35, 278 | "286":35, 279 | "287":35, 280 | "288":36, 281 | "289":36, 282 | "290":36, 283 | "291":36, 284 | "292":36, 285 | "293":36, 286 | "294":36, 287 | "295":36, 288 | "296":37, 289 | "297":37, 290 | "298":37, 291 | "299":37, 292 | "300":37, 293 | "301":37, 294 | "302":37, 295 | "303":37, 296 | "304":38, 297 | "305":38, 298 | "306":38, 299 | "307":38, 300 | "308":38, 301 | "309":38, 302 | "310":38, 303 | "311":38, 304 | "312":39, 305 | "313":39, 306 | "314":39, 307 | "315":39, 308 | "316":39, 309 | "317":39, 310 | "318":39, 311 | "319":39, 312 | "320":40, 313 | "321":40, 314 | "322":40, 315 | "323":40, 316 | "324":40, 317 | "325":40, 318 | "326":40, 319 | "327":40, 320 | "328":41, 321 | "329":41, 322 | "330":41, 323 | "331":41, 324 | "332":41, 325 | "333":41, 326 | "334":41, 327 | "335":41, 328 | "336":42, 329 | "337":42, 330 | "338":42, 331 | "339":42, 332 | "340":42, 333 | "341":42, 334 | "342":42, 335 | "343":42, 336 | "344":43, 337 | "345":43, 338 | "346":43, 339 | "347":43, 340 | "348":43, 341 | "349":43, 342 | "350":43, 343 | "351":43, 344 | "352":44, 345 | "353":44, 346 | "354":44, 347 | "355":44, 348 | "356":44, 349 | "357":44, 350 | "358":44, 351 | "359":44, 352 | "360":45, 353 | "361":45, 354 | "362":45, 355 | "363":45, 356 | "364":45, 357 | "365":45, 358 | "366":45, 359 | "367":45, 360 | "368":46, 361 | "369":46, 362 | "370":46, 363 | "371":46, 364 | "372":46, 365 | "373":46, 366 | "374":46, 367 | "375":46, 368 | "376":47, 369 | "377":47, 370 | "378":47, 371 | "379":47, 372 | "380":47, 373 | "381":47, 374 | "382":47, 375 | "383":47, 376 | "384":48, 377 | "385":48, 378 | "386":48, 379 | "387":48, 380 | "388":48, 381 | "389":48, 382 | "390":48, 383 | "391":48, 384 | "392":49, 385 | "393":49, 386 | "394":49, 387 | "395":49, 388 | "396":49, 389 | "397":49, 390 | "398":49, 391 | "399":49, 392 | "400":50, 393 | "401":50, 394 | "402":50, 395 | "403":50, 396 | "404":50, 397 | "405":50, 398 | "406":50, 399 | "407":50, 400 | "408":51, 401 | "409":51, 402 | "410":51, 403 | "411":51, 404 | "412":51, 405 | "413":51, 406 | "414":51, 407 | "415":51, 408 | "416":52, 409 | "417":52, 410 | "418":52, 411 | "419":52, 412 | "420":52, 413 | "421":52, 414 | "422":52, 415 | "423":52, 416 | "424":53, 417 | "425":53, 418 | "426":53, 419 | "427":53, 420 | "428":53, 421 | "429":53, 422 | "430":53, 423 | "431":53, 424 | "432":54, 425 | "433":54, 426 | "434":54, 427 | "435":54, 428 | "436":54, 429 | "437":54, 430 | "438":54, 431 | "439":54, 432 | "440":55, 433 | "441":55, 434 | "442":55, 435 | "443":55, 436 | "444":55, 437 | "445":55, 438 | "446":55, 439 | "447":55, 440 | "448":56, 441 | "449":56, 442 | "450":56, 443 | "451":56, 444 | "452":56, 445 | "453":56, 446 | "454":56, 447 | "455":56, 448 | "456":57, 449 | "457":57, 450 | "458":57, 451 | "459":57, 452 | "460":57, 453 | "461":57, 454 | "462":57, 455 | "463":57, 456 | "464":58, 457 | "465":58, 458 | "466":58, 459 | "467":58, 460 | "468":58, 461 | "469":58, 462 | "470":58, 463 | "471":58, 464 | "472":59, 465 | "473":59, 466 | "474":59, 467 | "475":59, 468 | "476":59, 469 | "477":59, 470 | "478":59, 471 | "479":59, 472 | "480":60, 473 | "481":60, 474 | "482":60, 475 | "483":60, 476 | "484":60, 477 | "485":60, 478 | "486":60, 479 | "487":60, 480 | "488":61, 481 | "489":61, 482 | "490":61, 483 | "491":61, 484 | "492":61, 485 | "493":61, 486 | "494":61, 487 | "495":61, 488 | "496":62, 489 | "497":62, 490 | "498":62, 491 | "499":62, 492 | "500":62, 493 | "501":62, 494 | "502":62, 495 | "503":62, 496 | "504":63, 497 | "505":63, 498 | "506":63, 499 | "507":63, 500 | "508":63, 501 | "509":63, 502 | "510":63, 503 | "511":63, 504 | "512":64, 505 | "513":64, 506 | "514":64, 507 | "515":64, 508 | "516":64, 509 | "517":64, 510 | "518":64, 511 | "519":64, 512 | "520":65, 513 | "521":65, 514 | "522":65, 515 | "523":65, 516 | "524":65, 517 | "525":65, 518 | "526":65, 519 | "527":65, 520 | "528":66, 521 | "529":66, 522 | "530":66, 523 | "531":66, 524 | "532":66, 525 | "533":66, 526 | "534":66, 527 | "535":66, 528 | "536":67, 529 | "537":67, 530 | "538":67, 531 | "539":67, 532 | "540":67, 533 | "541":67, 534 | "542":67, 535 | "543":67, 536 | "544":68, 537 | "545":68, 538 | "546":68, 539 | "547":68, 540 | "548":68, 541 | "549":68, 542 | "550":68, 543 | "551":68, 544 | "552":69, 545 | "553":69, 546 | "554":69, 547 | "555":69, 548 | "556":69, 549 | "557":69, 550 | "558":69, 551 | "559":69, 552 | "560":70, 553 | "561":70, 554 | "562":70, 555 | "563":70, 556 | "564":70, 557 | "565":70, 558 | "566":70, 559 | "567":70, 560 | "568":71, 561 | "569":71, 562 | "570":71, 563 | "571":71, 564 | "572":71, 565 | "573":71, 566 | "574":71, 567 | "575":71, 568 | "576":72, 569 | "577":72, 570 | "578":72, 571 | "579":72, 572 | "580":72, 573 | "581":72, 574 | "582":72, 575 | "583":72, 576 | "584":73, 577 | "585":73, 578 | "586":73, 579 | "587":73, 580 | "588":73, 581 | "589":73, 582 | "590":73, 583 | "591":73, 584 | "592":74, 585 | "593":74, 586 | "594":74, 587 | "595":74, 588 | "596":74, 589 | "597":74, 590 | "598":74, 591 | "599":74, 592 | "600":75, 593 | "601":75, 594 | "602":75, 595 | "603":75, 596 | "604":75, 597 | "605":75, 598 | "606":75, 599 | "607":75, 600 | "608":76, 601 | "609":76, 602 | "610":76, 603 | "611":76, 604 | "612":76, 605 | "613":76, 606 | "614":76, 607 | "615":76, 608 | "616":77, 609 | "617":77, 610 | "618":77, 611 | "619":77, 612 | "620":77, 613 | "621":77, 614 | "622":77, 615 | "623":77, 616 | "624":78, 617 | "625":78, 618 | "626":78, 619 | "627":78, 620 | "628":78, 621 | "629":78, 622 | "630":78, 623 | "631":78, 624 | "632":79, 625 | "633":79, 626 | "634":79, 627 | "635":79, 628 | "636":79, 629 | "637":79, 630 | "638":79, 631 | "639":79, 632 | "640":80, 633 | "641":80, 634 | "642":80, 635 | "643":80, 636 | "644":80, 637 | "645":80, 638 | "646":80, 639 | "647":80, 640 | "648":81, 641 | "649":81, 642 | "650":81, 643 | "651":81, 644 | "652":81, 645 | "653":81, 646 | "654":81, 647 | "655":81, 648 | "656":82, 649 | "657":82, 650 | "658":82, 651 | "659":82, 652 | "660":82, 653 | "661":82, 654 | "662":82, 655 | "663":82, 656 | "664":83, 657 | "665":83, 658 | "666":83, 659 | "667":83, 660 | "668":83, 661 | "669":83, 662 | "670":83, 663 | "671":83, 664 | "672":84, 665 | "673":84, 666 | "674":84, 667 | "675":84, 668 | "676":84, 669 | "677":84, 670 | "678":84, 671 | "679":84, 672 | "680":85, 673 | "681":85, 674 | "682":85, 675 | "683":85, 676 | "684":85, 677 | "685":85, 678 | "686":85, 679 | "687":85, 680 | "688":86, 681 | "689":86, 682 | "690":86, 683 | "691":86, 684 | "692":86, 685 | "693":86, 686 | "694":86, 687 | "695":86, 688 | "696":87, 689 | "697":87, 690 | "698":87, 691 | "699":87, 692 | "700":87, 693 | "701":87, 694 | "702":87, 695 | "703":87, 696 | "704":88, 697 | "705":88, 698 | "706":88, 699 | "707":88, 700 | "708":88, 701 | "709":88, 702 | "710":88, 703 | "711":88, 704 | "712":89, 705 | "713":89, 706 | "714":89, 707 | "715":89, 708 | "716":89, 709 | "717":89, 710 | "718":89, 711 | "719":89, 712 | "720":90, 713 | "721":90, 714 | "722":90, 715 | "723":90, 716 | "724":90, 717 | "725":90, 718 | "726":90, 719 | "727":90, 720 | "728":91, 721 | "729":91, 722 | "730":91, 723 | "731":91, 724 | "732":91, 725 | "733":91, 726 | "734":91, 727 | "735":91, 728 | "736":92, 729 | "737":92, 730 | "738":92, 731 | "739":92, 732 | "740":92, 733 | "741":92, 734 | "742":92, 735 | "743":92, 736 | "744":93, 737 | "745":93, 738 | "746":93, 739 | "747":93, 740 | "748":93, 741 | "749":93, 742 | "750":93, 743 | "751":93, 744 | "752":94, 745 | "753":94, 746 | "754":94, 747 | "755":94, 748 | "756":94, 749 | "757":94, 750 | "758":94, 751 | "759":94, 752 | "760":95, 753 | "761":95, 754 | "762":95, 755 | "763":95, 756 | "764":95, 757 | "765":95, 758 | "766":95, 759 | "767":95, 760 | "768":96, 761 | "769":96, 762 | "770":96, 763 | "771":96, 764 | "772":96, 765 | "773":96, 766 | "774":96, 767 | "775":96, 768 | "776":97, 769 | "777":97, 770 | "778":97, 771 | "779":97, 772 | "780":97, 773 | "781":97, 774 | "782":97, 775 | "783":97, 776 | "784":98, 777 | "785":98, 778 | "786":98, 779 | "787":98, 780 | "788":98, 781 | "789":98, 782 | "790":98, 783 | "791":98, 784 | "792":99, 785 | "793":99, 786 | "794":99, 787 | "795":99, 788 | "796":99, 789 | "797":99, 790 | "798":99, 791 | "799":99, 792 | "800":100, 793 | "801":100, 794 | "802":100, 795 | "803":100, 796 | "804":100, 797 | "805":100, 798 | "806":100, 799 | "807":100, 800 | "808":101, 801 | "809":101, 802 | "810":101, 803 | "811":101, 804 | "812":101, 805 | "813":101, 806 | "814":101, 807 | "815":101, 808 | "816":102, 809 | "817":102, 810 | "818":102, 811 | "819":102, 812 | "820":102, 813 | "821":102, 814 | "822":102, 815 | "823":102, 816 | "824":103, 817 | "825":103, 818 | "826":103, 819 | "827":103, 820 | "828":103, 821 | "829":103, 822 | "830":103, 823 | "831":103, 824 | "832":104, 825 | "833":104, 826 | "834":104, 827 | "835":104, 828 | "836":104, 829 | "837":104, 830 | "838":104, 831 | "839":104, 832 | "840":105, 833 | "841":105, 834 | "842":105, 835 | "843":105, 836 | "844":105, 837 | "845":105, 838 | "846":105, 839 | "847":105, 840 | "848":106, 841 | "849":106, 842 | "850":106, 843 | "851":106, 844 | "852":106, 845 | "853":106, 846 | "854":106, 847 | "855":106, 848 | "856":107, 849 | "857":107, 850 | "858":107, 851 | "859":107, 852 | "860":107, 853 | "861":107, 854 | "862":107, 855 | "863":107, 856 | "864":108, 857 | "865":108, 858 | "866":108, 859 | "867":108, 860 | "868":108, 861 | "869":108, 862 | "870":108, 863 | "871":108, 864 | "872":109, 865 | "873":109, 866 | "874":109, 867 | "875":109, 868 | "876":109, 869 | "877":109, 870 | "878":109, 871 | "879":109, 872 | "880":110, 873 | "881":110, 874 | "882":110, 875 | "883":110, 876 | "884":110, 877 | "885":110, 878 | "886":110, 879 | "887":110, 880 | "888":111, 881 | "889":111, 882 | "890":111, 883 | "891":111, 884 | "892":111, 885 | "893":111, 886 | "894":111, 887 | "895":111, 888 | "896":112, 889 | "897":112, 890 | "898":112, 891 | "899":112, 892 | "900":112, 893 | "901":112, 894 | "902":112, 895 | "903":112, 896 | "904":113, 897 | "905":113, 898 | "906":113, 899 | "907":113, 900 | "908":113, 901 | "909":113, 902 | "910":113, 903 | "911":113, 904 | "912":114, 905 | "913":114, 906 | "914":114, 907 | "915":114, 908 | "916":114, 909 | "917":114, 910 | "918":114, 911 | "919":114, 912 | "920":115, 913 | "921":115, 914 | "922":115, 915 | "923":115, 916 | "924":115, 917 | "925":115, 918 | "926":115, 919 | "927":115, 920 | "928":116, 921 | "929":116, 922 | "930":116, 923 | "931":116, 924 | "932":116, 925 | "933":116, 926 | "934":116, 927 | "935":116, 928 | "936":117, 929 | "937":117, 930 | "938":117, 931 | "939":117, 932 | "940":117, 933 | "941":117, 934 | "942":117, 935 | "943":117, 936 | "944":118, 937 | "945":118, 938 | "946":118, 939 | "947":118, 940 | "948":118, 941 | "949":118, 942 | "950":118, 943 | "951":118, 944 | "952":119, 945 | "953":119, 946 | "954":119, 947 | "955":119, 948 | "956":119, 949 | "957":119, 950 | "958":119, 951 | "959":119, 952 | "960":120, 953 | "961":120, 954 | "962":120, 955 | "963":120, 956 | "964":120, 957 | "965":120, 958 | "966":120, 959 | "967":120, 960 | "968":121, 961 | "969":121, 962 | "970":121, 963 | "971":121, 964 | "972":121, 965 | "973":121, 966 | "974":121, 967 | "975":121, 968 | "976":122, 969 | "977":122, 970 | "978":122, 971 | "979":122, 972 | "980":122, 973 | "981":122, 974 | "982":122, 975 | "983":122, 976 | "984":123, 977 | "985":123, 978 | "986":123, 979 | "987":123, 980 | "988":123, 981 | "989":123, 982 | "990":123, 983 | "991":123, 984 | "992":124, 985 | "993":124, 986 | "994":124, 987 | "995":124, 988 | "996":124, 989 | "997":124, 990 | "998":124, 991 | "999":124 992 | } -------------------------------------------------------------------------------- /src/hand_model_outputsize.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from tqdm import tqdm 3 | import json 4 | 5 | from src.model import handpose_model 6 | 7 | model = handpose_model() 8 | 9 | size = {} 10 | for i in tqdm(range(10, 1000)): 11 | data = torch.randn(1, 3, i, i) 12 | if torch.cuda.is_available(): 13 | data = data.cuda() 14 | size[i] = model(data).size(2) 15 | 16 | with open('hand_model_output_size.json') as f: 17 | json.dump(size, f) 18 | -------------------------------------------------------------------------------- /src/model.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from collections import OrderedDict 3 | 4 | import torch 5 | import torch.nn as nn 6 | 7 | def make_layers(block, no_relu_layers): 8 | layers = [] 9 | for layer_name, v in block.items(): 10 | if 'pool' in layer_name: 11 | layer = nn.MaxPool2d(kernel_size=v[0], stride=v[1], 12 | padding=v[2]) 13 | layers.append((layer_name, layer)) 14 | else: 15 | conv2d = nn.Conv2d(in_channels=v[0], out_channels=v[1], 16 | kernel_size=v[2], stride=v[3], 17 | padding=v[4]) 18 | layers.append((layer_name, conv2d)) 19 | if layer_name not in no_relu_layers: 20 | layers.append(('relu_'+layer_name, nn.ReLU(inplace=True))) 21 | 22 | return nn.Sequential(OrderedDict(layers)) 23 | 24 | class bodypose_model(nn.Module): 25 | def __init__(self): 26 | super(bodypose_model, self).__init__() 27 | 28 | # these layers have no relu layer 29 | no_relu_layers = ['conv5_5_CPM_L1', 'conv5_5_CPM_L2', 'Mconv7_stage2_L1',\ 30 | 'Mconv7_stage2_L2', 'Mconv7_stage3_L1', 'Mconv7_stage3_L2',\ 31 | 'Mconv7_stage4_L1', 'Mconv7_stage4_L2', 'Mconv7_stage5_L1',\ 32 | 'Mconv7_stage5_L2', 'Mconv7_stage6_L1', 'Mconv7_stage6_L1'] 33 | blocks = {} 34 | block0 = OrderedDict([ 35 | ('conv1_1', [3, 64, 3, 1, 1]), 36 | ('conv1_2', [64, 64, 3, 1, 1]), 37 | ('pool1_stage1', [2, 2, 0]), 38 | ('conv2_1', [64, 128, 3, 1, 1]), 39 | ('conv2_2', [128, 128, 3, 1, 1]), 40 | ('pool2_stage1', [2, 2, 0]), 41 | ('conv3_1', [128, 256, 3, 1, 1]), 42 | ('conv3_2', [256, 256, 3, 1, 1]), 43 | ('conv3_3', [256, 256, 3, 1, 1]), 44 | ('conv3_4', [256, 256, 3, 1, 1]), 45 | ('pool3_stage1', [2, 2, 0]), 46 | ('conv4_1', [256, 512, 3, 1, 1]), 47 | ('conv4_2', [512, 512, 3, 1, 1]), 48 | ('conv4_3_CPM', [512, 256, 3, 1, 1]), 49 | ('conv4_4_CPM', [256, 128, 3, 1, 1]) 50 | ]) 51 | 52 | 53 | # Stage 1 54 | block1_1 = OrderedDict([ 55 | ('conv5_1_CPM_L1', [128, 128, 3, 1, 1]), 56 | ('conv5_2_CPM_L1', [128, 128, 3, 1, 1]), 57 | ('conv5_3_CPM_L1', [128, 128, 3, 1, 1]), 58 | ('conv5_4_CPM_L1', [128, 512, 1, 1, 0]), 59 | ('conv5_5_CPM_L1', [512, 38, 1, 1, 0]) 60 | ]) 61 | 62 | block1_2 = OrderedDict([ 63 | ('conv5_1_CPM_L2', [128, 128, 3, 1, 1]), 64 | ('conv5_2_CPM_L2', [128, 128, 3, 1, 1]), 65 | ('conv5_3_CPM_L2', [128, 128, 3, 1, 1]), 66 | ('conv5_4_CPM_L2', [128, 512, 1, 1, 0]), 67 | ('conv5_5_CPM_L2', [512, 19, 1, 1, 0]) 68 | ]) 69 | blocks['block1_1'] = block1_1 70 | blocks['block1_2'] = block1_2 71 | 72 | self.model0 = make_layers(block0, no_relu_layers) 73 | 74 | # Stages 2 - 6 75 | for i in range(2, 7): 76 | blocks['block%d_1' % i] = OrderedDict([ 77 | ('Mconv1_stage%d_L1' % i, [185, 128, 7, 1, 3]), 78 | ('Mconv2_stage%d_L1' % i, [128, 128, 7, 1, 3]), 79 | ('Mconv3_stage%d_L1' % i, [128, 128, 7, 1, 3]), 80 | ('Mconv4_stage%d_L1' % i, [128, 128, 7, 1, 3]), 81 | ('Mconv5_stage%d_L1' % i, [128, 128, 7, 1, 3]), 82 | ('Mconv6_stage%d_L1' % i, [128, 128, 1, 1, 0]), 83 | ('Mconv7_stage%d_L1' % i, [128, 38, 1, 1, 0]) 84 | ]) 85 | 86 | blocks['block%d_2' % i] = OrderedDict([ 87 | ('Mconv1_stage%d_L2' % i, [185, 128, 7, 1, 3]), 88 | ('Mconv2_stage%d_L2' % i, [128, 128, 7, 1, 3]), 89 | ('Mconv3_stage%d_L2' % i, [128, 128, 7, 1, 3]), 90 | ('Mconv4_stage%d_L2' % i, [128, 128, 7, 1, 3]), 91 | ('Mconv5_stage%d_L2' % i, [128, 128, 7, 1, 3]), 92 | ('Mconv6_stage%d_L2' % i, [128, 128, 1, 1, 0]), 93 | ('Mconv7_stage%d_L2' % i, [128, 19, 1, 1, 0]) 94 | ]) 95 | 96 | for k in blocks.keys(): 97 | blocks[k] = make_layers(blocks[k], no_relu_layers) 98 | 99 | self.model1_1 = blocks['block1_1'] 100 | self.model2_1 = blocks['block2_1'] 101 | self.model3_1 = blocks['block3_1'] 102 | self.model4_1 = blocks['block4_1'] 103 | self.model5_1 = blocks['block5_1'] 104 | self.model6_1 = blocks['block6_1'] 105 | 106 | self.model1_2 = blocks['block1_2'] 107 | self.model2_2 = blocks['block2_2'] 108 | self.model3_2 = blocks['block3_2'] 109 | self.model4_2 = blocks['block4_2'] 110 | self.model5_2 = blocks['block5_2'] 111 | self.model6_2 = blocks['block6_2'] 112 | 113 | 114 | def forward(self, x): 115 | 116 | out1 = self.model0(x) 117 | 118 | out1_1 = self.model1_1(out1) 119 | out1_2 = self.model1_2(out1) 120 | out2 = torch.cat([out1_1, out1_2, out1], 1) 121 | 122 | out2_1 = self.model2_1(out2) 123 | out2_2 = self.model2_2(out2) 124 | out3 = torch.cat([out2_1, out2_2, out1], 1) 125 | 126 | out3_1 = self.model3_1(out3) 127 | out3_2 = self.model3_2(out3) 128 | out4 = torch.cat([out3_1, out3_2, out1], 1) 129 | 130 | out4_1 = self.model4_1(out4) 131 | out4_2 = self.model4_2(out4) 132 | out5 = torch.cat([out4_1, out4_2, out1], 1) 133 | 134 | out5_1 = self.model5_1(out5) 135 | out5_2 = self.model5_2(out5) 136 | out6 = torch.cat([out5_1, out5_2, out1], 1) 137 | 138 | out6_1 = self.model6_1(out6) 139 | out6_2 = self.model6_2(out6) 140 | 141 | return out6_1, out6_2 142 | 143 | class handpose_model(nn.Module): 144 | def __init__(self): 145 | super(handpose_model, self).__init__() 146 | 147 | # these layers have no relu layer 148 | no_relu_layers = ['conv6_2_CPM', 'Mconv7_stage2', 'Mconv7_stage3',\ 149 | 'Mconv7_stage4', 'Mconv7_stage5', 'Mconv7_stage6'] 150 | # stage 1 151 | block1_0 = OrderedDict([ 152 | ('conv1_1', [3, 64, 3, 1, 1]), 153 | ('conv1_2', [64, 64, 3, 1, 1]), 154 | ('pool1_stage1', [2, 2, 0]), 155 | ('conv2_1', [64, 128, 3, 1, 1]), 156 | ('conv2_2', [128, 128, 3, 1, 1]), 157 | ('pool2_stage1', [2, 2, 0]), 158 | ('conv3_1', [128, 256, 3, 1, 1]), 159 | ('conv3_2', [256, 256, 3, 1, 1]), 160 | ('conv3_3', [256, 256, 3, 1, 1]), 161 | ('conv3_4', [256, 256, 3, 1, 1]), 162 | ('pool3_stage1', [2, 2, 0]), 163 | ('conv4_1', [256, 512, 3, 1, 1]), 164 | ('conv4_2', [512, 512, 3, 1, 1]), 165 | ('conv4_3', [512, 512, 3, 1, 1]), 166 | ('conv4_4', [512, 512, 3, 1, 1]), 167 | ('conv5_1', [512, 512, 3, 1, 1]), 168 | ('conv5_2', [512, 512, 3, 1, 1]), 169 | ('conv5_3_CPM', [512, 128, 3, 1, 1]) 170 | ]) 171 | 172 | block1_1 = OrderedDict([ 173 | ('conv6_1_CPM', [128, 512, 1, 1, 0]), 174 | ('conv6_2_CPM', [512, 22, 1, 1, 0]) 175 | ]) 176 | 177 | blocks = {} 178 | blocks['block1_0'] = block1_0 179 | blocks['block1_1'] = block1_1 180 | 181 | # stage 2-6 182 | for i in range(2, 7): 183 | blocks['block%d' % i] = OrderedDict([ 184 | ('Mconv1_stage%d' % i, [150, 128, 7, 1, 3]), 185 | ('Mconv2_stage%d' % i, [128, 128, 7, 1, 3]), 186 | ('Mconv3_stage%d' % i, [128, 128, 7, 1, 3]), 187 | ('Mconv4_stage%d' % i, [128, 128, 7, 1, 3]), 188 | ('Mconv5_stage%d' % i, [128, 128, 7, 1, 3]), 189 | ('Mconv6_stage%d' % i, [128, 128, 1, 1, 0]), 190 | ('Mconv7_stage%d' % i, [128, 22, 1, 1, 0]) 191 | ]) 192 | 193 | for k in blocks.keys(): 194 | blocks[k] = make_layers(blocks[k], no_relu_layers) 195 | 196 | self.model1_0 = blocks['block1_0'] 197 | self.model1_1 = blocks['block1_1'] 198 | self.model2 = blocks['block2'] 199 | self.model3 = blocks['block3'] 200 | self.model4 = blocks['block4'] 201 | self.model5 = blocks['block5'] 202 | self.model6 = blocks['block6'] 203 | 204 | def forward(self, x): 205 | out1_0 = self.model1_0(x) 206 | out1_1 = self.model1_1(out1_0) 207 | concat_stage2 = torch.cat([out1_1, out1_0], 1) 208 | out_stage2 = self.model2(concat_stage2) 209 | concat_stage3 = torch.cat([out_stage2, out1_0], 1) 210 | out_stage3 = self.model3(concat_stage3) 211 | concat_stage4 = torch.cat([out_stage3, out1_0], 1) 212 | out_stage4 = self.model4(concat_stage4) 213 | concat_stage5 = torch.cat([out_stage4, out1_0], 1) 214 | out_stage5 = self.model5(concat_stage5) 215 | concat_stage6 = torch.cat([out_stage5, out1_0], 1) 216 | out_stage6 = self.model6(concat_stage6) 217 | return out_stage6 218 | 219 | 220 | -------------------------------------------------------------------------------- /src/util.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import math 3 | import cv2 4 | import matplotlib 5 | from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas 6 | from matplotlib.figure import Figure 7 | import numpy as np 8 | import matplotlib.pyplot as plt 9 | import cv2 10 | 11 | 12 | def padRightDownCorner(img, stride, padValue): 13 | h = img.shape[0] 14 | w = img.shape[1] 15 | 16 | pad = 4 * [None] 17 | pad[0] = 0 # up 18 | pad[1] = 0 # left 19 | pad[2] = 0 if (h % stride == 0) else stride - (h % stride) # down 20 | pad[3] = 0 if (w % stride == 0) else stride - (w % stride) # right 21 | 22 | img_padded = img 23 | pad_up = np.tile(img_padded[0:1, :, :]*0 + padValue, (pad[0], 1, 1)) 24 | img_padded = np.concatenate((pad_up, img_padded), axis=0) 25 | pad_left = np.tile(img_padded[:, 0:1, :]*0 + padValue, (1, pad[1], 1)) 26 | img_padded = np.concatenate((pad_left, img_padded), axis=1) 27 | pad_down = np.tile(img_padded[-2:-1, :, :]*0 + padValue, (pad[2], 1, 1)) 28 | img_padded = np.concatenate((img_padded, pad_down), axis=0) 29 | pad_right = np.tile(img_padded[:, -2:-1, :]*0 + padValue, (1, pad[3], 1)) 30 | img_padded = np.concatenate((img_padded, pad_right), axis=1) 31 | 32 | return img_padded, pad 33 | 34 | # transfer caffe model to pytorch which will match the layer name 35 | def transfer(model, model_weights): 36 | transfered_model_weights = {} 37 | for weights_name in model.state_dict().keys(): 38 | transfered_model_weights[weights_name] = model_weights['.'.join(weights_name.split('.')[1:])] 39 | return transfered_model_weights 40 | 41 | # draw the body keypoint and lims 42 | def draw_bodypose(canvas, candidate, subset): 43 | stickwidth = 4 44 | limbSeq = [[2, 3], [2, 6], [3, 4], [4, 5], [6, 7], [7, 8], [2, 9], [9, 10], \ 45 | [10, 11], [2, 12], [12, 13], [13, 14], [2, 1], [1, 15], [15, 17], \ 46 | [1, 16], [16, 18], [3, 17], [6, 18]] 47 | 48 | colors = [[255, 0, 0], [255, 85, 0], [255, 170, 0], [255, 255, 0], [170, 255, 0], [85, 255, 0], [0, 255, 0], \ 49 | [0, 255, 85], [0, 255, 170], [0, 255, 255], [0, 170, 255], [0, 85, 255], [0, 0, 255], [85, 0, 255], \ 50 | [170, 0, 255], [255, 0, 255], [255, 0, 170], [255, 0, 85]] 51 | for i in range(18): 52 | for n in range(len(subset)): 53 | index = int(subset[n][i]) 54 | if index == -1: 55 | continue 56 | x, y = candidate[index][0:2] 57 | cv2.circle(canvas, (int(x), int(y)), 4, colors[i], thickness=-1) 58 | for i in range(17): 59 | for n in range(len(subset)): 60 | index = subset[n][np.array(limbSeq[i]) - 1] 61 | if -1 in index: 62 | continue 63 | cur_canvas = canvas.copy() 64 | Y = candidate[index.astype(int), 0] 65 | X = candidate[index.astype(int), 1] 66 | mX = np.mean(X) 67 | mY = np.mean(Y) 68 | length = ((X[0] - X[1]) ** 2 + (Y[0] - Y[1]) ** 2) ** 0.5 69 | angle = math.degrees(math.atan2(X[0] - X[1], Y[0] - Y[1])) 70 | polygon = cv2.ellipse2Poly((int(mY), int(mX)), (int(length / 2), stickwidth), int(angle), 0, 360, 1) 71 | cv2.fillConvexPoly(cur_canvas, polygon, colors[i]) 72 | canvas = cv2.addWeighted(canvas, 0.4, cur_canvas, 0.6, 0) 73 | # plt.imsave("preview.jpg", canvas[:, :, [2, 1, 0]]) 74 | # plt.imshow(canvas[:, :, [2, 1, 0]]) 75 | return canvas 76 | 77 | def draw_handpose(canvas, all_hand_peaks, show_number=False): 78 | edges = [[0, 1], [1, 2], [2, 3], [3, 4], [0, 5], [5, 6], [6, 7], [7, 8], [0, 9], [9, 10], \ 79 | [10, 11], [11, 12], [0, 13], [13, 14], [14, 15], [15, 16], [0, 17], [17, 18], [18, 19], [19, 20]] 80 | fig = Figure(figsize=plt.figaspect(canvas)) 81 | 82 | fig.subplots_adjust(0, 0, 1, 1) 83 | fig.subplots_adjust(bottom=0, top=1, left=0, right=1) 84 | bg = FigureCanvas(fig) 85 | ax = fig.subplots() 86 | ax.axis('off') 87 | ax.imshow(canvas) 88 | 89 | width, height = ax.figure.get_size_inches() * ax.figure.get_dpi() 90 | 91 | for peaks in all_hand_peaks: 92 | for ie, e in enumerate(edges): 93 | if np.sum(np.all(peaks[e], axis=1)==0)==0: 94 | x1, y1 = peaks[e[0]] 95 | x2, y2 = peaks[e[1]] 96 | ax.plot([x1, x2], [y1, y2], color=matplotlib.colors.hsv_to_rgb([ie/float(len(edges)), 1.0, 1.0])) 97 | 98 | for i, keyponit in enumerate(peaks): 99 | x, y = keyponit 100 | ax.plot(x, y, 'r.') 101 | if show_number: 102 | ax.text(x, y, str(i)) 103 | bg.draw() 104 | canvas = np.fromstring(bg.tostring_rgb(), dtype='uint8').reshape(int(height), int(width), 3) 105 | return canvas 106 | 107 | # image drawed by opencv is not good. 108 | def draw_handpose_by_opencv(canvas, peaks, show_number=False): 109 | edges = [[0, 1], [1, 2], [2, 3], [3, 4], [0, 5], [5, 6], [6, 7], [7, 8], [0, 9], [9, 10], \ 110 | [10, 11], [11, 12], [0, 13], [13, 14], [14, 15], [15, 16], [0, 17], [17, 18], [18, 19], [19, 20]] 111 | # cv2.rectangle(canvas, (x, y), (x+w, y+w), (0, 255, 0), 2, lineType=cv2.LINE_AA) 112 | # cv2.putText(canvas, 'left' if is_left else 'right', (x, y), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2) 113 | for ie, e in enumerate(edges): 114 | if np.sum(np.all(peaks[e], axis=1)==0)==0: 115 | x1, y1 = peaks[e[0]] 116 | x2, y2 = peaks[e[1]] 117 | cv2.line(canvas, (x1, y1), (x2, y2), matplotlib.colors.hsv_to_rgb([ie/float(len(edges)), 1.0, 1.0])*255, thickness=2) 118 | 119 | for i, keyponit in enumerate(peaks): 120 | x, y = keyponit 121 | cv2.circle(canvas, (x, y), 4, (0, 0, 255), thickness=-1) 122 | if show_number: 123 | cv2.putText(canvas, str(i), (x, y), cv2.FONT_HERSHEY_SIMPLEX, 0.3, (0, 0, 0), lineType=cv2.LINE_AA) 124 | return canvas 125 | 126 | # detect hand according to body pose keypoints 127 | # please refer to https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/src/openpose/hand/handDetector.cpp 128 | def handDetect(candidate, subset, oriImg): 129 | # right hand: wrist 4, elbow 3, shoulder 2 130 | # left hand: wrist 7, elbow 6, shoulder 5 131 | ratioWristElbow = 0.33 132 | detect_result = [] 133 | image_height, image_width = oriImg.shape[0:2] 134 | for person in subset.astype(int): 135 | # if any of three not detected 136 | has_left = np.sum(person[[5, 6, 7]] == -1) == 0 137 | has_right = np.sum(person[[2, 3, 4]] == -1) == 0 138 | if not (has_left or has_right): 139 | continue 140 | hands = [] 141 | #left hand 142 | if has_left: 143 | left_shoulder_index, left_elbow_index, left_wrist_index = person[[5, 6, 7]] 144 | x1, y1 = candidate[left_shoulder_index][:2] 145 | x2, y2 = candidate[left_elbow_index][:2] 146 | x3, y3 = candidate[left_wrist_index][:2] 147 | hands.append([x1, y1, x2, y2, x3, y3, True]) 148 | # right hand 149 | if has_right: 150 | right_shoulder_index, right_elbow_index, right_wrist_index = person[[2, 3, 4]] 151 | x1, y1 = candidate[right_shoulder_index][:2] 152 | x2, y2 = candidate[right_elbow_index][:2] 153 | x3, y3 = candidate[right_wrist_index][:2] 154 | hands.append([x1, y1, x2, y2, x3, y3, False]) 155 | 156 | for x1, y1, x2, y2, x3, y3, is_left in hands: 157 | # pos_hand = pos_wrist + ratio * (pos_wrist - pos_elbox) = (1 + ratio) * pos_wrist - ratio * pos_elbox 158 | # handRectangle.x = posePtr[wrist*3] + ratioWristElbow * (posePtr[wrist*3] - posePtr[elbow*3]); 159 | # handRectangle.y = posePtr[wrist*3+1] + ratioWristElbow * (posePtr[wrist*3+1] - posePtr[elbow*3+1]); 160 | # const auto distanceWristElbow = getDistance(poseKeypoints, person, wrist, elbow); 161 | # const auto distanceElbowShoulder = getDistance(poseKeypoints, person, elbow, shoulder); 162 | # handRectangle.width = 1.5f * fastMax(distanceWristElbow, 0.9f * distanceElbowShoulder); 163 | x = x3 + ratioWristElbow * (x3 - x2) 164 | y = y3 + ratioWristElbow * (y3 - y2) 165 | distanceWristElbow = math.sqrt((x3 - x2) ** 2 + (y3 - y2) ** 2) 166 | distanceElbowShoulder = math.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2) 167 | width = 1.5 * max(distanceWristElbow, 0.9 * distanceElbowShoulder) 168 | # x-y refers to the center --> offset to topLeft point 169 | # handRectangle.x -= handRectangle.width / 2.f; 170 | # handRectangle.y -= handRectangle.height / 2.f; 171 | x -= width / 2 172 | y -= width / 2 # width = height 173 | # overflow the image 174 | if x < 0: x = 0 175 | if y < 0: y = 0 176 | width1 = width 177 | width2 = width 178 | if x + width > image_width: width1 = image_width - x 179 | if y + width > image_height: width2 = image_height - y 180 | width = min(width1, width2) 181 | # the max hand box value is 20 pixels 182 | if width >= 20: 183 | detect_result.append([int(x), int(y), int(width), is_left]) 184 | 185 | ''' 186 | return value: [[x, y, w, True if left hand else False]]. 187 | width=height since the network require squared input. 188 | x, y is the coordinate of top left 189 | ''' 190 | return detect_result 191 | 192 | # get max index of 2d array 193 | def npmax(array): 194 | arrayindex = array.argmax(1) 195 | arrayvalue = array.max(1) 196 | i = arrayvalue.argmax() 197 | j = arrayindex[i] 198 | return i, j 199 | --------------------------------------------------------------------------------