├── .gitignore
├── DATASET.md
├── README.md
├── configs
    ├── all_linear
    │   ├── mpii_heatmap.json
    │   ├── pushup_heatmap.json
    │   └── pushup_regression.json
    ├── mpii
    │   ├── config_blazepose_mpii_heatmap_bce.json
    │   ├── config_blazepose_mpii_heatmap_bce_regress_huber.json
    │   ├── config_blazepose_mpii_pushup_heatmap_bce.json
    │   ├── config_blazepose_mpii_pushup_heatmap_bce_regress_huber.json
    │   └── pushup_2head.json
    └── pushup_recognition.json
├── convert_to_onnx.py
├── images
    └── blazepose_full.png
├── requirements.txt
├── run_video.py
├── src
    ├── __init__.py
    ├── data_loaders
    │   ├── __init__.py
    │   ├── augmentation.py
    │   ├── augmentation2.py
    │   ├── augmentation_utils.py
    │   ├── humanpose.py
    │   ├── humanpose_2head.py
    │   └── pushup_recognition.py
    ├── metrics
    │   ├── f1.py
    │   ├── mae.py
    │   └── pck.py
    ├── models
    │   ├── __init__.py
    │   ├── blazepose_all_linear.py
    │   ├── blazepose_full.py
    │   ├── blazepose_layers.py
    │   ├── blazepose_legacy.py
    │   ├── blazepose_with_pushup_classify.py
    │   └── pushup_recognition.py
    ├── train_phase.py
    ├── trainers
    │   ├── __init__.py
    │   ├── blazepose_trainer.py
    │   ├── losses.py
    │   └── pushup_recognition_trainer.py
    └── utils
    │   ├── __init__.py
    │   ├── heatmap.py
    │   ├── keypoints.py
    │   ├── pre_processing.py
    │   └── visualizer.py
├── test.py
├── tools
    ├── lsp_data_to_json.py
    ├── merge_lsp.py
    ├── merge_lsp_lspet_pushup.py
    ├── merge_mpii_pushup.py
    ├── process_data.py
    ├── process_pushup_data.py
    ├── split_data_mpii.py
    ├── split_lsp_lspet.py
    ├── split_lsp_lspet_7points.py
    ├── split_mpii.py
    └── visualize_data.py
└── train.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | *.pyc
 2 | __pycache__
 3 | /.vscode
 4 | /experiments*
 5 | !/experiments/.gitkeep
 6 | /data
 7 | !/data/.gitkeep
 8 | /.vscode
 9 | /trained_models
10 | !/trained_models/.gitkeep


--------------------------------------------------------------------------------
/DATASET.md:
--------------------------------------------------------------------------------
 1 | # DATASET
 2 | 
 3 | ### I. Our custom dataset format
 4 | 
 5 | All data should be converted to our custom dataset format before being used for training. Our format has this folder structure:
 6 | ```
 7 | dataset_name/
 8 |     images/
 9 |     train.json
10 |     val.json
11 |     test.json
12 | ```
13 | 
14 | - `images` is a folder containing image files.
15 | - `train.json`, `val.json`, `test.json` are annotation files. Here are an example of labels in these files:
16 | 
17 | ```
18 | [
19 |     {
20 |         "image": "001.png",
21 |         "points": [[280, 540], [315, 468], [356, 354], [354, 243], [471, 331], [514, 440], [546, 540]],
22 |         "visibility": [1, 1, 1, 1, 0, 0, 1]
23 |     }
24 |     {
25 |         "image": "002.png",
26 |         "points": [[269, 529], [289, 465], [305, 410], [310, 309], [455, 358], [542, 429], [560, 542]],
27 |         "visibility": [1, 0, 0, 1, 1, 1, 1]
28 |     },
29 |     ...
30 | ]
31 | ```
32 | 
33 | ### II. LSP and LSPET
34 | 
35 | - Link to LSP dataset: <https://sam.johnson.io/research/lsp.html>.
36 | - Link to LSPET dataset: <https://sam.johnson.io/research/lspet.html>.
37 | 
38 | #### 1. Convert annotation to JSON format
39 | 
40 | - The annotation contains x and y locations and a binary value indicating the visbility of joints.
41 | - Use `tools/lsp_data_to_json.py` to convert LSP and LSPET annotation files to json format:
42 | - **NOTE:** We removed 6061 images from LSPET dataset due to missing points.
43 | 
44 | ```
45 | python tools/lsp_data_to_json.py --image_folder=data/lsp_dataset/images --input_file data/lsp_dataset/joints.mat --output_file data/lsp_dataset/labels.json
46 | python tools/lsp_data_to_json.py --image_folder=data/lspet_dataset/images --input_file data/lspet_dataset/joints.mat --output_file data/lspet_dataset/labels.json
47 | ```
48 | 
49 | #### 2. Merge 2 dataset and divide into subsets
50 | 
51 | + Training: 3739 from LSPET and 1800 from LSP.
52 | + Validation: 100 from LSPET and 100 from LSP.
53 | + Test: 100 from LSPET and 100 from LSP.
54 | 
55 | Please update paths to LSP and LSPET in `tools/split_lsp_lspet.py` and run:
56 | 
57 | ```
58 | python tools/split_lsp_lspet.py
59 | ```
60 | 
61 | 
62 | ### III. MPII Humanpose
63 | 
64 | - We only use images with numOtherPeople = 0. The original dataset are divided into 3 subsets:
65 | 
66 | + Training: 9503 images.
67 | + Validation: 1000 images.
68 | + Test: 1000 images.
69 | 
70 | 
71 | ### IV. PushUp dataset
72 | 
73 | We have push-up 420 videos, divided in 3 sets:
74 | 
75 | + Training: 8837 images from 317 videos.
76 | + Validation: 1189 images from 41 videos.
77 | + Test: 1013 images from 62 videos.


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # BlazePose Tensorflow 2.x
  2 | 
  3 | This is an implementation of Google BlazePose in Tensorflow 2.x. The original paper is "BlazePose: On-device Real-time Body Pose tracking" by Valentin Bazarevsky, Ivan Grishchenko, Karthik Raveendran, Tyler Zhu, Fan Zhang, and Matthias Grundmann, which is available on [arXiv](https://arxiv.org/abs/2006.10204). You can find some demonstrations of BlazePose from [Google blog](https://ai.googleblog.com/2020/08/on-device-real-time-body-pose-tracking.html).
  4 | 
  5 | Currently, the model being developed in this repo is based on TFLite (.tflite) model from [here](https://github.com/PINTO0309/PINTO_model_zoo/tree/master/058_BlazePose_Full_Keypoints/01_Accurate). I use [Netron.app](https://netron.app/) to visualize the architecture and try to mimic that architecture in my implementation. The visualized model architecture can be found [here](images/blazepose_full.png). Other architectures will be added in the future.
  6 | 
  7 | **Note:** This repository is still under active development.
  8 | 
  9 | **Update 14/12/2020:** Our PushUp Counter App is using this BlazePose model to count pushups from videos/webcam. [***Read more.***](https://github.com/vietanhdev/pushup-counter-app)
 10 | 
 11 | ## TODOs
 12 | 
 13 | - [ ] Implementation
 14 | 
 15 |     - [x] Initialize code for model from .tflite file.
 16 | 
 17 |     - [x] Basic dataset loader
 18 | 
 19 |     - [x] Implement loss function.
 20 | 
 21 |     - [x] Implement training code.
 22 | 
 23 |     - [x] Advanced augmentation: Random occlusion (BlazePose paper)
 24 | 
 25 |     - [x] Implement demo code for video and webcam.
 26 | 
 27 |     - [x] Support PCK metric.
 28 | 
 29 |     - [ ] Implement testing code.
 30 | 
 31 |     - [ ] Add training graph and pretrained models.
 32 | 
 33 |     - [ ] Support offset maps.
 34 | 
 35 |     - [ ] Experiment with other loss functions.
 36 | 
 37 |     - [ ] Workout counting from keypoints.
 38 | 
 39 |     - [ ] Rewrite in eager mode.
 40 | 
 41 | - [ ] Datasets
 42 | 
 43 |     - [x] Support LSP dataset and LSPET dataset (partially). [More](DATASET.md).
 44 | 
 45 |     - [x] Support PushUps dataset.
 46 | 
 47 |     - [x] Support MPII dataset.
 48 | 
 49 |     - [ ] Support YOGA-82 dataset.
 50 | 
 51 |     - [ ] Custom dataset.
 52 | 
 53 | - [ ] Convert and run model in TF Lite format.
 54 | 
 55 | - [ ] Convert and run model in TensorRT.
 56 | 
 57 | - [ ] Convert and run model in Tensorflow.js.
 58 | 
 59 | ## Demo
 60 | 
 61 | - Download pretrained model for PushUp dataset [here](https://1drv.ms/u/s!Av71xxzl6mYZgddJ7IdF0wfjwI3sgw?e=l94WL5) and put into `trained_models/blazepose_pushup_v1.h5`. Test with your webcam:
 62 | 
 63 | ```
 64 | python run_video.py -c configs/mpii/config_blazepose_mpii_pushup_heatmap_bce_regress_huber.json  -m trained_models/blazepose_pushup_v1.h5 -v webcam --confidence 0.3
 65 | ```
 66 | 
 67 | The pretrained model is only in experimental state now. It only detects 7 keypoints for Push Up counting and it may not produce a good result now. I will update other models in the future.
 68 | 
 69 | ## Training
 70 | 
 71 | **NOTE:** Currently, I only focus on PushUp datase, which contains 7 keypoints. Due to the copyright of this dataset, I don't have permission to publish it on the Internet. You can read the instruction and try with your own dataset.
 72 | 
 73 | - Prepare dataset using instruction from [DATASET.md](DATASET.md).
 74 | 
 75 | - Training heatmap branch:
 76 | 
 77 | ```
 78 | python train.py -c configs/mpii/config_blazepose_mpii_pushup_heatmap_bce.json
 79 | ```
 80 | 
 81 | - After heatmap branch converged, set `load_weights` to `true` and update the `pretrained_weights_path` to the best model, and continue with the regression branch:
 82 | 
 83 | ```
 84 | python train.py -c configs/mpii/config_blazepose_mpii_pushup_heatmap_bce_regress_huber.json
 85 | ```
 86 | 
 87 | ## Reference
 88 | 
 89 | - Cite the original paper:
 90 | 
 91 | ```tex
 92 | @article{Bazarevsky2020BlazePoseOR,
 93 |   title={BlazePose: On-device Real-time Body Pose tracking},
 94 |   author={Valentin Bazarevsky and I. Grishchenko and K. Raveendran and Tyler Lixuan Zhu and Fangfang Zhang and M. Grundmann},
 95 |   journal={ArXiv},
 96 |   year={2020},
 97 |   volume={abs/2006.10204}
 98 | }
 99 | ```
100 | 
101 | This source code uses some code and ideas from these repos:
102 | 
103 | - https://fairyonice.github.io/Achieving-top-5-in-Kaggles-facial-keypoints-detection-using-FCN.html
104 | - https://github.com/yuanyuanli85/Stacked_Hourglass_Network_Keras
105 | 
106 | ## Contributions
107 | 
108 | Please feel free to [submit an issue](https://github.com/vietanhdev/tf-blazepose/issues) or [pull a request](https://github.com/vietanhdev/tf-blazepose/pulls).
109 | 
110 | 


--------------------------------------------------------------------------------
/configs/all_linear/mpii_heatmap.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "experiment_name":                  "all_linear_mpii_heatmap",
 3 |     "trainer":                          "blazepose_trainer",
 4 |     "data_loader":                      "humanpose",      
 5 |     "data": {   
 6 |         "train_images":                 "data/mpii/images/",
 7 |         "train_labels":                 "data/mpii/train.json",
 8 |         "val_images":                   "data/mpii/images/",
 9 |         "val_labels":                   "data/mpii/val.json",
10 |         "test_images":                  "data/mpii/images/",
11 |         "test_labels":                  "data/mpii/test.json",
12 |         "symmetry_point_ids":           [[12, 13], [10, 15], [11, 14], [2, 3], [1, 4], [0, 5]]
13 |     },                   
14 |     "model" : {
15 |         "im_width":                     256,
16 |         "im_height":                    256,
17 |         "heatmap_width":                128,
18 |         "heatmap_height":               128,
19 |         "heatmap_kp_sigma":             4,
20 |         "num_keypoints":                16,
21 |         "model_type":                  "ALL_LINEAR_TWO_HEAD"
22 |     },
23 |     "train": {
24 |         "train_phase":                  "HEATMAP",
25 |         "heatmap_loss":                 "huber",
26 |         "keypoint_loss":                "huber",
27 |         "loss_weights":                 {"heatmap": 1.0, "joints": 0.0},
28 |         "train_batch_size":             32,
29 |         "val_batch_size":               32,
30 |         "nb_epochs":                    1000,
31 |         "learning_rate":                5e-4,
32 |         "load_weights":                 false,
33 |         "pretrained_weights_path":      "",
34 |         "initial_epoch":                0
35 |     },
36 |     "test": {
37 |         "pck_ref_points_idxs" :         [8, 9],
38 |         "pck_thresh":                   0.5
39 |     }
40 | }
41 | 


--------------------------------------------------------------------------------
/configs/all_linear/pushup_heatmap.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "experiment_name":                  "pushup_heatmap",
 3 |     "trainer":                          "blazepose_trainer",
 4 |     "data_loader":                      "humanpose",      
 5 |     "data": {   
 6 |         "train_images":                 "data/mpii_pushup/images/",
 7 |         "train_labels":                 "data/mpii_pushup/train.json",
 8 |         "val_images":                   "data/mpii_pushup/images/",
 9 |         "val_labels":                   "data/mpii_pushup/val.json",
10 |         "test_images":                  "data/mpii_pushup/images/",
11 |         "test_labels":                  "data/mpii_pushup/test.json",
12 |         "symmetry_point_ids":           [[0,6], [1,5], [2,4]]
13 |     },                   
14 |     "model" : {
15 |         "im_width":                     256,
16 |         "im_height":                    256,
17 |         "heatmap_width":                128,
18 |         "heatmap_height":               128,
19 |         "heatmap_kp_sigma":             4,
20 |         "num_keypoints":                7,
21 |         "model_type":                  "ALL_LINEAR_TWO_HEAD"
22 |     },
23 |     "train": {
24 |         "train_phase":                  "HEATMAP",
25 |         "heatmap_loss":                 "huber",
26 |         "keypoint_loss":                "huber",
27 |         "loss_weights":                 {"heatmap": 1.0, "joints": 0.0},
28 |         "train_batch_size":             32,
29 |         "val_batch_size":               32,
30 |         "nb_epochs":                    1000,
31 |         "learning_rate":                1e-4,
32 |         "load_weights":                 false,
33 |         "pretrained_weights_path":      "",
34 |         "initial_epoch":                0
35 |     },
36 |     "test": {
37 |         "pck_ref_points_idxs" :         [2, 4],
38 |         "pck_thresh":                   0.25
39 |     }
40 | }
41 | 


--------------------------------------------------------------------------------
/configs/all_linear/pushup_regression.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "experiment_name":                  "pushup_regression",
 3 |     "trainer":                          "blazepose_trainer",
 4 |     "data_loader":                      "humanpose",      
 5 |     "data": {   
 6 |         "train_images":                 "data/mpii_pushup/images/",
 7 |         "train_labels":                 "data/mpii_pushup/train.json",
 8 |         "val_images":                   "data/mpii_pushup/images/",
 9 |         "val_labels":                   "data/mpii_pushup/val.json",
10 |         "test_images":                  "data/mpii_pushup/images/",
11 |         "test_labels":                  "data/mpii_pushup/test.json",
12 |         "symmetry_point_ids":           [[0,6], [1,5], [2,4]]
13 |     },                   
14 |     "model" : {
15 |         "im_width":                     256,
16 |         "im_height":                    256,
17 |         "heatmap_width":                128,
18 |         "heatmap_height":               128,
19 |         "heatmap_kp_sigma":             4,
20 |         "num_keypoints":                7,
21 |         "model_type":                  "ALL_LINEAR_TWO_HEAD"
22 |     },
23 |     "train": {
24 |         "train_phase":                  "REGRESSION",
25 |         "heatmap_loss":                 "huber",
26 |         "keypoint_loss":                "huber",
27 |         "loss_weights":                 {"heatmap": 0.0, "joints": 1.0},
28 |         "train_batch_size":             32,
29 |         "val_batch_size":               32,
30 |         "nb_epochs":                    1000,
31 |         "learning_rate":                1e-4,
32 |         "load_weights":                 false,
33 |         "pretrained_weights_path":      "",
34 |         "initial_epoch":                0
35 |     },
36 |     "test": {
37 |         "pck_ref_points_idxs" :         [2, 4],
38 |         "pck_thresh":                   0.25
39 |     }
40 | }
41 | 


--------------------------------------------------------------------------------
/configs/mpii/config_blazepose_mpii_heatmap_bce.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "experiment_name":                  "blazepose_mpii_heatmap_bce",
 3 |     "trainer":                          "blazepose_trainer",
 4 |     "data_loader":                      "humanpose",      
 5 |     "data": {   
 6 |         "train_images":                 "data/mpii/images/",
 7 |         "train_labels":                 "data/mpii/train.json",
 8 |         "val_images":                   "data/mpii/images/",
 9 |         "val_labels":                   "data/mpii/val.json",
10 |         "test_images":                  "data/mpii/images/",
11 |         "test_labels":                  "data/mpii/test.json",
12 |         "symmetry_point_ids":           [[12, 13], [10, 15], [11, 14], [2, 3], [1, 4], [0, 5]]
13 |     },                   
14 |     "model" : {
15 |         "im_width":                     256,
16 |         "im_height":                    256,
17 |         "heatmap_width":                128,
18 |         "heatmap_height":               128,
19 |         "heatmap_kp_sigma":             4,
20 |         "num_keypoints":                16,
21 |         "model_type":                  "SIGMOID_HEATMAP_SIGMOID_REGRESS_TWO_HEAD"
22 |     },
23 |     "train": {
24 |         "train_phase":                  "HEATMAP",
25 |         "heatmap_loss":                 "binary_crossentropy",
26 |         "keypoint_loss":                "huber",
27 |         "loss_weights":                 {"heatmap": 1.0, "joints": 0.0},
28 |         "train_batch_size":             16,
29 |         "val_batch_size":               16,
30 |         "nb_epochs":                    1000,
31 |         "learning_rate":                1e-4,
32 |         "load_weights":                 false,
33 |         "pretrained_weights_path":      "",
34 |         "initial_epoch":                0
35 |     },
36 |     "test": {
37 |         "pck_ref_points_idxs" :         [8, 9],
38 |         "pck_thresh":                   0.5
39 |     }
40 | }
41 | 


--------------------------------------------------------------------------------
/configs/mpii/config_blazepose_mpii_heatmap_bce_regress_huber.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "experiment_name":                  "blazepose_mpii_heatmap_bce_regress_huber",
 3 |     "trainer":                          "blazepose_trainer",
 4 |     "data_loader":                      "humanpose",      
 5 |     "data": {   
 6 |         "train_images":                 "data/mpii/images/",
 7 |         "train_labels":                 "data/mpii/train.json",
 8 |         "val_images":                   "data/mpii/images/",
 9 |         "val_labels":                   "data/mpii/val.json",
10 |         "test_images":                  "data/mpii/images/",
11 |         "test_labels":                  "data/mpii/test.json",
12 |         "symmetry_point_ids":           [[12, 13], [10, 15], [11, 14], [2, 3], [1, 4], [0, 5]]
13 |     },                   
14 |     "model" : {
15 |         "im_width":                     256,
16 |         "im_height":                    256,
17 |         "heatmap_width":                128,
18 |         "heatmap_height":               128,
19 |         "heatmap_kp_sigma":             4,
20 |         "num_keypoints":                7,
21 |         "model_type":                  "SIGMOID_HEATMAP_LINEAR_REGRESS_TWO_HEAD"
22 |     },
23 |     "train": {
24 |         "train_phase":                  "REGRESSION",
25 |         "heatmap_loss":                 "binary_crossentropy",
26 |         "keypoint_loss":                "huber",
27 |         "loss_weights":                 {"heatmap": 0.0, "joints": 1.0},
28 |         "train_batch_size":             32,
29 |         "val_batch_size":               32,
30 |         "nb_epochs":                    1000,
31 |         "learning_rate":                1e-3,
32 |         "load_weights":                 false,
33 |         "pretrained_weights_path":      "",
34 |         "initial_epoch":                0
35 |     },
36 |     "test": {
37 |         "pck_ref_points_idxs" :         [8, 9],
38 |         "pck_thresh":                   0.5
39 |     }
40 | }
41 | 


--------------------------------------------------------------------------------
/configs/mpii/config_blazepose_mpii_pushup_heatmap_bce.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "experiment_name":                  "blazepose_mpii_pushup_heatmap_bce",
 3 |     "trainer":                          "blazepose_trainer",
 4 |     "data_loader":                      "humanpose",      
 5 |     "data": {   
 6 |         "train_images":                 "data/mpii_pushup/images/",
 7 |         "train_labels":                 "data/mpii_pushup/train.json",
 8 |         "val_images":                   "data/mpii_pushup/images/",
 9 |         "val_labels":                   "data/mpii_pushup/val.json",
10 |         "test_images":                  "data/mpii_pushup/images/",
11 |         "test_labels":                  "data/mpii_pushup/test.json",
12 |         "symmetry_point_ids":           [[0,6], [1,5], [2,4]]
13 |     },                   
14 |     "model" : {
15 |         "im_width":                     256,
16 |         "im_height":                    256,
17 |         "heatmap_width":                128,
18 |         "heatmap_height":               128,
19 |         "heatmap_kp_sigma":             4,
20 |         "num_keypoints":                7,
21 |         "model_type":                  "SIGMOID_HEATMAP_LINEAR_REGRESS_TWO_HEAD"
22 |     },
23 |     "train": {
24 |         "train_phase":                  "HEATMAP",
25 |         "heatmap_loss":                 "binary_crossentropy",
26 |         "keypoint_loss":                "huber",
27 |         "loss_weights":                 {"heatmap": 1.0, "joints": 0.0},
28 |         "train_batch_size":             16,
29 |         "val_batch_size":               16,
30 |         "nb_epochs":                    1000,
31 |         "learning_rate":                1e-4,
32 |         "load_weights":                 false,
33 |         "pretrained_weights_path":      "",
34 |         "initial_epoch":                0
35 |     },
36 |     "test": {
37 |         "pck_ref_points_idxs" :         [2, 4],
38 |         "pck_thresh":                   0.25
39 |     }
40 | }
41 | 


--------------------------------------------------------------------------------
/configs/mpii/config_blazepose_mpii_pushup_heatmap_bce_regress_huber.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "experiment_name":                  "config_blazepose_mpii_pushup_heatmap_bce_regress_huber",
 3 |     "trainer":                          "blazepose_trainer",
 4 |     "data_loader":                      "humanpose",      
 5 |     "data": {   
 6 |         "train_images":                 "data/mpii_pushup/images/",
 7 |         "train_labels":                 "data/mpii_pushup/train.json",
 8 |         "val_images":                   "data/mpii_pushup/images/",
 9 |         "val_labels":                   "data/mpii_pushup/val.json",
10 |         "test_images":                  "data/mpii_pushup/images/",
11 |         "test_labels":                  "data/mpii_pushup/test.json",
12 |         "symmetry_point_ids":           [[0,6], [1,5], [2,4]]
13 |     },                   
14 |     "model" : {
15 |         "im_width":                     256,
16 |         "im_height":                    256,
17 |         "heatmap_width":                128,
18 |         "heatmap_height":               128,
19 |         "heatmap_kp_sigma":             4,
20 |         "num_keypoints":                7,
21 |         "model_type":                  "SIGMOID_HEATMAP_LINEAR_REGRESS_TWO_HEAD"
22 |     },
23 |     "train": {
24 |         "train_phase":                  "REGRESSION",
25 |         "heatmap_loss":                 "binary_crossentropy",
26 |         "keypoint_loss":                "huber",
27 |         "loss_weights":                 {"heatmap": 1.0, "joints": 1.0},
28 |         "train_batch_size":             16,
29 |         "val_batch_size":               16,
30 |         "nb_epochs":                    1000,
31 |         "learning_rate":                0,
32 |         "load_weights":                 false,
33 |         "pretrained_weights_path":      "",
34 |         "initial_epoch":                0
35 |     },
36 |     "test": {
37 |         "pck_ref_points_idxs" :         [2, 4],
38 |         "pck_thresh":                   0.25
39 |     }
40 | }
41 | 


--------------------------------------------------------------------------------
/configs/mpii/pushup_2head.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "experiment_name":                  "2head_model",
 3 |     "trainer":                          "blazepose_trainer2",
 4 |     "data_loader":                      "humanpose_2head",      
 5 |     "data": {   
 6 |         "train_images":                 "data/mpii_pushup/images/",
 7 |         "train_labels":                 "data/mpii_pushup/train2.json",
 8 |         "val_images":                   "data/mpii_pushup/images/",
 9 |         "val_labels":                   "data/mpii_pushup/val2.json",
10 |         "test_images":                  "data/mpii_pushup/images/",
11 |         "test_labels":                  "data/mpii_pushup/test2.json",
12 |         "symmetry_point_ids":           [[12, 13], [10, 15], [11, 14], [2, 3], [1, 4], [0, 5]]
13 |     },                   
14 |     "model" : {
15 |         "im_width":                     256,
16 |         "im_height":                    256,
17 |         "heatmap_width":                128,
18 |         "heatmap_height":               128,
19 |         "heatmap_kp_sigma":             4,
20 |         "num_keypoints":                16,
21 |         "model_type":                  "BLAZEPOSE_WITH_PUSHUP_CLASSIFY"
22 |     },
23 |     "train": {
24 |         "train_phase":                  "HEATMAP",
25 |         "heatmap_loss":                 "focal_tversky",
26 |         "is_pushup_loss":               "binary_crossentropy",
27 |         "loss_weights":                 {"heatmap": 1.0, "is_pushup": 1.0},
28 |         "train_batch_size":             16,
29 |         "val_batch_size":               16,
30 |         "nb_epochs":                    1000,
31 |         "learning_rate":                1e-4,
32 |         "load_weights":                 false,
33 |         "pretrained_weights_path":      "",
34 |         "initial_epoch":                0
35 |     },
36 |     "test": {
37 |         "pck_ref_points_idxs" :         [12, 13],
38 |         "pck_thresh":                   0.25
39 |     }
40 | }
41 | 


--------------------------------------------------------------------------------
/configs/pushup_recognition.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "experiment_name":                  "pushup_recognition2",
 3 |     "trainer":                          "pushup_recognition_trainer",
 4 |     "data_loader":                      "pushup_recognition",      
 5 |     "data": {   
 6 |         "train_images":                 "data/mpii_pushup/images/",
 7 |         "train_labels":                 "data/mpii_pushup/train.json",
 8 |         "val_images":                   "data/mpii_pushup/images/",
 9 |         "val_labels":                   "data/mpii_pushup/val.json",
10 |         "test_images":                  "data/mpii_pushup/images/",
11 |         "test_labels":                  "data/mpii_pushup/test.json"
12 |     },                   
13 |     "model" : {
14 |         "im_width":                     224,
15 |         "im_height":                    224,
16 |         "model_type":                   "PUSHUP_RECOGNITION"
17 |     },
18 |     "train": {
19 |         "loss":                         "binary_crossentropy",
20 |         "train_batch_size":             16,
21 |         "val_batch_size":               16,
22 |         "nb_epochs":                    1000,
23 |         "learning_rate":                5e-4,
24 |         "load_weights":                 false,
25 |         "pretrained_weights_path":      "experiments/pushup_recognition/models/model_ep003.h5",
26 |         "initial_epoch":               0
27 |     },
28 |     "test": {}
29 | }
30 | 


--------------------------------------------------------------------------------
/convert_to_onnx.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import tensorflow.keras.backend as K
 3 | K.set_learning_phase(0)
 4 | import tensorflow as tf
 5 | import keras2onnx
 6 | from tensorflow.keras.models import load_model
 7 | 
 8 | MODEL_PATH = ""
 9 | model = load_model(MODEL_PATH)
10 | submodel = tf.keras.models.Model(inputs=model.inputs, outputs=model.get_layer("joints").outputs)
11 | submodel._name = "blazepose_heatmap_v1"
12 | print(submodel.summary())
13 | onnx_model = keras2onnx.convert_keras(submodel, submodel.name)
14 | 
15 | file = open("blazepose_heatmap_v1.1.onnx", "wb")
16 | file.write(onnx_model.SerializeToString())
17 | file.close()


--------------------------------------------------------------------------------
/images/blazepose_full.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vietanhdev/tf-blazepose/95db5ced1a4084c6a424161fbfc38c4e70612c97/images/blazepose_full.png


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vietanhdev/tf-blazepose/95db5ced1a4084c6a424161fbfc38c4e70612c97/requirements.txt


--------------------------------------------------------------------------------
/run_video.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import importlib
  3 | import json
  4 | import cv2
  5 | import numpy as np
  6 | from src.utils.heatmap import find_keypoints_from_heatmap
  7 | from src.utils.visualizer import visualize_keypoints
  8 | import tensorflow as tf
  9 | 
 10 | for gpu in tf.config.experimental.list_physical_devices('GPU'):
 11 |     tf.compat.v2.config.experimental.set_memory_growth(gpu, True)
 12 | 
 13 | parser = argparse.ArgumentParser()
 14 | parser.add_argument(
 15 |     '-c',
 16 |     '--conf_file', default="config.json",
 17 |     help='Configuration file')
 18 | parser.add_argument(
 19 |     '-m',
 20 |     '--model', default="model.h5",
 21 |     help='Path to h5 model')
 22 | parser.add_argument(
 23 |     '-confidence',
 24 |     '--confidence',
 25 |     default=0.05,
 26 |     help='Confidence for heatmap point')
 27 | parser.add_argument(
 28 |     '-v',
 29 |     '--video',
 30 |     help='Path to video file')
 31 | 
 32 | args = parser.parse_args()
 33 | 
 34 | # Webcam
 35 | if args.video == "webcam":
 36 |     args.video = 0
 37 | 
 38 | confth = float(args.confidence)
 39 | 
 40 | # Open and load the config json
 41 | with open(args.conf_file) as config_buffer:
 42 |     config = json.loads(config_buffer.read())
 43 | 
 44 | # Load model
 45 | trainer = importlib.import_module("src.trainers.{}".format(config["trainer"]))
 46 | model = trainer.load_model(config, args.model)
 47 | 
 48 | # Dataloader
 49 | datalib = importlib.import_module("src.data_loaders.{}".format(config["data_loader"]))
 50 | DataSequence = datalib.DataSequence
 51 | 
 52 | cap = cv2.VideoCapture(args.video)
 53 | cv2.namedWindow("Result", cv2.WINDOW_NORMAL)
 54 | while(True):
 55 | 
 56 |     ret, origin_frame = cap.read()
 57 | 
 58 |     scale = np.array([float(origin_frame.shape[1]) / config["model"]["im_width"],
 59 |                 float(origin_frame.shape[0]) / config["model"]["im_height"]], dtype=float)
 60 | 
 61 |     img = cv2.resize(origin_frame, (config["model"]["im_width"], config["model"]["im_height"]))
 62 |     input_x = DataSequence.preprocess_images(np.array([img]))
 63 | 
 64 |     regress_kps, heatmap = model.predict(input_x)
 65 |     heatmap_kps = find_keypoints_from_heatmap(heatmap)[0]
 66 |     heatmap_kps = np.array(heatmap_kps)
 67 | 
 68 |     # Scale heatmap keypoint
 69 |     heatmap_stride = np.array([config["model"]["im_width"] / config["model"]["heatmap_width"],
 70 |                             config["model"]["im_height"] / config["model"]["heatmap_height"]], dtype=float)
 71 |     heatmap_kps[:, :2] = heatmap_kps[:, :2] * scale * heatmap_stride
 72 | 
 73 |     # Scale regression keypoint
 74 |     regress_kps = regress_kps.reshape((-1, 3))
 75 |     regress_kps[:, :2] = regress_kps[:, :2] * np.array([origin_frame.shape[1], origin_frame.shape[0]])
 76 | 
 77 |     # Filter heatmap keypoint by confidence
 78 |     heatmap_kps_visibility = np.ones((len(heatmap_kps),), dtype=int)
 79 |     for i in range(len(heatmap_kps)):
 80 |         if heatmap_kps[i, 2] < confth:
 81 |             heatmap_kps[i, :2] = [-1, -1]
 82 |             heatmap_kps_visibility[i] = 0
 83 | 
 84 |     regress_kps_visibility = np.ones((len(regress_kps),), dtype=int)
 85 |     for i in range(len(regress_kps)):
 86 |         if regress_kps[i, 2] < 0.5:
 87 |             regress_kps[i, :2] = [-1, -1]
 88 |             regress_kps_visibility[i] = 0
 89 | 
 90 |     edges = [[0,1,2,3,4,5,6]]
 91 | 
 92 |     draw = origin_frame.copy()
 93 |     draw = visualize_keypoints(draw, regress_kps[:, :2], visibility=regress_kps_visibility, edges=edges, point_color=(0, 255, 0), text_color=(255, 0, 0))
 94 |     draw = visualize_keypoints(draw, heatmap_kps[:, :2], visibility=heatmap_kps_visibility, edges=edges, point_color=(0, 255, 0), text_color=(0, 0, 255))
 95 |     cv2.imshow('Result', draw)
 96 | 
 97 |     heatmap = np.sum(heatmap[0], axis=2)
 98 |     heatmap = cv2.resize(heatmap, None, fx=3, fy=3)
 99 |     heatmap = heatmap * 1.5
100 |     cv2.imshow('Heatmap', heatmap)
101 |     if cv2.waitKey(1) & 0xFF == ord('q'):
102 |         break
103 | 
104 | # When everything done, release the capture
105 | cap.release()
106 | cv2.destroyAllWindows()
107 | 


--------------------------------------------------------------------------------
/src/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vietanhdev/tf-blazepose/95db5ced1a4084c6a424161fbfc38c4e70612c97/src/__init__.py


--------------------------------------------------------------------------------
/src/data_loaders/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vietanhdev/tf-blazepose/95db5ced1a4084c6a424161fbfc38c4e70612c97/src/data_loaders/__init__.py


--------------------------------------------------------------------------------
/src/data_loaders/augmentation.py:
--------------------------------------------------------------------------------
  1 | import random
  2 | 
  3 | import cv2
  4 | import imgaug as ia
  5 | import numpy as np
  6 | from imgaug import augmenters as iaa
  7 | 
  8 | from .augmentation_utils import add_vertical_reflection
  9 | 
 10 | seq = [None]
 11 | 
 12 | 
 13 | def load_aug():
 14 | 
 15 |     def sometimes(aug): return iaa.Sometimes(0.2, aug)
 16 | 
 17 |     seq[0] = iaa.Sequential(
 18 |         [
 19 |             # crop images by -5% to 10% of their height/width
 20 |             sometimes(iaa.CropAndPad(
 21 |                 percent=(-0.05, 0.1),
 22 |                 pad_mode=ia.ALL,
 23 |                 pad_cval=(0, 255)
 24 |             )),
 25 |             sometimes(iaa.Affine(
 26 |                 scale={"x": (0.9, 1.1), "y": (0.9, 1.1)},
 27 |                 translate_percent={"x": (-0.05, 0.05), "y": (-0.05, 0.05)},
 28 |                 rotate=(-10, 10),
 29 |                 shear=(-5, 5),
 30 |                 order=[0, 1],
 31 |                 # if mode is constant, use a cval between 0 and 255
 32 |                 cval=(0, 255),
 33 |                 # use any of scikit-image's warping modes (see 2nd image from the top for examples)
 34 |                 mode=ia.ALL
 35 |             )),
 36 |             iaa.Sometimes(0.1, iaa.MotionBlur(k=15, angle=[-45, 45])),
 37 |             # execute 0 to 5 of the following (less important) augmenters per image
 38 |             # don't execute all of them, as that would often be way too strong
 39 |             iaa.SomeOf((0, 5),
 40 |                        [
 41 |                 iaa.OneOf([
 42 |                     iaa.GaussianBlur((0, 3.0)),
 43 |                     iaa.AverageBlur(k=(2, 5)),
 44 |                     iaa.MedianBlur(k=(3, 5)),
 45 |                 ]),
 46 |                 iaa.Sharpen(alpha=(0, 1.0), lightness=(
 47 |                     0.75, 1.5)),  # sharpen images
 48 |                 # add gaussian noise to images
 49 |                 iaa.AdditiveGaussianNoise(loc=0, scale=(
 50 |                     0.0, 0.05*255), per_channel=0.5),
 51 |                 # change brightness of images (by -10 to 10 of original value)
 52 |                 iaa.Add((-10, 10), per_channel=0.5),
 53 |                 # change hue and saturation
 54 |                 iaa.AddToHueAndSaturation((-20, 20)),
 55 |                 # either change the brightness of the whole image (sometimes
 56 |                 # per channel) or change the brightness of subareas
 57 |                 iaa.OneOf([
 58 |                     iaa.Multiply((0.5, 1.5), per_channel=0.5),
 59 |                     iaa.FrequencyNoiseAlpha(
 60 |                         exponent=(-4, 0),
 61 |                         first=iaa.Multiply((0.5, 1.5), per_channel=True),
 62 |                         second=iaa.LinearContrast((0.5, 2.0))
 63 |                     )
 64 |                 ]),
 65 |                 # improve or worsen the contrast
 66 |                 iaa.LinearContrast((0.5, 2.0), per_channel=0.5),
 67 |                 iaa.Grayscale(alpha=(0.0, 1.0)),
 68 |             ],
 69 |                 random_order=True
 70 |             )
 71 |         ],
 72 |         random_order=True
 73 |     )
 74 | 
 75 | 
 76 | def augment_img(image, landmark=None):
 77 |     if seq[0] is None:
 78 |         load_aug()
 79 | 
 80 |     if landmark is None:
 81 |         image_aug = seq[0](images=np.array([image]))
 82 |         return image_aug[0]
 83 |     else:
 84 | 
 85 |         landmark_xy = landmark[:, :2]
 86 |         image_aug, landmark_xy = seq[0](images=np.array(
 87 |             [image]), keypoints=np.array([landmark_xy]))
 88 |         image_aug = image_aug[0]
 89 |         landmark_xy = landmark_xy[0]
 90 | 
 91 |         # Simulate reflection
 92 |         if random.random() < 0.1:
 93 |             image_aug = add_vertical_reflection(image_aug, landmark_xy)
 94 | 
 95 |         landmark[:, :2] = landmark_xy
 96 |         # draw = image_aug.copy()
 97 |         # for i in range(landmark.shape[0]):
 98 |         # 	draw = cv2.circle(draw, (int(landmark[i][0]), int(landmark[i][1])), 2, (0,255,0), 2)
 99 |         # cv2.imshow("draw", draw)
100 |         # cv2.waitKey(0)
101 |         return image_aug, landmark
102 | 


--------------------------------------------------------------------------------
/src/data_loaders/augmentation2.py:
--------------------------------------------------------------------------------
  1 | import random
  2 | 
  3 | import cv2
  4 | import imgaug as ia
  5 | import numpy as np
  6 | from imgaug import augmenters as iaa
  7 | 
  8 | from .augmentation_utils import add_vertical_reflection
  9 | 
 10 | seq = [None]
 11 | 
 12 | 
 13 | def load_aug():
 14 | 
 15 |     def sometimes(aug): return iaa.Sometimes(0.2, aug)
 16 | 
 17 |     seq[0] = iaa.Sequential(
 18 |         [
 19 |             iaa.Crop(percent=(0, 0.3)), # random crops
 20 |             iaa.Fliplr(0.5),
 21 |             # crop images by -5% to 10% of their height/width
 22 |             sometimes(iaa.CropAndPad(
 23 |                 percent=(-0.2, 0.3),
 24 |                 pad_mode=ia.ALL,
 25 |                 pad_cval=(0, 255)
 26 |             )),
 27 |             sometimes(iaa.Affine(
 28 |                 scale={"x": (0.7, 1.3), "y": (0.7, 1.3)},
 29 |                 translate_percent={"x": (-0.2, 0.2), "y": (-0.2, 0.2)},
 30 |                 rotate=(-5, 5),
 31 |                 shear=(-5, 5),
 32 |                 order=[0, 1],
 33 |                 # if mode is constant, use a cval between 0 and 255
 34 |                 cval=(0, 255),
 35 |                 # use any of scikit-image's warping modes (see 2nd image from the top for examples)
 36 |                 mode=ia.ALL
 37 |             )),
 38 |             iaa.Sometimes(0.1, iaa.MotionBlur(k=15, angle=[-45, 45])),
 39 |             # execute 0 to 5 of the following (less important) augmenters per image
 40 |             # don't execute all of them, as that would often be way too strong
 41 |             iaa.SomeOf((0, 5),
 42 |                        [
 43 |                 iaa.OneOf([
 44 |                     iaa.GaussianBlur((0, 3.0)),
 45 |                     iaa.AverageBlur(k=(2, 5)),
 46 |                     iaa.MedianBlur(k=(3, 5)),
 47 |                 ]),
 48 |                 iaa.Sharpen(alpha=(0, 1.0), lightness=(
 49 |                     0.75, 1.5)),  # sharpen images
 50 |                 # add gaussian noise to images
 51 |                 iaa.AdditiveGaussianNoise(loc=0, scale=(
 52 |                     0.0, 0.05*255), per_channel=0.5),
 53 |                 # change brightness of images (by -10 to 10 of original value)
 54 |                 iaa.Add((-10, 10), per_channel=0.5),
 55 |                 # change hue and saturation
 56 |                 iaa.AddToHueAndSaturation((-20, 20)),
 57 |                 # either change the brightness of the whole image (sometimes
 58 |                 # per channel) or change the brightness of subareas
 59 |                 iaa.OneOf([
 60 |                     iaa.Multiply((0.5, 1.5), per_channel=0.5),
 61 |                     iaa.FrequencyNoiseAlpha(
 62 |                         exponent=(-4, 0),
 63 |                         first=iaa.Multiply((0.5, 1.5), per_channel=True),
 64 |                         second=iaa.LinearContrast((0.5, 2.0))
 65 |                     )
 66 |                 ]),
 67 |                 # improve or worsen the contrast
 68 |                 iaa.LinearContrast((0.3, 3.0), per_channel=0.5),
 69 |                 iaa.Grayscale(alpha=(0.0, 1.0)),
 70 |                 iaa.Multiply((0.333, 3), per_channel=0.5),
 71 |             ],
 72 |                 random_order=True
 73 |             )
 74 |         ],
 75 |         random_order=True
 76 |     )
 77 | 
 78 | 
 79 | def crop(image):
 80 |     # print(image.shape)
 81 |     old_size = (image.shape[1], image.shape[0])
 82 |     im_height = image.shape[0]
 83 |     max_y = random.randint(int(0.6 * im_height), int(0.9 * im_height))
 84 |     # print(max_y)
 85 |     
 86 |     image = image[0:max_y, :, :].copy()
 87 |     image = cv2.resize(image, (old_size[0], old_size[1]))
 88 |     # print(image.shape)
 89 |     return image
 90 | 
 91 | def crop0(image):
 92 |     # print(image.shape)
 93 |     old_size = (image.shape[1], image.shape[0])
 94 |     im_height = image.shape[0]
 95 |     max_y = random.randint(int(0.0 * im_height), int(0.5 * im_height))
 96 |     # print(max_y)
 97 |     
 98 |     image = image[max_y:, :, :].copy()
 99 |     image = cv2.resize(image, (old_size[0], old_size[1]))
100 |     # print(image.shape)
101 |     return image
102 | 
103 | def crop2(image):
104 |     # print(image.shape)
105 |     old_size = (image.shape[1], image.shape[0])
106 |     im_width = image.shape[1]
107 |     max_x = random.randint(int(0.0 * im_width), int(0.3 * im_width))
108 |     # print(max_y)
109 |     
110 |     image = image[:, max_x:, :].copy()
111 |     image = cv2.resize(image, (old_size[0], old_size[1]))
112 |     # print(image.shape)
113 |     return image
114 | 
115 | def crop3(image):
116 |     # print(image.shape)
117 |     old_size = (image.shape[1], image.shape[0])
118 |     im_width = image.shape[1]
119 |     max_x = random.randint(int(0.7 * im_width), int(0.95 * im_width))
120 |     # print(max_y)
121 |     
122 |     image = image[:, :max_x, :].copy()
123 |     image = cv2.resize(image, (old_size[0], old_size[1]))
124 |     # print(image.shape)
125 |     return image
126 | 
127 | def augment_img(image, y, landmark=None):
128 |     if seq[0] is None:
129 |         load_aug()
130 | 
131 |     if random.random() < 0.5 and y:
132 |         if random.random() < 0.5:
133 |             image = crop(image)
134 |         else:
135 |             image = crop0(image)
136 | 
137 |         image = crop2(image)
138 |         image = crop3(image)
139 | 
140 |     if landmark is None:
141 |         image_aug = seq[0](images=np.array([image]))
142 |         return image_aug[0]
143 |     else:
144 |         image_aug, landmark = seq[0](images=np.array(
145 |             [image]), keypoints=np.array([landmark]))
146 |         image_aug = image_aug[0]
147 |         landmark = landmark[0]
148 | 
149 |         # Simulate reflection
150 |         if random.random() < 0.1:
151 |             image_aug = add_vertical_reflection(image_aug, landmark)
152 | 
153 |         # draw = image_aug.copy()
154 |         # for i in range(landmark.shape[0]):
155 |         # 	draw = cv2.circle(draw, (int(landmark[i][0]), int(landmark[i][1])), 2, (0,255,0), 2)
156 |         # cv2.imshow("draw", draw)
157 |         # cv2.waitKey(0)
158 |         return image_aug, landmark
159 | 


--------------------------------------------------------------------------------
/src/data_loaders/augmentation_utils.py:
--------------------------------------------------------------------------------
 1 | import random
 2 | import numpy as np
 3 | import cv2
 4 | 
 5 | 
 6 | def add_vertical_reflection(image, keypoints, min_height=0.1):
 7 |     """Add vertical reflection
 8 | 
 9 |     Args:
10 |         image: Input image
11 |         keypoints: Keypoints
12 |         min_height [int]: Min height ratio of reflection (over image height)
13 | 
14 |     Return:
15 |         Augmented image
16 |     """
17 | 
18 |     im_height = image.shape[0]
19 |     max_y = np.max(np.array(keypoints)[:, 1])
20 |     reflection_height = min(im_height - max_y - 1, max_y)
21 | 
22 |     if reflection_height < min_height * im_height:
23 |         return image
24 | 
25 |     alpha = random.uniform(0.5, 0.9)
26 |     beta = (1.0 - alpha)
27 |     image[max_y:max_y+reflection_height, :, :] = cv2.addWeighted(image[max_y:max_y+reflection_height, :, :],
28 |                                                                  alpha,
29 |                                                                  cv2.flip(
30 |                                                                      image[max_y-reflection_height:max_y, :, :], 0),
31 |                                                                  beta, 0.0)
32 | 
33 |     return image
34 | 
35 | 
36 | def random_occlusion(image, keypoints, visibility=None, rect_ratio=None, rect_color="random"):
37 |     """Generate random rectangle to occlude points
38 |         From BlazePose paper: "To support the prediction of invisible points, we simulate occlusions (random
39 |         rectangles filled with various colors) during training and introduce a per-point
40 |         visibility classifier that indicates whether a particular point is occluded and
41 |         if the position prediction is deemed inaccurate."
42 | 
43 |     Args:
44 |         image: Input image
45 |         keypoints: Keypoints in format [[x1, y1], [x2, y2], ...]
46 |         visibility [list]: List of visibilities of keypoints. 0: occluded by rectangle, 1: visible
47 |         rect_ratio: Rect ratio wrt image width and height. Format ((min_width, max_width), (min_height, max_height))
48 |                     Example: ((0.2, 0.5), (0.2, 0.5))
49 |         rect_color: Scalar indicating color to fill in the rectangle
50 | 
51 |     Return:
52 |         image: Generated image
53 |         visibility [list]: List of visibilities of keypoints. 0: occluded by rectangle, 1: visible
54 |     """
55 | 
56 |     if rect_ratio is None:
57 |         rect_ratio = ((0.2, 0.5), (0.2, 0.5))
58 | 
59 |     im_height, im_width = image.shape[:2]
60 |     rect_width = int(im_width * random.uniform(*rect_ratio[0]))
61 |     rect_height = int(im_height * random.uniform(*rect_ratio[1]))
62 |     rect_x = random.randint(0, im_width - rect_width)
63 |     rect_y = random.randint(0, im_height - rect_height)
64 | 
65 |     gen_image = image.copy()
66 |     if rect_color == "random":
67 |         rect_color = (random.randint(0, 255), random.randint(
68 |             0, 255), random.randint(0, 255))
69 |     gen_image = cv2.rectangle(gen_image, (rect_x, rect_y),
70 |                               (rect_x + rect_width, rect_y + rect_height), rect_color, -1)
71 | 
72 |     if visibility is None:
73 |         visibility = [1] * len(keypoints)
74 |     for i in range(len(visibility)):
75 |         if rect_x < keypoints[i][0] and keypoints[i][0] < rect_x + rect_width \
76 |                 and rect_y < keypoints[i][1] and keypoints[i][1] < rect_y + rect_height:
77 |             visibility[i] = 0
78 | 
79 |     return gen_image, visibility
80 | 


--------------------------------------------------------------------------------
/src/data_loaders/humanpose.py:
--------------------------------------------------------------------------------
  1 | import json
  2 | import math
  3 | import os
  4 | import random
  5 | 
  6 | import cv2
  7 | import numpy as np
  8 | from tensorflow.keras.utils import Sequence
  9 | 
 10 | from ..utils.heatmap import gen_gt_heatmap
 11 | from ..utils.keypoints import normalize_landmark
 12 | from ..utils.pre_processing import square_crop_with_keypoints
 13 | from ..utils.visualizer import visualize_keypoints
 14 | from .augmentation import augment_img
 15 | from .augmentation_utils import random_occlusion
 16 | 
 17 | 
 18 | class DataSequence(Sequence):
 19 | 
 20 |     def __init__(self, image_folder, label_file, batch_size=8, input_size=(256, 256), output_heatmap=True, heatmap_size=(128, 128), heatmap_sigma=4, n_points=16, shuffle=True, augment=False, random_flip=False, random_rotate=False, random_scale_on_crop=False, clip_landmark=False, symmetry_point_ids=None):
 21 | 
 22 |         self.batch_size = batch_size
 23 |         self.input_size = input_size
 24 |         self.output_heatmap = output_heatmap
 25 |         self.heatmap_size = heatmap_size
 26 |         self.heatmap_sigma = heatmap_sigma
 27 |         self.image_folder = image_folder
 28 |         self.random_flip = random_flip
 29 |         self.random_rotate = random_rotate
 30 |         self.random_scale_on_crop = random_scale_on_crop
 31 |         self.augment = augment
 32 |         self.n_points = n_points
 33 |         self.symmetry_point_ids = symmetry_point_ids
 34 |         self.clip_landmark = clip_landmark # Clip value of landmark to range [0, 1]
 35 | 
 36 |         with open(label_file, "r") as fp:
 37 |             self.anno = json.load(fp)
 38 | 
 39 |         if shuffle:
 40 |             random.shuffle(self.anno)
 41 | 
 42 |     def __len__(self):
 43 |         """
 44 |         Number of batch in the Sequence.
 45 |         :return: The number of batches in the Sequence.
 46 |         """
 47 |         return math.ceil(len(self.anno) / float(self.batch_size))
 48 | 
 49 |     def __getitem__(self, idx):
 50 |         """
 51 |         Retrieve the mask and the image in batches at position idx
 52 |         :param idx: position of the batch in the Sequence.
 53 |         :return: batches of image and the corresponding mask
 54 |         """
 55 | 
 56 |         batch_data = self.anno[idx *
 57 |                                self.batch_size: (1 + idx) * self.batch_size]
 58 | 
 59 |         batch_image = []
 60 |         batch_landmark = []
 61 |         batch_heatmap = []
 62 | 
 63 |         for data in batch_data:
 64 | 
 65 |             # Load and augment data
 66 |             image, landmark, heatmap = self.load_data(self.image_folder, data)
 67 | 
 68 |             batch_image.append(image)
 69 |             batch_landmark.append(landmark)
 70 |             if self.output_heatmap:
 71 |                 batch_heatmap.append(heatmap)
 72 | 
 73 |         batch_image = np.array(batch_image)
 74 |         batch_landmark = np.array(batch_landmark)
 75 |         if self.output_heatmap:
 76 |             batch_heatmap = np.array(batch_heatmap)
 77 | 
 78 |         batch_image = DataSequence.preprocess_images(batch_image)
 79 |         batch_landmark = self.preprocess_landmarks(batch_landmark)
 80 | 
 81 |         # Prevent values from going outside [0, 1]
 82 |         # Only applied for sigmoid output
 83 |         if self.clip_landmark:
 84 |             batch_landmark[batch_landmark < 0] = 0
 85 |             batch_landmark[batch_landmark > 1] = 1
 86 | 
 87 |         if self.output_heatmap:
 88 |             return batch_image, [batch_landmark, batch_heatmap]
 89 |         else:
 90 |             return batch_image, batch_landmark
 91 | 
 92 |     @staticmethod
 93 |     def preprocess_images(images):
 94 |         # Convert color to RGB
 95 |         for i in range(images.shape[0]):
 96 |             images[i] = cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB)
 97 |         mean = np.array([0.5, 0.5, 0.5], dtype=np.float)
 98 |         images = np.array(images, dtype=np.float32)
 99 |         images = images / 255.0
100 |         images -= mean
101 |         return images
102 | 
103 |     def preprocess_landmarks(self, landmarks):
104 | 
105 |         first_dim = landmarks.shape[0]
106 |         landmarks = landmarks.reshape((-1, 3))
107 |         landmarks = normalize_landmark(landmarks, self.input_size)
108 |         landmarks = landmarks.reshape((first_dim, -1))
109 |         return landmarks
110 | 
111 |     def load_data(self, img_folder, data):
112 | 
113 |         # Load image
114 |         path = os.path.join(img_folder, data["image"])
115 |         image = cv2.imread(path)
116 | 
117 |         # Load landmark and apply square cropping for image
118 |         landmark = data["points"]
119 |         bbox = data["bbox"]
120 |         landmark = np.array(landmark)
121 | 
122 |         # Convert all (-1, -1) to (0, 0)
123 |         for i in range(landmark.shape[0]):
124 |             if landmark[i][0] == -1 and landmark[i][1] == -1:
125 |                 landmark[i, :] = [0, 0]
126 | 
127 |         # Generate visibility mask
128 |         # visible = inside image + not occluded by simulated rectangle
129 |         # (see BlazePose paper for more detail)
130 |         visibility = np.ones((landmark.shape[0], 1), dtype=int)
131 |         for i in range(len(visibility)):
132 |             if 0 > landmark[i][0] or landmark[i][0] >= image.shape[1] \
133 |                     or 0 > landmark[i][1] or landmark[i][1] >= image.shape[0]:
134 |                 visibility[i] = 0
135 | 
136 |         image, landmark = square_crop_with_keypoints(
137 |             image, bbox, landmark, pad_value="random")
138 |         landmark = np.array(landmark)
139 | 
140 |         # Resize image
141 |         old_img_size = np.array([image.shape[1], image.shape[0]])
142 |         image = cv2.resize(image, self.input_size)
143 |         landmark = (
144 |             landmark * np.divide(np.array(self.input_size).astype(float), old_img_size)).astype(int)
145 | 
146 |         # Horizontal flip
147 |         # and update the order of landmark points
148 |         if self.random_flip and random.choice([0, 1]):
149 |             image = cv2.flip(image, 1)
150 | 
151 |             # Mark missing keypoints
152 |             missing_idxs = []
153 |             for i in range(landmark.shape[0]):
154 |                 if landmark[i, 0] == 0 and landmark[i, 1] == 0:
155 |                     missing_idxs.append(i)
156 | 
157 |             # Flip landmark
158 |             landmark[:, 0] = self.input_size[0] - landmark[:, 0]
159 | 
160 |             # Restore missing keypoints
161 |             for i in missing_idxs:
162 |                 landmark[i, 0] = 0
163 |                 landmark[i, 1] = 0
164 | 
165 |             # Change the indices of landmark points and visibility
166 |             if self.symmetry_point_ids is not None:
167 |                 for p1, p2 in self.symmetry_point_ids:
168 |                     l = landmark[p1, :].copy()
169 |                     landmark[p1, :] = landmark[p2, :].copy()
170 |                     landmark[p2, :] = l
171 | 
172 |         if self.augment:
173 |             image, landmark = augment_img(image, landmark)
174 | 
175 |         # Random occlusion
176 |         # (see BlazePose paper for more detail)
177 |         if self.augment and random.random() < 0.2:
178 |             landmark = landmark.reshape(-1, 2)
179 |             image, visibility = random_occlusion(image, landmark, visibility=visibility,
180 |                                                  rect_ratio=((0.2, 0.5), (0.2, 0.5)), rect_color="random")
181 | 
182 |         # Concatenate visibility into landmark
183 |         visibility = np.array(visibility)
184 |         visibility = visibility.reshape((landmark.shape[0], 1))
185 |         landmark = np.hstack((landmark, visibility))
186 | 
187 |         # Generate heatmap
188 |         gtmap = None
189 |         if self.output_heatmap:
190 |             gtmap_kps = landmark.copy()
191 |             gtmap_kps[:, :2] = (np.array(gtmap_kps[:, :2]).astype(float)
192 |                                 * np.array(self.heatmap_size) / np.array(self.input_size)).astype(int)
193 |             gtmap = gen_gt_heatmap(
194 |                 gtmap_kps, self.heatmap_sigma, self.heatmap_size)
195 |             # gtmap = np.clip(np.sum(gtmap, axis=2, keepdims=True), None, 1)
196 | 
197 |         # Uncomment following lines to debug augmentation
198 |         # draw = visualize_keypoints(image, landmark, visibility, text_color=(0,0,255))
199 |         # cv2.namedWindow("draw", cv2.WINDOW_NORMAL)
200 |         # cv2.imshow("draw", draw)
201 |         # if self.output_heatmap:
202 |         #     cv2.imshow("gtmap", gtmap.sum(axis=2))
203 |         # cv2.waitKey(0)
204 | 
205 |         landmark = np.array(landmark)
206 |         return image, landmark, gtmap
207 | 


--------------------------------------------------------------------------------
/src/data_loaders/humanpose_2head.py:
--------------------------------------------------------------------------------
  1 | import json
  2 | import math
  3 | import os
  4 | import random
  5 | 
  6 | import cv2
  7 | import numpy as np
  8 | from tensorflow.keras.utils import Sequence
  9 | 
 10 | from ..utils.heatmap import gen_gt_heatmap
 11 | from ..utils.keypoints import normalize_landmark
 12 | from ..utils.pre_processing import square_crop_with_keypoints
 13 | from ..utils.visualizer import visualize_keypoints
 14 | from .augmentation import augment_img
 15 | from .augmentation_utils import random_occlusion
 16 | 
 17 | 
 18 | class DataSequence(Sequence):
 19 | 
 20 |     def __init__(self, image_folder, label_file, batch_size=8, input_size=(256, 256), output_heatmap=True, heatmap_size=(128, 128), heatmap_sigma=4, n_points=16, shuffle=True, augment=False, random_flip=False, random_rotate=False, random_scale_on_crop=False, clip_landmark=False, symmetry_point_ids=None):
 21 | 
 22 |         self.batch_size = batch_size
 23 |         self.input_size = input_size
 24 |         self.output_heatmap = output_heatmap
 25 |         self.heatmap_size = heatmap_size
 26 |         self.heatmap_sigma = heatmap_sigma
 27 |         self.image_folder = image_folder
 28 |         self.random_flip = random_flip
 29 |         self.random_rotate = random_rotate
 30 |         self.random_scale_on_crop = random_scale_on_crop
 31 |         self.augment = augment
 32 |         self.n_points = n_points
 33 |         self.symmetry_point_ids = symmetry_point_ids
 34 |         self.clip_landmark = clip_landmark # Clip value of landmark to range [0, 1]
 35 | 
 36 |         with open(label_file, "r") as fp:
 37 |             self.anno = json.load(fp)
 38 | 
 39 |         if shuffle:
 40 |             random.shuffle(self.anno)
 41 | 
 42 |     def __len__(self):
 43 |         """
 44 |         Number of batch in the Sequence.
 45 |         :return: The number of batches in the Sequence.
 46 |         """
 47 |         return math.ceil(len(self.anno) / float(self.batch_size))
 48 | 
 49 |     def __getitem__(self, idx):
 50 |         """
 51 |         Retrieve the mask and the image in batches at position idx
 52 |         :param idx: position of the batch in the Sequence.
 53 |         :return: batches of image and the corresponding mask
 54 |         """
 55 | 
 56 |         batch_data = self.anno[idx *
 57 |                                self.batch_size: (1 + idx) * self.batch_size]
 58 | 
 59 |         batch_image = []
 60 |         batch_landmark = []
 61 |         batch_heatmap = []
 62 |         batch_pushup = []
 63 | 
 64 |         for data in batch_data:
 65 | 
 66 |             # Load and augment data
 67 |             image, landmark, heatmap, is_pushup = self.load_data(self.image_folder, data)
 68 |             
 69 |             batch_image.append(image)
 70 |             batch_landmark.append(landmark)
 71 |             batch_pushup.append(is_pushup)
 72 |             if self.output_heatmap:
 73 |                 batch_heatmap.append(heatmap)
 74 | 
 75 |         batch_image = np.array(batch_image)
 76 |         batch_landmark = np.array(batch_landmark)
 77 |         if self.output_heatmap:
 78 |             batch_heatmap = np.array(batch_heatmap)
 79 | 
 80 |         batch_image = DataSequence.preprocess_images(batch_image)
 81 |         batch_landmark = self.preprocess_landmarks(batch_landmark)
 82 | 
 83 |         # print(batch_pushup)
 84 |         batch_pushup = np.array(batch_pushup)
 85 | 
 86 |         # Prevent values from going outside [0, 1]
 87 |         # Only applied for sigmoid output
 88 |         if self.clip_landmark:
 89 |             batch_landmark[batch_landmark < 0] = 0
 90 |             batch_landmark[batch_landmark > 1] = 1
 91 | 
 92 |         # if self.output_heatmap:
 93 |         #     return batch_image, [batch_landmark, batch_heatmap]
 94 |         # else:
 95 |         #     return batch_image, batch_landmark
 96 |         return batch_image, [batch_heatmap, batch_pushup]
 97 | 
 98 |     @staticmethod
 99 |     def preprocess_images(images):
100 |         # Convert color to RGB
101 |         for i in range(images.shape[0]):
102 |             images[i] = cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB)
103 |         mean = np.array([0.5, 0.5, 0.5], dtype=np.float)
104 |         images = np.array(images, dtype=np.float32)
105 |         images = images / 255.0
106 |         images -= mean
107 |         return images
108 | 
109 |     def preprocess_landmarks(self, landmarks):
110 | 
111 |         first_dim = landmarks.shape[0]
112 |         landmarks = landmarks.reshape((-1, 3))
113 |         landmarks = normalize_landmark(landmarks, self.input_size)
114 |         landmarks = landmarks.reshape((first_dim, -1))
115 |         return landmarks
116 | 
117 |     def load_data(self, img_folder, data):
118 | 
119 |         # Load image
120 |         path = os.path.join(img_folder, data["image"])
121 |         image = cv2.imread(path)
122 | 
123 |         # Load landmark and apply square cropping for image
124 |         landmark = data["points"]
125 |         bbox = data["bbox"]
126 |         landmark = np.array(landmark)
127 | 
128 |         # Convert all (-1, -1) to (0, 0)
129 |         for i in range(landmark.shape[0]):
130 |             if landmark[i][0] == -1 and landmark[i][1] == -1:
131 |                 landmark[i, :] = [0, 0]
132 | 
133 |         # Generate visibility mask
134 |         # visible = inside image + not occluded by simulated rectangle
135 |         # (see BlazePose paper for more detail)
136 |         visibility = np.ones((landmark.shape[0], 1), dtype=int)
137 |         for i in range(len(visibility)):
138 |             if 0 > landmark[i][0] or landmark[i][0] >= image.shape[1] \
139 |                     or 0 > landmark[i][1] or landmark[i][1] >= image.shape[0] \
140 |                     or (landmark[i][0] == 0 and landmark[i][1] == 0):
141 |                 visibility[i] = 0
142 | 
143 |         image, landmark = square_crop_with_keypoints(
144 |             image, bbox, landmark, pad_value="random")
145 |         landmark = np.array(landmark)
146 | 
147 |         # Resize image
148 |         old_img_size = np.array([image.shape[1], image.shape[0]])
149 |         image = cv2.resize(image, self.input_size)
150 |         landmark = (
151 |             landmark * np.divide(np.array(self.input_size).astype(float), old_img_size)).astype(int)
152 | 
153 |         # Horizontal flip
154 |         # and update the order of landmark points
155 |         if self.random_flip and random.choice([0, 1]):
156 |             image = cv2.flip(image, 1)
157 | 
158 |             # Mark missing keypoints
159 |             missing_idxs = []
160 |             for i in range(landmark.shape[0]):
161 |                 if landmark[i, 0] == 0 and landmark[i, 1] == 0:
162 |                     missing_idxs.append(i)
163 | 
164 |             # Flip landmark
165 |             landmark[:, 0] = self.input_size[0] - landmark[:, 0]
166 | 
167 |             # Restore missing keypoints
168 |             for i in missing_idxs:
169 |                 landmark[i, 0] = 0
170 |                 landmark[i, 1] = 0
171 | 
172 |             # Change the indices of landmark points and visibility
173 |             if self.symmetry_point_ids is not None:
174 |                 for p1, p2 in self.symmetry_point_ids:
175 |                     l = landmark[p1, :].copy()
176 |                     landmark[p1, :] = landmark[p2, :].copy()
177 |                     landmark[p2, :] = l
178 | 
179 |         # Random occlusion
180 |         # (see BlazePose paper for more detail)
181 |         if self.augment and random.random() < 0.2:
182 |             landmark = landmark.reshape(-1, 2)
183 |             image, visibility = random_occlusion(image, landmark, visibility=visibility,
184 |                                                  rect_ratio=((0.2, 0.5), (0.2, 0.5)), rect_color="random")
185 | 
186 |         # Concatenate visibility into landmark
187 |         visibility = np.array(visibility)
188 |         visibility = visibility.reshape((landmark.shape[0], 1))
189 |         landmark = np.hstack((landmark, visibility))
190 | 
191 |         if self.augment:
192 | 
193 |             # Mark missing keypoints
194 |             missing_idxs = []
195 |             for i in range(landmark.shape[0]):
196 |                 if landmark[i, 0] == 0 and landmark[i, 1] == 0:
197 |                     missing_idxs.append(i)
198 | 
199 |             image, landmark = augment_img(image, landmark)
200 | 
201 |             # Restore missing keypoints
202 |             for i in missing_idxs:
203 |                 landmark[i, 0] = 0
204 |                 landmark[i, 1] = 0
205 | 
206 | 
207 |         # Generate heatmap
208 |         gtmap = None
209 |         if self.output_heatmap:
210 |             gtmap_kps = landmark.copy()
211 |             gtmap_kps[:, :2] = (np.array(gtmap_kps[:, :2]).astype(float)
212 |                                 * np.array(self.heatmap_size) / np.array(self.input_size)).astype(int)
213 |             gtmap = gen_gt_heatmap(
214 |                 gtmap_kps, self.heatmap_sigma, self.heatmap_size)
215 | 
216 |         # # Uncomment following lines to debug augmentation
217 |         # draw = visualize_keypoints(image, landmark, visibility, text_color=(0,0,255))
218 |         # cv2.namedWindow("draw", cv2.WINDOW_NORMAL)
219 |         # cv2.imshow("draw", draw)
220 |         # if self.output_heatmap:
221 |         #     cv2.imshow("gtmap", gtmap.sum(axis=2))
222 |         # cv2.waitKey(0)
223 | 
224 |         landmark = np.array(landmark)
225 |         return image, landmark, gtmap, int(data["is_pushing_up"])
226 | 


--------------------------------------------------------------------------------
/src/data_loaders/pushup_recognition.py:
--------------------------------------------------------------------------------
  1 | import json
  2 | import math
  3 | import os
  4 | import random
  5 | 
  6 | import cv2
  7 | import numpy as np
  8 | from tensorflow.keras.utils import Sequence
  9 | 
 10 | from ..utils.heatmap import gen_gt_heatmap
 11 | from ..utils.keypoints import normalize_landmark
 12 | from ..utils.pre_processing import square_crop_with_keypoints
 13 | from ..utils.visualizer import visualize_keypoints
 14 | from .augmentation2 import augment_img
 15 | from .augmentation_utils import random_occlusion
 16 | 
 17 | 
 18 | class DataSequence(Sequence):
 19 | 
 20 |     def __init__(self, image_folder, label_file, batch_size=8, input_size=(256, 256), shuffle=True, augment=False, random_flip=False, random_rotate=False, random_scale_on_crop=False):
 21 | 
 22 |         self.batch_size = batch_size
 23 |         self.input_size = input_size
 24 |         self.image_folder = image_folder
 25 |         self.random_flip = random_flip
 26 |         self.random_rotate = random_rotate
 27 |         self.random_scale_on_crop = random_scale_on_crop
 28 |         self.augment = augment
 29 | 
 30 |         with open(label_file, "r") as fp:
 31 |             self.anno = json.load(fp)
 32 | 
 33 |         if shuffle:
 34 |             random.shuffle(self.anno)
 35 | 
 36 |     def __len__(self):
 37 |         """
 38 |         Number of batch in the Sequence.
 39 |         :return: The number of batches in the Sequence.
 40 |         """
 41 |         return math.ceil(len(self.anno) / float(self.batch_size))
 42 | 
 43 |     def __getitem__(self, idx):
 44 |         """
 45 |         Retrieve the mask and the image in batches at position idx
 46 |         :param idx: position of the batch in the Sequence.
 47 |         :return: batches of image and the corresponding mask
 48 |         """
 49 | 
 50 |         batch_data = self.anno[idx * self.batch_size: (1 + idx) * self.batch_size]
 51 | 
 52 |         batch_image = []
 53 |         batch_y = []
 54 | 
 55 |         for data in batch_data:
 56 | 
 57 |             # Load and augment data
 58 |             image, y = self.load_data(self.image_folder, data)
 59 |             batch_image.append(image)
 60 |             batch_y.append(y)
 61 | 
 62 |         batch_image = np.array(batch_image)
 63 |         batch_y = np.array(batch_y)
 64 | 
 65 |         batch_image = DataSequence.preprocess_images(batch_image)
 66 | 
 67 |         return batch_image, batch_y
 68 | 
 69 |     @staticmethod
 70 |     def preprocess_images(images):
 71 |         # Convert color to RGB
 72 |         for i in range(images.shape[0]):
 73 |             images[i] = cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB)
 74 |         mean = np.array([0.5, 0.5, 0.5], dtype=np.float)
 75 |         images = np.array(images, dtype=np.float32)
 76 |         images = images / 255.0
 77 |         images -= mean
 78 |         return images
 79 | 
 80 | 
 81 |     def load_data(self, img_folder, data):
 82 | 
 83 |         # Load image
 84 |         path = os.path.join(img_folder, data["image"])
 85 |         image = cv2.imread(path)
 86 | 
 87 |         # Resize image
 88 |         image = cv2.resize(image, self.input_size)
 89 | 
 90 |         is_pushing_up = int(data["is_pushing_up"])
 91 | 
 92 |         # Horizontal flip
 93 |         # and update the order of landmark points
 94 |         if self.random_flip and random.choice([0, 1]):
 95 |             image = cv2.flip(image, 1)
 96 | 
 97 |         if self.augment:
 98 |             image = augment_img(image, not is_pushing_up)
 99 | 
100 |         # cv2.imshow("Image", image)
101 |         # cv2.waitKey(0)
102 | 
103 |         
104 | 
105 |         return image, is_pushing_up
106 | 


--------------------------------------------------------------------------------
/src/metrics/f1.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | from tensorflow.keras.metrics import Precision, Recall
 3 | 
 4 | class F1_Score(tf.keras.metrics.Metric):
 5 | 
 6 |     def __init__(self, name='f1_score', **kwargs):
 7 |         super().__init__(name=name, **kwargs)
 8 |         self.f1 = self.add_weight(name='f1', initializer='zeros')
 9 |         self.precision_fn = Precision(thresholds=0.5)
10 |         self.recall_fn = Recall(thresholds=0.5)
11 | 
12 |     def update_state(self, y_true, y_pred, sample_weight=None):
13 |         p = self.precision_fn(y_true, y_pred)
14 |         r = self.recall_fn(y_true, y_pred)
15 |         # since f1 is a variable, we use assign
16 |         self.f1.assign(2 * ((p * r) / (p + r + 1e-6)))
17 | 
18 |     def result(self):
19 |         return self.f1
20 | 
21 |     def reset_states(self):
22 |         # we also need to reset the state of the precision and recall objects
23 |         self.precision_fn.reset_states()
24 |         self.recall_fn.reset_states()
25 |         self.f1.assign(0)


--------------------------------------------------------------------------------
/src/metrics/mae.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | import numpy as np
 3 | 
 4 | from ..utils.heatmap import find_keypoints_from_heatmap
 5 | 
 6 | 
 7 | @tf.function
 8 | def calc_mae(batch_keypoints_true, batch_keypoints_pred, keypoint_thresh=0.1):
 9 | 
10 |     mask = tf.greater(batch_keypoints_true[:, :, 2], keypoint_thresh)
11 |     tf.boolean_mask(batch_keypoints_true[:, :, 0], mask)
12 |     tf.boolean_mask(batch_keypoints_true[:, :, 1], mask)
13 | 
14 |     mask = tf.greater(batch_keypoints_pred[:, :, 2], keypoint_thresh)
15 |     tf.boolean_mask(batch_keypoints_pred[:, :, 0], mask)
16 |     tf.boolean_mask(batch_keypoints_pred[:, :, 1], mask)
17 | 
18 |     error = tf.abs(batch_keypoints_pred[:, :, :2] - batch_keypoints_true[:, :, :2])
19 |     n_points = tf.cast(tf.reduce_prod(tf.shape(error)), tf.float32)
20 |     error = tf.reduce_sum(tf.cast(error, tf.float32))
21 | 
22 |     return error, n_points
23 | 
24 | 
25 | def get_mae_metric():
26 | 
27 |     class MAE(tf.keras.metrics.Metric):
28 | 
29 |         def __init__(self, name='mae', **kwargs):
30 |             super(MAE, self).__init__(name=name, **kwargs)
31 |             self.total_error = self.add_weight(name='total_error', initializer='zeros')
32 |             self.n_total = self.add_weight(name='n_total', initializer='zeros')
33 | 
34 |         def reset_states(self):
35 |             self.total_error.assign(0)
36 |             self.n_total.assign(0)
37 | 
38 |         def update_state(self, y_true, y_pred, sample_weight=None):
39 | 
40 |             keypoint_thresh = 0.0
41 |             if len(tf.shape(y_true)) == 4:  # Heatmap
42 |                 batch_keypoints_pred = find_keypoints_from_heatmap(y_pred, normalize=True)
43 |                 batch_keypoints_true = find_keypoints_from_heatmap(y_true, normalize=True)
44 |                 keypoint_thresh = 0.1
45 |             elif len(tf.shape(y_true)) == 2:  # Regression
46 |                 batch_keypoints_pred = tf.reshape(
47 |                     y_pred, (tf.shape(y_pred)[0], -1, 3))
48 |                 batch_keypoints_true = tf.reshape(
49 |                     y_true, (tf.shape(y_true)[0], -1, 3))
50 |                 keypoint_thresh = 0.5
51 |             else:
52 |                 tf.print("Error: Wrong MAE input shape.")
53 |                 exit(0)
54 |             error, n_points = calc_mae(
55 |                 batch_keypoints_true, batch_keypoints_pred, keypoint_thresh=keypoint_thresh)
56 |             self.total_error.assign_add(error)
57 |             self.n_total.assign_add(n_points)
58 | 
59 |         def result(self):
60 |             return self.total_error / (self.n_total + 1e-5)
61 | 
62 |     return MAE
63 | 


--------------------------------------------------------------------------------
/src/metrics/pck.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | import numpy as np
 3 | 
 4 | from ..utils.heatmap import find_keypoints_from_heatmap
 5 | 
 6 | 
 7 | @tf.function
 8 | def calc_pck(batch_keypoints_true, batch_keypoints_pred, keypoint_thresh=0.0, ref_point_pair=(2, 4), pck_thresh=0.5):
 9 | 
10 |     mask = tf.greater(batch_keypoints_true[:, :, 2], keypoint_thresh)
11 |     tf.boolean_mask(batch_keypoints_true[:, :, 0], mask)
12 |     tf.boolean_mask(batch_keypoints_true[:, :, 1], mask)
13 | 
14 |     mask = tf.greater(batch_keypoints_pred[:, :, 2], keypoint_thresh)
15 |     tf.boolean_mask(batch_keypoints_pred[:, :, 0], mask)
16 |     tf.boolean_mask(batch_keypoints_pred[:, :, 1], mask)
17 | 
18 |     ref_distance = tf.math.reduce_euclidean_norm(
19 |         batch_keypoints_true[:, ref_point_pair[0], :2] - batch_keypoints_true[:, ref_point_pair[1], :2], axis=1, keepdims=True)
20 |     error = tf.math.reduce_euclidean_norm(
21 |         batch_keypoints_pred[:, :, :2] - batch_keypoints_true[:, :, :2], axis=2)
22 |     wrong_matrix = tf.cast(error, tf.float32) > (tf.cast(ref_distance, tf.float32) * pck_thresh)
23 |     n_wrongs = tf.reduce_sum(tf.cast(wrong_matrix, tf.float32))
24 |     return tf.cast(n_wrongs, tf.float32), tf.cast(tf.reduce_prod(tf.shape(wrong_matrix)), tf.float32)
25 | 
26 | 
27 | def get_pck_metric(ref_point_pair=(2, 4), thresh=0.2):
28 | 
29 |     class PCK(tf.keras.metrics.Metric):
30 | 
31 |         def __init__(self, name='pck', ref_point_pair=ref_point_pair, pck_thresh=thresh, **kwargs):
32 |             super(PCK, self).__init__(name=name, **kwargs)
33 |             self.ref_point_pair = ref_point_pair
34 |             self.pck_thresh = pck_thresh
35 |             self.n_wrongs = self.add_weight(name='n_wrongs', initializer='zeros')
36 |             self.n_total = self.add_weight(name='n_total', initializer='zeros')
37 | 
38 |         def reset_states(self):
39 |             self.n_wrongs.assign(0)
40 |             self.n_total.assign(0)
41 | 
42 |         def update_state(self, y_true, y_pred, sample_weight=None):
43 | 
44 |             keypoint_thresh = 0.0
45 |             if len(tf.shape(y_true)) == 4:  # Heatmap
46 |                 batch_keypoints_pred = find_keypoints_from_heatmap(y_pred)
47 |                 batch_keypoints_true = find_keypoints_from_heatmap(y_true)
48 |                 keypoint_thresh = 0.1
49 |             elif len(tf.shape(y_true)) == 2:  # Regression
50 |                 batch_keypoints_pred = tf.reshape(
51 |                     y_pred, (tf.shape(y_pred)[0], -1, 3))
52 |                 batch_keypoints_true = tf.reshape(
53 |                     y_true, (tf.shape(y_true)[0], -1, 3))
54 |                 keypoint_thresh = 0.5
55 |             else:
56 |                 tf.print("Error: Wrong PCK input shape.")
57 |                 exit(0)
58 |             n_wrongs, n_points = calc_pck(
59 |                 batch_keypoints_true, batch_keypoints_pred, keypoint_thresh=keypoint_thresh, ref_point_pair=self.ref_point_pair, pck_thresh=self.pck_thresh)
60 |             self.n_wrongs.assign_add(n_wrongs)
61 |             self.n_total.assign_add(n_points)
62 | 
63 |         def result(self):
64 |             return (self.n_total - self.n_wrongs) / (self.n_total + 1e-5)
65 | 
66 |     return PCK
67 | 


--------------------------------------------------------------------------------
/src/models/__init__.py:
--------------------------------------------------------------------------------
 1 | from .blazepose_legacy import BlazePose as BlazePoseLegacy
 2 | from .blazepose_full import BlazePose as BlazePoseFull
 3 | from .blazepose_all_linear import BlazePose as BlazePoseAllLinear
 4 | from .blazepose_with_pushup_classify import BlazePose as BlazePoseWithClassify
 5 | from .pushup_recognition import PushUpRecognition
 6 | 
 7 | class ModelCreator():
 8 | 
 9 |     @staticmethod
10 |     def create_model(model_name, n_points=0):
11 | 
12 |         if model_name == "SIGMOID_HEATMAP_SIGMOID_REGRESS_TWO_HEAD":
13 |             return BlazePoseLegacy(n_points).build_model("TWO_HEAD")
14 |         elif model_name == "SIGMOID_HEATMAP_SIGMOID_REGRESS_HEATMAP":
15 |             return BlazePoseLegacy(n_points).build_model("HEATMAP")
16 |         elif model_name == "SIGMOID_HEATMAP_SIGMOID_REGRESS_REGRESSION":
17 |             return BlazePoseLegacy(n_points).build_model("REGRESSION")
18 | 
19 |         elif model_name == "SIGMOID_HEATMAP_LINEAR_REGRESS_TWO_HEAD":
20 |             return BlazePoseFull(n_points).build_model("TWO_HEAD")
21 |         elif model_name == "SIGMOID_HEATMAP_LINEAR_REGRESS_HEATMAP":
22 |             return BlazePoseFull(n_points).build_model("HEATMAP")
23 |         elif model_name == "SIGMOID_HEATMAP_LINEAR_REGRESS_REGRESSION":
24 |             return BlazePoseFull(n_points).build_model("REGRESSION")
25 | 
26 |         elif model_name == "ALL_LINEAR_TWO_HEAD":
27 |             return BlazePoseAllLinear(n_points).build_model("TWO_HEAD")
28 |         elif model_name == "ALL_LINEAR_HEATMAP":
29 |             return BlazePoseAllLinear(n_points).build_model("HEATMAP")
30 |         elif model_name == "ALL_LINEAR_REGRESSION":
31 |             return BlazePoseAllLinear(n_points).build_model("REGRESSION")
32 | 
33 |         elif model_name == "PUSHUP_RECOGNITION":
34 |             return PushUpRecognition.build_model()
35 | 
36 |         elif model_name == "BLAZEPOSE_WITH_PUSHUP_CLASSIFY":
37 |             return BlazePoseWithClassify(n_points).build_model("TWO_HEAD")
38 | 


--------------------------------------------------------------------------------
/src/models/blazepose_all_linear.py:
--------------------------------------------------------------------------------
  1 | import tensorflow as tf
  2 | from tensorflow.keras.models import Model
  3 | from .blazepose_layers import BlazeBlock
  4 | 
  5 | 
  6 | class BlazePose():
  7 |     def __init__(self, num_keypoints: int):
  8 | 
  9 |         self.num_keypoints = num_keypoints
 10 | 
 11 |         self.conv1 = tf.keras.layers.Conv2D(
 12 |             filters=24, kernel_size=3, strides=(2, 2), padding='same', activation='relu'
 13 |         )
 14 | 
 15 |         self.conv2_1 = tf.keras.models.Sequential([
 16 |             tf.keras.layers.DepthwiseConv2D(
 17 |                 kernel_size=3, padding='same', activation=None),
 18 |             tf.keras.layers.Conv2D(filters=24, kernel_size=1, activation=None)
 19 |         ])
 20 | 
 21 |         self.conv2_2 = tf.keras.models.Sequential([
 22 |             tf.keras.layers.DepthwiseConv2D(
 23 |                 kernel_size=3, padding='same', activation=None),
 24 |             tf.keras.layers.Conv2D(filters=24, kernel_size=1, activation=None)
 25 |         ])
 26 | 
 27 |         # === Heatmap ===
 28 | 
 29 |         self.conv3 = BlazeBlock(block_num=3, channel=48)
 30 |         self.conv4 = BlazeBlock(block_num=4, channel=96)
 31 |         self.conv5 = BlazeBlock(block_num=5, channel=192)
 32 |         self.conv6 = BlazeBlock(block_num=6, channel=288)
 33 | 
 34 |         self.conv7a = tf.keras.models.Sequential([
 35 |             tf.keras.layers.DepthwiseConv2D(
 36 |                 kernel_size=3, padding="same", activation=None),
 37 |             tf.keras.layers.Conv2D(
 38 |                 filters=48, kernel_size=1, activation="relu"),
 39 |             tf.keras.layers.UpSampling2D(size=(2, 2), interpolation="bilinear")
 40 |         ])
 41 |         self.conv7b = tf.keras.models.Sequential([
 42 |             tf.keras.layers.DepthwiseConv2D(
 43 |                 kernel_size=3, padding="same", activation=None),
 44 |             tf.keras.layers.Conv2D(
 45 |                 filters=48, kernel_size=1, activation="relu")
 46 |         ])
 47 | 
 48 |         self.conv8a = tf.keras.layers.UpSampling2D(
 49 |             size=(2, 2), interpolation="bilinear")
 50 |         self.conv8b = tf.keras.models.Sequential([
 51 |             tf.keras.layers.DepthwiseConv2D(
 52 |                 kernel_size=3, padding="same", activation=None),
 53 |             tf.keras.layers.Conv2D(
 54 |                 filters=48, kernel_size=1, activation="relu")
 55 |         ])
 56 | 
 57 |         self.conv9a = tf.keras.layers.UpSampling2D(
 58 |             size=(2, 2), interpolation="bilinear")
 59 |         self.conv9b = tf.keras.models.Sequential([
 60 |             tf.keras.layers.DepthwiseConv2D(
 61 |                 kernel_size=3, padding="same", activation=None),
 62 |             tf.keras.layers.Conv2D(
 63 |                 filters=48, kernel_size=1, activation="relu")
 64 |         ])
 65 | 
 66 |         self.conv10a = tf.keras.models.Sequential([
 67 |             tf.keras.layers.DepthwiseConv2D(
 68 |                 kernel_size=3, padding="same", activation=None),
 69 |             tf.keras.layers.Conv2D(
 70 |                 filters=8, kernel_size=1, activation="relu"),
 71 |             tf.keras.layers.UpSampling2D(size=(2, 2), interpolation="bilinear")
 72 |         ])
 73 |         self.conv10b = tf.keras.models.Sequential([
 74 |             tf.keras.layers.DepthwiseConv2D(
 75 |                 kernel_size=3, padding="same", activation=None),
 76 |             tf.keras.layers.Conv2D(filters=8, kernel_size=1, activation="relu")
 77 |         ])
 78 | 
 79 |         self.conv11 = tf.keras.models.Sequential([
 80 |             tf.keras.layers.DepthwiseConv2D(
 81 |                 kernel_size=3, padding="same", activation=None),
 82 |             tf.keras.layers.Conv2D(
 83 |                 filters=8, kernel_size=1, activation="relu"),
 84 |             tf.keras.layers.Conv2D(
 85 |                 filters=self.num_keypoints, kernel_size=3, padding="same", activation=None) # -> Heatmap output
 86 |         ], name="heatmap")
 87 | 
 88 |         # === Regression ===
 89 | 
 90 |         #  In: 1, 64, 64, 48)
 91 |         self.conv12a = BlazeBlock(block_num=4, channel=96, name_prefix="regression_conv12a_")    # input res: 64
 92 |         self.conv12b = tf.keras.models.Sequential([
 93 |             tf.keras.layers.DepthwiseConv2D(
 94 |                 kernel_size=3, padding="same", activation=None, name="regression_conv12b_depthwise"),
 95 |             tf.keras.layers.Conv2D(
 96 |                 filters=96, kernel_size=1, activation="relu", name="regression_conv12b_conv1x1")
 97 |         ], name="regression_conv12b")
 98 | 
 99 |         self.conv13a = BlazeBlock(block_num=5, channel=192, name_prefix="regression_conv13a_")   # input res: 32
100 |         self.conv13b = tf.keras.models.Sequential([
101 |             tf.keras.layers.DepthwiseConv2D(
102 |                 kernel_size=3, padding="same", activation=None, name="regression_conv13b_depthwise"),
103 |             tf.keras.layers.Conv2D(
104 |                 filters=192, kernel_size=1, activation="relu", name="regression_conv13b_conv1x1")
105 |         ], name="regression_conv13b")
106 | 
107 |         self.conv14a = BlazeBlock(block_num=6, channel=288, name_prefix="regression_conv14a_")   # input res: 16
108 |         self.conv14b = tf.keras.models.Sequential([
109 |             tf.keras.layers.DepthwiseConv2D(
110 |                 kernel_size=3, padding="same", activation=None, name="regression_conv14b_depthwise"),
111 |             tf.keras.layers.Conv2D(
112 |                 filters=288, kernel_size=1, activation="relu", name="regression_conv14b_conv1x1")
113 |         ], name="regression_conv14b")
114 | 
115 |         self.conv15 = tf.keras.models.Sequential([
116 |             BlazeBlock(block_num=7, channel=288, channel_padding=0, name_prefix="regression_conv15a_"),
117 |             BlazeBlock(block_num=7, channel=288, channel_padding=0, name_prefix="regression_conv15b_")
118 |         ], name="regression_conv15")
119 | 
120 |         self.conv16 = tf.keras.models.Sequential([
121 |             tf.keras.layers.Conv2D(
122 |                 filters=3*self.num_keypoints, kernel_size=2, activation=None),
123 |             tf.keras.layers.Reshape((3*self.num_keypoints, 1), name="regression_final_dense")
124 |         ], name="joints")
125 | 
126 |     def build_model(self, model_type):
127 | 
128 |         input_x = tf.keras.layers.Input(shape=(256, 256, 3))
129 | 
130 |         # Block 1
131 |         # In: 1x256x256x3
132 |         x = self.conv1(input_x)
133 | 
134 |         # Block 2
135 |         # In: 1x128x128x24
136 |         x = x + self.conv2_1(x)
137 |         x = tf.keras.activations.relu(x)
138 | 
139 |         # Block 3
140 |         # In: 1x128x128x24
141 |         x = x + self.conv2_2(x)
142 |         y0 = tf.keras.activations.relu(x)
143 | 
144 |         # === Heatmap ===
145 | 
146 |         # In: 1, 128, 128, 24
147 |         y1 = self.conv3(y0)
148 |         y2 = self.conv4(y1)
149 |         y3 = self.conv5(y2)
150 |         y4 = self.conv6(y3)
151 | 
152 |         x = self.conv7a(y4) + self.conv7b(y3)
153 |         x = self.conv8a(x) + self.conv8b(y2)
154 |         # In: 1, 32, 32, 96
155 |         x = self.conv9a(x) + self.conv9b(y1)
156 |         # In: 1, 64, 64, 48
157 |         y = self.conv10a(x) + self.conv10b(y0)
158 |         heatmap = self.conv11(y)
159 | 
160 |         # === Regression ===
161 | 
162 |         # Stop gradient for regression on 2-head model
163 |         if model_type == "TWO_HEAD":
164 |             x = tf.keras.backend.stop_gradient(x)
165 |             y2 = tf.keras.backend.stop_gradient(y2)
166 |             y3 = tf.keras.backend.stop_gradient(y3)
167 |             y4 = tf.keras.backend.stop_gradient(y4)
168 | 
169 |         x = self.conv12a(x) + self.conv12b(y2)
170 |         # In: 1, 32, 32, 96
171 |         x = self.conv13a(x) + self.conv13b(y3)
172 |         # In: 1, 16, 16, 192
173 |         x = self.conv14a(x) + self.conv14b(y4)
174 |         # In: 1, 8, 8, 288
175 |         x = self.conv15(x)
176 |         # In: 1, 2, 2, 288
177 |         joints = self.conv16(x)
178 | 
179 |         if model_type == "TWO_HEAD":
180 |             return Model(inputs=input_x, outputs=[joints, heatmap])
181 |         elif model_type == "HEATMAP":
182 |             return Model(inputs=input_x, outputs=heatmap)
183 |         elif model_type == "REGRESSION":
184 |             return Model(inputs=input_x, outputs=joints)
185 |         else:
186 |             raise ValueError("Wrong model type.")
187 | 


--------------------------------------------------------------------------------
/src/models/blazepose_full.py:
--------------------------------------------------------------------------------
  1 | import tensorflow as tf
  2 | from tensorflow.keras.models import Model
  3 | from .blazepose_layers import BlazeBlock
  4 | 
  5 | 
  6 | class BlazePose():
  7 |     def __init__(self, num_keypoints: int):
  8 | 
  9 |         self.num_keypoints = num_keypoints
 10 | 
 11 |         self.conv1 = tf.keras.layers.Conv2D(
 12 |             filters=24, kernel_size=3, strides=(2, 2), padding='same', activation='relu'
 13 |         )
 14 | 
 15 |         self.conv2_1 = tf.keras.models.Sequential([
 16 |             tf.keras.layers.DepthwiseConv2D(
 17 |                 kernel_size=3, padding='same', activation=None),
 18 |             tf.keras.layers.Conv2D(filters=24, kernel_size=1, activation=None)
 19 |         ])
 20 | 
 21 |         self.conv2_2 = tf.keras.models.Sequential([
 22 |             tf.keras.layers.DepthwiseConv2D(
 23 |                 kernel_size=3, padding='same', activation=None),
 24 |             tf.keras.layers.Conv2D(filters=24, kernel_size=1, activation=None)
 25 |         ])
 26 | 
 27 |         # === Heatmap ===
 28 | 
 29 |         self.conv3 = BlazeBlock(block_num=3, channel=48)
 30 |         self.conv4 = BlazeBlock(block_num=4, channel=96)
 31 |         self.conv5 = BlazeBlock(block_num=5, channel=192)
 32 |         self.conv6 = BlazeBlock(block_num=6, channel=288)
 33 | 
 34 |         self.conv7a = tf.keras.models.Sequential([
 35 |             tf.keras.layers.DepthwiseConv2D(
 36 |                 kernel_size=3, padding="same", activation=None),
 37 |             tf.keras.layers.Conv2D(
 38 |                 filters=48, kernel_size=1, activation="relu"),
 39 |             tf.keras.layers.UpSampling2D(size=(2, 2), interpolation="bilinear")
 40 |         ])
 41 |         self.conv7b = tf.keras.models.Sequential([
 42 |             tf.keras.layers.DepthwiseConv2D(
 43 |                 kernel_size=3, padding="same", activation=None),
 44 |             tf.keras.layers.Conv2D(
 45 |                 filters=48, kernel_size=1, activation="relu")
 46 |         ])
 47 | 
 48 |         self.conv8a = tf.keras.layers.UpSampling2D(
 49 |             size=(2, 2), interpolation="bilinear")
 50 |         self.conv8b = tf.keras.models.Sequential([
 51 |             tf.keras.layers.DepthwiseConv2D(
 52 |                 kernel_size=3, padding="same", activation=None),
 53 |             tf.keras.layers.Conv2D(
 54 |                 filters=48, kernel_size=1, activation="relu")
 55 |         ])
 56 | 
 57 |         self.conv9a = tf.keras.layers.UpSampling2D(
 58 |             size=(2, 2), interpolation="bilinear")
 59 |         self.conv9b = tf.keras.models.Sequential([
 60 |             tf.keras.layers.DepthwiseConv2D(
 61 |                 kernel_size=3, padding="same", activation=None),
 62 |             tf.keras.layers.Conv2D(
 63 |                 filters=48, kernel_size=1, activation="relu")
 64 |         ])
 65 | 
 66 |         self.conv10a = tf.keras.models.Sequential([
 67 |             tf.keras.layers.DepthwiseConv2D(
 68 |                 kernel_size=3, padding="same", activation=None),
 69 |             tf.keras.layers.Conv2D(
 70 |                 filters=8, kernel_size=1, activation="relu"),
 71 |             tf.keras.layers.UpSampling2D(size=(2, 2), interpolation="bilinear")
 72 |         ])
 73 |         self.conv10b = tf.keras.models.Sequential([
 74 |             tf.keras.layers.DepthwiseConv2D(
 75 |                 kernel_size=3, padding="same", activation=None),
 76 |             tf.keras.layers.Conv2D(filters=8, kernel_size=1, activation="relu")
 77 |         ])
 78 | 
 79 |         self.conv11 = tf.keras.models.Sequential([
 80 |             tf.keras.layers.DepthwiseConv2D(
 81 |                 kernel_size=3, padding="same", activation=None),
 82 |             tf.keras.layers.Conv2D(
 83 |                 filters=8, kernel_size=1, activation="relu"),
 84 |             tf.keras.layers.Conv2D(
 85 |                 filters=self.num_keypoints, kernel_size=3, padding="same", activation=None) # -> Heatmap output
 86 |         ])
 87 | 
 88 |         # === Regression ===
 89 | 
 90 |         #  In: 1, 64, 64, 48)
 91 |         self.conv12a = BlazeBlock(block_num=4, channel=96, name_prefix="regression_conv12a_")    # input res: 64
 92 |         self.conv12b = tf.keras.models.Sequential([
 93 |             tf.keras.layers.DepthwiseConv2D(
 94 |                 kernel_size=3, padding="same", activation=None, name="regression_conv12b_depthwise"),
 95 |             tf.keras.layers.Conv2D(
 96 |                 filters=96, kernel_size=1, activation="relu", name="regression_conv12b_conv1x1")
 97 |         ], name="regression_conv12b")
 98 | 
 99 |         self.conv13a = BlazeBlock(block_num=5, channel=192, name_prefix="regression_conv13a_")   # input res: 32
100 |         self.conv13b = tf.keras.models.Sequential([
101 |             tf.keras.layers.DepthwiseConv2D(
102 |                 kernel_size=3, padding="same", activation=None, name="regression_conv13b_depthwise"),
103 |             tf.keras.layers.Conv2D(
104 |                 filters=192, kernel_size=1, activation="relu", name="regression_conv13b_conv1x1")
105 |         ], name="regression_conv13b")
106 | 
107 |         self.conv14a = BlazeBlock(block_num=6, channel=288, name_prefix="regression_conv14a_")   # input res: 16
108 |         self.conv14b = tf.keras.models.Sequential([
109 |             tf.keras.layers.DepthwiseConv2D(
110 |                 kernel_size=3, padding="same", activation=None, name="regression_conv14b_depthwise"),
111 |             tf.keras.layers.Conv2D(
112 |                 filters=288, kernel_size=1, activation="relu", name="regression_conv14b_conv1x1")
113 |         ], name="regression_conv14b")
114 | 
115 |         self.conv15 = tf.keras.models.Sequential([
116 |             BlazeBlock(block_num=7, channel=288, channel_padding=0, name_prefix="regression_conv15a_"),
117 |             BlazeBlock(block_num=7, channel=288, channel_padding=0, name_prefix="regression_conv15b_")
118 |         ], name="regression_conv15")
119 | 
120 |         self.conv16 = tf.keras.models.Sequential([
121 |             tf.keras.layers.Conv2D(
122 |                 filters=3*self.num_keypoints, kernel_size=2, activation=None),
123 |             tf.keras.layers.Reshape((3*self.num_keypoints, 1), name="regression_final_dense")
124 |         ], name="joints")
125 | 
126 |     def build_model(self, model_type):
127 | 
128 |         input_x = tf.keras.layers.Input(shape=(256, 256, 3))
129 | 
130 |         # Block 1
131 |         # In: 1x256x256x3
132 |         x = self.conv1(input_x)
133 | 
134 |         # Block 2
135 |         # In: 1x128x128x24
136 |         x = x + self.conv2_1(x)
137 |         x = tf.keras.activations.relu(x)
138 | 
139 |         # Block 3
140 |         # In: 1x128x128x24
141 |         x = x + self.conv2_2(x)
142 |         y0 = tf.keras.activations.relu(x)
143 | 
144 |         # === Heatmap ===
145 | 
146 |         # In: 1, 128, 128, 24
147 |         y1 = self.conv3(y0)
148 |         y2 = self.conv4(y1)
149 |         y3 = self.conv5(y2)
150 |         y4 = self.conv6(y3)
151 | 
152 |         x = self.conv7a(y4) + self.conv7b(y3)
153 |         x = self.conv8a(x) + self.conv8b(y2)
154 |         # In: 1, 32, 32, 96
155 |         x = self.conv9a(x) + self.conv9b(y1)
156 |         # In: 1, 64, 64, 48
157 |         y = self.conv10a(x) + self.conv10b(y0)
158 |         y = self.conv11(y)
159 | 
160 |         # In: 1, 128, 128, 8
161 |         heatmap = tf.keras.layers.Activation("sigmoid", name="heatmap")(y)
162 | 
163 |         # === Regression ===
164 | 
165 |         # Stop gradient for regression on 2-head model
166 |         if model_type == "TWO_HEAD":
167 |             x = tf.keras.backend.stop_gradient(x)
168 |             y2 = tf.keras.backend.stop_gradient(y2)
169 |             y3 = tf.keras.backend.stop_gradient(y3)
170 |             y4 = tf.keras.backend.stop_gradient(y4)
171 | 
172 |         x = self.conv12a(x) + self.conv12b(y2)
173 |         # In: 1, 32, 32, 96
174 |         x = self.conv13a(x) + self.conv13b(y3)
175 |         # In: 1, 16, 16, 192
176 |         x = self.conv14a(x) + self.conv14b(y4)
177 |         # In: 1, 8, 8, 288
178 |         x = self.conv15(x)
179 |         # In: 1, 2, 2, 288
180 |         joints = self.conv16(x)
181 | 
182 |         if model_type == "TWO_HEAD":
183 |             return Model(inputs=input_x, outputs=[joints, heatmap])
184 |         elif model_type == "HEATMAP":
185 |             return Model(inputs=input_x, outputs=heatmap)
186 |         elif model_type == "REGRESSION":
187 |             return Model(inputs=input_x, outputs=joints)
188 |         else:
189 |             raise ValueError("Wrong model type.")
190 | 


--------------------------------------------------------------------------------
/src/models/blazepose_layers.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | 
 4 | class ChannelPadding(tf.keras.layers.Layer):
 5 |     def __init__(self, channels):
 6 |         super(ChannelPadding, self).__init__()
 7 |         self.channels = channels
 8 | 
 9 |     def build(self, input_shapes):
10 |         self.pad_shape = tf.constant(
11 |             [[0, 0], [0, 0], [0, 0], [0, self.channels - input_shapes[-1]]])
12 | 
13 |     def call(self, x):
14 |         return tf.pad(x, self.pad_shape)
15 | 
16 | 
17 | class BlazeBlock(tf.keras.Model):
18 |     def __init__(self, block_num=3, channel=48, channel_padding=1, name_prefix=""):
19 |         super(BlazeBlock, self).__init__()
20 | 
21 |         self.downsample_a = tf.keras.models.Sequential([
22 |             tf.keras.layers.DepthwiseConv2D(kernel_size=3, strides=(
23 |                 2, 2), padding='same', activation=None, name=name_prefix+"downsample_a_depthwise"),
24 |             tf.keras.layers.Conv2D(
25 |                 filters=channel, kernel_size=1, activation=None, name=name_prefix+"downsample_a_conv1x1")
26 |         ])
27 |         if channel_padding:
28 |             self.downsample_b = tf.keras.models.Sequential([
29 |                 tf.keras.layers.MaxPool2D(pool_size=(2, 2)),
30 |                 ChannelPadding(channels=channel)
31 |             ])
32 |         else:
33 |             self.downsample_b = tf.keras.layers.MaxPool2D(pool_size=(2, 2))
34 | 
35 |         self.conv = list()
36 |         for i in range(block_num):
37 |             self.conv.append(tf.keras.models.Sequential([
38 |                 tf.keras.layers.DepthwiseConv2D(
39 |                     kernel_size=3, padding='same', activation=None, name=name_prefix+"conv_block_{}".format(i+1)),
40 |                 tf.keras.layers.Conv2D(
41 |                     filters=channel, kernel_size=1, activation=None)
42 |             ]))
43 | 
44 |     def call(self, x):
45 |         x = tf.keras.activations.relu(
46 |             self.downsample_a(x) + self.downsample_b(x))
47 |         for i in range(len(self.conv)):
48 |             x = tf.keras.activations.relu(x + self.conv[i](x))
49 |         return x
50 | 


--------------------------------------------------------------------------------
/src/models/blazepose_legacy.py:
--------------------------------------------------------------------------------
  1 | import tensorflow as tf
  2 | from tensorflow.keras.models import Model
  3 | from .blazepose_layers import BlazeBlock
  4 | 
  5 | 
  6 | class BlazePose():
  7 |     def __init__(self, num_keypoints: int):
  8 | 
  9 |         self.num_keypoints = num_keypoints
 10 |         self.conv1 = tf.keras.layers.Conv2D(
 11 |             filters=24, kernel_size=3, strides=(2, 2), padding='same', activation='relu'
 12 |         )
 13 | 
 14 |         self.conv2_1 = tf.keras.models.Sequential([
 15 |             tf.keras.layers.DepthwiseConv2D(
 16 |                 kernel_size=3, padding='same', activation=None),
 17 |             tf.keras.layers.Conv2D(filters=24, kernel_size=1, activation=None)
 18 |         ])
 19 | 
 20 |         self.conv2_2 = tf.keras.models.Sequential([
 21 |             tf.keras.layers.DepthwiseConv2D(
 22 |                 kernel_size=3, padding='same', activation=None),
 23 |             tf.keras.layers.Conv2D(filters=24, kernel_size=1, activation=None)
 24 |         ])
 25 | 
 26 |         # === Heatmap ===
 27 | 
 28 |         self.conv3 = BlazeBlock(block_num=3, channel=48)    # input res: 128
 29 |         self.conv4 = BlazeBlock(block_num=4, channel=96)    # input res: 64
 30 |         self.conv5 = BlazeBlock(block_num=5, channel=192)   # input res: 32
 31 |         self.conv6 = BlazeBlock(block_num=6, channel=288)   # input res: 16
 32 | 
 33 |         self.conv7a = tf.keras.models.Sequential([
 34 |             tf.keras.layers.DepthwiseConv2D(
 35 |                 kernel_size=3, padding="same", activation=None),
 36 |             tf.keras.layers.Conv2D(
 37 |                 filters=48, kernel_size=1, activation="relu"),
 38 |             tf.keras.layers.UpSampling2D(size=(2, 2), interpolation="bilinear")
 39 |         ])
 40 |         self.conv7b = tf.keras.models.Sequential([
 41 |             tf.keras.layers.DepthwiseConv2D(
 42 |                 kernel_size=3, padding="same", activation=None),
 43 |             tf.keras.layers.Conv2D(
 44 |                 filters=48, kernel_size=1, activation="relu")
 45 |         ])
 46 | 
 47 |         self.conv8a = tf.keras.layers.UpSampling2D(
 48 |             size=(2, 2), interpolation="bilinear")
 49 |         self.conv8b = tf.keras.models.Sequential([
 50 |             tf.keras.layers.DepthwiseConv2D(
 51 |                 kernel_size=3, padding="same", activation=None),
 52 |             tf.keras.layers.Conv2D(
 53 |                 filters=48, kernel_size=1, activation="relu")
 54 |         ])
 55 | 
 56 |         self.conv9a = tf.keras.layers.UpSampling2D(
 57 |             size=(2, 2), interpolation="bilinear")
 58 |         self.conv9b = tf.keras.models.Sequential([
 59 |             tf.keras.layers.DepthwiseConv2D(
 60 |                 kernel_size=3, padding="same", activation=None),
 61 |             tf.keras.layers.Conv2D(
 62 |                 filters=48, kernel_size=1, activation="relu")
 63 |         ])
 64 | 
 65 |         self.conv10a = tf.keras.models.Sequential([
 66 |             tf.keras.layers.DepthwiseConv2D(
 67 |                 kernel_size=3, padding="same", activation=None),
 68 |             tf.keras.layers.Conv2D(
 69 |                 filters=8, kernel_size=1, activation="relu"),
 70 |             tf.keras.layers.UpSampling2D(size=(2, 2), interpolation="bilinear")
 71 |         ])
 72 |         self.conv10b = tf.keras.models.Sequential([
 73 |             tf.keras.layers.DepthwiseConv2D(
 74 |                 kernel_size=3, padding="same", activation=None),
 75 |             tf.keras.layers.Conv2D(filters=8, kernel_size=1, activation="relu")
 76 |         ])
 77 | 
 78 |         self.conv11 = tf.keras.models.Sequential([
 79 |             tf.keras.layers.DepthwiseConv2D(
 80 |                 kernel_size=3, padding="same", activation=None),
 81 |             tf.keras.layers.Conv2D(
 82 |                 filters=8, kernel_size=1, activation="relu"),
 83 |             # heatmap
 84 |             tf.keras.layers.Conv2D(
 85 |                 filters=self.num_keypoints, kernel_size=3, padding="same", activation=None)
 86 |         ])
 87 | 
 88 |         # === Regression ===
 89 | 
 90 |         #  In: 1, 64, 64, 48)
 91 |         self.conv12a = BlazeBlock(block_num=4, channel=96, name_prefix="regression_conv12a_")    # input res: 64
 92 |         self.conv12b = tf.keras.models.Sequential([
 93 |             tf.keras.layers.DepthwiseConv2D(
 94 |                 kernel_size=3, padding="same", activation=None, name="regression_conv12b_depthwise"),
 95 |             tf.keras.layers.Conv2D(
 96 |                 filters=96, kernel_size=1, activation="relu", name="regression_conv12b_conv1x1")
 97 |         ], name="regression_conv12b")
 98 | 
 99 |         self.conv13a = BlazeBlock(block_num=5, channel=192, name_prefix="regression_conv13a_")   # input res: 32
100 |         self.conv13b = tf.keras.models.Sequential([
101 |             tf.keras.layers.DepthwiseConv2D(
102 |                 kernel_size=3, padding="same", activation=None, name="regression_conv13b_depthwise"),
103 |             tf.keras.layers.Conv2D(
104 |                 filters=192, kernel_size=1, activation="relu", name="regression_conv13b_conv1x1")
105 |         ], name="regression_conv13b")
106 | 
107 |         self.conv14a = BlazeBlock(block_num=6, channel=288, name_prefix="regression_conv14a_")   # input res: 16
108 |         self.conv14b = tf.keras.models.Sequential([
109 |             tf.keras.layers.DepthwiseConv2D(
110 |                 kernel_size=3, padding="same", activation=None, name="regression_conv14b_depthwise"),
111 |             tf.keras.layers.Conv2D(
112 |                 filters=288, kernel_size=1, activation="relu", name="regression_conv14b_conv1x1")
113 |         ], name="regression_conv14b")
114 | 
115 |         self.conv15 = tf.keras.models.Sequential([
116 |             BlazeBlock(block_num=7, channel=288, channel_padding=0, name_prefix="regression_conv15a_"),
117 |             BlazeBlock(block_num=7, channel=288, channel_padding=0, name_prefix="regression_conv15b_")
118 |         ], name="regression_conv15")
119 | 
120 |         self.conv16 = tf.keras.models.Sequential([
121 |             tf.keras.layers.GlobalAveragePooling2D(),
122 |             # In: 1, 1, 1, 288
123 |             tf.keras.layers.Dense(units=3*self.num_keypoints,
124 |                                   activation=None, name="regression_final_dense"),
125 |         ], name="regression_conv16")
126 | 
127 |     def build_model(self, model_type):
128 | 
129 |         input_x = tf.keras.layers.Input(shape=(256, 256, 3))
130 | 
131 |         # Block 1
132 |         # In: 1x256x256x3
133 |         x = self.conv1(input_x)
134 | 
135 |         # Block 2
136 |         # In: 1x128x128x24
137 |         x = x + self.conv2_1(x)
138 |         x = tf.keras.activations.relu(x)
139 | 
140 |         # Block 3
141 |         # In: 1x128x128x24
142 |         x = x + self.conv2_2(x)
143 |         y0 = tf.keras.activations.relu(x)
144 | 
145 |         # === Heatmap ===
146 | 
147 |         # In: 1, 128, 128, 24
148 |         y1 = self.conv3(y0)
149 |         y2 = self.conv4(y1)
150 |         y3 = self.conv5(y2)
151 |         y4 = self.conv6(y3)
152 | 
153 |         x = self.conv7a(y4) + self.conv7b(y3)
154 |         x = self.conv8a(x) + self.conv8b(y2)
155 |         # In: 1, 32, 32, 96
156 |         x = self.conv9a(x) + self.conv9b(y1)
157 |         # In: 1, 64, 64, 48
158 |         y = self.conv10a(x) + self.conv10b(y0)
159 |         # In: 1, 128, 128, 8
160 |         heatmap = tf.keras.layers.Activation("sigmoid", name="heatmap")(self.conv11(y))
161 | 
162 |         # === Regression ===
163 | 
164 |         # Stop gradient for regression on 2-head model
165 |         if model_type == "TWO_HEAD":
166 |             x = tf.keras.backend.stop_gradient(x)
167 |             y2 = tf.keras.backend.stop_gradient(y2)
168 |             y3 = tf.keras.backend.stop_gradient(y3)
169 |             y4 = tf.keras.backend.stop_gradient(y4)
170 | 
171 |         x = self.conv12a(x) + self.conv12b(y2)
172 |         # In: 1, 32, 32, 96
173 |         x = self.conv13a(x) + self.conv13b(y3)
174 |         # In: 1, 16, 16, 192
175 |         x = self.conv14a(x) + self.conv14b(y4)
176 |         # In: 1, 8, 8, 288
177 |         x = self.conv15(x)
178 |         # In: 1, 2, 2, 288
179 |         joints = self.conv16(x)
180 |         joints = tf.keras.layers.Activation("sigmoid", name="joints")(joints)
181 | 
182 |         if model_type == "TWO_HEAD":
183 |             return Model(inputs=input_x, outputs=[joints, heatmap])
184 |         elif model_type == "HEATMAP":
185 |             return Model(inputs=input_x, outputs=heatmap)
186 |         elif model_type == "REGRESSION":
187 |             return Model(inputs=input_x, outputs=joints)
188 |         else:
189 |             raise ValueError("Wrong model type.")
190 | 


--------------------------------------------------------------------------------
/src/models/blazepose_with_pushup_classify.py:
--------------------------------------------------------------------------------
  1 | import tensorflow as tf
  2 | from tensorflow.keras.models import Model
  3 | from .blazepose_layers import BlazeBlock
  4 | 
  5 | 
  6 | class BlazePose():
  7 |     def __init__(self, num_keypoints: int):
  8 | 
  9 |         self.num_keypoints = num_keypoints
 10 | 
 11 |         self.conv1 = tf.keras.layers.Conv2D(
 12 |             filters=24, kernel_size=3, strides=(2, 2), padding='same', activation='relu'
 13 |         )
 14 | 
 15 |         self.conv2_1 = tf.keras.models.Sequential([
 16 |             tf.keras.layers.DepthwiseConv2D(
 17 |                 kernel_size=3, padding='same', activation=None),
 18 |             tf.keras.layers.Conv2D(filters=24, kernel_size=1, activation=None)
 19 |         ])
 20 | 
 21 |         self.conv2_2 = tf.keras.models.Sequential([
 22 |             tf.keras.layers.DepthwiseConv2D(
 23 |                 kernel_size=3, padding='same', activation=None),
 24 |             tf.keras.layers.Conv2D(filters=24, kernel_size=1, activation=None)
 25 |         ])
 26 | 
 27 |         # === Heatmap ===
 28 | 
 29 |         self.conv3 = BlazeBlock(block_num=3, channel=48)
 30 |         self.conv4 = BlazeBlock(block_num=4, channel=96)
 31 |         self.conv5 = BlazeBlock(block_num=5, channel=192)
 32 |         self.conv6 = BlazeBlock(block_num=6, channel=288)
 33 | 
 34 |         self.conv7a = tf.keras.models.Sequential([
 35 |             tf.keras.layers.DepthwiseConv2D(
 36 |                 kernel_size=3, padding="same", activation=None),
 37 |             tf.keras.layers.Conv2D(
 38 |                 filters=48, kernel_size=1, activation="relu"),
 39 |             tf.keras.layers.UpSampling2D(size=(2, 2), interpolation="bilinear")
 40 |         ])
 41 |         self.conv7b = tf.keras.models.Sequential([
 42 |             tf.keras.layers.DepthwiseConv2D(
 43 |                 kernel_size=3, padding="same", activation=None),
 44 |             tf.keras.layers.Conv2D(
 45 |                 filters=48, kernel_size=1, activation="relu")
 46 |         ])
 47 | 
 48 |         self.conv8a = tf.keras.layers.UpSampling2D(
 49 |             size=(2, 2), interpolation="bilinear")
 50 |         self.conv8b = tf.keras.models.Sequential([
 51 |             tf.keras.layers.DepthwiseConv2D(
 52 |                 kernel_size=3, padding="same", activation=None),
 53 |             tf.keras.layers.Conv2D(
 54 |                 filters=48, kernel_size=1, activation="relu")
 55 |         ])
 56 | 
 57 |         self.conv9a = tf.keras.layers.UpSampling2D(
 58 |             size=(2, 2), interpolation="bilinear")
 59 |         self.conv9b = tf.keras.models.Sequential([
 60 |             tf.keras.layers.DepthwiseConv2D(
 61 |                 kernel_size=3, padding="same", activation=None),
 62 |             tf.keras.layers.Conv2D(
 63 |                 filters=48, kernel_size=1, activation="relu")
 64 |         ])
 65 | 
 66 |         self.conv10a = tf.keras.models.Sequential([
 67 |             tf.keras.layers.DepthwiseConv2D(
 68 |                 kernel_size=3, padding="same", activation=None),
 69 |             tf.keras.layers.Conv2D(
 70 |                 filters=8, kernel_size=1, activation="relu"),
 71 |             tf.keras.layers.UpSampling2D(size=(2, 2), interpolation="bilinear")
 72 |         ])
 73 |         self.conv10b = tf.keras.models.Sequential([
 74 |             tf.keras.layers.DepthwiseConv2D(
 75 |                 kernel_size=3, padding="same", activation=None),
 76 |             tf.keras.layers.Conv2D(filters=8, kernel_size=1, activation="relu")
 77 |         ])
 78 | 
 79 |         self.conv11 = tf.keras.models.Sequential([
 80 |             tf.keras.layers.DepthwiseConv2D(
 81 |                 kernel_size=3, padding="same", activation=None),
 82 |             tf.keras.layers.Conv2D(
 83 |                 filters=8, kernel_size=1, activation="relu"),
 84 |             tf.keras.layers.Conv2D(
 85 |                 filters=self.num_keypoints, kernel_size=3, padding="same", activation=None) # -> Heatmap output
 86 |         ])
 87 | 
 88 |         # === Regression ===
 89 |         self.conv12 = tf.keras.models.Sequential([
 90 |             tf.keras.layers.DepthwiseConv2D(
 91 |                 kernel_size=3, padding="same", activation=None),
 92 |             tf.keras.layers.Conv2D(
 93 |                 filters=48, kernel_size=1, activation="relu"),
 94 |             tf.keras.layers.DepthwiseConv2D(
 95 |                 kernel_size=3, padding="same", activation=None),
 96 |             tf.keras.layers.Conv2D(
 97 |                 filters=48, kernel_size=1, activation="relu"),
 98 |         ], name="regression_1")
 99 |         self.conv13 = tf.keras.models.Sequential([
100 |             tf.keras.layers.GlobalAveragePooling2D(),
101 |             tf.keras.layers.Dropout(0.2),
102 |             tf.keras.layers.Dense(units=1, activation=None, name="regression_2")
103 |         ])
104 | 
105 |     def build_model(self, model_type):
106 | 
107 |         input_x = tf.keras.layers.Input(shape=(256, 256, 3))
108 | 
109 |         # Block 1
110 |         # In: 1x256x256x3
111 |         x = self.conv1(input_x)
112 | 
113 |         # Block 2
114 |         # In: 1x128x128x24
115 |         x = x + self.conv2_1(x)
116 |         x = tf.keras.activations.relu(x)
117 | 
118 |         # Block 3
119 |         # In: 1x128x128x24
120 |         x = x + self.conv2_2(x)
121 |         y0 = tf.keras.activations.relu(x)
122 | 
123 |         # === Heatmap ===
124 | 
125 |         # In: 1, 128, 128, 24
126 |         y1 = self.conv3(y0)
127 |         y2 = self.conv4(y1)
128 |         y3 = self.conv5(y2)
129 |         y4 = self.conv6(y3)
130 | 
131 |         x = self.conv7a(y4) + self.conv7b(y3)
132 |         x = self.conv8a(x) + self.conv8b(y2)
133 |         # In: 1, 32, 32, 96
134 |         x = self.conv9a(x) + self.conv9b(y1)
135 |         # In: 1, 64, 64, 48
136 |         y = self.conv10a(x) + self.conv10b(y0)
137 |         y = self.conv11(y)
138 | 
139 |         # In: 1, 128, 128, 8
140 |         heatmap = tf.keras.layers.Activation("sigmoid", name="heatmap")(y)
141 | 
142 |         # === Regression ===
143 | 
144 |         # Stop gradient for regression on 2-head model
145 |         # x = tf.keras.backend.stop_gradient(y4)
146 |         x = self.conv12(y4)
147 |         x = self.conv13(x)
148 |         is_pushup = tf.keras.layers.Activation("sigmoid", name="is_pushup")(x)
149 | 
150 |         return Model(inputs=input_x, outputs=[heatmap, is_pushup])


--------------------------------------------------------------------------------
/src/models/pushup_recognition.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | from tensorflow.keras.models import Model
 3 | 
 4 | class PushUpRecognition():
 5 | 
 6 |     @staticmethod
 7 |     def build_model():
 8 | 
 9 |         input_x = tf.keras.layers.Input(shape=(224, 224, 3))
10 |         x = tf.keras.applications.MobileNetV2(input_shape=(224, 224, 3),
11 |                                                include_top=False,
12 |                                                weights='imagenet',
13 |                                     alpha=0.5)(input_x)
14 |         x = tf.keras.layers.GlobalAveragePooling2D()(x)
15 |         x = tf.keras.layers.Dropout(0.5)(x)
16 |         x = tf.keras.layers.Dense(1, activation="sigmoid")(x)
17 |         
18 |         return Model(inputs=input_x, outputs=x)


--------------------------------------------------------------------------------
/src/train_phase.py:
--------------------------------------------------------------------------------
1 | import enum
2 | 
3 | 
4 | class TrainPhase(enum.Enum):
5 |     HEATMAP = "HEATMAP"
6 |     REGRESSION = "REGRESSION"
7 |     UNKNOWN = "UNKNOWN"


--------------------------------------------------------------------------------
/src/trainers/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vietanhdev/tf-blazepose/95db5ced1a4084c6a424161fbfc38c4e70612c97/src/trainers/__init__.py


--------------------------------------------------------------------------------
/src/trainers/blazepose_trainer.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import pathlib
  3 | import importlib
  4 | 
  5 | import tensorflow as tf
  6 | from tensorflow.keras.callbacks import ModelCheckpoint, TensorBoard
  7 | 
  8 | from ..train_phase import TrainPhase
  9 | from ..models import ModelCreator
 10 | 
 11 | from .losses import euclidean_distance_loss, focal_tversky, focal_loss, get_huber_loss, get_wing_loss
 12 | from ..metrics.pck import get_pck_metric
 13 | from ..metrics.mae import get_mae_metric
 14 | 
 15 | def train(config):
 16 |     """Train model
 17 | 
 18 |     Args:
 19 |         config (dict): Training configuration from configuration file
 20 |     """
 21 | 
 22 |     import tensorflow as tf
 23 | 
 24 |     train_config = config["train"]
 25 |     test_config = config["test"]
 26 |     model_config = config["model"]
 27 | 
 28 |     # Dataloader
 29 |     datalib = importlib.import_module("src.data_loaders.{}".format(config["data_loader"]))
 30 |     DataSequence = datalib.DataSequence
 31 | 
 32 |     # Initialize model
 33 |     model = ModelCreator.create_model(model_config["model_type"], model_config["num_keypoints"])
 34 | 
 35 |     # Freeze regression branch when training heatmap
 36 |     train_phase = TrainPhase(train_config.get("train_phase", "UNKNOWN"))
 37 |     if train_phase == train_phase.HEATMAP:
 38 |         print("Freeze these layers:")
 39 |         for layer in model.layers:
 40 |             if layer.name.startswith("regression"):
 41 |                 print(layer.name)
 42 |                 layer.trainable = False
 43 |     # Freeze heatmap branch when training regression
 44 |     elif train_phase == train_phase.REGRESSION:
 45 |         print("Freeze these layers:")
 46 |         for layer in model.layers:
 47 |             if not layer.name.startswith("regression"):
 48 |                 print(layer.name)
 49 |                 layer.trainable = False
 50 | 
 51 |     print(model.summary())
 52 | 
 53 |     loss_functions = {
 54 |         "heatmap": train_config["heatmap_loss"],
 55 |         "joints": train_config["keypoint_loss"]
 56 |     }
 57 |     
 58 |     # Replace all names with functions for custom losses
 59 |     for k in loss_functions.keys():
 60 |         if loss_functions[k] == "euclidean_distance_loss":
 61 |             loss_functions[k] = euclidean_distance_loss
 62 |         elif loss_functions[k] == "focal_tversky":
 63 |             loss_functions[k] = focal_tversky
 64 |         elif loss_functions[k] == "huber":
 65 |             loss_functions[k] = get_huber_loss(delta=1.0, weights=(1.0, 1.0))
 66 |         elif loss_functions[k] == "focal":
 67 |             loss_functions[k] = focal_loss(gamma=2, alpha=0.25)
 68 |         elif loss_functions[k] == "wing_loss":
 69 |             loss_functions[k] = get_wing_loss()
 70 | 
 71 | 
 72 |     loss_weights = train_config["loss_weights"]
 73 |     hm_pck_metric = get_pck_metric(ref_point_pair=test_config["pck_ref_points_idxs"], thresh=test_config["pck_thresh"])(name="pck1")
 74 |     hm_mae_metric = get_mae_metric()(name="mae1")
 75 |     kp_pck_metric = get_pck_metric(ref_point_pair=test_config["pck_ref_points_idxs"], thresh=test_config["pck_thresh"])(name="pck2")
 76 |     kp_mae_metric = get_mae_metric()(name="mae2")
 77 |     model.compile(optimizer=tf.optimizers.SGD(train_config["learning_rate"], momentum=0.9),
 78 |                   loss=loss_functions, loss_weights=loss_weights, metrics={"heatmap": [hm_pck_metric, hm_mae_metric], "joints": [kp_pck_metric, kp_mae_metric]})
 79 | 
 80 |     # Load pretrained model
 81 |     if train_config["load_weights"]:
 82 |         print("Loading model weights: " +
 83 |               train_config["pretrained_weights_path"])
 84 |         model.load_weights(train_config["pretrained_weights_path"])
 85 | 
 86 |     # Create experiment folder
 87 |     exp_path = os.path.join("experiments/{}".format(config["experiment_name"]))
 88 |     pathlib.Path(exp_path).mkdir(parents=True, exist_ok=True)
 89 | 
 90 |     # Define the callbacks
 91 |     tb_log_path = os.path.join(exp_path, "tb_logs")
 92 |     tb = TensorBoard(log_dir=tb_log_path, write_graph=True)
 93 |     model_folder_path = os.path.join(exp_path, "models")
 94 |     pathlib.Path(model_folder_path).mkdir(parents=True, exist_ok=True)
 95 |     mc = ModelCheckpoint(filepath=os.path.join(
 96 |         model_folder_path, "model_ep{epoch:03d}.h5"), save_weights_only=True, save_format="h5", verbose=1)
 97 | 
 98 |     # Load data
 99 |     train_dataset = DataSequence(
100 |         config["data"]["train_images"],
101 |         config["data"]["train_labels"],
102 |         batch_size=train_config["train_batch_size"],
103 |         input_size=(model_config["im_width"], model_config["im_height"]),
104 |         heatmap_size=(model_config["heatmap_width"], model_config["heatmap_height"]),
105 |         heatmap_sigma=model_config["heatmap_kp_sigma"],
106 |         n_points=model_config["num_keypoints"],
107 |         symmetry_point_ids=config["data"]["symmetry_point_ids"],
108 |         shuffle=True, augment=True, random_flip=True, random_rotate=True,
109 |         clip_landmark=train_config["keypoint_loss"] == "binary_crossentropy",
110 |         random_scale_on_crop=True)
111 |     val_dataset = DataSequence(
112 |         config["data"]["val_images"],
113 |         config["data"]["val_labels"],
114 |         batch_size=train_config["val_batch_size"],
115 |         input_size=(model_config["im_width"], model_config["im_height"]),
116 |         heatmap_size=(model_config["heatmap_width"], model_config["heatmap_height"]),
117 |         heatmap_sigma=model_config["heatmap_kp_sigma"],
118 |         n_points=model_config["num_keypoints"],
119 |         symmetry_point_ids=config["data"]["symmetry_point_ids"],
120 |         shuffle=False, augment=False, random_flip=False, random_rotate=False,
121 |         clip_landmark=train_config["keypoint_loss"] == "binary_crossentropy",random_scale_on_crop=False)
122 | 
123 |     test_dataset = DataSequence(
124 |         config["data"]["test_images"],
125 |         config["data"]["test_labels"],
126 |         batch_size=train_config["val_batch_size"],
127 |         input_size=(model_config["im_width"], model_config["im_height"]),
128 |         heatmap_size=(model_config["heatmap_width"], model_config["heatmap_height"]),
129 |         heatmap_sigma=model_config["heatmap_kp_sigma"],
130 |         n_points=model_config["num_keypoints"],
131 |         symmetry_point_ids=config["data"]["symmetry_point_ids"],
132 |         shuffle=False, augment=False, random_flip=False, random_rotate=False,
133 |         clip_landmark=train_config["keypoint_loss"] == "binary_crossentropy",random_scale_on_crop=False)
134 | 
135 |     # Initial epoch. Use when continue training
136 |     initial_epoch = train_config.get("initial_epoch", 0)
137 | 
138 | 
139 |     # Train
140 |     model.fit(train_dataset,
141 |               epochs=train_config["nb_epochs"],
142 |               steps_per_epoch=len(train_dataset),
143 |               validation_data=val_dataset,
144 |               validation_steps=len(val_dataset),
145 |               callbacks=[tb, mc],
146 |               initial_epoch=initial_epoch,
147 |               verbose=1)
148 | 
149 | 
150 | def load_model(config, model_path):
151 |     """Load pretrained model
152 | 
153 |     Args:
154 |         config (dict): Model configuration
155 |         model (str): Path to h5 model to be tested
156 |     """
157 | 
158 |     model_config = config["model"]
159 | 
160 |     # Initialize model and load weights
161 |     model = ModelCreator.create_model(model_config["model_type"], model_config["num_keypoints"])
162 |     model.compile()
163 |     model.load_weights(model_path)
164 | 
165 |     return model


--------------------------------------------------------------------------------
/src/trainers/losses.py:
--------------------------------------------------------------------------------
  1 | from tensorflow.keras.losses import binary_crossentropy
  2 | import tensorflow.keras.backend as K
  3 | import tensorflow as tf 
  4 | import math
  5 | 
  6 | epsilon = 1e-5
  7 | smooth = 1
  8 | 
  9 | def focal_loss(gamma=2., alpha=.25):
 10 | 	def focal_loss_fixed(y_true, y_pred):
 11 | 		pt_1 = tf.where(tf.equal(y_true, 1), y_pred, tf.ones_like(y_pred))
 12 | 		pt_0 = tf.where(tf.equal(y_true, 0), y_pred, tf.zeros_like(y_pred))
 13 | 		return -K.mean(alpha * K.pow(1. - pt_1, gamma) * K.log(pt_1)) - K.mean((1 - alpha) * K.pow(pt_0, gamma) * K.log(1. - pt_0))
 14 | 	return focal_loss_fixed
 15 | 
 16 | def dsc(y_true, y_pred):
 17 |     smooth = 1.
 18 |     y_true_f = K.flatten(y_true)
 19 |     y_pred_f = K.flatten(y_pred)
 20 |     intersection = K.sum(y_true_f * y_pred_f)
 21 |     score = (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
 22 |     return score
 23 | 
 24 | def dice_loss(y_true, y_pred):
 25 |     loss = 1 - dsc(y_true, y_pred)
 26 |     return loss
 27 | 
 28 | def bce_dice_loss(y_true, y_pred):
 29 |     loss = binary_crossentropy(y_true, y_pred) + dice_loss(y_true, y_pred)
 30 |     return loss
 31 | 
 32 | def confusion(y_true, y_pred):
 33 |     smooth=1
 34 |     y_pred_pos = K.clip(y_pred, 0, 1)
 35 |     y_pred_neg = 1 - y_pred_pos
 36 |     y_pos = K.clip(y_true, 0, 1)
 37 |     y_neg = 1 - y_pos
 38 |     tp = K.sum(y_pos * y_pred_pos)
 39 |     fp = K.sum(y_neg * y_pred_pos)
 40 |     fn = K.sum(y_pos * y_pred_neg) 
 41 |     prec = (tp + smooth)/(tp+fp+smooth)
 42 |     recall = (tp+smooth)/(tp+fn+smooth)
 43 |     return prec, recall
 44 | 
 45 | def tp(y_true, y_pred):
 46 |     smooth = 1
 47 |     y_pred_pos = K.round(K.clip(y_pred, 0, 1))
 48 |     y_pos = K.round(K.clip(y_true, 0, 1))
 49 |     tp = (K.sum(y_pos * y_pred_pos) + smooth)/ (K.sum(y_pos) + smooth) 
 50 |     return tp 
 51 | 
 52 | def tn(y_true, y_pred):
 53 |     smooth = 1
 54 |     y_pred_pos = K.round(K.clip(y_pred, 0, 1))
 55 |     y_pred_neg = 1 - y_pred_pos
 56 |     y_pos = K.round(K.clip(y_true, 0, 1))
 57 |     y_neg = 1 - y_pos 
 58 |     tn = (K.sum(y_neg * y_pred_neg) + smooth) / (K.sum(y_neg) + smooth )
 59 |     return tn 
 60 | 
 61 | def tversky(y_true, y_pred):
 62 |     y_true_pos = K.flatten(y_true)
 63 |     y_pred_pos = K.flatten(y_pred)
 64 |     true_pos = K.sum(y_true_pos * y_pred_pos)
 65 |     false_neg = K.sum(y_true_pos * (1-y_pred_pos))
 66 |     false_pos = K.sum((1-y_true_pos)*y_pred_pos)
 67 |     alpha = 0.7
 68 |     return (true_pos + smooth)/(true_pos + alpha*false_neg + (1-alpha)*false_pos + smooth)
 69 | 
 70 | def tversky_loss(y_true, y_pred):
 71 |     return 1 - tversky(y_true,y_pred)
 72 | 
 73 | def focal_tversky(y_true,y_pred):
 74 |     pt_1 = tversky(y_true, y_pred)
 75 |     gamma = 0.75
 76 |     return K.pow((1-pt_1), gamma)
 77 | 
 78 | def euclidean_distance_loss(y_true, y_pred):
 79 |     """
 80 |     Euclidean distance loss
 81 |     https://en.wikipedia.org/wiki/Euclidean_distance
 82 |     :param y_true: TensorFlow tensor
 83 |     :param y_pred: TensorFlow tensor of the same shape as y_true
 84 |     :return: float
 85 |     """
 86 |     return K.sqrt(K.sum(K.square(y_pred - y_true), axis=-1))
 87 | 
 88 | def wing_loss(landmarks, labels, w=10.0, epsilon=2.0):
 89 |     """
 90 |     Arguments:
 91 |         landmarks, labels: float tensors with shape [batch_size, num_landmarks, 2].
 92 |         w, epsilon: a float numbers.
 93 |     Returns:
 94 |         a float tensor with shape [].
 95 |     """
 96 |     with tf.name_scope('wing_loss'):
 97 |         x = landmarks - labels
 98 |         c = w * (1.0 - math.log(1.0 + w/epsilon))
 99 |         absolute_x = tf.abs(x)
100 |         losses = tf.where(
101 |             tf.greater(w, absolute_x),
102 |             w * tf.log(1.0 + absolute_x/epsilon),
103 |             absolute_x - c
104 |         )
105 |         loss = tf.reduce_mean(tf.reduce_sum(losses, axis=[1, 2]), axis=0)
106 |         return loss
107 | 
108 | 
109 | def get_huber_loss2(delta=1.0, weights=1.0):
110 |     # https://www.tensorflow.org/api_docs/python/tf/compat/v1/losses/huber_loss
111 | 
112 |     def huber_loss(y_true, y_pred):
113 |         return tf.compat.v1.losses.huber_loss(
114 |             y_true, y_pred, weights=weights, delta=delta
115 |         )
116 | 
117 |     return huber_loss
118 | 
119 | def get_huber_loss(delta=1.0, weights=(1.0, 100.0)):
120 |     '''
121 |     ' Huber loss.
122 |     ' https://jaromiru.com/2017/05/27/on-using-huber-loss-in-deep-q-learning/
123 |     ' https://en.wikipedia.org/wiki/Huber_loss
124 |     '''
125 |     def huber_loss(y_true, y_pred, clip_delta=delta, weights=weights):
126 |         error = y_true - y_pred
127 |         cond  = tf.keras.backend.abs(error) < clip_delta
128 |         squared_loss = 0.5 * tf.keras.backend.square(error)
129 |         linear_loss  = clip_delta * (tf.keras.backend.abs(error) - 0.5 * clip_delta)
130 |         total_loss = tf.where(cond, squared_loss, linear_loss)
131 |         weights = (y_true * weights[1]) + weights[0]
132 |         total_loss = total_loss * weights
133 |         return total_loss
134 | 
135 |     '''
136 |     ' Same as above but returns the mean loss.
137 |     '''
138 |     def huber_loss_mean(y_true, y_pred, clip_delta=delta):
139 |         return tf.keras.backend.mean(huber_loss(y_true, y_pred, clip_delta))
140 | 
141 |     return huber_loss
142 | 
143 | 
144 | def get_wing_loss(w=10.0, epsilon=2.0):
145 |     """
146 |     Arguments:
147 |         landmarks, labels: float tensors with shape [batch_size, num_landmarks, 2].
148 |         w, epsilon: a float numbers.
149 |     Returns:
150 |         a float tensor with shape [].
151 |     """
152 |     
153 |     def wing_loss(y_true, y_pred):
154 |         with tf.name_scope('wing_loss'):
155 |             x = y_pred - y_true
156 |             c = w * (1.0 - math.log(1.0 + w/epsilon))
157 |             absolute_x = tf.abs(x)
158 |             losses = tf.where(
159 |                 tf.greater(w, absolute_x),
160 |                 w * tf.log(1.0 + absolute_x/epsilon),
161 |                 absolute_x - c
162 |             )
163 |             loss = tf.reduce_mean(tf.reduce_sum(losses, axis=[1, 2]), axis=0)
164 |             return loss
165 | 


--------------------------------------------------------------------------------
/src/trainers/pushup_recognition_trainer.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import pathlib
 3 | import importlib
 4 | 
 5 | import tensorflow as tf
 6 | from tensorflow.keras.callbacks import ModelCheckpoint, TensorBoard
 7 | 
 8 | from ..models import ModelCreator
 9 | 
10 | from .losses import euclidean_distance_loss, focal_tversky, focal_loss, get_huber_loss
11 | from ..metrics.f1 import F1_Score, Recall, Precision
12 | 
13 | def train(config):
14 |     """Train model
15 | 
16 |     Args:
17 |         config (dict): Training configuration from configuration file
18 |     """
19 | 
20 |     train_config = config["train"]
21 |     test_config = config["test"]
22 |     model_config = config["model"]
23 | 
24 |     # Dataloader
25 |     datalib = importlib.import_module("src.data_loaders.{}".format(config["data_loader"]))
26 |     DataSequence = datalib.DataSequence
27 | 
28 |     # Initialize model
29 |     model = ModelCreator.create_model(model_config["model_type"])
30 | 
31 |     print(model.summary())
32 |     model.compile(optimizer=tf.optimizers.Adam(train_config["learning_rate"]),
33 |                   loss=train_config["loss"], metrics=[F1_Score(), Recall(), Precision()])
34 | 
35 | 
36 |     # Load pretrained model
37 |     if train_config["load_weights"]:
38 |         print("Loading model weights: " +
39 |               train_config["pretrained_weights_path"])
40 |         model.load_weights(train_config["pretrained_weights_path"])
41 | 
42 |     # Create experiment folder
43 |     exp_path = os.path.join("experiments/{}".format(config["experiment_name"]))
44 |     pathlib.Path(exp_path).mkdir(parents=True, exist_ok=True)
45 | 
46 |     # Define the callbacks
47 |     tb_log_path = os.path.join(exp_path, "tb_logs")
48 |     tb = TensorBoard(log_dir=tb_log_path, write_graph=True)
49 |     model_folder_path = os.path.join(exp_path, "models")
50 |     pathlib.Path(model_folder_path).mkdir(parents=True, exist_ok=True)
51 |     mc = ModelCheckpoint(filepath=os.path.join(
52 |         model_folder_path, "model_ep{epoch:03d}.h5"), save_weights_only=False, save_format="h5", verbose=1)
53 | 
54 |     # Load data
55 |     train_dataset = DataSequence(
56 |         config["data"]["train_images"],
57 |         config["data"]["train_labels"],
58 |         batch_size=train_config["train_batch_size"],
59 |         input_size=(model_config["im_width"], model_config["im_height"]),
60 |         shuffle=True, augment=True, random_flip=True, random_rotate=True,
61 |         random_scale_on_crop=True)
62 |     val_dataset = DataSequence(
63 |         config["data"]["val_images"],
64 |         config["data"]["val_labels"],
65 |         batch_size=train_config["val_batch_size"],
66 |         input_size=(model_config["im_width"], model_config["im_height"]),
67 |         shuffle=False, augment=False, random_flip=False, random_rotate=False,random_scale_on_crop=False)
68 | 
69 |     # Initial epoch. Use when continue training
70 |     initial_epoch = train_config.get("initial_epoch", 0)
71 | 
72 |     # Train
73 |     model.fit(train_dataset,
74 |               epochs=train_config["nb_epochs"],
75 |               steps_per_epoch=len(train_dataset),
76 |               validation_data=val_dataset,
77 |               validation_steps=len(val_dataset),
78 |               callbacks=[tb, mc],
79 |               initial_epoch=initial_epoch,
80 |               verbose=1)
81 | 
82 | 
83 | def load_model(config, model_path):
84 |     """Load pretrained model
85 | 
86 |     Args:
87 |         config (dict): Model configuration
88 |         model (str): Path to h5 model to be tested
89 |     """
90 | 
91 |     model_config = config["model"]
92 | 
93 |     # Initialize model and load weights
94 |     model = ModelCreator.create_model(model_config["model_type"])
95 |     model.compile()
96 |     model.load_weights(model_path)
97 | 
98 |     return model


--------------------------------------------------------------------------------
/src/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vietanhdev/tf-blazepose/95db5ced1a4084c6a424161fbfc38c4e70612c97/src/utils/__init__.py


--------------------------------------------------------------------------------
/src/utils/heatmap.py:
--------------------------------------------------------------------------------
  1 | from scipy.ndimage import gaussian_filter, maximum_filter
  2 | import numpy as np
  3 | import tensorflow as tf
  4 | 
  5 | 
  6 | def gen_point_heatmap(img, pt, sigma, type='Gaussian'):
  7 |     """Draw label map for 1 point
  8 | 
  9 |     Args:
 10 |         img: Input image
 11 |         pt: Point in format (x, y)
 12 |         sigma: Sigma param in Gaussian or Cauchy kernel
 13 |         type (str, optional): Type of kernel used to generate heatmap. Defaults to 'Gaussian'.
 14 | 
 15 |     Returns:
 16 |         np.array: Heatmap image
 17 |     """
 18 |     # Draw a 2D gaussian
 19 |     # Adopted from https://github.com/anewell/pose-hg-train/blob/master/src/pypose/draw.py
 20 | 
 21 |     # Check that any part of the gaussian is in-bounds
 22 |     ul = [int(pt[0] - 3 * sigma), int(pt[1] - 3 * sigma)]
 23 |     br = [int(pt[0] + 3 * sigma + 1), int(pt[1] + 3 * sigma + 1)]
 24 |     if (ul[0] >= img.shape[1] or ul[1] >= img.shape[0] or
 25 |             br[0] < 0 or br[1] < 0):
 26 |         # If not, just return the image as is
 27 |         return img
 28 | 
 29 |     # Generate gaussian
 30 |     size = 6 * sigma + 1
 31 |     x = np.arange(0, size, 1, float)
 32 |     y = x[:, np.newaxis]
 33 |     x0 = y0 = size // 2
 34 |     # The gaussian is not normalized, we want the center value to equal 1
 35 |     if type == 'Gaussian':
 36 |         g = np.exp(- ((x - x0) ** 2 + (y - y0) ** 2) / (2 * sigma ** 2))
 37 |     elif type == 'Cauchy':
 38 |         g = sigma / (((x - x0) ** 2 + (y - y0) ** 2 + sigma ** 2) ** 1.5)
 39 | 
 40 |     # Usable gaussian range
 41 |     g_x = max(0, -ul[0]), min(br[0], img.shape[1]) - ul[0]
 42 |     g_y = max(0, -ul[1]), min(br[1], img.shape[0]) - ul[1]
 43 |     # Image range
 44 |     img_x = max(0, ul[0]), min(br[0], img.shape[1])
 45 |     img_y = max(0, ul[1]), min(br[1], img.shape[0])
 46 | 
 47 |     img[img_y[0]:img_y[1], img_x[0]:img_x[1]] = g[g_y[0]:g_y[1], g_x[0]:g_x[1]]
 48 |     return img
 49 | 
 50 | 
 51 | def gen_gt_heatmap(keypoints, sigma, heatmap_size):
 52 |     """Generate groundtruth heatmap
 53 | 
 54 |     Args:
 55 |         keypoints: Keypoints in format [[x1, y1], [x2, y2], ...]
 56 |         sigma: Sigma param in Gaussian
 57 |         heatmap_size: Heatmap size in format (width, height)
 58 | 
 59 |     Returns:
 60 |         Generated heatmap
 61 |     """
 62 |     npart = keypoints.shape[0]
 63 |     gtmap = np.zeros(shape=(heatmap_size[1], heatmap_size[0], npart), dtype=float)
 64 |     for i in range(npart):
 65 |         if keypoints[i, 0] == 0 and keypoints[i, 1] == 0:
 66 |             continue
 67 |         is_visible = True
 68 |         if len(keypoints[0]) > 2:
 69 |             visibility = keypoints[i, 2]
 70 |             if visibility <= 0:
 71 |                 is_visible = False
 72 |         gtmap[:, :, i] = gen_point_heatmap(
 73 |             gtmap[:, :, i], keypoints[i, :], sigma)
 74 |         if not is_visible:
 75 |             gtmap[:, :, i] *= 0.5
 76 |     return gtmap
 77 | 
 78 | 
 79 | @tf.function
 80 | def nms(heat, kernel=3):
 81 |     hmax = tf.nn.max_pool2d(heat, kernel, 1, padding='SAME')
 82 |     keep = tf.cast(tf.equal(heat, hmax), tf.float32)
 83 |     return heat*keep
 84 | 
 85 | 
 86 | @tf.function
 87 | def find_keypoints_from_heatmap(batch_heatmaps, normalize=False):
 88 | 
 89 |     batch, height, width, n_points = tf.shape(batch_heatmaps)[0], tf.shape(
 90 |         batch_heatmaps)[1], tf.shape(batch_heatmaps)[2], tf.shape(batch_heatmaps)[3]
 91 | 
 92 |     batch_heatmaps = nms(batch_heatmaps)
 93 | 
 94 |     flat_tensor = tf.reshape(batch_heatmaps, (batch, -1, n_points))
 95 | 
 96 |     # Argmax of the flat tensor
 97 |     argmax = tf.argmax(flat_tensor, axis=1)
 98 |     argmax = tf.cast(argmax, tf.int32)
 99 |     scores = tf.math.reduce_max(flat_tensor, axis=1)
100 | 
101 |     # Convert indexes into 2D coordinates
102 |     argmax_y = argmax // width
103 |     argmax_x = argmax % width
104 |     argmax_y = tf.cast(argmax_y, tf.float32)
105 |     argmax_x = tf.cast(argmax_x, tf.float32)
106 | 
107 |     if normalize:
108 |         argmax_x = argmax_x / tf.cast(width, tf.float32)
109 |         argmax_y = argmax_y / tf.cast(height, tf.float32)
110 | 
111 |     # Shape: batch * 3 * n_points
112 |     batch_keypoints = tf.stack((argmax_x, argmax_y, scores), axis=1)
113 |     # Shape: batch * n_points * 3
114 |     batch_keypoints = tf.transpose(batch_keypoints, [0, 2, 1])
115 | 
116 |     return batch_keypoints
117 | 


--------------------------------------------------------------------------------
/src/utils/keypoints.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import random
 3 | import numpy as np
 4 | import cv2
 5 | 
 6 | 
 7 | def unnormalize_landmark(landmark, image_size):
 8 |     """Unnormalize landmark by image size
 9 | 
10 |     Args:
11 |         landmark: Normalized keypoints in format [[x1, y1], [x2, y2], ...]
12 |         image_size: Image size in format (width, height)
13 | 
14 |     Returns:
15 |         Unnormalized landmark
16 |     """
17 |     image_size = np.array(image_size)
18 |     landmark[:, :2] = np.multiply(
19 |         np.array(landmark[:, :2]), np.array(image_size).reshape((1, 2)))
20 |     return landmark
21 | 
22 | 
23 | def normalize_landmark(landmark, image_size):
24 |     """Normalize landmark by image size
25 | 
26 |     Args:
27 |         landmark: Keypoints in format [[x1, y1], [x2, y2], ...]
28 |         image_size: Image size in format (width, height)
29 | 
30 |     Returns:
31 |         Normalized landmark
32 |     """
33 |     image_size = np.array(image_size)
34 |     landmark = np.array(landmark)
35 |     landmark = landmark.astype(float)
36 |     landmark[:, :2] = np.divide(
37 |         landmark[:, :2], np.array(image_size).reshape((1, 2)))
38 |     return landmark
39 | 
40 | 
41 | 


--------------------------------------------------------------------------------
/src/utils/pre_processing.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import cv2
  3 | import random
  4 | 
  5 | def calculate_bbox_from_keypoints(kps, padding=0.1):
  6 |     """Estimate body bounding box from all body keypoints
  7 | 
  8 |     Args:
  9 |         kps: Keypoints. Shape: (n, 2)
 10 |         padding: Padding the smallest keypoint bounding box to form body bounding box
 11 |     """
 12 | 
 13 |     kps = np.array(kps)
 14 |     min_x = np.min(kps[:, 0])
 15 |     min_y = np.min(kps[:, 1])
 16 |     max_x = np.max(kps[:, 0])
 17 |     max_y = np.max(kps[:, 1])
 18 | 
 19 |     width = max_x - min_x
 20 |     height = max_y - min_y
 21 | 
 22 |     x1 = min_x - padding * width
 23 |     x2 = max_x + padding * width
 24 |     y1 = min_y - padding * height
 25 |     y2 = max_y + padding * height
 26 | 
 27 |     return [[x1, y1], [x2, y2]]
 28 | 
 29 | 
 30 | def square_padding(im, desired_size=800, return_padding=False):
 31 | 
 32 |     old_size = im.shape[:2]  # old_size is in (height, width) format
 33 | 
 34 |     ratio = float(desired_size) / max(old_size)
 35 |     new_size = tuple([int(x*ratio) for x in old_size])
 36 | 
 37 |     im = cv2.resize(im, (new_size[1], new_size[0]))
 38 | 
 39 |     delta_w = desired_size - new_size[1]
 40 |     delta_h = desired_size - new_size[0]
 41 |     top, bottom = delta_h // 2, delta_h - (delta_h // 2)
 42 |     left, right = delta_w // 2, delta_w - (delta_w // 2)
 43 | 
 44 |     color = [0, 0, 0]
 45 |     new_im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT,
 46 |                                 value=color)
 47 | 
 48 |     if not return_padding:
 49 |         return new_im
 50 |     else:
 51 |         h, w = new_im.shape[:2]
 52 |         padding = (top / h, left / w, bottom / h, right / w)
 53 |         return new_im, padding
 54 | 
 55 | 
 56 | def square_crop_with_keypoints(image, bbox, keypoints, pad_value=0):
 57 |     """Square crop an image knowing a bounding box. This function also update keypoints accordingly
 58 |     Steps: Extend bbox to a square -> Pad image -> Crop image -> Recalculate keypoints 
 59 | 
 60 |     Args:
 61 |         image: Input image
 62 |         bbox: Bounding box. Shape: (2, 2), Format: [[x1, y1], [x2, y2]]
 63 |         keypoints: Keypoints in format [[x1, y1], [x2, y2], ...]
 64 |         pad_value: Scalar indicating padding color
 65 |     
 66 |     Returns:
 67 |         cropped_image, keypoints
 68 |     """
 69 | 
 70 |     bbox_width = bbox[1][0] - bbox[0][0]
 71 |     bbox_height = bbox[1][1] - bbox[0][1]
 72 |     im_height, im_width = image.shape[:2]
 73 | 
 74 |     if bbox_width > bbox_height: # Padding on y-axis
 75 |         pad = int((bbox_width - bbox_height) / 2)
 76 |         bbox[0][1] -= pad
 77 |         bbox[1][1] = bbox[0][1] + bbox_width
 78 |     elif bbox_height > bbox_width: # Padding on x-axis
 79 |         pad = int((bbox_height - bbox_width) / 2)
 80 |         bbox[0][0] -= pad
 81 |         bbox[1][0] = bbox[0][0] + bbox_height
 82 | 
 83 |     pad_top = 0
 84 |     pad_bottom = 0
 85 |     pad_left = 0
 86 |     pad_right = 0
 87 |     if bbox[0][0] < 0:
 88 |         pad_left = -bbox[0][0]
 89 |         bbox[0][0] = 0
 90 |         bbox[1][0] += pad_left
 91 |     if bbox[0][1] < 0:
 92 |         pad_top = -bbox[0][1]
 93 |         bbox[0][1] = 0
 94 |         bbox[1][1] += pad_top
 95 |     if bbox[1][0] >= im_width:
 96 |         pad_right = bbox[1][0] - im_width + 1
 97 |     if bbox[1][1] >= im_height:
 98 |         pad_bottom = bbox[1][1] - im_height + 1
 99 |     
100 |     if pad_value == "random":
101 |         pad_value = (random.randint(0, 255), random.randint(0, 255), random.randint(0, 255))
102 |     padded_image = cv2.copyMakeBorder(image, pad_top, pad_bottom, pad_left, pad_right, cv2.BORDER_CONSTANT, value=pad_value)
103 | 
104 |     cropped_image = padded_image[bbox[0][1]:bbox[1][1], bbox[0][0]:bbox[1][0]]
105 | 
106 |     # Mark missing keypoints
107 |     keypoints = np.array(keypoints)
108 |     missing_idxs = []
109 |     for i in range(keypoints.shape[0]):
110 |         if keypoints[i, 0] == 0 and keypoints[i, 1] == 0:
111 |             missing_idxs.append(i)
112 | 
113 |     # Update keypoints
114 |     keypoints[:, 0] = keypoints[:, 0] - bbox[0][0] + pad_left
115 |     keypoints[:, 1] = keypoints[:, 1] - bbox[0][1] + pad_top
116 | 
117 |     # Restore missing keypoints
118 |     for i in missing_idxs:
119 |         keypoints[i, 0] = 0
120 |         keypoints[i, 1] = 0
121 | 
122 |     return cropped_image, keypoints
123 | 
124 | 


--------------------------------------------------------------------------------
/src/utils/visualizer.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | 
 3 | 
 4 | def visualize_keypoints(image, keypoints, visibility=None, edges=None, point_color=(0, 255, 0), text_color=(0, 0, 0)):
 5 |     """Visualize keypoints
 6 | 
 7 |     Args:
 8 |         image: Input image
 9 |         keypoints: Keypoints in format [[x1, y1], [x2, y2], ...]
10 |         visibility [list]: List of visibilities of keypoints. 0: occluded, 1: visible
11 | 
12 |     Returns:
13 |         Visualized image
14 |     """
15 | 
16 |     draw = image.copy()
17 |     for i, p in enumerate(keypoints):
18 |         x, y = p[0], p[1]
19 |         tmp_point_color = point_color
20 |         if visibility is not None and not int(visibility[i]):
21 |             tmp_point_color = (100, 100, 100)
22 |         draw = cv2.circle(draw, center=(int(x), int(y)),
23 |                           color=tmp_point_color, radius=5, thickness=-1)
24 |         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
25 |                            0.5, text_color, 1, cv2.LINE_AA)
26 | 
27 |     if edges is not None and visibility is not None:
28 |         for edge_chain in edges:
29 |             for i in range(len(edge_chain) - 1):
30 |                 if visibility[edge_chain[i]] and visibility[edge_chain[i+1]]:
31 |                     p1 = tuple(keypoints[edge_chain[i]])
32 |                     p2 = tuple(keypoints[edge_chain[i+1]])
33 |                     cv2.line(draw, p1, p2, (0, 0, 255), 2)
34 | 
35 |     return draw
36 | 


--------------------------------------------------------------------------------
/test.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import importlib
 3 | import json
 4 | import tensorflow as tf
 5 | 
 6 | for gpu in tf.config.experimental.list_physical_devices('GPU'):
 7 |     tf.compat.v2.config.experimental.set_memory_growth(gpu, True)
 8 | 
 9 | parser = argparse.ArgumentParser()
10 | parser.add_argument(
11 |     '-c',
12 |     '--conf_file', default="config.json",
13 |     help='Configuration file')
14 | parser.add_argument(
15 |     '-m',
16 |     '--model', default="model.h5",
17 |     help='Path to h5 model')
18 | args = parser.parse_args()
19 | 
20 | # Open and load the config json
21 | with open(args.conf_file) as config_buffer:
22 |     config = json.loads(config_buffer.read())
23 | 
24 | trainer = importlib.import_module("src.trainers.{}".format(config["trainer"]))
25 | trainer.test(config, args.model)
26 | 


--------------------------------------------------------------------------------
/tools/lsp_data_to_json.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import argparse
 3 | import json
 4 | from scipy.io import loadmat
 5 | 
 6 | parser = argparse.ArgumentParser()
 7 | parser.add_argument(
 8 |     '-i',
 9 |     '--input_file', default="data/lsp_dataset/joints.mat",
10 |     help='Path to lsp annotation file')
11 | parser.add_argument(
12 |     '-f',
13 |     '--image_folder', default="data/lsp_dataset/images",
14 |     help='Image folder')
15 | parser.add_argument(
16 |     '-o',
17 |     '--output_file', default="data/lsp_dataset/labels.json",
18 |     help='Output json file')
19 | args = parser.parse_args()
20 | 
21 | # Load annotations
22 | annotations = loadmat(args.input_file)
23 | joints = annotations["joints"]
24 | joints_shape = joints.shape
25 | print(joints_shape)
26 | if joints_shape[0] == 3 and joints_shape[1] == 14: # LSP (3, 14, n) -> (n, 14, 3)
27 |     joints = joints.swapaxes(0, 2)
28 | elif joints_shape[0] == 14 and joints_shape[1] == 3: # LSPET (14, 3, n) -> (n, 14, 3)
29 |     joints = joints.swapaxes(0, 2)
30 |     joints = joints.swapaxes(1, 2)
31 | 
32 | # List image files
33 | images = [i for i in os.listdir(args.image_folder) if i.endswith("jpg")]
34 | images.sort()
35 | 
36 | # Build new annotations
37 | labels = []
38 | w_img_count = 0
39 | for i in range(len(images)):
40 |     points = joints[i, :, :2]
41 |     visibility = joints[i, :, 2]
42 | 
43 |     wrong_label = False
44 |     for p in points:
45 |         if p[0] <= 0 or p[1] <= 0:
46 |             wrong_label = True
47 |             break
48 |     if wrong_label:
49 |         w_img_count += 1
50 |         print(w_img_count)
51 |         continue
52 | 
53 |     label = {"image": images[i], "points": points.tolist(), "visibility": visibility.tolist()}
54 |     labels.append(label)
55 | 
56 | print("Len: ", len(labels))
57 | 
58 | with open(args.output_file, "w") as fp:
59 |     json.dump(labels, fp)
60 | 
61 | 


--------------------------------------------------------------------------------
/tools/merge_lsp.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | import os
 3 | 
 4 | with open("data/pushup/train.json", "r") as fp:
 5 |     train1 = json.load(fp)
 6 |     for l in train1:
 7 |         l["is_pushing_up"] = True
 8 | with open("data/lsp_lspet_7points/train.json", "r") as fp:
 9 |     train2 = json.load(fp)
10 |     for l in train2:
11 |         l["is_pushing_up"] = False
12 | 
13 | with open("data/pushup/val.json", "r") as fp:
14 |     val1 = json.load(fp)
15 |     for l in val1:
16 |         l["is_pushing_up"] = True
17 | with open("data/lsp_lspet_7points/val.json", "r") as fp:
18 |     val2 = json.load(fp)
19 |     for l in val2:
20 |         l["is_pushing_up"] = False
21 | 
22 | with open("data/pushup/test.json", "r") as fp:
23 |     test1 = json.load(fp)
24 |     for l in test1:
25 |         l["is_pushing_up"] = True
26 | with open("data/lsp_lspet_7points/test.json", "r") as fp:
27 |     test2 = json.load(fp)
28 |     for l in test2:
29 |         l["is_pushing_up"] = False
30 | 
31 | with open("data/lsp_lspet_pushup/train.json", "w") as fp:
32 |     json.dump(train1 + train2, fp)
33 | with open("data/lsp_lspet_pushup/val.json", "w") as fp:
34 |     json.dump(val1 + val2, fp)
35 | with open("data/lsp_lspet_pushup/test.json", "w") as fp:
36 |     json.dump(test1 + test2, fp)
37 | 
38 | os.system("mkdir -p data/lsp_lspet_pushup/images")
39 | os.system("cp data/pushup/images/* data/lsp_lspet_pushup/images")
40 | os.system("cp data/lsp_lspet_7points/images/* data/lsp_lspet_pushup/images")


--------------------------------------------------------------------------------
/tools/merge_lsp_lspet_pushup.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | import os
 3 | 
 4 | with open("data/pushup/train.json", "r") as fp:
 5 |     train1 = json.load(fp)
 6 |     for l in train1:
 7 |         l["is_pushing_up"] = True
 8 | with open("data/lsp_lspet_7points/train.json", "r") as fp:
 9 |     train2 = json.load(fp)
10 |     for l in train2:
11 |         l["is_pushing_up"] = False
12 | 
13 | with open("data/pushup/val.json", "r") as fp:
14 |     val1 = json.load(fp)
15 |     for l in val1:
16 |         l["is_pushing_up"] = True
17 | with open("data/lsp_lspet_7points/val.json", "r") as fp:
18 |     val2 = json.load(fp)
19 |     for l in val2:
20 |         l["is_pushing_up"] = False
21 | 
22 | with open("data/pushup/test.json", "r") as fp:
23 |     test1 = json.load(fp)
24 |     for l in test1:
25 |         l["is_pushing_up"] = True
26 | with open("data/lsp_lspet_7points/test.json", "r") as fp:
27 |     test2 = json.load(fp)
28 |     for l in test2:
29 |         l["is_pushing_up"] = False
30 | 
31 | with open("data/lsp_lspet_pushup/train.json", "w") as fp:
32 |     json.dump(train1 + train2, fp)
33 | with open("data/lsp_lspet_pushup/val.json", "w") as fp:
34 |     json.dump(val1 + val2, fp)
35 | with open("data/lsp_lspet_pushup/test.json", "w") as fp:
36 |     json.dump(test1 + test2, fp)
37 | 
38 | os.system("mkdir -p data/lsp_lspet_pushup/images")
39 | os.system("cp data/pushup/images/* data/lsp_lspet_pushup/images")
40 | os.system("cp data/lsp_lspet_7points/images/* data/lsp_lspet_pushup/images")


--------------------------------------------------------------------------------
/tools/merge_mpii_pushup.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | import os
 3 | 
 4 | with open("data/pushup/train.json", "r") as fp:
 5 |     train1 = json.load(fp)
 6 |     for l in train1:
 7 |         l["is_pushing_up"] = True
 8 | with open("data/mpii/train.json", "r") as fp:
 9 |     train2 = json.load(fp)
10 |     for l in train2:
11 |         x = l["points"]
12 |         l["points"] = [x[10], x[11], x[12], x[9], x[13], x[14], x[15]]
13 |         x = l["visibility"]
14 |         l["visibility"] = [x[10], x[11], x[12], x[9], x[13], x[14], x[15]]
15 |         l["is_pushing_up"] = False
16 | 
17 | with open("data/pushup/val.json", "r") as fp:
18 |     val1 = json.load(fp)
19 |     for l in val1:
20 |         l["is_pushing_up"] = True
21 | with open("data/mpii/val.json", "r") as fp:
22 |     val2 = json.load(fp)
23 |     for l in val2:
24 |         x = l["points"]
25 |         l["points"] = [x[10], x[11], x[12], x[9], x[13], x[14], x[15]]
26 |         x = l["visibility"]
27 |         l["visibility"] = [x[10], x[11], x[12], x[9], x[13], x[14], x[15]]
28 |         l["is_pushing_up"] = False
29 | 
30 | with open("data/pushup/test.json", "r") as fp:
31 |     test1 = json.load(fp)
32 |     for l in test1:
33 |         l["is_pushing_up"] = True
34 | with open("data/mpii/test.json", "r") as fp:
35 |     test2 = json.load(fp)
36 |     for l in test2:
37 |         x = l["points"]
38 |         l["points"] = [x[10], x[11], x[12], x[9], x[13], x[14], x[15]]
39 |         x = l["visibility"]
40 |         l["visibility"] = [x[10], x[11], x[12], x[9], x[13], x[14], x[15]]
41 |         l["is_pushing_up"] = False
42 | 
43 | os.system("mkdir -p data/mpii_pushup/images")
44 | with open("data/mpii_pushup/train.json", "w") as fp:
45 |     json.dump(train1 + train2, fp)
46 | with open("data/mpii_pushup/val.json", "w") as fp:
47 |     json.dump(val1 + val2, fp)
48 | with open("data/mpii_pushup/test.json", "w") as fp:
49 |     json.dump(test1 + test2, fp)
50 | 
51 | os.system("cp data/pushup/images/* data/mpii_pushup/images")
52 | os.system("cp data/mpii/images/* data/mpii_pushup/images")


--------------------------------------------------------------------------------
/tools/process_data.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | import random
 3 | import cv2
 4 | import os
 5 | import copy
 6 | import shutil
 7 | 
 8 | data = []
 9 | with open("data/pushup/train.json") as fp:
10 |     data1 = json.load(fp)["labels"]
11 | with open("data/pushup/test.json") as fp:
12 |     data2 = json.load(fp)["labels"]
13 | with open("data/pushup/val.json") as fp:
14 |     data3 = json.load(fp)["labels"]
15 | 
16 | data = [d for d in data if d["contains_person"] and d["is_pushing_up"]]
17 | print(len(data))
18 | 
19 | mpii1 = [d for d in data1 if d["contains_person"] and not d["is_pushing_up"]]
20 | mpii2 = [d for d in data2 if d["contains_person"] and not d["is_pushing_up"]]
21 | mpii3 = [d for d in data3 if d["contains_person"] and not d["is_pushing_up"]]
22 | 
23 | print(len(mpii1))
24 | print(len(mpii2))
25 | print(len(mpii3))
26 | exit(0)
27 | 
28 | kk = {}
29 | for d in data:
30 |     d["video"] = d["image"].split("_")[0]
31 |     kk[d["image"]] = d
32 | 
33 | new_data = []
34 | for k in kk.values():
35 |     new_data.append(k)
36 | 
37 | data = new_data
38 | random.shuffle(data)
39 | 
40 | for d in data:
41 |     with open(os.path.join("labels", "{}.json".format(d["image"])), "w") as fp:
42 |         json.dump(d["points"], fp)
43 | 
44 | # for d in data:
45 | #     # if d["video"] == "317":
46 | #     #     img = cv2.imread(os.path.join("data/pushup/images", d["image"]))
47 | #     #     cv2.imshow("Image", img)
48 | #     #     cv2.waitKey(0)
49 | #     shutil.copy(os.path.join("data/pushup/images", d["image"]),
50 | #     os.path.join("data/pushup/new_images", d["image"]))
51 | 
52 | # videos = {}
53 | # for d in data:
54 | #     if d["video"] not in videos:
55 | #         videos[d["video"]] = 1
56 | #     else:
57 | #         videos[d["video"]] += 1
58 | 
59 | # video_img_count = []
60 | # for video, img_count in videos.items():
61 | #     print("{}:{}".format(video, img_count), end="; ")
62 | 
63 | # print("We have {} videos", len(videos.keys()))


--------------------------------------------------------------------------------
/tools/process_pushup_data.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import pathlib
  3 | import json
  4 | import shutil
  5 | import random
  6 | import cv2
  7 | import numpy as np
  8 | from src.utils.pre_processing import calculate_bbox_from_keypoints, square_crop_with_keypoints, random_occlusion
  9 | 
 10 | 
 11 | label_files = list(os.listdir("data/pushup/labels"))
 12 | label_files = [l for l in label_files if l.endswith("json")]
 13 | 
 14 | labels = []
 15 | for lf in label_files:
 16 |     with open(os.path.join("data/pushup/labels", lf)) as fp:
 17 |         points = json.load(fp)
 18 |         if len(points) != 7:
 19 |             continue
 20 |         label = {
 21 |             "video": int(lf.split("_")[0]),
 22 |             "image": lf.replace(".json", ""),
 23 |             "points": points,
 24 |             "visibility": [1,1,1,1,1,1,1]
 25 |         }
 26 |         image = cv2.imread(os.path.join("data/pushup/images", label["image"]))
 27 |         label["bbox"] = [[0,0],[image.shape[1],image.shape[0]]]
 28 |         label["bbox"] = np.array(label["bbox"]).astype(int).tolist()
 29 |         labels.append(label)
 30 | 
 31 | # for label in labels:
 32 | #     # Visualize
 33 | #     for label in labels:
 34 | #         image = cv2.imread(os.path.join("data/pushup/images", label["image"]))
 35 | 
 36 | #         draw = image.copy()
 37 | #         for i, p in enumerate(label["points"]):
 38 | #             x, y = p
 39 | #             color = (0, 0, 255) if int(label["visibility"][i]) else (255, 0, 0)
 40 | #             draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
 41 | #             draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
 42 | #                     0.5, (0, 255, 0), 1, cv2.LINE_AA)
 43 | #         p1 = tuple(label["bbox"][0])
 44 | #         p2 = tuple(label["bbox"][1])
 45 | #         draw = cv2.rectangle(draw, p1, p2, (0,0,255), 2)
 46 | #         cv2.imshow("Image", draw)
 47 | #         # cv2.waitKey(0)
 48 | 
 49 | #         cropped_image, keypoints = square_crop_with_keypoints(image, label["bbox"], label["points"], "random")
 50 | #         draw = cropped_image.copy()
 51 | #         for i, p in enumerate(keypoints):
 52 | #             x, y = p
 53 | #             color = (0, 0, 255) if int(label["visibility"][i]) else (255, 0, 0)
 54 | #             draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
 55 | #             draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
 56 | #                     0.5, (0, 255, 0), 1, cv2.LINE_AA)
 57 | 
 58 | #         cv2.imshow("Square cropped", draw)
 59 | #         # cv2.waitKey(0)
 60 | 
 61 | #         # Test random occlusion
 62 | #         cropped_image, visibility = random_occlusion(cropped_image, keypoints, visibility=None, rect_ratio=((0.2, 0.5), (0.2, 0.5)))
 63 | #         draw = cropped_image.copy()
 64 | #         for i, p in enumerate(keypoints):
 65 | #             print(i)
 66 | #             print(visibility[i])
 67 | #             x, y = p
 68 | #             color = (0, 0, 255) if int(visibility[i]) else (255, 0, 0)
 69 | #             draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
 70 | #             draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
 71 | #                     0.5, (0, 255, 0), 1, cv2.LINE_AA)
 72 | 
 73 | #         cv2.imshow("Random occlusion", draw)
 74 | #         cv2.waitKey(0)
 75 | 
 76 | print("n_labels:", len(labels))
 77 | 
 78 | # for label in labels:
 79 | #     label.pop("video", None)
 80 | 
 81 | train_labels = [l for l in labels if l["video"] < 480]
 82 | # print("train_labels", len(train_labels))
 83 | # train_videos = set(l["video"] for l in train_labels)
 84 | # print("train_videos", len(train_videos))
 85 | 
 86 | val_labels = [l for l in labels if l["video"] >= 480 and l["video"] < 550]
 87 | # print("val_labels", len(val_labels))
 88 | # val_videos = set(l["video"] for l in val_labels)
 89 | # print("val_videos", len(val_videos))
 90 | 
 91 | test_labels = [l for l in labels if l["video"] >= 550]
 92 | # print("test_labels", len(test_labels))
 93 | # test_videos = set(l["video"] for l in test_labels)
 94 | # print("test_videos", len(test_videos))
 95 | 
 96 | with open("data/pushup/train.json", "w") as fp:
 97 |     json.dump(train_labels, fp)
 98 | 
 99 | with open("data/pushup/val.json", "w") as fp:
100 |     json.dump(val_labels, fp)
101 | 
102 | with open("data/pushup/test.json", "w") as fp:
103 |     json.dump(test_labels, fp)
104 | 
105 | 
106 | 


--------------------------------------------------------------------------------
/tools/split_data_mpii.py:
--------------------------------------------------------------------------------
  1 | import json
  2 | import os
  3 | import pathlib
  4 | import json
  5 | import shutil
  6 | import random
  7 | import cv2
  8 | import numpy as np
  9 | 
 10 | from src.utils.pre_processing import calculate_bbox_from_keypoints, square_crop_with_keypoints
 11 | from src.data_loaders.augmentation_utils import random_occlusion
 12 | 
 13 | image_folder = "data/mpii/images"
 14 | jsonfile = "data/mpii_annotations.json"
 15 | 
 16 | # load train or val annotation
 17 | with open(jsonfile) as anno_file:
 18 |     anno = json.load(anno_file)
 19 | 
 20 | val_anno, train_anno = [], []
 21 | for idx, val in enumerate(anno):
 22 |     if val['isValidation'] == True:
 23 |         val_anno.append(anno[idx])
 24 |     else:
 25 |         train_anno.append(anno[idx])
 26 | 
 27 | anno = train_anno + val_anno
 28 | 
 29 | count = 0
 30 | labels = []
 31 | for a in anno:
 32 | 
 33 |     if a["numOtherPeople"] != 0:
 34 |         continue
 35 | 
 36 |     l = {"image": a["img_paths"]}
 37 |     points = np.array(a["joint_self"])
 38 |     l["points"] = points[:, :2].tolist()
 39 |     l["visibility"] = points[:, 2].tolist()
 40 | 
 41 |     for p in l["points"]:
 42 |         if p[0] == 0 and p[1] == 0:
 43 |             p[0] = -1
 44 |             p[1] = -1
 45 | 
 46 | 
 47 |     inside_points = [p for p in l["points"] if p[0] != -1 or p[1] != -1]
 48 |     l["bbox"] = calculate_bbox_from_keypoints(inside_points)
 49 |     l["bbox"] = np.array(l["bbox"]).astype(int).tolist()
 50 | 
 51 |     # Crop image
 52 |     image = cv2.imread(os.path.join(image_folder, l["image"]))
 53 |     image, keypoints = square_crop_with_keypoints(image, l["bbox"], l["points"], "random")
 54 |     l["points"] = keypoints.tolist()
 55 | 
 56 |     # BBox again
 57 |     inside_points = [p for p in l["points"] if p[0] != -1 or p[1] != -1]
 58 |     l["bbox"] = calculate_bbox_from_keypoints(inside_points)
 59 |     l["bbox"] = np.array(l["bbox"]).astype(int).tolist()
 60 | 
 61 |     l["image"] = "mpii_crop_{}.png".format(count)
 62 |     count += 1
 63 | 
 64 |     # cv2.imwrite(os.path.join("data/mpii/images2", l["image"]), image)
 65 | 
 66 |     # draw = image.copy()
 67 |     # for i, p in enumerate(l["points"]):
 68 |     #     x, y = p
 69 |     #     color = (0, 0, 255) if int(l["visibility"][i]) else (255, 0, 0)
 70 |     #     draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
 71 |     #     draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
 72 |     #             0.5, (0, 255, 0), 1, cv2.LINE_AA)
 73 |     #     draw = cv2.rectangle(draw, tuple(l["bbox"][0]), tuple(l["bbox"][1]), (0, 0, 255), 5)
 74 |     # cv2.imshow("Image", draw)
 75 |     # cv2.waitKey(0)
 76 | 
 77 | 
 78 |     labels.append(l)
 79 |     
 80 | # # Visualize
 81 | # for label in labels:
 82 | #     image = cv2.imread(os.path.join(image_folder, label["image"]))
 83 | 
 84 | #     draw = image.copy()
 85 | #     for i, p in enumerate(label["points"]):
 86 | #         x, y = p
 87 | #         color = (0, 0, 255) if int(label["visibility"][i]) else (255, 0, 0)
 88 | #         draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
 89 | #         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
 90 | #                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
 91 | #     p1 = tuple(label["bbox"][0])
 92 | #     p2 = tuple(label["bbox"][1])
 93 | #     draw = cv2.rectangle(draw, p1, p2, (0,0,255), 2)
 94 | #     cv2.imshow("Image", draw)
 95 | #     # cv2.waitKey(0)
 96 | 
 97 | #     cropped_image, keypoints = square_crop_with_keypoints(image, label["bbox"], label["points"], "random")
 98 | #     draw = cropped_image.copy()
 99 | #     for i, p in enumerate(keypoints):
100 | #         x, y = p
101 | #         color = (0, 0, 255) if int(label["visibility"][i]) else (255, 0, 0)
102 | #         draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
103 | #         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
104 | #                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
105 | 
106 | #     cv2.imshow("Square cropped", draw)
107 | #     # cv2.waitKey(0)
108 | 
109 | #     # Test random occlusion
110 | #     cropped_image, visibility = random_occlusion(cropped_image, keypoints, visibility=None, rect_ratio=((0.2, 0.5), (0.2, 0.5)))
111 | #     draw = cropped_image.copy()
112 | #     for i, p in enumerate(keypoints):
113 | #         print(i)
114 | #         print(visibility[i])
115 | #         x, y = p
116 | #         color = (0, 0, 255) if int(visibility[i]) else (255, 0, 0)
117 | #         draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
118 | #         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
119 | #                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
120 | 
121 | #     cv2.imshow("Random occlusion", draw)
122 | #     cv2.waitKey(0)
123 | 
124 | def save_label(file_name, labels):
125 |     with open(os.path.join("data/mpii/", file_name), "w") as fp:
126 |         json.dump(labels, fp)
127 | 
128 | with open("data/mpii/labels.json", "w") as anno_file:
129 |     json.dump(labels, anno_file)
130 | 
131 | 
132 | n_train = 9503
133 | n_val = 1000
134 | save_label("train.json", labels[:n_train])
135 | save_label("val.json", labels[n_train:n_train+n_val])
136 | save_label("test.json", labels[n_train+n_val:])
137 | 
138 | print(len(labels))
139 | 
140 | # with open("data/mpii/train.json", "w") as anno_file:
141 | #     json.dump(train_anno, anno_file)
142 |     
143 | # with open("data/mpii/val.json", "w") as anno_file:
144 | #     json.dump(val_anno, anno_file)


--------------------------------------------------------------------------------
/tools/split_lsp_lspet.py:
--------------------------------------------------------------------------------
  1 | """
  2 | This script is used to concatenate datasets into a bigger dataset
  3 | and split them into training, validation and test sets.
  4 | """
  5 | import os
  6 | import pathlib
  7 | import json
  8 | import shutil
  9 | import random
 10 | import cv2
 11 | import numpy as np
 12 | 
 13 | from src.utils.pre_processing import calculate_bbox_from_keypoints, square_crop_with_keypoints
 14 | from src.data_loaders.augmentation_utils import random_occlusion
 15 | 
 16 | class DatasetCreator:
 17 |     def __init__(self, root_folder):
 18 | 
 19 |         if os.path.exists(root_folder):
 20 |             print("Folder existed! Please choose a non-existed path. {}".format(root_folder))
 21 |             exit(0)
 22 | 
 23 |         self.root_folder = root_folder
 24 |         self.image_folder = os.path.join(root_folder, "images")
 25 |         pathlib.Path(self.root_folder).mkdir(parents=True, exist_ok=True)
 26 |         pathlib.Path(self.image_folder).mkdir(parents=True, exist_ok=True)
 27 | 
 28 | 
 29 |     def save_label(self, file_name, labels):
 30 |         with open(os.path.join(self.root_folder, file_name), "w") as fp:
 31 |             json.dump(labels, fp)
 32 | 
 33 |     def add_set(self, image_folder, label_file, n_train, n_val, n_test, copy_images=True, shuffle=True):
 34 |         """Add subset into this dataset
 35 | 
 36 |         Args:
 37 |             image_folder (str): Path to image folder of subset
 38 |             label_file (str): Path to annotation file of subset
 39 |             n_train (int): Number of samples for training
 40 |             n_val (int): Number of samples for validation
 41 |             n_test (int): Number of samples for testing
 42 |         """
 43 | 
 44 |         with open(label_file, "r") as fp:
 45 |             labels = json.load(fp)
 46 | 
 47 |         # Use all data?
 48 |         assert (len(labels) == n_train + n_val + n_test)
 49 | 
 50 |         if shuffle:
 51 |             random.seed(42)
 52 |             random.shuffle(labels)
 53 | 
 54 |         for label in labels:
 55 | 
 56 |             # Copy images
 57 |             if copy_images:
 58 | 
 59 |                 # Rename image if duplicated
 60 |                 if os.path.exists(os.path.join(self.image_folder, label["image"])):
 61 |                     new_name = label["image"]
 62 |                     filename, file_extension = os.path.splitext(label["image"])
 63 |                     extended_number = 2
 64 |                     while True:
 65 |                         new_name = "{}_ext{}{}".format(filename, extended_number, file_extension)
 66 |                         if os.path.exists(os.path.join(self.image_folder, new_name)):
 67 |                             extended_number += 1
 68 |                         else:
 69 |                             break
 70 |                     shutil.copy(
 71 |                         os.path.join(image_folder, label["image"]),
 72 |                         os.path.join(self.image_folder, new_name),
 73 |                     )
 74 |                     label["image"] = new_name
 75 |                 else:
 76 |                     shutil.copy(
 77 |                         os.path.join(image_folder, label["image"]),
 78 |                         os.path.join(self.image_folder, label["image"]),
 79 |                     )
 80 |                     pass
 81 | 
 82 |             label["bbox"] = calculate_bbox_from_keypoints(label["points"])
 83 |             label["bbox"] = np.array(label["bbox"]).astype(int).tolist()
 84 |         
 85 |         # # Visualize
 86 |         # for label in labels:
 87 |         #     image = cv2.imread(os.path.join(self.image_folder, label["image"]))
 88 | 
 89 |         #     draw = image.copy()
 90 |         #     for i, p in enumerate(label["points"]):
 91 |         #         x, y = p
 92 |         #         color = (0, 0, 255) if int(label["visibility"][i]) else (255, 0, 0)
 93 |         #         draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
 94 |         #         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
 95 |         #                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
 96 |         #     p1 = tuple(label["bbox"][0])
 97 |         #     p2 = tuple(label["bbox"][1])
 98 |         #     draw = cv2.rectangle(draw, p1, p2, (0,0,255), 2)
 99 |         #     cv2.imshow("Image", draw)
100 |         #     # cv2.waitKey(0)
101 | 
102 |         #     cropped_image, keypoints = square_crop_with_keypoints(image, label["bbox"], label["points"], "random")
103 |         #     draw = cropped_image.copy()
104 |         #     for i, p in enumerate(keypoints):
105 |         #         x, y = p
106 |         #         color = (0, 0, 255) if int(label["visibility"][i]) else (255, 0, 0)
107 |         #         draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
108 |         #         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
109 |         #                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
110 | 
111 |         #     cv2.imshow("Square cropped", draw)
112 |         #     # cv2.waitKey(0)
113 | 
114 |         #     # Test random occlusion
115 |         #     cropped_image, visibility = random_occlusion(cropped_image, keypoints, visibility=None, rect_ratio=((0.2, 0.5), (0.2, 0.5)))
116 |         #     draw = cropped_image.copy()
117 |         #     for i, p in enumerate(keypoints):
118 |         #         print(i)
119 |         #         print(visibility[i])
120 |         #         x, y = p
121 |         #         color = (0, 0, 255) if int(visibility[i]) else (255, 0, 0)
122 |         #         draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
123 |         #         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
124 |         #                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
125 | 
126 |         #     cv2.imshow("Random occlusion", draw)
127 |         #     cv2.waitKey(0)
128 | 
129 |         self.save_label("train.json", labels[:n_train])
130 |         self.save_label("val.json", labels[n_train:n_train+n_val])
131 |         self.save_label("test.json", labels[n_train+n_val:])
132 | 
133 | dataset = DatasetCreator("data/lsp_lspet")
134 | dataset.add_set("data/lsp_dataset/images", "data/lsp_dataset/labels.json", 1800, 100, 100)
135 | dataset.add_set("data/lspet_dataset/images", "data/lspet_dataset/labels.json", 3739, 100, 100)
136 | 


--------------------------------------------------------------------------------
/tools/split_lsp_lspet_7points.py:
--------------------------------------------------------------------------------
  1 | """
  2 | This script is used to concatenate datasets into a bigger dataset
  3 | and split them into training, validation and test sets.
  4 | """
  5 | import os
  6 | import pathlib
  7 | import json
  8 | import shutil
  9 | import random
 10 | import cv2
 11 | import numpy as np
 12 | 
 13 | from src.utils.pre_processing import calculate_bbox_from_keypoints, square_crop_with_keypoints
 14 | from src.data_loaders.augmentation_utils import random_occlusion
 15 | 
 16 | 
 17 | class DatasetCreator:
 18 |     def __init__(self, root_folder):
 19 | 
 20 |         if os.path.exists(root_folder):
 21 |             print("Folder existed! Please choose a non-existed path. {}".format(root_folder))
 22 |             exit(0)
 23 | 
 24 |         self.root_folder = root_folder
 25 |         self.image_folder = os.path.join(root_folder, "images")
 26 |         pathlib.Path(self.root_folder).mkdir(parents=True, exist_ok=True)
 27 |         pathlib.Path(self.image_folder).mkdir(parents=True, exist_ok=True)
 28 | 
 29 | 
 30 |     def save_label(self, file_name, labels):
 31 |         with open(os.path.join(self.root_folder, file_name), "w") as fp:
 32 |             json.dump(labels, fp)
 33 | 
 34 |     def add_set(self, image_folder, label_file, n_train, n_val, n_test, copy_images=True, shuffle=True):
 35 |         """Add subset into this dataset
 36 | 
 37 |         Args:
 38 |             image_folder (str): Path to image folder of subset
 39 |             label_file (str): Path to annotation file of subset
 40 |             n_train (int): Number of samples for training
 41 |             n_val (int): Number of samples for validation
 42 |             n_test (int): Number of samples for testing
 43 |         """
 44 | 
 45 |         with open(label_file, "r") as fp:
 46 |             labels = json.load(fp)
 47 | 
 48 |         # Use all data?
 49 |         assert (len(labels) == n_train + n_val + n_test)
 50 | 
 51 |         if shuffle:
 52 |             random.seed(42)
 53 |             random.shuffle(labels)
 54 | 
 55 |         for label in labels:
 56 | 
 57 |             # Copy images
 58 |             if copy_images:
 59 | 
 60 |                 # Rename image if duplicated
 61 |                 if os.path.exists(os.path.join(self.image_folder, label["image"])):
 62 |                     new_name = label["image"]
 63 |                     filename, file_extension = os.path.splitext(label["image"])
 64 |                     extended_number = 2
 65 |                     while True:
 66 |                         new_name = "{}_ext{}{}".format(filename, extended_number, file_extension)
 67 |                         if os.path.exists(os.path.join(self.image_folder, new_name)):
 68 |                             extended_number += 1
 69 |                         else:
 70 |                             break
 71 |                     shutil.copy(
 72 |                         os.path.join(image_folder, label["image"]),
 73 |                         os.path.join(self.image_folder, new_name),
 74 |                     )
 75 |                     label["image"] = new_name
 76 |                 else:
 77 |                     shutil.copy(
 78 |                         os.path.join(image_folder, label["image"]),
 79 |                         os.path.join(self.image_folder, label["image"]),
 80 |                     )
 81 |                     pass
 82 | 
 83 |             label["bbox"] = calculate_bbox_from_keypoints(label["points"])
 84 |             label["bbox"] = np.array(label["bbox"]).astype(int).tolist()
 85 |         
 86 |         # # Visualize
 87 |         # for label in labels:
 88 |         #     image = cv2.imread(os.path.join(self.image_folder, label["image"]))
 89 | 
 90 |         #     draw = image.copy()
 91 |         #     for i, p in enumerate(label["points"]):
 92 |         #         x, y = p
 93 |         #         color = (0, 0, 255) if int(label["visibility"][i]) else (255, 0, 0)
 94 |         #         draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
 95 |         #         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
 96 |         #                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
 97 |         #     p1 = tuple(label["bbox"][0])
 98 |         #     p2 = tuple(label["bbox"][1])
 99 |         #     draw = cv2.rectangle(draw, p1, p2, (0,0,255), 2)
100 |         #     cv2.imshow("Image", draw)
101 |         #     # cv2.waitKey(0)
102 | 
103 |         #     cropped_image, keypoints = square_crop_with_keypoints(image, label["bbox"], label["points"], "random")
104 |         #     draw = cropped_image.copy()
105 |         #     for i, p in enumerate(keypoints):
106 |         #         x, y = p
107 |         #         color = (0, 0, 255) if int(label["visibility"][i]) else (255, 0, 0)
108 |         #         draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
109 |         #         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
110 |         #                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
111 | 
112 |         #     cv2.imshow("Square cropped", draw)
113 |         #     # cv2.waitKey(0)
114 | 
115 |         #     # Test random occlusion
116 |         #     cropped_image, visibility = random_occlusion(cropped_image, keypoints, visibility=None, rect_ratio=((0.2, 0.5), (0.2, 0.5)))
117 |         #     draw = cropped_image.copy()
118 |         #     for i, p in enumerate(keypoints):
119 |         #         print(i)
120 |         #         print(visibility[i])
121 |         #         x, y = p
122 |         #         color = (0, 0, 255) if int(visibility[i]) else (255, 0, 0)
123 |         #         draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
124 |         #         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
125 |         #                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
126 | 
127 |         #     cv2.imshow("Random occlusion", draw)
128 |         #     cv2.waitKey(0)
129 | 
130 |         for i, label in enumerate(labels):
131 |             ps = label["points"]
132 |             labels[i]["points"] = [ps[6], ps[7], ps[8], ps[13], ps[9], ps[10], ps[11]]
133 |             vs = label["visibility"]
134 |             labels[i]["visibility"] = [vs[6], vs[7], vs[8], vs[13], vs[9], vs[10], vs[11]]
135 | 
136 | 
137 |         self.save_label("train.json", labels[:n_train])
138 |         self.save_label("val.json", labels[n_train:n_train+n_val])
139 |         self.save_label("test.json", labels[n_train+n_val:])
140 | 
141 | dataset = DatasetCreator("data/lsp_lspet_7points")
142 | dataset.add_set("data/lsp_dataset/images", "data/lsp_dataset/labels.json", 1800, 100, 100)
143 | dataset.add_set("data/lspet_dataset/images", "data/lspet_dataset/labels.json", 3739, 100, 100)
144 | 


--------------------------------------------------------------------------------
/tools/split_mpii.py:
--------------------------------------------------------------------------------
  1 | """
  2 | This script is used to concatenate datasets into a bigger dataset
  3 | and split them into training, validation and test sets.
  4 | """
  5 | import os
  6 | import pathlib
  7 | import json
  8 | import shutil
  9 | import random
 10 | import cv2
 11 | import numpy as np
 12 | 
 13 | from src.utils.pre_processing import calculate_bbox_from_keypoints, square_crop_with_keypoints
 14 | from src.data_loaders.augmentation_utils import random_occlusion
 15 | 
 16 | class DatasetCreator:
 17 |     def __init__(self, root_folder):
 18 | 
 19 |         if os.path.exists(root_folder):
 20 |             print("Folder existed! Please choose a non-existed path. {}".format(root_folder))
 21 |             exit(0)
 22 | 
 23 |         self.root_folder = root_folder
 24 |         self.image_folder = os.path.join(root_folder, "images")
 25 |         pathlib.Path(self.root_folder).mkdir(parents=True, exist_ok=True)
 26 |         pathlib.Path(self.image_folder).mkdir(parents=True, exist_ok=True)
 27 | 
 28 | 
 29 |     def save_label(self, file_name, labels):
 30 |         with open(os.path.join(self.root_folder, file_name), "w") as fp:
 31 |             json.dump(labels, fp)
 32 | 
 33 |     def add_set(self, image_folder, label_file, n_train, n_val, n_test, copy_images=True, shuffle=True):
 34 |         """Add subset into this dataset
 35 | 
 36 |         Args:
 37 |             image_folder (str): Path to image folder of subset
 38 |             label_file (str): Path to annotation file of subset
 39 |             n_train (int): Number of samples for training
 40 |             n_val (int): Number of samples for validation
 41 |             n_test (int): Number of samples for testing
 42 |         """
 43 | 
 44 |         with open(label_file, "r") as fp:
 45 |             labels = json.load(fp)
 46 | 
 47 |         # Use all data?
 48 |         assert (len(labels) == n_train + n_val + n_test)
 49 | 
 50 |         if shuffle:
 51 |             random.seed(42)
 52 |             random.shuffle(labels)
 53 | 
 54 |         for label in labels:
 55 | 
 56 |             # Copy images
 57 |             if copy_images:
 58 | 
 59 |                 # Rename image if duplicated
 60 |                 if os.path.exists(os.path.join(self.image_folder, label["image"])):
 61 |                     new_name = label["image"]
 62 |                     filename, file_extension = os.path.splitext(label["image"])
 63 |                     extended_number = 2
 64 |                     while True:
 65 |                         new_name = "{}_ext{}{}".format(filename, extended_number, file_extension)
 66 |                         if os.path.exists(os.path.join(self.image_folder, new_name)):
 67 |                             extended_number += 1
 68 |                         else:
 69 |                             break
 70 |                     shutil.copy(
 71 |                         os.path.join(image_folder, label["image"]),
 72 |                         os.path.join(self.image_folder, new_name),
 73 |                     )
 74 |                     label["image"] = new_name
 75 |                 else:
 76 |                     shutil.copy(
 77 |                         os.path.join(image_folder, label["image"]),
 78 |                         os.path.join(self.image_folder, label["image"]),
 79 |                     )
 80 |                     pass
 81 | 
 82 |             label["bbox"] = calculate_bbox_from_keypoints(label["points"])
 83 |             label["bbox"] = np.array(label["bbox"]).astype(int).tolist()
 84 |         
 85 |         # # Visualize
 86 |         # for label in labels:
 87 |         #     image = cv2.imread(os.path.join(self.image_folder, label["image"]))
 88 | 
 89 |         #     draw = image.copy()
 90 |         #     for i, p in enumerate(label["points"]):
 91 |         #         x, y = p
 92 |         #         color = (0, 0, 255) if int(label["visibility"][i]) else (255, 0, 0)
 93 |         #         draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
 94 |         #         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
 95 |         #                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
 96 |         #     p1 = tuple(label["bbox"][0])
 97 |         #     p2 = tuple(label["bbox"][1])
 98 |         #     draw = cv2.rectangle(draw, p1, p2, (0,0,255), 2)
 99 |         #     cv2.imshow("Image", draw)
100 |         #     # cv2.waitKey(0)
101 | 
102 |         #     cropped_image, keypoints = square_crop_with_keypoints(image, label["bbox"], label["points"], "random")
103 |         #     draw = cropped_image.copy()
104 |         #     for i, p in enumerate(keypoints):
105 |         #         x, y = p
106 |         #         color = (0, 0, 255) if int(label["visibility"][i]) else (255, 0, 0)
107 |         #         draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
108 |         #         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
109 |         #                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
110 | 
111 |         #     cv2.imshow("Square cropped", draw)
112 |         #     # cv2.waitKey(0)
113 | 
114 |         #     # Test random occlusion
115 |         #     cropped_image, visibility = random_occlusion(cropped_image, keypoints, visibility=None, rect_ratio=((0.2, 0.5), (0.2, 0.5)))
116 |         #     draw = cropped_image.copy()
117 |         #     for i, p in enumerate(keypoints):
118 |         #         print(i)
119 |         #         print(visibility[i])
120 |         #         x, y = p
121 |         #         color = (0, 0, 255) if int(visibility[i]) else (255, 0, 0)
122 |         #         draw = cv2.circle(draw, center=(int(x), int(y)), color=color, radius=1, thickness=2)
123 |         #         draw = cv2.putText(draw, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
124 |         #                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
125 | 
126 |         #     cv2.imshow("Random occlusion", draw)
127 |         #     cv2.waitKey(0)
128 | 
129 |         self.save_label("train.json", labels[:n_train])
130 |         self.save_label("val.json", labels[n_train:n_train+n_val])
131 |         self.save_label("test.json", labels[n_train+n_val:])
132 | 
133 | dataset = DatasetCreator("data/lsp_lspet")
134 | dataset.add_set("data/lsp_dataset/images", "data/lsp_dataset/labels.json", 1800, 100, 100)
135 | dataset.add_set("data/lspet_dataset/images", "data/lspet_dataset/labels.json", 3739, 100, 100)
136 | 


--------------------------------------------------------------------------------
/tools/visualize_data.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import argparse
 3 | import json
 4 | import cv2
 5 | 
 6 | parser = argparse.ArgumentParser()
 7 | parser.add_argument(
 8 |     '-i',
 9 |     '--image_folder', default="data/lsp_dataset/images",
10 |     help='Image folder')
11 | parser.add_argument(
12 |     '-l',
13 |     '--labels', default="data/lsp_dataset/labels.json",
14 |     help='Label/Annotation file')
15 | args = parser.parse_args()
16 | 
17 | with open(args.labels, "r") as fp:
18 |     labels = json.load(fp)
19 | 
20 | cv2.namedWindow("Image", cv2.WINDOW_NORMAL)
21 | for label in labels:
22 |     image_name = label["image"]
23 |     points = label["points"]
24 |     visibility = label["visibility"]
25 |     image = cv2.imread(os.path.join(args.image_folder, image_name))
26 |     for i, p in enumerate(points):
27 |         x, y = p
28 |         color = (0, 0, 255) if int(visibility[i]) else (255, 0, 0)
29 |         cv2.circle(image, center=(int(x), int(y)), color=(255, 0, 0), radius=1, thickness=2)
30 |         image = cv2.putText(image, str(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX,
31 |                 0.5, (0, 255, 0), 1, cv2.LINE_AA)
32 |     cv2.imshow("Image", image)
33 |     cv2.waitKey(0)
34 | 
35 | 


--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import pathlib
 3 | import shutil
 4 | import argparse
 5 | import importlib
 6 | import json
 7 | import tensorflow as tf
 8 | 
 9 | for gpu in tf.config.experimental.list_physical_devices('GPU'):
10 |     tf.compat.v2.config.experimental.set_memory_growth(gpu, True)
11 | 
12 | parser = argparse.ArgumentParser()
13 | parser.add_argument(
14 |     '-c',
15 |     '--conf_file', default="config.json",
16 |     help='Configuration file')
17 | args = parser.parse_args()
18 | 
19 | # Open and load the config json
20 | with open(args.conf_file) as config_buffer:
21 |     config = json.loads(config_buffer.read())
22 | 
23 | # Create experiment folder and copy configuration file
24 | exp_folder = os.path.join("experiments", config["experiment_name"])
25 | pathlib.Path(exp_folder).mkdir(parents=True, exist_ok=True)
26 | shutil.copy(args.conf_file, exp_folder)
27 | 
28 | # Train model
29 | trainer = importlib.import_module("src.trainers.{}".format(config["trainer"]))
30 | trainer.train(config)
31 | 


--------------------------------------------------------------------------------