├── GLOSSARY.md ├── LICENSE ├── README.md ├── bbpose ├── __init__.py ├── configs │ └── config.cfg ├── ingestion │ ├── __init__.py │ ├── ingest.py │ └── scripts │ │ └── data.sh ├── main.py ├── models │ ├── __init__.py │ ├── hourglass.py │ ├── model.py │ └── softgate.py ├── preprocess │ ├── __init__.py │ ├── preprocess.py │ └── utils.py ├── requirements.txt ├── test │ ├── __init__.py │ ├── metrics.py │ └── test.py ├── train │ ├── __init__.py │ ├── callbacks.py │ ├── losses.py │ ├── metrics.py │ ├── train.py │ └── utils.py └── utils.py └── setup.py /GLOSSARY.md: -------------------------------------------------------------------------------- 1 | # Glossary 2 | 3 | ## Main 4 | 5 | * **version**: The version number assigned to the current model being trained on. \ 6 | Note: CHECK VERSION NUMBER! Users should get into the habit of checking which version number is being set in order to avoid overwriting old models (unless that is the intended results). 7 | 8 | * **img_dir**: Absolute path to where image files are stored to. \ 9 | Note: do not set path to within the project's 'data' or 'results' directories. 10 | 11 | * **dataset**: Image set and accompanying data files which the user wishes to download. \ 12 | Supported datasets currently only include 'mpii'. 13 | 14 | * **use_records**: Will switch the model over to use TFRecords. \ 15 | Note: the TFRecords dataset will be much larger than its Tensorflow Dataset counterpart but will lead to faster training times. If operational cost is more important than memory cost, set this param to True. 16 | Note: unless the user has a reason to keep the videos after creating the TFRecords, it is recommended to set delete_videos to True in order to avoid massive use of memory space. 17 | 18 | * **use_cloud**: Set to True if user wished to retrieve data from a remote location. 19 | Note: the code has been set up to work with Google Cloud Storage and most likely will not work with any other service. 20 | 21 | * **batch_per_replica**: Total number of samples passed to each replica. \ 22 | Do not set this number to a global value since this value will be passed to each device available. 23 | 24 | * **img_size**: Spatial length that each image should be resized/cropped to. \ 25 | Supported values include: 128, 192, 256, 320, 384, 448, and 512. \ 26 | Note: the value of img_size must be 4 times larger than hm_size. 27 | 28 | * **hm_size**: Spatial length of each heatmap. \ 29 | Supported value include: 32, 48, 64, 80, 96, 112, and 128. \ 30 | Note: the value of hm_size must be 4 times smaller than img_size. 31 | 32 | * **num_stacks**: Total number of hourglass stacks to use. \ 33 | Supported value include: 2, 4, and 8. 34 | 35 | ## Ingestion 36 | 37 | * **download_images**: Set to False in order to bypass downloading of the images. This may be desired if the user already contains a copy of the image dataset. \ 38 | Note: do not forget to set the img_dir param to the location of your copy. 39 | 40 | * **toy_set**: Set to True if user wishes to work a smaller subset to experiment with. \ 41 | Note: this is for experimentation only, as any results gathered from this toy set will not be accurate. 42 | 43 | * **toy_samples**: Total number of images to subset for the toy dataset. 44 | 45 | * **examples_per_record**: Total number of examples to be placed in each training TFRecord. \ 46 | 47 | * **interleave_cycle**: Total number of records to simultaneously interleave. \ 48 | As a visual example: for a simplest case, allow your hands to represent two records and your fingers to represent examples from each each record. If you now interlock your hands together, this will represent the interleaving performed by this operation. \ 49 | This option is necessary for TFRecords since you cannot shuffle binary data. \ 50 | For further information, check the [Tensorflow](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#interleave) documentation on this operation. 51 | 52 | * **interleave_block**: Total number of consecutive elements that are interwoven from each record at one time. \ 53 | This option just allows for blocks of examples from the same record to be placed next to one another when interwoven with examples (also in a block) from another record. 54 | For further information, check the [Tensorflow](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#interleave) documentation on this operation. 55 | 56 | * **gcs_records**: Remote path where TFRecords are stored. 57 | Note: the code has been set up to work with Google Cloud Storage and most likely will not work with any other service. 58 | Note: The directory defined by this path should contain the 'train' and 'val' directories. But they should not be included in the param's path name. 59 | 60 | ## Preprocess 61 | 62 | * **mean**: Mean values calculated (from train dataset) for each RGB channel. 63 | 64 | * **sigma**: STD used for Gaussian kernel in creation of the heatmaps. 65 | 66 | ## Models 67 | 68 | * **arch**: Model architecture to be used. \ 69 | Supported architectures include: 'softgate' and 'hourglass' 70 | 71 | * **num_filters**: The base number of filters/channels used at each layer. All layers will either hold this number of filters or be a factor of this value. 72 | Supported values include: any multiple of 8. 73 | 74 | * **initializer**: Technique to initialize the model weights. \ 75 | Recommended initializers include: 'glorot_normal', 'glorot_uniform', 'he_normal', and 'he_uniform'. \ 76 | If the user wishes to use a different initializer not in this list, please check with the [tensorflow](https://www.tensorflow.org/api_docs/python/tf/keras/initializers) documentation first to determine if your requested initializer is supported. If not, you may receive an error at runtime. 77 | 78 | * **momentum**: Defines the momentum for the moving average used in each batch norm layer. 79 | 80 | * **epsilon**: Value added to each batch norm's variance to avoid division by zero. 81 | 82 | * **dropout_rate**: The percentage of units to drop at a specific layer. \ 83 | Recommended value is 0.2 (if used). Otherwise, default value of 0 will effectively lead to no dropout used. 84 | 85 | ## Train / Test 86 | 87 | * **is_eager**: Set to True if user wishes to run tf.functions in eager mode. This can be set to True when debugging the code, but should be set to False when actually training the model. 88 | 89 | * **strategy**: Tensorflow Strategy to use during training. \ 90 | Supported strategies include: 'default', 'mirrored', and 'tpu'. \ 91 | Use default when using only a single GPU, use mirrored when using multiple GPUs, and use tpu when using one or multiple TPUs. 92 | 93 | * **tpu_address**: Set this param either to the TPU's name or the TPU's gRPC address. 94 | 95 | * **gcs_results**: Remote path where SaveModels are stored. \ 96 | Note: the code has been set up to work with Google Cloud Storage and most likely will not work with any other service. \ 97 | Note: do not include the version number at the end of the path name. 98 | 99 | * **mixed_precision**: Set to True if user wishes to use a mix of 16 and 32 bit floating-point types. This should greatly reduce the memory footprint and minimal loss to accuracy. \ 100 | Note: this param does not currently work when used with TPUs. Only available for GPUs. 101 | 102 | * **num_epochs**: Total number of epochs to train the model for. 103 | 104 | * **steps_per_execution**: Total number of batches to simultaneously push through each tf.function at one time. \ 105 | This a brand new feature so there is not much documentation available yet (although faster training times are claimed). So the optimal value chosen will have to be up to the user. If set to 1, this model will run as normal (one step at a time). \ 106 | Note: the only limitation on this param is that 'track_every' must be a multiple of 'steps_per_execution'. 107 | 108 | * **track_every**: Determines how often to save a set of values ('best', 'step', 'epoch') needed in case the model is interrupted and needs to be restored. \ 109 | Note: 'track_every' must be a multiple of 'steps_per_execution'. 110 | 111 | * **threshold**: Acceptable range in difference between the true joint position and the predicted joint position (when calculating the PCK metric). 112 | 113 | * **decay_epochs**: Epochs at which to apply a decay factor to the learning rate. 114 | 115 | * **decay_factor**: Factor to decay the learning rate by. 116 | 117 | * **learning_rate**: Initial learning rate used in calculating the training step size. \ 118 | Note: this value will be passed to all available devices (GPU or TPU). But set this param as if only one device is available. The job of scaling the learning rate will be left upto the 'scale' param. 119 | 120 | * **scale**: Learning rate scaler used in distributed training. A scale of 0.35 seems to be a good starting point, but the optimal value may require experimentation. Either way, it is not recommended to scale the learning rate with a value greater than 1. \ 121 | Note: The learning rate for each replica is determined by multiplying the learning_rate, num_replicas, and the scale. 122 | 123 | * **schedule_per_step**: Set to True if user wishes to use a per step schedule (exponential decay) on the learning rate. 124 | 125 | * **decay_rate**: Factor to decay the learning rate by when used in an exponential schedule. 126 | 127 | * **decay_steps**: Frequency of decay on the learning rate when used in an exponential schedule. 128 | 129 | ## Hidden 130 | 131 | * **switch**: Globally switches the whole codebase from using the base config file to a specific frozen config file (i.e. saved copy of a versioned model). \ 132 | Note: **DO NOT CHANGE THIS PARAMETER.** This param is defined and used by the model. 133 | 134 | * **num_replicas**: Defines the total number of devices (CPU, GPU, or TPU) available. \ 135 | Note: **DO NOT CHANGE THIS PARAMETER.** This param is defined and used by the model. 136 | 137 | * **train_size**: Defines the total number of training examples found in all TFRecords. \ 138 | Note: **DO NOT CHANGE THIS PARAMETER.** This param is defined and used by the model. \ 139 | Note: if this value is accidentally erased and truly gone (i.e. not in a frozen config), the value may be retrieved using DataGenerator's _get_dataset_size method. 140 | 141 | * **val_size**: Defines the total number of validation/testing examples found in all TFRecords. \ 142 | Note: **DO NOT CHANGE THIS PARAMETER.** This param is defined and used by the model. \ 143 | Note: if this value is accidentally erased and truly gone (i.e. not in a frozen config), the value may be retrieved using DataGenerator's _get_dataset_size method. 144 | 145 | 146 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2021 BB-Repos 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # BB-Pose 2 | 3 | BB-Pose is a Tensorflow 2 training pipeline for human pose estimation. The aim of this codebase is to set up a foundation on which future projects might be able to build upon. Supported models currently include: 4 | 5 | * **Hourglass Networks** [[1]](https://arxiv.org/abs/1603.06937) 6 | * **Softgated Skip Connections** [[2]](https://arxiv.org/abs/2002.11098) 7 | 8 | Supported datasets currently include: 9 | * **MPII Dataset** [[3]](http://human-pose.mpi-inf.mpg.de/) 10 | 11 | **Note: This repository has only been tested on Ubuntu 18.04 and Debian (Sid).** 12 | 13 | ## Installation 14 | 15 | 1. Clone this repository: 16 | ``` 17 | git clone https://github.com/salinasJJ/BBpose.git 18 | ``` 19 | 2. Create a virtual environment (using Pipenv or Conda for example). 20 | 21 | 3. Install the project onto your system: 22 | ``` 23 | pip install -e BBpose 24 | ``` 25 | 4. Install dependencies: 26 | ``` 27 | pip install -r BBpose/bbpose/requirements.txt 28 | ``` 29 | 5. Make script executable: 30 | ``` 31 | chmod +x BBpose/bbpose/ingestion/scripts/data.sh 32 | ``` 33 | 34 | ## Params 35 | 36 | This project is setup around a config file which contains numerous adjustable parameters. Any changes to how the project runs must be done here by updating the params. There are 3 main commands to update the params: 37 | 1. The 'reset' command will reset all params back to their default values: 38 | ``` 39 | python BBpose/bbpose/main.py reset 40 | ``` 41 | 2. The 'update' command will update all requested params to new values. For example: 42 | ``` 43 | python BBpose/bbpose/main.py update \ 44 | --version 0 \ 45 | --dataset mpii \ 46 | --batch_per_replica 32 \ 47 | --use_records True \ 48 | --num_epochs 200 \ 49 | --num_filters 144 \ 50 | ``` 51 | 3. The 'force' command is a special command that will update a set of model-defined hidden params. The command is there for a specific use case (i.e. resetting hidden params after an accidental updating) but in general, users of this repository should never have to use this command. 52 | ``` 53 | python BBpose/bbpose/main.py force \ 54 | --train_size 22245 \ 55 | --validate_size 2958 \ 56 | ``` 57 | 58 | **Note #1: The 'reset' command will clear out all user-defined values as well as those of the hidden params. Without these pre-defined params, the model will fail to work properly, if at all. Please use this command carefully.** 59 | 60 | **Note #2: CHECK VERSION NUMBER! Users should get into the practice of always checking which version number is being set in order to avoid overwriting old models.** 61 | 62 | **Note #3: Do not set the path to 'img_dir' within the data directory. Best to place it in a directory outside of the project.** 63 | 64 | There are many params, some of which are interconnected to one another, and some which have limitations. Please see [Glossary](GLOSSARY.md) for a full breakdown of all these params. 65 | 66 | That was the hard part. From here on out, the commands to create the datasets, train, and test are simple one liners. 67 | 68 | ## Datasets 69 | 70 | In order to create the datasets, we can make use of the 'ingest' command. This command contains two options: 71 | 1. setup: retrieves required data files and prepares the data for downloading. 72 | 2. generate: downloads the image set and generates Tensorflow datasets. 73 | 74 | To setup and start downloading, call: 75 | ``` 76 | python BBpose/bbpose/main.py ingest --setup --generate 77 | ``` 78 | 79 | **Note: The 'setup' option will clear everything in the data directory. So, if downloading is interrupted, make sure to only use 'generate' to restart downloading.** 80 | ``` 81 | python BBpose/bbpose/main.py ingest --generate 82 | ``` 83 | 84 | ## Training 85 | 86 | To start training on a brand new model, call: 87 | ``` 88 | python BBpose/bbpose/main.py train 89 | ``` 90 | If training is interupted for any reason, you can easily restart from where you left off: 91 | ``` 92 | python BBpose/bbpose/main.py train --restore 93 | ``` 94 | 95 | ## Testing 96 | 97 | To evaluate your trained model: 98 | ``` 99 | python BBpose/bbpose/main.py test 100 | ``` 101 | 102 | ## Contribute 103 | 104 | Contributions from the community are welcomed. 105 | 106 | ## License 107 | 108 | BB-Pose is licensed under MIT. 109 | 110 | ## References 111 | 112 | 1. A. Newell, K.Yang, J. Deng, **Stacked Hourglass Networks for Human Pose Estimation**, arXiv:1603.06937, 2016. 113 | 2. A. Bulat, J. Kossaifi, G. Tzimiropoulos, M. Pantic, **Toward Fast and Accurate Human Pose Estimation via Soft-Gated Skip Connections**, arXiv:2002.11098, 2020. 114 | 115 | 116 | 117 | -------------------------------------------------------------------------------- /bbpose/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salinasJJ/BBpose/4be635ead0f99b1788160ca0b1c7b3947ba05526/bbpose/__init__.py -------------------------------------------------------------------------------- /bbpose/configs/config.cfg: -------------------------------------------------------------------------------- 1 | [DEFAULT] 2 | version = 0 3 | img_dir = '/PATH/TO/IMG/DIR' 4 | dataset = 'mpii' 5 | use_records = False 6 | use_cloud = False 7 | img_size = 256 8 | hm_size = 64 9 | num_stacks = 4 10 | batch_per_replica = 24 11 | switch = False 12 | num_replicas = 1 13 | train_size = 0 14 | val_size = 0 15 | 16 | [Ingestion] 17 | download_images = False 18 | toy_set = False 19 | toy_samples = 250 20 | examples_per_record = 250 21 | interleave_cycle = -1 22 | interleave_block = 1 23 | gcs_records = '' 24 | 25 | [Preprocess] 26 | mean = [0.4624228775501251, 0.44416481256484985, 0.4025438725948334] 27 | sigma = 1 28 | 29 | [Models] 30 | arch = 'softgate' 31 | num_filters = 144 32 | initializer = 'glorot_uniform' 33 | momentum = 0.9 34 | epsilon = 0.001 35 | dropout_rate = 0.2 36 | 37 | [Train] 38 | is_eager = False 39 | strategy = 'default' 40 | tpu_address = '' 41 | gcs_results = '' 42 | mixed_precision = False 43 | num_epochs = 200 44 | steps_per_execution = 1 45 | track_every = 32 46 | threshold = 0.5 47 | decay_epochs = [75, 100, 150] 48 | decay_factor = 0.2 49 | learning_rate = 0.00025 50 | scale = 0.0 51 | schedule_per_step = False 52 | decay_rate = 0.96 53 | decay_steps = 2000 54 | 55 | [Test] 56 | toy_set = ${Ingestion:toy_set} 57 | is_eager = ${Train:is_eager} 58 | strategy = ${Train:strategy} 59 | tpu_address = ${Train:tpu_address} 60 | gcs_results = ${Train:gcs_results} 61 | mixed_precision = ${Train:mixed_precision} 62 | threshold = ${Train:threshold} 63 | schedule_per_step = ${Train:schedule_per_step} 64 | 65 | -------------------------------------------------------------------------------- /bbpose/ingestion/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salinasJJ/BBpose/4be635ead0f99b1788160ca0b1c7b3947ba05526/bbpose/ingestion/__init__.py -------------------------------------------------------------------------------- /bbpose/ingestion/ingest.py: -------------------------------------------------------------------------------- 1 | import math 2 | import os 3 | import shutil 4 | 5 | import pandas as pd 6 | import tensorflow as tf 7 | tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.WARN) 8 | 9 | from utils import ( 10 | DATASETS, 11 | call_bash, 12 | force_update, 13 | get_data_dir, 14 | get_frozen_params, 15 | get_params, 16 | is_dir, 17 | ) 18 | 19 | 20 | ANNOTATIONS = [ 21 | 'img_paths', 22 | 'people_index', 23 | 'joint_self', 24 | 'objpos', 25 | 'scale_provided', 26 | ] 27 | ID_COLUMNS = [ 28 | 'file_name', 29 | 'person_N', 30 | ] 31 | JNT_COLUMNS = [ 32 | 'r_ankle_X', 'r_ankle_Y', 'r_knee_X', 'r_knee_Y', 'r_hip_X', 'r_hip_Y', 33 | 'l_hip_X', 'l_hip_Y', 'l_knee_X', 'l_knee_Y', 'l_ankle_X', 'l_ankle_Y', 34 | 'pelvis_X', 'pelvis_Y', 'thorax_X', 'thorax_Y', 'neck_X', 'neck_Y', 35 | 'head_X', 'head_Y', 'r_wrist_X', 'r_wrist_Y', 'r_elbow_X', 'r_elbow_Y', 36 | 'r_shoulder_X', 'r_shoulder_Y', 'l_shoulder_X', 'l_shoulder_Y', 'l_elbow_X', 37 | 'l_elbow_Y', 'l_wrist_X', 'l_wrist_Y', 38 | ] 39 | VIS_COLUMNS = [ 40 | 'r_ankle_vis', 'r_knee_vis', 'r_hip_vis', 'l_hip_vis', 'l_knee_vis', 41 | 'l_ankle_vis', 'pelvis_vis', 'thorax_vis', 'neck_vis', 'head_vis', 42 | 'r_wrist_vis', 'r_elbow_vis', 'r_shoulder_vis', 'l_shoulder_vis', 43 | 'l_elbow_vis', 'l_wrist_vis', 44 | ] 45 | CS_COLUMNS = [ 46 | 'center_X', 47 | 'center_Y', 48 | 'scale', 49 | ] 50 | EXCLUDE = [ 51 | ('012545809.jpg', 2), 52 | ] 53 | SEEN_EVERY = 100 54 | 55 | 56 | class DataGenerator(): 57 | def __init__(self, setup=False): 58 | params = get_params('Ingestion') 59 | if params['switch']: 60 | self.params = get_frozen_params( 61 | 'Ingestion', 62 | version=params['version'], 63 | ) 64 | else: 65 | self.params = params 66 | self.dataset_dir = get_data_dir(self.params['dataset']) 67 | if self.params['toy_set']: 68 | self.data_dir = self.dataset_dir + 'data/toy/' 69 | self.tfds_dir = self.dataset_dir + 'tfds/toy/' 70 | self.records_dir = self.dataset_dir + 'records/toy/' 71 | else: 72 | self.data_dir = self.dataset_dir + 'data/full/' 73 | self.tfds_dir = self.dataset_dir + 'tfds/full/' 74 | self.records_dir = self.dataset_dir + 'records/full/' 75 | 76 | self.script_dir = ( 77 | os.path.dirname(os.path.realpath(__file__)) + '/scripts/' 78 | ) 79 | if setup: 80 | self.get_data() 81 | 82 | def get_data(self): 83 | print('Downloading data files...') 84 | data_url = DATASETS[self.params['dataset']]['data'] 85 | annotations_url = DATASETS[self.params['dataset']]['annotations'] 86 | detections_url = DATASETS[self.params['dataset']]['detections'] 87 | 88 | status = call_bash( 89 | command = ( 90 | f"'{self.script_dir + 'data.sh'}' " 91 | f"-D '{self.data_dir}' " 92 | f"-T '{self.tfds_dir}' " 93 | f"-R '{self.records_dir}' " 94 | f"-j {data_url} " 95 | f"-a {annotations_url} " 96 | f"-m {detections_url} " 97 | f"-I '{self.params['img_dir']}' " 98 | f"-r {self.params['use_records']} " 99 | f"-d {self.params['dataset']} " 100 | f"-i {self.params['download_images']} " 101 | ), 102 | message='Data files downloaded.\n', 103 | ) 104 | if status: 105 | for s in status.decode('utf-8').split("\n"): 106 | print(s) 107 | 108 | def generate(self): 109 | print("Generating 'train' csv file...") 110 | train_df = self._json_to_pandas('train') 111 | self._pandas_to_csv(train_df, split='train') 112 | print("Generating 'val' csv file...") 113 | val_df = self._json_to_pandas('val') 114 | self._pandas_to_csv(val_df, split='val') 115 | 116 | train_dataset = self._csv_to_tfds('train') 117 | val_dataset = self._csv_to_tfds('val') 118 | if self.params['use_records']: 119 | print(f'Generating Tensorflow Records...') 120 | self._create_records(train_dataset, 'train') 121 | self._create_records(val_dataset, 'val') 122 | 123 | shutil.rmtree(self.dataset_dir + 'tfds/') 124 | else: 125 | print(f'Generating Tensorflow Datasets...') 126 | self.save_dataset(train_dataset, 'train') 127 | self.save_dataset(val_dataset, 'val') 128 | 129 | def _json_to_pandas(self, split): 130 | return self._get_dataframe( 131 | self._get_annos(split), 132 | ) 133 | 134 | def _get_annos(self, split): 135 | annos = pd.read_json(self.data_dir + 'annotations.json') 136 | if split == 'train': 137 | annos = annos[annos.isValidation == 0] 138 | elif split == 'val': 139 | annos = annos[annos.isValidation == 1] 140 | annos = annos.drop( 141 | 'isValidation', 142 | axis=1 143 | ).reset_index(drop=True) 144 | return annos.loc[:, ANNOTATIONS] 145 | 146 | def _get_dataframe(self, dataframe): 147 | df = pd.DataFrame( 148 | columns=ID_COLUMNS + JNT_COLUMNS + VIS_COLUMNS + CS_COLUMNS 149 | ) 150 | total = dataframe.shape[0] 151 | for idx, annos in dataframe.iterrows(): 152 | if annos[0] == EXCLUDE[0][0] and annos[1] == EXCLUDE[0][1]: 153 | continue 154 | 155 | ids = [0 for _ in range(2)] 156 | jnts = [] 157 | vis = [] 158 | cs = [0 for _ in range(3)] 159 | for a in annos[2]: 160 | jnts.append(a[0]) 161 | jnts.append(a[1]) 162 | vis.append(a[2]) 163 | 164 | ids = pd.DataFrame([ids], columns=ID_COLUMNS) 165 | jnts = pd.DataFrame([jnts], columns=JNT_COLUMNS) 166 | vis = pd.DataFrame([vis], columns=VIS_COLUMNS) 167 | cs = pd.DataFrame([cs], columns=CS_COLUMNS) 168 | 169 | ids.file_name = annos[0] 170 | ids.person_N = annos[1] 171 | 172 | if annos[3][0] != -1: 173 | annos[3][1] = annos[3][1] + 15 * annos[4] 174 | annos[4] = annos[4] * 1.25 175 | cs.center_X = annos[3][0] 176 | cs.center_Y = annos[3][1] 177 | cs.scale = annos[4] 178 | 179 | temp = pd.concat( 180 | [ids, jnts, vis, cs], 181 | axis=1, 182 | ) 183 | df = pd.concat([df, temp]) 184 | 185 | if idx % SEEN_EVERY == 0: 186 | print(f'{idx}/{total}') 187 | if self.params['toy_set']: 188 | if idx == self.params['toy_samples'] - 1: 189 | break 190 | return df.reset_index(drop=True) 191 | 192 | def _pandas_to_csv(self, dataframe, split): 193 | print(f"Saving '{split}' csv file to: {self.data_dir}\n") 194 | dataframe.to_csv( 195 | self.data_dir + split + '.csv', 196 | index=False, 197 | ) 198 | 199 | def _csv_to_tfds(self, split): 200 | return self._map_dataset( 201 | self._get_dataset(split) 202 | ) 203 | 204 | def _get_dataset(self, split): 205 | return tf.data.experimental.CsvDataset( 206 | self.data_dir + split + '.csv', 207 | DataGenerator._get_component_dtypes(), 208 | header=True 209 | ) 210 | 211 | @staticmethod 212 | def _get_component_dtypes(): 213 | field_dtypes = [tf.string for _ in range(2)] 214 | return field_dtypes + [tf.float32 for _ in range(51)] 215 | 216 | def _map_dataset(self, dataset): 217 | return dataset.map( 218 | DataGenerator._get_components, 219 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 220 | deterministic=True, 221 | ) 222 | 223 | @staticmethod 224 | def _get_components(*elements): 225 | file_name = elements[0] 226 | person_N = elements[1] 227 | identifier = tf.strings.join( 228 | inputs=[ 229 | file_name, 230 | person_N, 231 | ], 232 | separator='-', 233 | ) 234 | joints = tf.reshape( 235 | tf.stack(elements[2:34]), 236 | shape=[16,2], 237 | ) 238 | weights = tf.stack(elements[34:50]) 239 | center = tf.stack(elements[50:52]) 240 | scale = elements[52] 241 | return identifier, joints, weights, center, scale 242 | 243 | def _create_records(self, dataset, split): 244 | dataset_size = self._get_dataset_size(split) 245 | total_records = math.ceil( 246 | dataset_size / self.params['examples_per_record'] 247 | ) 248 | for record_num in range(total_records): 249 | skip = record_num * self.params['examples_per_record'] 250 | take = self.params['examples_per_record'] 251 | 252 | self._write_to_record( 253 | dataset.skip(skip).take(take), 254 | split, 255 | record_num, 256 | ) 257 | print(f"{record_num}/{total_records - 1}") 258 | print(f"Saving '{split}' records to: {self.records_dir}") 259 | 260 | def _write_to_record( 261 | self, 262 | dataset, 263 | split, 264 | record_num, 265 | ): 266 | is_dir(self.records_dir + split) 267 | record = self.records_dir + split + f'/{split}_{record_num}.tfrecord' 268 | with tf.io.TFRecordWriter(record) as writer: 269 | for identifier, joints, weights, center, scale in dataset: 270 | filename = ( 271 | identifier.numpy().decode().split('.jpg')[0] + '.jpg' 272 | ) 273 | path = self.params['img_dir'] + filename 274 | raw_image = open(path, 'rb').read() 275 | 276 | example = self._serialize_example( 277 | raw_image, 278 | joints, 279 | weights, 280 | center, 281 | scale, 282 | ) 283 | writer.write(example) 284 | 285 | def _serialize_example( 286 | self, 287 | raw_image, 288 | joints, 289 | weights, 290 | center, 291 | scale, 292 | ): 293 | feature = { 294 | 'raw_image':DataGenerator._bytes_feature(raw_image), 295 | 'joints':DataGenerator._bytes_feature( 296 | tf.io.serialize_tensor(joints), 297 | ), 298 | 'weights':DataGenerator._bytes_feature( 299 | tf.io.serialize_tensor(weights), 300 | ), 301 | 'center':DataGenerator._bytes_feature( 302 | tf.io.serialize_tensor(center), 303 | ), 304 | 'scale':DataGenerator._float_feature(scale), 305 | } 306 | example = tf.train.Example( 307 | features=tf.train.Features(feature=feature) 308 | ) 309 | return example.SerializeToString() 310 | 311 | @staticmethod 312 | def _bytes_feature(value): 313 | if isinstance(value, type(tf.constant(0))): 314 | value = value.numpy() 315 | return tf.train.Feature( 316 | bytes_list=tf.train.BytesList(value=[value]), 317 | ) 318 | 319 | @staticmethod 320 | def _float_feature(value): 321 | return tf.train.Feature( 322 | float_list=tf.train.FloatList(value=[value]), 323 | ) 324 | 325 | def _get_dataset_size(self, split): 326 | csv_file = self.data_dir + split + '.csv' 327 | dataset_size = call_bash( 328 | command = f"wc -l < {csv_file}", 329 | ) 330 | dataset_size = int(dataset_size.decode('utf-8').strip("\n")) - 1 331 | force_update( 332 | {f"{split}_size":dataset_size}, 333 | ) 334 | return dataset_size 335 | 336 | def save_dataset(self, dataset, split): 337 | print(f"Saving '{split}' dataset to: {self.tfds_dir}") 338 | if os.path.isdir(self.tfds_dir + split): 339 | shutil.rmtree(self.tfds_dir + split) 340 | 341 | tf.data.experimental.save( 342 | dataset, 343 | self.tfds_dir + split, 344 | ) 345 | 346 | def load_datasets(self): 347 | train_dataset = self._load_dataset('train') 348 | val_dataset = self._load_dataset('val') 349 | return train_dataset, val_dataset 350 | 351 | def load_test_dataset(self): 352 | return self._load_dataset('val') 353 | 354 | def _load_dataset(self, split): 355 | return tf.data.experimental.load( 356 | path=self.tfds_dir + split, 357 | element_spec=( 358 | tf.TensorSpec(shape=(), dtype=tf.string), 359 | tf.TensorSpec(shape=(16, 2), dtype=tf.float32), 360 | tf.TensorSpec(shape=(16,), dtype=tf.float32), 361 | tf.TensorSpec(shape=(2,), dtype=tf.float32), 362 | tf.TensorSpec(shape=(), dtype=tf.float32), 363 | ), 364 | ) 365 | 366 | def load_records(self): 367 | train_records = self._load_records('train') 368 | val_records = self._load_records('val') 369 | return train_records, val_records 370 | 371 | def load_test_records(self): 372 | return self._load_records('test') 373 | 374 | def _load_records(self, split): 375 | records = self._get_record_names(split) 376 | if split == 'test': 377 | records = sorted( 378 | records, 379 | key=lambda r: int(r.split('val_')[1].split('.tf')[0]), 380 | ) 381 | return tf.data.TFRecordDataset(records) 382 | else: 383 | ds = tf.data.Dataset.from_tensor_slices(records) 384 | ds = ds.shuffle( 385 | buffer_size=tf.cast( 386 | tf.shape(records)[0], 387 | tf.int64, 388 | ), 389 | ) 390 | ds = ds.repeat() 391 | return ds.interleave( 392 | tf.data.TFRecordDataset, 393 | cycle_length=self.params['interleave_num'], 394 | block_length=self.params['interleave_block'], 395 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 396 | deterministic=False, 397 | ) 398 | 399 | def _get_record_names(self, split): 400 | if split == 'test': 401 | split = 'val' 402 | if self.params['use_cloud']: 403 | records = tf.io.gfile.glob(( 404 | self.params['gcs_data'].rstrip('/') 405 | + '/' 406 | + split 407 | + f'/{split}_*.tfrecord' 408 | )) 409 | else: 410 | records = tf.io.gfile.glob( 411 | self.records_dir + split + f'/{split}*.tfrecord' 412 | ) 413 | return records 414 | 415 | 416 | 417 | 418 | 419 | 420 | 421 | 422 | 423 | 424 | 425 | -------------------------------------------------------------------------------- /bbpose/ingestion/scripts/data.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | while getopts 'r:I:D:T:R:j:a:m:d:i:' OPTION; do 4 | case "${OPTION}" in 5 | r) 6 | use_records="${OPTARG}" ;; 7 | I) 8 | IMG_DIR="${OPTARG}" ;; 9 | D) 10 | DATA_DIR="${OPTARG}" ;; 11 | T) 12 | TFDS_DIR="${OPTARG}" ;; 13 | R) 14 | RECORDS_DIR="${OPTARG}" ;; 15 | j) 16 | data_url="${OPTARG}" ;; 17 | a) 18 | annotations_url="${OPTARG}" ;; 19 | m) 20 | detections_url="${OPTARG}" ;; 21 | d) 22 | dataset="${OPTARG}" ;; 23 | i) 24 | download_images="${OPTARG}" ;; 25 | esac 26 | done 27 | 28 | mkdir -p "${DATA_DIR}" "${TFDS_DIR}" 29 | 30 | if [ "${use_records}" == "True" ]; then 31 | mkdir -p "${RECORDS_DIR}" 32 | fi 33 | 34 | rm -rf "${DATA_DIR}"* "${TFDS_DIR}"* "${RECORDS_DIR}"* 35 | 36 | if [ "${dataset}" == "mpii" ]; then 37 | wget -q -O "${DATA_DIR}annotations.json" \ 38 | "${annotations_url}" 39 | 40 | wget -q -O "${DATA_DIR}detections.mat" \ 41 | "${detections_url}" 42 | 43 | if [ "${download_images}" == "True" ]; then 44 | wget -q -O "${IMG_DIR}{dataset}_images.tar.gz" \ 45 | "${data_url}" 46 | 47 | tar -xzvf "${IMG_DIR}{dataset}_images.tar.gz" \ 48 | -C "${IMG_DIR}" 49 | 50 | mv "${IMG_DIR}images/" "${IMG_DIR}${dataset}/" 51 | 52 | rm -rf "${IMG_DIR}{dataset}_images.tar.gz" 53 | fi 54 | fi 55 | 56 | 57 | 58 | 59 | 60 | 61 | -------------------------------------------------------------------------------- /bbpose/main.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 4 | 5 | from ingestion.ingest import DataGenerator 6 | from test import test 7 | from train import train 8 | from utils import ( 9 | force_update, 10 | reload_modules, 11 | reset_default_params, 12 | set_media_directory, 13 | set_model_version, 14 | update_params, 15 | ) 16 | 17 | 18 | def str_to_bool(v): 19 | if isinstance(v, bool): 20 | return v 21 | elif v.lower() in ['true', 't', '1']: 22 | return True 23 | elif v.lower() in ['false', 'f', '0']: 24 | return False 25 | else: 26 | raise argparse.ArgumentTypeError('Boolean value expected.') 27 | 28 | def main(): 29 | parser = argparse.ArgumentParser( 30 | description="Human pose estimation model." 31 | ) 32 | subparsers = parser.add_subparsers( 33 | dest='subparser_name', 34 | help='sub-command help', 35 | ) 36 | 37 | update_parser = subparsers.add_parser( 38 | 'update', 39 | help="updates the model parameters", 40 | ) 41 | update_parser.add_argument( 42 | '--version', type=int, required=True, 43 | help="model version number", 44 | ) 45 | update_parser.add_argument( 46 | '--img_dir', type=str, default='/PATH/TO/IMG/DIR', 47 | help="directory where image files can be found", 48 | ) 49 | update_parser.add_argument( 50 | '--dataset', type=str, default='mpii', 51 | help="currently, only 'mpii' is supported", 52 | ) 53 | update_parser.add_argument( 54 | '--use_records', type=str_to_bool, default=False, 55 | help="whether to generate and use tfrecords or not", 56 | ) 57 | update_parser.add_argument( 58 | '--use_cloud', type=str_to_bool, default=False, 59 | help="whether to retrieve data from a remote GCS location or not", 60 | ) 61 | update_parser.add_argument( 62 | '--img_size', type=int, default=256, 63 | help="one of 128, 192, 256, 320, 384, 448, or 512 (spatial length)", 64 | ) 65 | update_parser.add_argument( 66 | '--hm_size', type=int, default=64, 67 | help="one of 32, 48, 64, 80, 96, 112, or 128 (spatial length)", 68 | ) 69 | update_parser.add_argument( 70 | '--num_stacks', type=int, default=4, 71 | help="one of 2, 4, or 8 (number of hourglass stacks)", 72 | ) 73 | update_parser.add_argument( 74 | '--batch_per_replica', type=int, default=24, 75 | help="number of samples passed to each replica", 76 | ) 77 | update_parser.add_argument( 78 | '--download_images', type=str_to_bool, default=False, 79 | help="whether to download image dataset or not", 80 | ) 81 | update_parser.add_argument( 82 | '--toy_set', type=str_to_bool, default=False, 83 | help="whether to use a smaller dataset to experiment with or not", 84 | ) 85 | update_parser.add_argument( 86 | '--toy_samples', type=int, default=250, 87 | help="number of samples per toy dataset", 88 | ) 89 | update_parser.add_argument( 90 | '--examples_per_record', type=int, default=250, 91 | help="number of examples per tfrecord file", 92 | ) 93 | update_parser.add_argument( 94 | '--interleave_num', type=int, default=2, 95 | help="number of records to simultaneously interleave", 96 | ) 97 | update_parser.add_argument( 98 | '--interleave_block', type=int, default=1, 99 | help="number of consecutive elements from each record", 100 | ) 101 | update_parser.add_argument( 102 | '--gcs_data', type=str, default='', 103 | help="remote GCS location where tfrecords may be found", 104 | ) 105 | update_parser.add_argument( 106 | '--mean', type=float, nargs=3, 107 | default=[0.4624228775501251, 0.44416481256484985, 0.4025438725948334], 108 | help="mean values calculated (from dataset) for each channel (rgb)", 109 | ) 110 | update_parser.add_argument( 111 | '--sigma', type=int, default=1, 112 | help="std used for gaussian kernel", 113 | ) 114 | update_parser.add_argument( 115 | '--arch', type=str, default='softgate', 116 | help="one of 'softgate' or 'hourglass' (network architectures)") 117 | update_parser.add_argument( 118 | '--num_filters', type=int, default=144, 119 | help="any multiple of 8 (number of channels used at each layer)", 120 | ) 121 | update_parser.add_argument( 122 | '--initializer', type=str, default='glorot_uniform', 123 | help=( 124 | "one of 'glorot_normal', 'glorot_uniform', 'he_normal', or " 125 | "'he_uniform' (used in each layer)" 126 | ), 127 | ) 128 | update_parser.add_argument( 129 | '--momentum', type=float, default=0.9, 130 | help="momentum for moving average used in each batch norm layer", 131 | ) 132 | update_parser.add_argument( 133 | '--epsilon', type=float, default=0.001, 134 | help="value added to batch norm's variance to avoid division by zero", 135 | ) 136 | update_parser.add_argument( 137 | '--dropout_rate', type=float, default=0.2, 138 | help="dropout performed after each hourglass", 139 | ) 140 | update_parser.add_argument( 141 | '--is_eager', type=str_to_bool, default=False, 142 | help="whether to run tf.functions in eager mode or not", 143 | ) 144 | update_parser.add_argument( 145 | '--strategy', type=str, default='default', 146 | help="one of 'default', 'mirrored', or 'tpu' (distributed training)", 147 | ) 148 | update_parser.add_argument( 149 | '--tpu_address', type=str, default='', 150 | help="remote location of TPU device(s)" 151 | ) 152 | update_parser.add_argument( 153 | '--gcs_results', type=str, default='', 154 | help="remote GCS location where savedmodels may be found", 155 | ) 156 | update_parser.add_argument( 157 | '--mixed_precision', type=str_to_bool, default=False, 158 | help="whether to use both 16 and 32 bit floating-point types or not", 159 | ) 160 | update_parser.add_argument( 161 | '--num_epochs', type=int, default=200, 162 | help="number of epochs to train for", 163 | ) 164 | update_parser.add_argument( 165 | '--steps_per_execution', type=int, default=1, 166 | help="number of batches to run through each tf.function", 167 | ) 168 | update_parser.add_argument( 169 | '--track_every', type=int, default=32, 170 | help="how often to save current iteration/step for use with LR schedule", 171 | ) 172 | update_parser.add_argument( 173 | '--threshold', type=float, default=0.5, 174 | help="pck threshold", 175 | ) 176 | update_parser.add_argument( 177 | '--decay_epochs', type=int, nargs='*', default=[75, 100, 150], 178 | help="epoch at which to apply a decay factor to", 179 | ) 180 | update_parser.add_argument( 181 | '--decay_factor', type=float, default=0.2, 182 | help="factor to decay the learning rate by", 183 | ) 184 | update_parser.add_argument( 185 | '--learning_rate', type=float, default=0.00025, 186 | help="initial learning rate", 187 | ) 188 | update_parser.add_argument( 189 | '--scale', type=float, default=0.0, 190 | help="learning rate scaler used in distributed training", 191 | ) 192 | update_parser.add_argument( 193 | '--schedule_per_step', type=str_to_bool, default=False, 194 | help="whether to use a per step schedule (exponential decay) or not", 195 | ) 196 | update_parser.add_argument( 197 | '--decay_rate', type=float, default=0.96, 198 | help="decay rate used in exponential schedule", 199 | ) 200 | update_parser.add_argument( 201 | '--decay_steps', type=int, default=2000, 202 | help="decay frequency used in exponential schedule", 203 | ) 204 | 205 | force_parser = subparsers.add_parser( 206 | 'force', 207 | help="updates hidden params", 208 | ) 209 | force_parser.add_argument( 210 | '--train_size', type=int, default=0, 211 | help="number of elements in the train dataset", 212 | ) 213 | force_parser.add_argument( 214 | '--val_size', type=int, default=0, 215 | help="number of elemenets in the validation dataset", 216 | ) 217 | 218 | reset_parser = subparsers.add_parser( 219 | 'reset', 220 | help="resets params to default values", 221 | ) 222 | reset_parser.add_argument( 223 | '--defaults', type=str, default='bulat', 224 | help="one of 'bulat' or 'newell' (set of default params)", 225 | ) 226 | 227 | ingest_parser = subparsers.add_parser( 228 | 'ingest', 229 | help="retrieves data and generates datasets from scratch", 230 | ) 231 | ingest_parser.add_argument( 232 | '--setup', action='store_true', 233 | help="whether to run setup script or not", 234 | ) 235 | ingest_parser.add_argument( 236 | '--generate', action='store_true', 237 | help="whether to generate the datasets or not" 238 | ) 239 | 240 | train_parser = subparsers.add_parser( 241 | 'train', 242 | help="trains the model", 243 | ) 244 | train_parser.add_argument( 245 | '--restore', action='store_true', 246 | help="whether to restore a saved model or not", 247 | ) 248 | 249 | test_parser = subparsers.add_parser( 250 | 'test', 251 | help="evaluates a trained model", 252 | ) 253 | 254 | args = vars(parser.parse_args()) 255 | 256 | if args['subparser_name'] == 'reset': 257 | reset_default_params(args['defaults']) 258 | print('Params reset to default values.') 259 | 260 | elif args['subparser_name'] == 'force': 261 | args.pop('subparser_name') 262 | force_update(args) 263 | 264 | elif args['subparser_name'] == 'update': 265 | args.pop('subparser_name') 266 | set_model_version(args.pop('version')) 267 | set_media_directory(args.pop('img_dir')) 268 | update_params(args) 269 | print('Update complete.') 270 | 271 | elif args['subparser_name'] == 'ingest': 272 | if args['setup']: 273 | generator = DataGenerator(setup=True) 274 | else: 275 | generator = DataGenerator() 276 | if args['generate']: 277 | generator.generate() 278 | else: 279 | pass 280 | 281 | elif args['subparser_name'] == 'train': 282 | reload_modules(train) 283 | train.run(restore=args['restore']) 284 | 285 | elif args['subparser_name'] == 'test': 286 | reload_modules(test) 287 | test.run() 288 | 289 | if __name__ == '__main__': 290 | main() 291 | 292 | 293 | 294 | 295 | 296 | 297 | 298 | 299 | 300 | 301 | 302 | 303 | 304 | 305 | 306 | 307 | 308 | 309 | 310 | 311 | 312 | 313 | -------------------------------------------------------------------------------- /bbpose/models/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salinasJJ/BBpose/4be635ead0f99b1788160ca0b1c7b3947ba05526/bbpose/models/__init__.py -------------------------------------------------------------------------------- /bbpose/models/hourglass.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import layers 3 | 4 | from utils import get_frozen_params, get_params 5 | 6 | 7 | PARAMS = get_params('Models') 8 | if PARAMS['switch']: 9 | PARAMS = get_frozen_params( 10 | 'Models', 11 | version=PARAMS['version'], 12 | ) 13 | 14 | 15 | def residual_block(inputs, filters): 16 | x = layers.BatchNormalization( 17 | momentum=PARAMS['momentum'], 18 | epsilon=PARAMS['epsilon'], 19 | scale=False, 20 | )(inputs) 21 | x = layers.ReLU()(x) 22 | 23 | x = layers.Conv2D( 24 | filters=int(filters/2), 25 | kernel_size=(1,1), 26 | strides=(1,1), 27 | kernel_initializer=PARAMS['initializer'], 28 | )(x) 29 | x = layers.BatchNormalization( 30 | momentum=PARAMS['momentum'], 31 | epsilon=PARAMS['epsilon'], 32 | scale=False, 33 | )(x) 34 | x = layers.ReLU()(x) 35 | 36 | x = layers.Conv2D( 37 | filters=int(filters/2), 38 | kernel_size=(3,3), 39 | strides=(1,1), 40 | padding='same', 41 | kernel_initializer=PARAMS['initializer'], 42 | )(x) 43 | x = layers.BatchNormalization( 44 | momentum=PARAMS['momentum'], 45 | epsilon=PARAMS['epsilon'], 46 | scale=False, 47 | )(x) 48 | x = layers.ReLU()(x) 49 | 50 | x = layers.Conv2D( 51 | filters=int(filters), 52 | kernel_size=(1,1), 53 | strides=(1,1), 54 | kernel_initializer=PARAMS['initializer'], 55 | )(x) 56 | # skip connection 57 | if inputs.shape[3] == x.shape[3]: 58 | skip = inputs 59 | else: 60 | skip = layers.Conv2D( 61 | filters=int(filters), 62 | kernel_size=(1,1), 63 | strides=(1,1), 64 | kernel_initializer=PARAMS['initializer'], 65 | )(inputs) 66 | return tf.math.add_n([x, skip]) 67 | 68 | def encoder_to_decoder(inputs, depth): 69 | filters = tf.cast(PARAMS['num_filters'], tf.int32) 70 | 71 | skip = residual_block(inputs, filters) 72 | x = layers.MaxPool2D(pool_size=(2,2))(inputs) 73 | x = residual_block(x, filters) 74 | if depth > 1: 75 | x = encoder_to_decoder(x, depth-1) 76 | else: 77 | x = residual_block(x, filters) 78 | x = residual_block(x, filters) 79 | x = layers.UpSampling2D(size=(2,2))(x) 80 | return layers.Add()([skip, x]) 81 | 82 | 83 | 84 | -------------------------------------------------------------------------------- /bbpose/models/model.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import layers 3 | 4 | from models import hourglass, softgate 5 | from utils import get_frozen_params, get_params, reload_modules 6 | 7 | 8 | PARAMS = get_params('Models') 9 | if PARAMS['switch']: 10 | PARAMS = get_frozen_params( 11 | 'Models', 12 | version=PARAMS['version'], 13 | ) 14 | 15 | NUM_JOINTS = 16 16 | DEPTH = 4 17 | 18 | 19 | def block(inputs, filters): 20 | if PARAMS['arch'] == 'hourglass': 21 | return hourglass.residual_block(inputs, filters) 22 | elif PARAMS['arch'] == 'softgate': 23 | return softgate.skip_block(inputs, filters) 24 | 25 | def bottleneck(inputs, final=tf.constant(0)): 26 | if PARAMS['arch'] == 'hourglass': 27 | x = hourglass.encoder_to_decoder(inputs, tf.constant(DEPTH)) 28 | elif PARAMS['arch'] == 'softgate': 29 | e_out, skip1, skip2, skip3, skip4 = softgate.encoder(inputs) 30 | x = softgate.decoder(e_out, skip1, skip2, skip3, skip4) 31 | 32 | x = layers.Dropout(rate=PARAMS['dropout_rate'])(x) 33 | x = block(x, tf.cast(PARAMS['num_filters'], tf.int32)) 34 | 35 | x = layers.Conv2D( 36 | filters=PARAMS['num_filters'], 37 | kernel_size=(1,1), 38 | strides=(1,1), 39 | kernel_initializer=PARAMS['initializer'], 40 | )(x) 41 | x = layers.BatchNormalization( 42 | momentum=PARAMS['momentum'], 43 | epsilon=PARAMS['epsilon'], 44 | scale=False, 45 | )(x) 46 | x = layers.ReLU()(x) 47 | 48 | scores = layers.Conv2D( 49 | filters=NUM_JOINTS, 50 | kernel_size=(1,1), 51 | strides=(1,1), 52 | kernel_initializer=PARAMS['initializer'], 53 | )(x) 54 | if final == 0: 55 | x_lower = layers.Conv2D( 56 | filters=PARAMS['num_filters'], 57 | kernel_size=(1,1), 58 | strides=(1,1), 59 | kernel_initializer=PARAMS['initializer'], 60 | )(scores) 61 | x_upper = layers.Conv2D( 62 | filters=PARAMS['num_filters'], 63 | kernel_size=(1,1), 64 | strides=(1,1), 65 | kernel_initializer=PARAMS['initializer'], 66 | )(x) 67 | x = layers.Add()([ 68 | x_lower, 69 | x_upper, 70 | inputs, 71 | ]) 72 | return scores, x 73 | else: 74 | return scores 75 | 76 | def network(inputs): 77 | x = layers.ZeroPadding2D(padding=(3,3))(inputs) 78 | x = layers.Conv2D( 79 | filters=PARAMS['num_filters']//4, 80 | kernel_size=(7,7), 81 | strides=(2,2), 82 | kernel_initializer=PARAMS['initializer'], 83 | )(x) 84 | x = layers.BatchNormalization( 85 | momentum=PARAMS['momentum'], 86 | epsilon=PARAMS['epsilon'], 87 | scale=False, 88 | )(x) 89 | x = layers.ReLU()(x) 90 | 91 | x = block(x, tf.cast(PARAMS['num_filters'] / 2, tf.int32)) 92 | x = layers.MaxPool2D( 93 | pool_size=(2,2), 94 | strides=(2,2), 95 | )(x) 96 | x = block(x, tf.cast(PARAMS['num_filters'], tf.int32)) 97 | x = block(x, tf.cast(PARAMS['num_filters'],tf.int32)) 98 | 99 | hm1, x = bottleneck(x) 100 | if PARAMS['num_stacks'] == 8: 101 | hm2, x = bottleneck(x) 102 | hm3, x = bottleneck(x) 103 | hm4, x = bottleneck(x) 104 | hm5, x = bottleneck(x) 105 | hm6, x = bottleneck(x) 106 | hm7, x = bottleneck(x) 107 | hm8 = bottleneck(x, final=tf.constant(1)) 108 | return tf.stack( 109 | [hm1, hm2, hm3, hm4, hm5, hm6, hm7, hm8], 110 | axis=1, 111 | ) 112 | elif PARAMS['num_stacks'] == 4: 113 | hm2, x = bottleneck(x) 114 | hm3, x = bottleneck(x) 115 | hm4 = bottleneck(x, final=tf.constant(1)) 116 | return tf.stack( 117 | [hm1, hm2, hm3, hm4], 118 | axis=1, 119 | ) 120 | else: 121 | hm2 = bottleneck(x, final=tf.constant(1)) 122 | return tf.stack( 123 | [hm1, hm2], 124 | axis=1, 125 | ) 126 | 127 | def get_model(): 128 | if PARAMS['arch'] == 'hourglass': 129 | reload_modules(hourglass) 130 | elif PARAMS['arch'] == 'softgate': 131 | reload_modules(softgate) 132 | 133 | inputs = tf.keras.layers.Input([ 134 | PARAMS['img_size'], 135 | PARAMS['img_size'], 136 | 3, 137 | ]) 138 | outputs = network(inputs) 139 | outputs = layers.Activation( 140 | 'linear', 141 | dtype='float32', 142 | )(outputs) 143 | return tf.keras.Model(inputs, outputs) 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | -------------------------------------------------------------------------------- /bbpose/models/softgate.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import layers 3 | 4 | from utils import get_frozen_params, get_params 5 | 6 | 7 | PARAMS = get_params('Models') 8 | if PARAMS['switch']: 9 | PARAMS = get_frozen_params( 10 | 'Models', 11 | version=PARAMS['version'], 12 | ) 13 | 14 | 15 | def skip_block(inputs, filters): 16 | x1 = layers.BatchNormalization( 17 | momentum=PARAMS['momentum'], 18 | epsilon=PARAMS['epsilon'], 19 | scale=False, 20 | )(inputs) 21 | x1 = layers.ReLU()(x1) 22 | x1 = layers.Conv2D( 23 | filters=int(filters/2), 24 | kernel_size=(3,3), 25 | strides=(1,1), 26 | padding='same', 27 | kernel_initializer=PARAMS['initializer'], 28 | )(x1) 29 | 30 | x2 = layers.BatchNormalization( 31 | momentum=PARAMS['momentum'], 32 | epsilon=PARAMS['epsilon'], 33 | scale=False, 34 | )(x1) 35 | x2 = layers.ReLU()(x2) 36 | x2 = layers.Conv2D( 37 | filters=int(filters/4), 38 | kernel_size=(3,3), 39 | strides=(1,1), 40 | padding='same', 41 | kernel_initializer=PARAMS['initializer'], 42 | )(x2) 43 | 44 | x3 = layers.BatchNormalization( 45 | momentum=PARAMS['momentum'], 46 | epsilon=PARAMS['epsilon'], 47 | scale=False, 48 | )(x2) 49 | x3 = layers.ReLU()(x3) 50 | x3 = layers.Conv2D( 51 | filters=int(filters/4), 52 | kernel_size=(3,3), 53 | strides=(1,1), 54 | padding='same', 55 | kernel_initializer=PARAMS['initializer'], 56 | )(x3) 57 | x = layers.concatenate( 58 | [x1, x2, x3], 59 | axis=3, 60 | ) 61 | 62 | if inputs.shape[3] == x.shape[3]: 63 | skip = layers.DepthwiseConv2D( 64 | kernel_size=1, 65 | strides=(1,1), 66 | depthwise_initializer=PARAMS['initializer'], 67 | )(inputs) 68 | else: 69 | skip = layers.Conv2D( 70 | filters=int(filters), 71 | kernel_size=(1,1), 72 | strides=(1,1), 73 | kernel_initializer=PARAMS['initializer'], 74 | )(inputs) 75 | return tf.math.add_n([x, skip]) 76 | 77 | def encoder(inputs): 78 | skip1 = skip_block(inputs, filters=inputs.shape[3]) 79 | e1 = layers.MaxPool2D(pool_size=(2,2))(inputs) 80 | e1 = skip_block(e1, filters=e1.shape[3]) 81 | 82 | skip2 = skip_block(e1, filters=e1.shape[3]) 83 | e2 = layers.MaxPool2D(pool_size=(2,2))(e1) 84 | e2 = skip_block(e2, filters=e2.shape[3]//2) 85 | 86 | skip3 = skip_block(e2, filters=e2.shape[3]) 87 | e3 = layers.MaxPool2D(pool_size=(2,2))(e2) 88 | e3 = skip_block(e3, filters=e3.shape[3]) 89 | 90 | skip4 = skip_block(e3, filters=e3.shape[3]) 91 | e4 = layers.MaxPool2D(pool_size=(2,2))(e3) 92 | e4 = skip_block(e4, filters=e4.shape[3]) 93 | 94 | return e4, skip1, skip2, skip3, skip4 95 | 96 | def decoder(e_out, skip1, skip2, skip3, skip4): 97 | d4 = layers.UpSampling2D(size=(2,2))(e_out) 98 | d4 = layers.concatenate( 99 | [skip4, d4], 100 | axis=3, 101 | ) 102 | d4 = layers.Conv2D( 103 | filters=int(d4.shape[3]/2), 104 | kernel_size=(3,3), 105 | strides=(1,1), 106 | padding='same', 107 | kernel_initializer=PARAMS['initializer'], 108 | )(d4) 109 | d4 = skip_block(d4, filters=d4.shape[3]) 110 | 111 | d3 = layers.UpSampling2D(size=(2,2))(d4) 112 | d3 = layers.concatenate( 113 | [skip3, d3], 114 | axis=3, 115 | ) 116 | d3 = layers.Conv2D( 117 | filters=int(d3.shape[3]/2), 118 | kernel_size=(3,3), 119 | strides=(1,1), 120 | padding='same', 121 | kernel_initializer=PARAMS['initializer'], 122 | )(d3) 123 | d3 = skip_block(d3, filters=d3.shape[3]*2) 124 | 125 | d2 = layers.UpSampling2D(size=(2,2))(d3) 126 | d2 = layers.concatenate( 127 | [skip2, d2], 128 | axis=3, 129 | ) 130 | d2 = layers.Conv2D( 131 | filters=int(d2.shape[3]/2), 132 | kernel_size=(3,3), 133 | strides=(1,1), 134 | padding='same', 135 | kernel_initializer=PARAMS['initializer'], 136 | )(d2) 137 | d2 = skip_block(d2, filters=d2.shape[3]) 138 | 139 | d1 = layers.UpSampling2D(size=(2,2))(d2) 140 | d1 = layers.concatenate( 141 | [skip1, d1], 142 | axis=3, 143 | ) 144 | d1 = layers.Conv2D( 145 | filters=int(d1.shape[3]/2), 146 | kernel_size=(3,3), 147 | strides=(1,1), 148 | padding='same', 149 | kernel_initializer=PARAMS['initializer'], 150 | )(d1) 151 | return skip_block(d1, filters=d1.shape[3]) 152 | 153 | 154 | -------------------------------------------------------------------------------- /bbpose/preprocess/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salinasJJ/BBpose/4be635ead0f99b1788160ca0b1c7b3947ba05526/bbpose/preprocess/__init__.py -------------------------------------------------------------------------------- /bbpose/preprocess/preprocess.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras.layers.experimental.preprocessing import Rescaling 3 | import tensorflow_addons as tfa 4 | 5 | from preprocess.utils import coord_transform 6 | from utils import get_frozen_params, get_params, update_config 7 | 8 | 9 | NUM_JOINTS = 16 10 | PAIRED_JOINTS = [ 11 | [0,5], [1,4], [2,3], [10,15], [11,14], [12,13] 12 | ] 13 | SINGLE_JOINTS = [6, 7, 8, 9] 14 | PI = 3.14159265 15 | 16 | 17 | class DataPreprocessor(): 18 | def __init__(self): 19 | params = get_params('Preprocess') 20 | if params['switch']: 21 | self.params = get_frozen_params( 22 | 'Preprocess', 23 | version=params['version'], 24 | ) 25 | else: 26 | self.params = params 27 | 28 | self.batch_size = tf.cast( 29 | self.params['batch_per_replica'] * self.params['num_replicas'], 30 | tf.int64, 31 | ) 32 | self.random_generator = tf.random.Generator.from_seed(1) 33 | 34 | def read_records(self, train_records, val_records): 35 | train_record_dataset = self._map_records(train_records) 36 | val_record_dataset = self._map_records(val_records) 37 | return train_record_dataset, val_record_dataset 38 | 39 | def read_test_records(self, test_records): 40 | return self._map_records(test_records) 41 | 42 | def _map_records(self, dataset): 43 | ds = dataset.map( 44 | self._parse_example, 45 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 46 | deterministic=False, 47 | ) 48 | ds = ds.map( 49 | self._parse_features, 50 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 51 | deterministic=False, 52 | ) 53 | return ds 54 | 55 | def _parse_example(self, example): 56 | feature_description = { 57 | 'raw_image':tf.io.FixedLenFeature([], tf.string), 58 | 'joints':tf.io.FixedLenFeature([], tf.string), 59 | 'weights':tf.io.FixedLenFeature([], tf.string), 60 | 'center':tf.io.FixedLenFeature([], tf.string), 61 | 'scale':tf.io.FixedLenFeature([], tf.float32) 62 | } 63 | features = tf.io.parse_single_example(example, feature_description) 64 | return features 65 | 66 | def _parse_features(self, features): 67 | raw_image = features['raw_image'] 68 | joints = tf.io.parse_tensor(features['joints'], tf.float32) 69 | weights = tf.io.parse_tensor(features['weights'], tf.float32) 70 | center = tf.io.parse_tensor(features['center'], tf.float32) 71 | scale = tf.cast(features['scale'], tf.float32) 72 | return raw_image, joints, weights, center, scale 73 | 74 | def get_datasets(self, train_table, val_table): 75 | train_dataset = self._train_process(train_table) 76 | validation_dataset = self._validation_process(val_table) 77 | return train_dataset, validation_dataset 78 | 79 | def get_test_dataset(self, test_table): 80 | return self._test_process(test_table) 81 | 82 | def _train_process(self, dataset): 83 | if self.params['use_records']: 84 | buffer_size = self.params['train_size'] 85 | else: 86 | buffer_size = dataset.cardinality() 87 | 88 | ds = dataset.shuffle( 89 | tf.cast(buffer_size + 1, tf.int64) 90 | ) 91 | ds = ds.map( 92 | self._parsing_function, 93 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 94 | deterministic=False, 95 | ) 96 | ds = ds.map( 97 | self._augmentation_function, 98 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 99 | deterministic=False, 100 | ) 101 | ds = ds.map( 102 | self._generating_function, 103 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 104 | deterministic=False, 105 | ) 106 | ds = ds.map( 107 | lambda im, hm, wt, c, s: (im, hm, wt), 108 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 109 | deterministic=False, 110 | ) 111 | ds = ds.map( 112 | lambda im, hm, wt: (Rescaling(scale=1./255)(im), hm, wt), 113 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 114 | deterministic=False, 115 | ) 116 | ds = ds.map( 117 | lambda im, hm, wt: (im - self.params['mean'], hm, wt), 118 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 119 | deterministic=False, 120 | ) 121 | ds = ds.map( 122 | self._expansion_function, 123 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 124 | deterministic=False, 125 | ) 126 | ds = ds.batch( 127 | self.batch_size, 128 | drop_remainder=False if self.params['use_records'] else True, 129 | ) 130 | ds = ds.prefetch(1) 131 | return ds 132 | 133 | def _validation_process(self, dataset): 134 | if self.params['use_records']: 135 | buffer_size = self.params['val_size'] 136 | else: 137 | buffer_size = dataset.cardinality() 138 | 139 | ds = dataset.shuffle( 140 | tf.cast(buffer_size + 1, tf.int64) 141 | ) 142 | ds = ds.map( 143 | self._parsing_function, 144 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 145 | deterministic=False, 146 | ) 147 | ds = ds.map( 148 | lambda im, j, wt, c, s: (im, j, wt, c, s, tf.zeros([])), 149 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 150 | deterministic=False, 151 | ) 152 | ds = ds.map( 153 | self._generating_function, 154 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 155 | deterministic=False, 156 | ) 157 | ds = ds.map( 158 | lambda im, hm, wt, c, s: (im, hm, wt), 159 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 160 | deterministic=False, 161 | ) 162 | ds = ds.map( 163 | lambda im, hm, wt: (Rescaling(scale=1./255)(im), hm, wt), 164 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 165 | deterministic=False, 166 | ) 167 | ds = ds.map( 168 | lambda im, hm, wt: (im - self.params['mean'], hm, wt), 169 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 170 | deterministic=False, 171 | ) 172 | ds = ds.map( 173 | self._expansion_function, 174 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 175 | deterministic=False, 176 | ) 177 | ds = ds.batch( 178 | self.batch_size, 179 | drop_remainder=False if self.params['use_records'] else True, 180 | ) 181 | ds = ds.prefetch(1) 182 | return ds 183 | 184 | def _test_process(self, dataset): 185 | ds = dataset.map( 186 | self._parsing_function, 187 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 188 | deterministic=True, 189 | ) 190 | ds = ds.map( 191 | lambda im, j, wt, c, s: (im, j, wt, c, s, tf.zeros([])), 192 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 193 | deterministic=True, 194 | ) 195 | ds = ds.map( 196 | self._generating_function, 197 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 198 | deterministic=True, 199 | ) 200 | ds = ds.map( 201 | lambda im, hm, wt, c, s: (im, c, s), 202 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 203 | deterministic=True, 204 | ) 205 | ds = ds.map( 206 | lambda im, c, s: (Rescaling(scale=1./255)(im), c, s), 207 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 208 | deterministic=True, 209 | ) 210 | ds = ds.map( 211 | lambda im, c, s: (im - self.params['mean'], c, s), 212 | num_parallel_calls=tf.data.experimental.AUTOTUNE, 213 | deterministic=True, 214 | ) 215 | ds = ds.batch(self.batch_size) 216 | ds = ds.prefetch(1) 217 | return ds 218 | 219 | @tf.function 220 | def _parsing_function( 221 | self, 222 | source, 223 | joints, 224 | weights, 225 | center, 226 | scale, 227 | ): 228 | image = self._get_image(source) 229 | return image, joints, weights, center, scale 230 | 231 | def _get_image(self, source): 232 | if self.params['use_records']: 233 | image = source 234 | else: 235 | source = tf.strings.substr( 236 | source, 237 | pos=0, 238 | len=13, 239 | ) 240 | path = tf.strings.join([ 241 | self.params['img_dir'], 242 | source, 243 | ]) 244 | image = tf.io.read_file(path) 245 | return tf.io.decode_jpeg(image) 246 | 247 | @tf.function 248 | def _augmentation_function( 249 | self, 250 | image, 251 | joints, 252 | weights, 253 | center, 254 | scale, 255 | ): 256 | image, joints, weights, center = self._random_flipping( 257 | image, 258 | joints, 259 | weights, 260 | center, 261 | ) 262 | image = self._random_color_jittering(image) 263 | image = tf.clip_by_value( 264 | image, 265 | clip_value_min=0, 266 | clip_value_max=255, 267 | ) 268 | scale = self._random_scaling(scale) 269 | rotation = self._random_rotating() 270 | 271 | return image, joints, weights, center, scale, rotation 272 | 273 | def _random_flipping( 274 | self, 275 | image, 276 | joints, 277 | weights, 278 | center, 279 | ): 280 | random_chance = self.random_generator.uniform( 281 | shape=[], 282 | minval=0, 283 | maxval=2, 284 | dtype=tf.int32, 285 | ) 286 | if random_chance == 0: 287 | image = tf.image.flip_left_right(image) 288 | width = tf.cast(tf.shape(image)[1], tf.float32) 289 | joint_weights = tf.TensorArray( 290 | tf.float32, 291 | size=NUM_JOINTS, 292 | dynamic_size=False, 293 | clear_after_read=False, 294 | ) 295 | for j in tf.range(NUM_JOINTS): 296 | joint_weights = joint_weights.write( 297 | index=j, 298 | value=[ 299 | width - joints[j,0], 300 | joints[j,1], 301 | weights[j], 302 | ], 303 | ) 304 | paired_joints = tf.constant(PAIRED_JOINTS) 305 | single_joints = tf.constant(SINGLE_JOINTS) 306 | 307 | joint_weights_flipped = tf.TensorArray( 308 | tf.float32, 309 | size=NUM_JOINTS, 310 | dynamic_size=False, 311 | ) 312 | for sj in single_joints: 313 | joint_weights_flipped = joint_weights_flipped.write( 314 | index=sj, 315 | value=joint_weights.read(sj), 316 | ) 317 | for pj in paired_joints: 318 | temp = joint_weights.read(pj[0]) 319 | joint_weights_flipped = joint_weights_flipped.write( 320 | index=pj[0], 321 | value=joint_weights.read(pj[1]), 322 | ) 323 | joint_weights_flipped = joint_weights_flipped.write( 324 | index=pj[1], 325 | value=temp, 326 | ) 327 | joint_weights_flipped = joint_weights_flipped.stack() 328 | joints = joint_weights_flipped[:,:2] 329 | weights = joint_weights_flipped[:,2] 330 | 331 | center_flipped = tf.TensorArray( 332 | tf.float32, 333 | size=1, 334 | dynamic_size=False, 335 | ) 336 | center_flipped = center_flipped.write( 337 | index=0, 338 | value=[ 339 | width - center[0], 340 | center[1] 341 | ], 342 | ) 343 | center = tf.squeeze(center_flipped.stack()) 344 | return image, joints, weights, center 345 | else: 346 | return image, joints, weights, center 347 | 348 | def _random_color_jittering(self, image): 349 | image = tf.image.random_brightness(image, max_delta=0.2) 350 | image = tf.image.random_contrast( 351 | image, 352 | lower=0.5, 353 | upper=1.5, 354 | ) 355 | # Option 2: Remove random contrast. 356 | return image 357 | 358 | def _random_scaling(self, scale): 359 | random_number = self.random_generator.normal([1]) 360 | scale = ( 361 | scale * 362 | tf.clip_by_value( 363 | 0.25 * random_number + 1, 364 | clip_value_min=0.75, 365 | clip_value_max=1.25, 366 | ) 367 | ) 368 | return scale[0] 369 | 370 | def _random_rotating(self): 371 | random_chance = self.random_generator.uniform( 372 | shape=[], 373 | minval=0, 374 | maxval=3, 375 | dtype=tf.int32, 376 | ) 377 | random_number = self.random_generator.normal([1]) 378 | 379 | if random_chance == 0: 380 | rotation = tf.abs(tf.clip_by_value( 381 | random_number * 15, 382 | clip_value_min=-30, 383 | clip_value_max=30, 384 | )) 385 | elif random_chance == 1: 386 | rotation = ( 387 | 360.0 - 388 | tf.abs(tf.clip_by_value( 389 | random_number * 15, 390 | clip_value_min=-30, 391 | clip_value_max=30, 392 | )) 393 | ) 394 | else: 395 | rotation = tf.zeros([1]) 396 | # Option 2: Change (random_number * 15) to (random_number * 30). 397 | return rotation[0] 398 | 399 | @tf.function 400 | def _generating_function( 401 | self, 402 | image, 403 | joints, 404 | weights, 405 | center, 406 | scale, 407 | rotation, 408 | ): 409 | image = self._get_cropped_image( 410 | image, 411 | center, 412 | scale, 413 | rotation, 414 | ) 415 | heatmap, weights = self._get_heatmap( 416 | joints, 417 | weights, 418 | center, 419 | scale, 420 | rotation, 421 | ) 422 | image = tf.reshape( 423 | image, 424 | shape=[ 425 | self.params['img_size'], 426 | self.params['img_size'], 427 | 3, 428 | ], 429 | ) 430 | heatmap = tf.reshape( 431 | heatmap, 432 | shape=[ 433 | self.params['hm_size'], 434 | self.params['hm_size'], 435 | NUM_JOINTS, 436 | ], 437 | ) 438 | return image, heatmap, weights, center, scale 439 | 440 | def _get_cropped_image( 441 | self, 442 | image, 443 | center, 444 | scale, 445 | rotation, 446 | ): 447 | height = tf.cast(tf.shape(image)[0], tf.float32) 448 | width = tf.cast(tf.shape(image)[1], tf.float32) 449 | sf = scale * 200.0 / tf.cast(self.params['img_size'], tf.float32) 450 | if sf < 2: 451 | return self._crop_image( 452 | image, 453 | center, 454 | scale, 455 | rotation, 456 | ) 457 | else: 458 | max_size = tf.cast( 459 | tf.floor(tf.maximum(height, width) / sf), 460 | tf.int32, 461 | ) 462 | new_height = tf.cast(tf.floor(height / sf), tf.int32) 463 | new_width = tf.cast(tf.floor(width / sf), tf.int32) 464 | if max_size < 2: 465 | return tf.zeros([ 466 | self.params['img_size'], 467 | self.params['img_size'], 468 | tf.shape(image)[2], 469 | ]) 470 | else: 471 | image = tf.image.resize(image, size=[new_height, new_width]) 472 | center = center / sf 473 | scale = scale / sf 474 | return self._crop_image( 475 | image, 476 | center, 477 | scale, 478 | rotation, 479 | ) 480 | 481 | def _crop_image( 482 | self, 483 | image, 484 | center, 485 | scale, 486 | rotation, 487 | ): 488 | ul_coords = coord_transform( 489 | tf.zeros([2]), 490 | center, 491 | scale, 492 | rotation=tf.zeros([]), 493 | invert=tf.ones([], tf.int32), 494 | size=tf.cast(self.params['img_size'], tf.float32), 495 | ) 496 | br_coords = coord_transform( 497 | tf.cast( 498 | [self.params['img_size'], self.params['img_size']], 499 | tf.float32, 500 | ), 501 | center, 502 | scale, 503 | rotation=tf.zeros([]), 504 | invert=tf.ones([], tf.int32), 505 | size=tf.cast(self.params['img_size'], tf.float32), 506 | ) 507 | pad = tf.norm( 508 | tf.cast(br_coords - ul_coords, tf.float32) 509 | ) 510 | pad = pad / 2 511 | pad = pad - tf.cast(br_coords[1] - ul_coords[1], tf.float32) / 2 512 | pad = tf.cast(pad, tf.int32) 513 | 514 | if rotation != 0.: 515 | ul_coords = ul_coords - pad 516 | br_coords = br_coords + pad 517 | 518 | x_min = tf.maximum(0, ul_coords[0]) 519 | x_max = tf.minimum(tf.shape(image)[1], br_coords[0]) 520 | y_min = tf.maximum(0, ul_coords[1]) 521 | y_max = tf.minimum(tf.shape(image)[0], br_coords[1]) 522 | x_min_margin = tf.maximum(0, -ul_coords[0]) 523 | x_max_margin = ( 524 | tf.minimum(br_coords[0], tf.shape(image)[1]) 525 | - ul_coords[0] 526 | ) 527 | y_min_margin = tf.maximum(0, -ul_coords[1]) 528 | y_max_margin = ( 529 | tf.minimum(br_coords[1], tf.shape(image)[0]) 530 | - ul_coords[1] 531 | ) 532 | 533 | if x_max_margin < x_min_margin: 534 | temp = x_max_margin 535 | x_max_margin = x_min_margin 536 | x_min_margin = temp 537 | temp = x_min 538 | x_min = x_max 539 | x_max = temp 540 | 541 | top = y_min_margin 542 | bottom = (br_coords[1] - ul_coords[1]) - y_max_margin 543 | left = x_min_margin 544 | right = (br_coords[0] - ul_coords[0]) - x_max_margin 545 | 546 | cropped_image = image[y_min:y_max, x_min:x_max] 547 | cropped_image = tf.pad( 548 | cropped_image, 549 | paddings=[ 550 | [top, bottom], 551 | [left, right], 552 | [0, 0], 553 | ], 554 | ) 555 | if rotation != 0.: 556 | cropped_image = tfa.image.rotate( 557 | cropped_image, 558 | angles=rotation * (PI / 180), 559 | ) 560 | cropped_image = cropped_image[pad:-pad, pad:-pad] 561 | return tf.image.resize( 562 | cropped_image, 563 | size=[ 564 | self.params['img_size'], 565 | self.params['img_size'], 566 | ], 567 | ) 568 | 569 | def _get_heatmap( 570 | self, 571 | joints, 572 | weights, 573 | center, 574 | scale, 575 | rotation, 576 | ): 577 | hm_size = tf.cast(self.params['hm_size'], tf.float32) 578 | hm = tf.TensorArray( 579 | tf.float32, 580 | size=NUM_JOINTS, 581 | dynamic_size=False, 582 | ) 583 | w = tf.TensorArray( 584 | tf.float32, 585 | size=NUM_JOINTS, 586 | dynamic_size=False, 587 | ) 588 | for j in tf.range(NUM_JOINTS): 589 | if joints[j, 1] > 0: 590 | joints_transformed = coord_transform( 591 | joints[j,:] + 1, 592 | center, 593 | scale, 594 | rotation, 595 | tf.zeros([], tf.int32), 596 | hm_size, 597 | ) 598 | heatmap, visible = self._draw_heatmap(joints_transformed - 1) 599 | hm = hm.write(j, value=heatmap) 600 | w = w.write(j, value=weights[j] * visible) 601 | else: 602 | hm = hm.write(j, value=tf.zeros([hm_size, hm_size])) 603 | w = w.write(j, value=weights[j]) 604 | heatmap = tf.transpose(hm.stack(), perm=[1,2,0]) 605 | weights = w.stack() 606 | return heatmap, weights 607 | 608 | def _draw_heatmap(self, joints): 609 | coords = tf.TensorArray( 610 | tf.int32, 611 | size=2, 612 | dynamic_size=False, 613 | ) 614 | coords = coords.write( 615 | index=0, 616 | value=[ 617 | tf.cast(joints[0] - 3 * self.params['sigma'], tf.int32), 618 | tf.cast(joints[1] - 3 * self.params['sigma'], tf.int32), 619 | ], 620 | ) 621 | coords = coords.write( 622 | index=1, 623 | value=[ 624 | tf.cast(joints[0] + 3 * self.params['sigma'] + 1, tf.int32), 625 | tf.cast(joints[1] + 3 * self.params['sigma'] + 1, tf.int32,), 626 | ], 627 | ) 628 | coords = coords.stack() 629 | if ( 630 | coords[0,0] >= self.params['hm_size'] 631 | or coords[0,1] >= self.params['hm_size'] 632 | or coords[1,0] < 0 633 | or coords[1,1] < 0 634 | ): 635 | heatmap = tf.zeros([ 636 | self.params['hm_size'], 637 | self.params['hm_size'], 638 | ]) 639 | visible = tf.zeros([]) 640 | return heatmap, visible 641 | 642 | gaussian = self._get_gaussian(coords) 643 | padding = self._get_padding(coords) 644 | heatmap = tf.pad(gaussian, padding) 645 | visible = tf.ones([]) 646 | return heatmap, visible 647 | 648 | def _get_gaussian(self, coords): 649 | size = tf.constant(6. * self.params['sigma'] + 1.) 650 | x = tf.range( 651 | start=0., 652 | limit=size, 653 | delta=1, 654 | ) 655 | y = tf.expand_dims(x, 1) 656 | x0 = y0 = tf.floor(size / 2) 657 | gaussian = tf.exp( 658 | - ((x - x0) ** 2 + (y - y0) ** 2) / (2 * self.params['sigma'] ** 2) 659 | ) 660 | 661 | x_min = tf.maximum(0, -coords[0,0]) 662 | x_max = tf.minimum(coords[1,0], self.params['hm_size']) - coords[0,0] 663 | y_min = tf.maximum(0, -coords[0,1]) 664 | y_max = tf.minimum(coords[1,1], self.params['hm_size']) - coords[0,1] 665 | 666 | return gaussian[y_min:y_max, x_min:x_max] 667 | 668 | def _get_padding(self, coords): 669 | x_min = tf.maximum(0, coords[0,0]) 670 | x_max = tf.minimum(coords[1,0], self.params['hm_size']) 671 | y_min = tf.maximum(0, coords[0,1]) 672 | y_max = tf.minimum(coords[1,1], self.params['hm_size']) 673 | 674 | top = y_min if y_min != 0 else 0 675 | bottom = self.params['hm_size'] - y_max 676 | left = x_min if x_min != 0 else 0 677 | right = self.params['hm_size'] - x_max 678 | 679 | padding = tf.TensorArray( 680 | tf.int32, 681 | size=2, 682 | dynamic_size=False, 683 | ) 684 | padding = padding.write(0, value=[top, bottom]) 685 | padding = padding.write(1, value=[left, right]) 686 | return padding.stack() 687 | 688 | @tf.function 689 | def _expansion_function( 690 | self, 691 | image, 692 | heatmap, 693 | weights, 694 | ): 695 | weights = tf.expand_dims(weights, axis=0) 696 | weights = tf.expand_dims(weights, axis=0) 697 | weights = tf.tile( 698 | weights, 699 | multiples=[ 700 | self.params['hm_size'], 701 | self.params['hm_size'], 702 | 1, 703 | ], 704 | ) 705 | return image, heatmap, weights 706 | 707 | 708 | 709 | 710 | 711 | 712 | 713 | 714 | -------------------------------------------------------------------------------- /bbpose/preprocess/utils.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | 4 | PI = 3.14159265 5 | 6 | 7 | def coord_transform( 8 | coords, 9 | center, 10 | scale, 11 | rotation, 12 | invert, 13 | size, 14 | ): 15 | transfromation_matrix = get_transformation_matrix( 16 | center, 17 | scale, 18 | rotation, 19 | invert, 20 | size, 21 | ) 22 | new_coords = tf.TensorArray( 23 | tf.float32, 24 | size=3, 25 | dynamic_size=False, 26 | ) 27 | new_coords = new_coords.write(0, value=coords[0] - 1.) 28 | new_coords = new_coords.write(1, value=coords[1] - 1.) 29 | new_coords = new_coords.write(2, value=1.) 30 | new_coords = new_coords.stack() 31 | new_coords = tf.tensordot( 32 | transfromation_matrix, 33 | new_coords, 34 | axes=1, 35 | ) 36 | return tf.cast(new_coords[:2], tf.int32) + 1 37 | 38 | def get_transformation_matrix( 39 | center, 40 | scale, 41 | rotation, 42 | invert, 43 | size, 44 | ): 45 | transfromation_matrix = tf.TensorArray( 46 | tf.float32, 47 | size=3, 48 | dynamic_size=False, 49 | ) 50 | transfromation_matrix = transfromation_matrix.write( 51 | index=0, 52 | value=[ 53 | size / (scale * 200.), 54 | 0., 55 | size * (-center[0] / (scale * 200.) + 0.5), 56 | ], 57 | ) 58 | transfromation_matrix = transfromation_matrix.write( 59 | index=1, 60 | value=[ 61 | 0., 62 | size / (scale * 200.), 63 | size * (-center[1] / (scale * 200.) + 0.5), 64 | ], 65 | ) 66 | transfromation_matrix = transfromation_matrix.write(2, value=[0., 0., 1.]) 67 | transfromation_matrix = transfromation_matrix.stack() 68 | 69 | if rotation != 0.: 70 | transfromation_matrix = rotate_transformation_matrix( 71 | transfromation_matrix, 72 | rotation, 73 | size, 74 | ) 75 | if invert == 1: 76 | transfromation_matrix = tf.linalg.inv(transfromation_matrix) 77 | return transfromation_matrix 78 | 79 | def rotate_transformation_matrix( 80 | transfromation_matrix, 81 | rotation, 82 | size, 83 | ): 84 | rotation = -rotation 85 | sn = tf.sin(rotation * (PI / 180)) 86 | csn = tf.cos(rotation * (PI / 180)) 87 | 88 | rot_matrix = tf.TensorArray( 89 | tf.float32, 90 | size=3, 91 | dynamic_size=False, 92 | ) 93 | rot_matrix = rot_matrix.write(0, value=[csn, -sn, 0]) 94 | rot_matrix = rot_matrix.write(1, value=[sn, csn, 0]) 95 | rot_matrix = rot_matrix.write(2, value=[0, 0, 1]) 96 | rot_matrix = rot_matrix.stack() 97 | 98 | tr_matrix = tf.TensorArray( 99 | tf.float32, 100 | size=3, 101 | dynamic_size=False, 102 | ) 103 | tr_matrix = tr_matrix.write(0, value=[1, 0, -size / 2]) 104 | tr_matrix = tr_matrix.write(1, value=[0, 1, -size / 2]) 105 | tr_matrix = tr_matrix.write(2, value=[0, 0, 1]) 106 | tr_matrix = tr_matrix.stack() 107 | 108 | inv_matrix = tf.TensorArray( 109 | tf.float32, 110 | size=3, 111 | dynamic_size=False, 112 | ) 113 | inv_matrix = inv_matrix.write(0, value=[1, 0, size / 2]) 114 | inv_matrix = inv_matrix.write(1, value=[0, 1, size / 2]) 115 | inv_matrix = inv_matrix.write(2, value=[0, 0, 1]) 116 | inv_matrix = inv_matrix.stack() 117 | 118 | transfromation_matrix = tf.tensordot( 119 | tr_matrix, 120 | transfromation_matrix, 121 | axes=1, 122 | ) 123 | transfromation_matrix = tf.tensordot( 124 | rot_matrix, 125 | transfromation_matrix, 126 | axes=1, 127 | ) 128 | transfromation_matrix = tf.tensordot( 129 | inv_matrix, 130 | transfromation_matrix, 131 | axes=1, 132 | ) 133 | return transfromation_matrix 134 | 135 | 136 | -------------------------------------------------------------------------------- /bbpose/requirements.txt: -------------------------------------------------------------------------------- 1 | matplotlib==3.2.2 2 | numpy==1.19.5 3 | pandas==1.1.5 4 | scipy==1.4.1 5 | tensorflow==2.4.1 6 | tensorflow-addons==0.13.0 -------------------------------------------------------------------------------- /bbpose/test/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salinasJJ/BBpose/4be635ead0f99b1788160ca0b1c7b3947ba05526/bbpose/test/__init__.py -------------------------------------------------------------------------------- /bbpose/test/metrics.py: -------------------------------------------------------------------------------- 1 | import json 2 | 3 | import tensorflow as tf 4 | import scipy.io 5 | 6 | from preprocess.utils import coord_transform 7 | from train.utils import get_max_indices 8 | from utils import ( 9 | get_data_dir, 10 | get_frozen_params, 11 | get_params, 12 | get_results_dir, 13 | is_file, 14 | ) 15 | 16 | 17 | NUM_JOINTS = 16 18 | PRECISION = 0.6 19 | 20 | 21 | class PCKh(): 22 | def __init__(self): 23 | self.params = get_frozen_params( 24 | 'Test', 25 | version=get_params('Test')['version'], 26 | ) 27 | self.results_dir = get_results_dir(self.params['dataset']) 28 | self.dataset_dir = get_data_dir(self.params['dataset']) 29 | if self.params['toy_set']: 30 | self.data_dir = self.dataset_dir + 'data/toy/' 31 | else: 32 | self.data_dir = self.dataset_dir + 'data/full/' 33 | 34 | self.num_joints = NUM_JOINTS 35 | 36 | @tf.function 37 | def get_final_predictions( 38 | self, 39 | preds, 40 | center, 41 | scale, 42 | ): 43 | coords = self._get_coords(preds) 44 | 45 | batch_size = tf.shape(preds)[0] 46 | predictions = tf.TensorArray( 47 | tf.float32, 48 | size=batch_size, 49 | dynamic_size=False, 50 | ) 51 | for n in tf.range(batch_size): 52 | predictions = predictions.write( 53 | index=n, 54 | value=self._transform_coords( 55 | coords[n], 56 | center[n], 57 | scale[n]) 58 | ) 59 | return predictions.stack() 60 | 61 | def _get_coords(self, preds): 62 | max_coords = get_max_indices(preds) 63 | batch_size = tf.shape(preds)[0] 64 | example_coords = tf.TensorArray( 65 | tf.float32, 66 | size=batch_size, 67 | dynamic_size=False, 68 | ) 69 | for n in tf.range(batch_size): 70 | joint_coords = tf.TensorArray( 71 | tf.float32, 72 | size=self.num_joints, 73 | dynamic_size=False, 74 | ) 75 | for j in tf.range(self.num_joints): 76 | hm = preds[n, :, :, j] 77 | max_x = tf.cast(tf.floor(max_coords[n, 0, j]), tf.int32) 78 | max_y = tf.cast(tf.floor(max_coords[n, 1, j]), tf.int32) 79 | if ( 80 | max_x > 1 81 | and max_x < self.params['hm_size'] 82 | and max_y > 1 83 | and max_y < self.params['hm_size'] 84 | ): 85 | diff = tf.TensorArray( 86 | tf.float32, 87 | size=1, 88 | dynamic_size=False, 89 | ) 90 | diff = diff.write( 91 | index=0, 92 | value=[ 93 | hm[max_y - 1][max_x] - hm[max_y - 1][max_x - 2], 94 | hm[max_y][max_x - 1] - hm[max_y - 2][max_x - 1], 95 | ], 96 | ) 97 | diff = diff.stack() 98 | diff = tf.squeeze(diff) 99 | joint_coords = joint_coords.write( 100 | index=j, 101 | value=max_coords[n,:,j] + tf.sign(diff) * 0.25, 102 | ) 103 | else: 104 | joint_coords = joint_coords.write(j, max_coords[n,:,j]) 105 | joint_coords = joint_coords.stack() 106 | example_coords = example_coords.write(n, joint_coords) 107 | coords = example_coords.stack() + 0.5 108 | return tf.transpose(coords, perm=[0,2,1]) 109 | 110 | def _transform_coords( 111 | self, 112 | coords, 113 | center, 114 | scale, 115 | ): 116 | coords_transformed = tf.TensorArray( 117 | tf.float32, 118 | size=self.num_joints, 119 | dynamic_size=False, 120 | ) 121 | for j in tf.range(self.num_joints): 122 | temp = tf.cast( 123 | coord_transform( 124 | coords[0:2, j], 125 | center, 126 | scale, 127 | rotation=tf.zeros([]), 128 | invert=tf.ones([], tf.int32), 129 | size=tf.cast(self.params['hm_size'], tf.float32), 130 | ), 131 | tf.float32, 132 | ) 133 | coords_transformed = coords_transformed.write(j, temp) 134 | coords_transformed = coords_transformed.stack() 135 | return tf.transpose(coords_transformed, [1,0]) 136 | 137 | def get_results(self, y_pred): 138 | y_true = scipy.io.loadmat(self.data_dir + 'detections.mat') 139 | joints_missing = tf.cast(y_true['jnt_missing'], tf.float32) 140 | hm_true = tf.cast(y_true['pos_gt_src'], tf.float32) 141 | headboxes = tf.cast(y_true['headboxes_src'], tf.float32) 142 | if self.params['toy_set']: 143 | joints_missing = joints_missing[...,:y_pred.shape[0]] 144 | hm_true = hm_true[...,:y_pred.shape[0]] 145 | headboxes = headboxes[...,:y_pred.shape[0]] 146 | 147 | hm_pred = tf.transpose(y_pred, perm=[2, 1, 0]) 148 | 149 | head_norm = tf.norm( 150 | headboxes[1, :, :] - headboxes[0, :, :], 151 | axis=0, 152 | ) 153 | visible = 1 - joints_missing 154 | distances = tf.norm(hm_pred - hm_true, axis=1) / (head_norm * PRECISION) 155 | distances = distances * visible 156 | 157 | below_threshold = tf.less(distances, self.params['threshold']) 158 | below_threshold = tf.cast(below_threshold, tf.float32) * visible 159 | 160 | total_visible = tf.reduce_sum(visible, axis=1) 161 | results = tf.reduce_sum(below_threshold, axis=1) / total_visible 162 | return self._get_results_dict(results * 100.) 163 | 164 | def _get_results_dict(self, results): 165 | return { 166 | 'head':float( 167 | results[9].numpy() 168 | ), 169 | 'shoulder':float( 170 | (results[12] + results[13]).numpy() / 2 171 | ), 172 | 'elbow':float( 173 | (results[11] + results[14]).numpy() / 2 174 | ), 175 | 'wrist':float( 176 | (results[10] + results[15]).numpy() / 2 177 | ), 178 | 'hip':float( 179 | (results[2] + results[3]).numpy() / 2 180 | ), 181 | 'knee':float( 182 | (results[1] + results[4]).numpy() / 2 183 | ), 184 | 'ankle':float( 185 | (results[0] + results[5]).numpy() / 2 186 | ), 187 | 'mean':float( 188 | tf.reduce_mean(tf.concat( 189 | values=[ 190 | results[:6], 191 | results[8:], 192 | ], 193 | axis=0, 194 | )).numpy() 195 | ), 196 | } 197 | 198 | def save_results(self, results_dict): 199 | is_file(self.results_dir, 'results.json') 200 | json_file = self.results_dir + 'results.json' 201 | print(f"\nSaving results to {json_file}") 202 | 203 | results = {} 204 | results[f"v{self.params['version']}"] = results_dict 205 | with open(json_file, 'w', encoding='utf-8') as f: 206 | f.write(json.dumps( 207 | results, 208 | ensure_ascii=False, 209 | indent=4, 210 | )) 211 | 212 | def load_results(self): 213 | with open(self.results_dir + 'results.json', 'r') as f: 214 | results = json.load(f) 215 | try: 216 | return results[f"v{self.params['version']}"] 217 | except: 218 | return ( 219 | f"No test results were found for version " 220 | f"'{self.params['version']}'." 221 | ) 222 | 223 | 224 | 225 | 226 | 227 | 228 | 229 | 230 | 231 | -------------------------------------------------------------------------------- /bbpose/test/test.py: -------------------------------------------------------------------------------- 1 | import math 2 | import os 3 | 4 | import tensorflow as tf 5 | tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR) 6 | 7 | from ingestion.ingest import DataGenerator 8 | from preprocess.preprocess import DataPreprocessor 9 | from test.metrics import PCKh 10 | from train.train import get_custom_objects 11 | from utils import ( 12 | force_update, 13 | get_frozen_params, 14 | get_params, 15 | get_results_dir, 16 | ) 17 | 18 | 19 | SEEN_EVERY = 20 20 | 21 | 22 | def run(): 23 | force_update({'switch':True}) 24 | 25 | version=get_params('Test')['version'] 26 | params = get_frozen_params( 27 | 'Test', 28 | version=version, 29 | ) 30 | 31 | tf.config.run_functions_eagerly(params['is_eager']) 32 | 33 | if params['strategy'] == 'default': 34 | strategy = tf.distribute.get_strategy() 35 | elif params['strategy'] == 'mirrored': 36 | strategy = tf.distribute.MirroredStrategy() 37 | elif params['strategy'] == 'tpu': 38 | resolver = tf.distribute.cluster_resolver.TPUClusterResolver( 39 | tpu=params['tpu_address'] 40 | ) 41 | tf.config.experimental_connect_to_cluster(resolver) 42 | tf.tpu.experimental.initialize_tpu_system(resolver) 43 | strategy = tf.distribute.TPUStrategy(resolver) 44 | 45 | force_update({'num_replicas':strategy.num_replicas_in_sync}) 46 | 47 | if params['mixed_precision']: 48 | if params['strategy'] != 'tpu': 49 | tf.keras.mixed_precision.set_global_policy('mixed_float16') 50 | 51 | assert params['img_size'] == 4 * params['hm_size'], \ 52 | "'hm_size' must be 4 times smaller than 'img_size'" 53 | 54 | generator = DataGenerator() 55 | preprocessor = DataPreprocessor() 56 | if params['use_records']: 57 | test_records = generator.load_test_records() 58 | test_tables = preprocessor.read_test_records(test_records) 59 | else: 60 | assert params['strategy'] != 'tpu', \ 61 | "TPUStrategy only supports TFRecords as input" 62 | if os.path.isdir(params['img_dir']): 63 | test_tables = generator.load_test_dataset() 64 | else: 65 | print(f"{params['img_dir']} does not exist") 66 | 67 | test_dataset = preprocessor.get_test_dataset(test_tables) 68 | 69 | if params['use_records']: 70 | test_steps = int(math.ceil( 71 | params['val_size'] / params['batch_per_replica'] 72 | )) 73 | else: 74 | test_steps = test_dataset.cardinality().numpy() 75 | 76 | if params['use_cloud']: 77 | savemodel_dir = ( 78 | params['gcs_results'].rstrip('/') + f'/{str(version)}' 79 | ) 80 | else: 81 | savemodel_dir = ( 82 | get_results_dir(params['dataset']) 83 | + f'savemodel/{str(version)}' 84 | ) 85 | 86 | if tf.io.gfile.isdir(savemodel_dir): 87 | with strategy.scope(): 88 | print('Loading model weights...') 89 | model = tf.keras.models.load_model( 90 | savemodel_dir, 91 | custom_objects=get_custom_objects( 92 | params['schedule_per_step'], 93 | ), 94 | ) 95 | pckh = PCKh() 96 | 97 | print("Evaluating...") 98 | for enum, elements in test_dataset.enumerate(): 99 | images, center, scale = elements 100 | if enum % SEEN_EVERY == 0: 101 | print(f'{enum.numpy()} / {test_steps}') 102 | preds = model.predict(images) 103 | preds = preds[:, -1, :, :, :] 104 | 105 | if enum == 0: 106 | predictions = pckh.get_final_predictions( 107 | preds, 108 | center, 109 | scale, 110 | ) 111 | else: 112 | temp = pckh.get_final_predictions( 113 | preds, 114 | center, 115 | scale, 116 | ) 117 | predictions = tf.concat([predictions, temp], axis=0) 118 | 119 | results = pckh.get_results(predictions) 120 | print('\nResults...') 121 | for k, v in results.items(): 122 | print(f"{k}:{v}") 123 | pckh.save_results(results) 124 | 125 | force_update({'switch':False,}) 126 | else: 127 | force_update({'switch':False,}) 128 | print('No SaveModel was found.') 129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | -------------------------------------------------------------------------------- /bbpose/train/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salinasJJ/BBpose/4be635ead0f99b1788160ca0b1c7b3947ba05526/bbpose/train/__init__.py -------------------------------------------------------------------------------- /bbpose/train/callbacks.py: -------------------------------------------------------------------------------- 1 | import bisect 2 | import json 3 | 4 | import tensorflow as tf 5 | from tensorflow.keras.callbacks import ( 6 | CSVLogger, 7 | LearningRateScheduler, 8 | ModelCheckpoint, 9 | ) 10 | from tensorflow.keras.optimizers.schedules import ExponentialDecay 11 | 12 | from utils import get_frozen_params, get_params, get_results_dir, is_file 13 | 14 | 15 | PARAMS = get_params('Train') 16 | VERSION = PARAMS['version'] 17 | if PARAMS['switch']: 18 | PARAMS = get_frozen_params( 19 | 'Train', 20 | version=VERSION, 21 | ) 22 | 23 | 24 | def savemodel(): 25 | if PARAMS['use_cloud']: 26 | savemodel_dir = ( 27 | PARAMS['gcs_results'].rstrip('/') + f'/{str(VERSION)}' 28 | ) 29 | else: 30 | savemodel_dir = ( 31 | get_results_dir(PARAMS['dataset']) 32 | + f'savemodel/{str(VERSION)}' 33 | ) 34 | return Checkpoint(savemodel_dir) 35 | 36 | def csvlogger(restore=False): 37 | logs_dir = get_results_dir(PARAMS['dataset']) + 'logs/' 38 | is_file( 39 | logs_dir, 40 | filename=f'logs_v{VERSION}.csv', 41 | restore=restore, 42 | ) 43 | return CSVLogger( 44 | filename=logs_dir + f'logs_v{VERSION}.csv', 45 | append=True, 46 | ) 47 | 48 | def lr_schedule_per_step(): 49 | return LRSchedulerPerStep(schedule_per_step) 50 | 51 | def lr_schedule(): 52 | return LRScheduler(schedule) 53 | 54 | def schedule_per_step(step, learning_rate): 55 | if PARAMS['num_replicas'] > 1: 56 | if PARAMS['scale'] == 0.: 57 | lr_per_replica = PARAMS['learning_rate'] 58 | else: 59 | scale = PARAMS['num_replicas'] * PARAMS['scale'] 60 | lr_per_replica = PARAMS['learning_rate'] * scale 61 | else: 62 | lr_per_replica = PARAMS['learning_rate'] 63 | return ExponentialDecay( 64 | initial_learning_rate=lr_per_replica, 65 | decay_steps=PARAMS['decay_steps'], 66 | decay_rate=PARAMS['decay_rate'], 67 | staircase=True, 68 | )(step) 69 | 70 | def schedule(epoch, learning_rate): 71 | if PARAMS['num_replicas'] > 1: 72 | if PARAMS['scale'] == 0.: 73 | lr_per_replica = PARAMS['learning_rate'] 74 | else: 75 | scale = PARAMS['num_replicas'] * PARAMS['scale'] 76 | lr_per_replica = PARAMS['learning_rate'] * scale 77 | else: 78 | lr_per_replica = PARAMS['learning_rate'] 79 | 80 | decay_epochs = sorted(PARAMS['decay_epochs']) 81 | idx = bisect.bisect_right(decay_epochs, epoch) 82 | return lr_per_replica * PARAMS['decay_factor']**idx 83 | 84 | 85 | class LRSchedulerPerStep(LearningRateScheduler): 86 | def __init__( 87 | self, 88 | schedule, 89 | verbose=0, 90 | ): 91 | super(LRSchedulerPerStep, self).__init__(schedule, verbose) 92 | 93 | params = get_params('Train') 94 | self.version = params['version'] 95 | if params['switch']: 96 | self.params = get_frozen_params( 97 | 'Train', 98 | version=self.version, 99 | ) 100 | else: 101 | self.params = params 102 | self.track_every = self.params['track_every'] 103 | self.steps_per_execution = self.params['steps_per_execution'] 104 | self.strategy = self.params['strategy'] 105 | 106 | self.step = 1 107 | results_dir = get_results_dir(self.params['dataset']) 108 | self.trackers_file = results_dir + 'trackers.json' 109 | 110 | def on_epoch_begin(self, epoch, logs=None): 111 | pass 112 | 113 | def on_epoch_end(self, epoch, logs=None): 114 | super(LRSchedulerPerStep, self).on_epoch_end(epoch, logs) 115 | 116 | with open(self.trackers_file, 'r') as f: 117 | trackers = json.load(f) 118 | trackers[f"v{self.version}"]["epoch"] = epoch + 1 119 | with open(self.trackers_file, 'w', encoding='utf-8') as f: 120 | f.write(json.dumps( 121 | trackers, 122 | ensure_ascii=False, 123 | indent=4, 124 | )) 125 | 126 | def on_train_batch_begin(self, batch, logs=None): 127 | super(LRSchedulerPerStep, self).on_epoch_begin(self.step, logs) 128 | 129 | def on_train_batch_end(self, batch, logs=None): 130 | if (self.step - 1) % self.track_every == 0: 131 | with open(self.trackers_file, 'r') as f: 132 | trackers = json.load(f) 133 | trackers[f"v{self.version}"]["step"] = self.step 134 | with open(self.trackers_file, 'w', encoding='utf-8') as f: 135 | f.write(json.dumps( 136 | trackers, 137 | ensure_ascii=False, 138 | indent=4, 139 | )) 140 | if self.strategy != 'tpu': 141 | self.step += self.steps_per_execution 142 | else: 143 | self.step += 1 144 | 145 | class LRScheduler(LearningRateScheduler): 146 | def __init__(self, scheduler, verbose=0): 147 | super(LRScheduler, self).__init__(scheduler, verbose) 148 | 149 | params = get_params('Train') 150 | self.version = params['version'] 151 | if params['switch']: 152 | self.params = get_frozen_params( 153 | 'Train', 154 | version=self.version, 155 | ) 156 | else: 157 | self.params = params 158 | 159 | results_dir = get_results_dir(self.params['dataset']) 160 | self.trackers_file = results_dir + 'trackers.json' 161 | 162 | def on_epoch_end(self, epoch, logs=None): 163 | super(LRScheduler, self).on_epoch_end(epoch, logs) 164 | 165 | with open(self.trackers_file, 'r') as f: 166 | trackers = json.load(f) 167 | trackers[f"v{self.version}"]["epoch"] = epoch + 1 168 | with open(self.trackers_file, 'w', encoding='utf-8') as f: 169 | f.write(json.dumps( 170 | trackers, 171 | ensure_ascii=False, 172 | indent=4, 173 | )) 174 | 175 | class Checkpoint(ModelCheckpoint): 176 | def __init__( 177 | self, 178 | filepath, 179 | monitor='val_pck', 180 | save_best_only=True, 181 | mode='max', 182 | save_weights_only=False, 183 | verbose=0, 184 | ): 185 | super(Checkpoint, self).__init__( 186 | filepath, 187 | monitor=monitor, 188 | save_best_only=save_best_only, 189 | mode=mode, 190 | save_weights_only=save_weights_only, 191 | verbose=verbose, 192 | ) 193 | params = get_params('Train') 194 | self.version = params['version'] 195 | if params['switch']: 196 | self.params = get_frozen_params( 197 | 'Train', 198 | version=self.version, 199 | ) 200 | else: 201 | self.params = params 202 | 203 | self.best_pck = -1. 204 | results_dir = get_results_dir(self.params['dataset']) 205 | self.trackers_file = results_dir + 'trackers.json' 206 | 207 | def on_epoch_end(self, epoch, logs=None): 208 | super(Checkpoint, self).on_epoch_end(epoch, logs) 209 | if self.best_pck < self.best: 210 | self.best_pck = self.best 211 | with open(self.trackers_file, 'r') as f: 212 | trackers = json.load(f) 213 | trackers[f"v{self.version}"]["best"] = self.best_pck 214 | with open(self.trackers_file, 'w', encoding='utf-8') as f: 215 | f.write(json.dumps( 216 | trackers, 217 | ensure_ascii=False, 218 | indent=4, 219 | )) 220 | 221 | 222 | 223 | 224 | 225 | -------------------------------------------------------------------------------- /bbpose/train/losses.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import losses 3 | 4 | from utils import get_frozen_params, get_params 5 | 6 | 7 | NUM_JOINTS = 16 8 | 9 | 10 | class JointsMSE(losses.Loss): 11 | def __init__( 12 | self, 13 | reduction=tf.keras.losses.Reduction.SUM_OVER_BATCH_SIZE, 14 | name='joints_mse', 15 | **kwargs 16 | ): 17 | super(JointsMSE, self).__init__(reduction=reduction, name=name) 18 | 19 | params = get_params('Train') 20 | if params['switch']: 21 | self.params = get_frozen_params( 22 | 'Train', 23 | version=params['version'], 24 | ) 25 | else: 26 | self.params = params 27 | 28 | self.batch_size = self.params['batch_per_replica'] 29 | self.num_stacks = self.params['num_stacks'] 30 | self.hm_size = self.params['hm_size'] 31 | self.num_joints = NUM_JOINTS 32 | 33 | @classmethod 34 | def from_config(cls, config): 35 | return cls(**config) 36 | 37 | def get_config(self): 38 | config = { 39 | 'num_stacks':self.num_stacks, 40 | 'hm_size':self.hm_size, 41 | 'batch_size':self.batch_size, 42 | 'num_joints':self.num_joints, 43 | } 44 | base_config = super(JointsMSE, self).get_config() 45 | return dict( 46 | list(base_config.items()) + list(config.items()) 47 | ) 48 | 49 | @tf.function 50 | def __call__( 51 | self, 52 | y_true, 53 | y_pred, 54 | sample_weight=None, 55 | ): 56 | hm_pred = tf.reshape( 57 | y_pred, 58 | shape=[ 59 | self.batch_size, 60 | self.num_stacks, 61 | tf.square(self.hm_size), 62 | self.num_joints, 63 | ], 64 | ) 65 | hm_true = tf.reshape( 66 | y_true, 67 | shape=[ 68 | self.batch_size, 69 | tf.square(self.hm_size), 70 | self.num_joints, 71 | ], 72 | ) 73 | weights = tf.reshape( 74 | sample_weight, 75 | shape=[ 76 | self.batch_size, 77 | tf.square(self.hm_size), 78 | self.num_joints, 79 | ], 80 | ) 81 | 82 | loss = tf.zeros([self.batch_size]) 83 | for s in tf.range(self.num_stacks): 84 | j_loss = tf.zeros([self.batch_size]) 85 | for j in tf.range(self.num_joints): 86 | weighted_pred = weights[:,:,j] * hm_pred[:,s,:,j] 87 | weighted_true = weights[:,:,j] * hm_true[:,:,j] 88 | j_loss = ( 89 | j_loss 90 | + 0.5 91 | * tf.keras.losses.MSE(weighted_true, weighted_pred) 92 | ) 93 | loss = loss + j_loss / self.num_joints 94 | return loss 95 | 96 | 97 | 98 | 99 | -------------------------------------------------------------------------------- /bbpose/train/metrics.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from tensorflow.keras import metrics 3 | 4 | from train.utils import get_max_indices 5 | from utils import get_frozen_params, get_params 6 | 7 | 8 | NUM_JOINTS = 16 9 | EXCLUDE = 6 10 | PRECISION = 0.1 11 | 12 | 13 | class PercentageOfCorrectKeypoints(metrics.Metric): 14 | def __init__(self, name='pck', **kwargs): 15 | super(PercentageOfCorrectKeypoints, self).__init__(name=name) 16 | 17 | params = get_params('Train') 18 | if params['switch']: 19 | self.params = get_frozen_params( 20 | 'Train', 21 | version=params['version'], 22 | ) 23 | else: 24 | self.params = params 25 | 26 | self.num_stacks = self.params['num_stacks'] 27 | self.hm_size = self.params['hm_size'] 28 | self.threshold = self.params['threshold'] 29 | self.num_joints = NUM_JOINTS - EXCLUDE 30 | 31 | self.accuracy = tf.Variable(0.0, trainable=False) 32 | self.num_batches = tf.Variable(0.0, trainable=False) 33 | 34 | @classmethod 35 | def from_config(cls, config): 36 | return cls(**config) 37 | 38 | def get_config(self): 39 | config = { 40 | 'num_stacks':self.num_stacks, 41 | 'hm_size':self.hm_size, 42 | 'threshold':self.threshold, 43 | 'num_joints':self.num_joints, 44 | } 45 | base_config = super(PercentageOfCorrectKeypoints, self).get_config() 46 | return dict( 47 | list(base_config.items()) + list(config.items()) 48 | ) 49 | 50 | @tf.function 51 | def update_state( 52 | self, 53 | y_true, 54 | y_pred, 55 | sample_weight=None, 56 | ): 57 | tf.debugging.assert_equal( 58 | sample_weight, 59 | None, 60 | message='sample_weight is not required', 61 | ) 62 | hm_pred = tf.concat( 63 | values=[ 64 | y_pred[...,:6], 65 | y_pred[...,10:12], 66 | y_pred[...,14:], 67 | ], 68 | axis=4, 69 | ) 70 | hm_pred = hm_pred[:, self.num_stacks-1, :, :, :] 71 | hm_true = tf.concat( 72 | values=[ 73 | y_true[...,:6], 74 | y_true[...,10:12], 75 | y_true[...,14:], 76 | ], 77 | axis=3, 78 | ) 79 | 80 | batch_acc = self._get_batch_acc(hm_pred, hm_true) 81 | self.accuracy.assign_add(batch_acc) 82 | self.num_batches.assign_add(1.) 83 | 84 | def result(self): 85 | return self.accuracy / self.num_batches 86 | 87 | def reset_state(self): 88 | self.accuracy.assign(0.) 89 | self.num_batches.assign(0.) 90 | 91 | def _get_batch_acc(self, hm_pred, hm_true): 92 | pred_max = get_max_indices(hm_pred) 93 | true_max = get_max_indices(hm_true) 94 | l2_distances = self._get_l2_distances(pred_max, true_max) 95 | 96 | batch_acc = tf.zeros([]) 97 | visible_joints = tf.zeros([]) 98 | 99 | joint_acc = tf.TensorArray( 100 | tf.float32, 101 | size=self.num_joints+1, 102 | dynamic_size=False, 103 | clear_after_read=False, 104 | ) 105 | for j in tf.range(self.num_joints): 106 | joint_acc = joint_acc.write( 107 | index=j, 108 | value=self._get_joint_acc(l2_distances[:,j]), 109 | ) 110 | if joint_acc.read(j) >= 0.: 111 | batch_acc = batch_acc + joint_acc.read(j) 112 | visible_joints = visible_joints + 1. 113 | 114 | if visible_joints != 0.: 115 | joint_acc = joint_acc.write( 116 | index=self.num_joints, 117 | value=batch_acc / visible_joints, 118 | ) 119 | joint_acc = joint_acc.stack() 120 | return joint_acc[-1] 121 | else: 122 | return tf.zeros([]) 123 | 124 | def _get_l2_distances(self, pred_max, true_max): 125 | batch_size = tf.shape(true_max)[0] 126 | distances = tf.TensorArray( 127 | tf.float32, 128 | size=batch_size, 129 | dynamic_size=False, 130 | ) 131 | for n in tf.range(batch_size): 132 | dist = tf.TensorArray( 133 | tf.float32, 134 | size=self.num_joints, 135 | dynamic_size=False, 136 | ) 137 | for j in tf.range(self.num_joints): 138 | if true_max[n, 0, j] > 1 and true_max[n, 1, j] > 1: 139 | dist = dist.write( 140 | index=j, 141 | value=( 142 | tf.norm(pred_max[n, :, j] - true_max[n, :, j]) 143 | / (self.hm_size * PRECISION) 144 | ) 145 | ) 146 | else: 147 | dist = dist.write(j, -1.) 148 | dist = dist.stack() 149 | distances = distances.write(n, dist) 150 | return distances.stack() 151 | 152 | def _get_joint_acc(self, joint_dists): 153 | visible = joint_dists[tf.not_equal(joint_dists, -1.)] 154 | if tf.size(visible) > 0: 155 | below_thr = tf.less(visible, self.threshold) 156 | below_thr = tf.reduce_sum( 157 | tf.cast(below_thr, tf.float32) 158 | ) 159 | return below_thr / tf.cast(tf.size(visible), tf.float32) 160 | else: 161 | return -tf.ones([]) 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | -------------------------------------------------------------------------------- /bbpose/train/train.py: -------------------------------------------------------------------------------- 1 | import json 2 | import math 3 | import os 4 | 5 | import tensorflow as tf 6 | tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR) 7 | 8 | from ingestion.ingest import DataGenerator 9 | from models import model as cnn 10 | from preprocess.preprocess import DataPreprocessor 11 | from train import callbacks 12 | from train.losses import JointsMSE 13 | from train.metrics import PercentageOfCorrectKeypoints 14 | from utils import ( 15 | force_update, 16 | freeze_cfg, 17 | get_frozen_params, 18 | get_params, 19 | get_results_dir, 20 | is_file, 21 | reload_modules, 22 | ) 23 | 24 | 25 | def run(restore=False): 26 | force_update({'switch':False}) 27 | reload_modules(cnn, callbacks) 28 | 29 | params = get_params('Train') 30 | version = params['version'] 31 | if restore: 32 | force_update({'switch':True}) 33 | reload_modules(cnn, callbacks) 34 | 35 | params = get_frozen_params( 36 | 'Train', 37 | version=version, 38 | ) 39 | 40 | results_dir = get_results_dir(params['dataset']) 41 | else: 42 | freeze_cfg(version=version) 43 | results_dir = get_results_dir(params['dataset']) 44 | 45 | skeleton = {} 46 | skeleton[f'v{version}'] = { 47 | 'best': 0.0, 48 | 'epoch': 0, 49 | 'step': 0, 50 | } 51 | is_file(results_dir, 'trackers.json') 52 | with open(results_dir + 'trackers.json', 'w', encoding='utf-8') as f: 53 | f.write(json.dumps( 54 | skeleton, 55 | ensure_ascii=False, 56 | indent=4, 57 | )) 58 | 59 | tf.config.run_functions_eagerly(params['is_eager']) 60 | 61 | if params['strategy'] == 'default': 62 | strategy = tf.distribute.get_strategy() 63 | elif params['strategy'] == 'mirrored': 64 | strategy = tf.distribute.MirroredStrategy() 65 | elif params['strategy'] == 'tpu': 66 | resolver = tf.distribute.cluster_resolver.TPUClusterResolver( 67 | tpu=params['tpu_address'] 68 | ) 69 | tf.config.experimental_connect_to_cluster(resolver) 70 | tf.tpu.experimental.initialize_tpu_system(resolver) 71 | strategy = tf.distribute.TPUStrategy(resolver) 72 | 73 | force_update({'num_replicas':strategy.num_replicas_in_sync}) 74 | reload_modules(callbacks) 75 | 76 | if params['strategy'] != 'tpu': 77 | if params['mixed_precision']: 78 | tf.keras.mixed_precision.set_global_policy('mixed_float16') 79 | 80 | steps_per_execution = params['steps_per_execution'] 81 | assert params['track_every'] % steps_per_execution == 0, \ 82 | "'track_every' must be a multiple of 'steps_per_execution'" 83 | else: 84 | steps_per_execution = None 85 | 86 | assert params['img_size'] == 4 * params['hm_size'], \ 87 | "'hm_size' must be 4 times smaller than 'img_size'" 88 | 89 | generator = DataGenerator() 90 | preprocessor = DataPreprocessor() 91 | if params['use_records']: 92 | train_records, val_records = generator.load_records() 93 | train_tables, val_tables = preprocessor.read_records( 94 | train_records, 95 | val_records, 96 | ) 97 | else: 98 | assert params['strategy'] != 'tpu', \ 99 | "TPUStrategy only supports TFRecords as input" 100 | if os.path.isdir(params['img_dir']): 101 | train_tables, val_tables = generator.load_datasets() 102 | else: 103 | print(f"{params['img_dir']} does not exist") 104 | 105 | train_dataset, val_dataset = preprocessor.get_datasets( 106 | train_tables, 107 | val_tables, 108 | ) 109 | 110 | if params['schedule_per_step']: 111 | lr_callback = callbacks.lr_schedule_per_step() 112 | lr_object = callbacks.LRSchedulerPerStep 113 | else: 114 | lr_callback = callbacks.lr_schedule() 115 | lr_object = callbacks.LRScheduler 116 | 117 | callbacks_list = [ 118 | lr_callback, 119 | callbacks.savemodel(), 120 | callbacks.csvlogger(restore), 121 | ] 122 | 123 | if params['use_records']: 124 | steps_per_epoch = params['train_size'] // params['batch_per_replica'] 125 | validation_steps = int(math.ceil( 126 | params['val_size'] / params['batch_per_replica'] 127 | )) 128 | else: 129 | steps_per_epoch = train_dataset.cardinality().numpy() 130 | validation_steps = val_dataset.cardinality().numpy() 131 | 132 | if restore: 133 | if params['use_cloud']: 134 | savemodel_dir = ( 135 | params['gcs_results'].rstrip('/') + f'/{str(version)}' 136 | ) 137 | else: 138 | savemodel_dir = results_dir + f'savemodel/{str(version)}' 139 | 140 | if tf.io.gfile.isdir(savemodel_dir): 141 | print('Restoring...') 142 | with open(results_dir + 'trackers.json', 'r') as f: 143 | trackers = json.load(f) 144 | 145 | callbacks_list[1].best = trackers[f"v{version}"]["best"] 146 | initial_epoch = int(trackers[f"v{version}"]["epoch"]) 147 | if params['schedule_per_step']: 148 | callbacks_list[0].step = int(trackers[f"v{version}"]["step"]) 149 | 150 | with strategy.scope(): 151 | model = tf.keras.models.load_model( 152 | savemodel_dir, 153 | custom_objects=get_custom_objects( 154 | params['schedule_per_step'], 155 | ), 156 | ) 157 | model = get_compiled_model(model, steps_per_execution) 158 | else: 159 | print('No SaveModel found, creating a new model from ' \ 160 | 'scratch...') 161 | initial_epoch = 0 162 | with strategy.scope(): 163 | model = cnn.get_model() 164 | model = get_compiled_model(model, steps_per_execution) 165 | else: 166 | print('Creating a new model...') 167 | initial_epoch = 0 168 | with strategy.scope(): 169 | model = cnn.get_model() 170 | model = get_compiled_model(model, steps_per_execution) 171 | 172 | force_update({'switch':False}) 173 | 174 | model.fit( 175 | train_dataset, 176 | validation_data=val_dataset, 177 | initial_epoch=initial_epoch, 178 | epochs=params['num_epochs'], 179 | callbacks=callbacks_list, 180 | steps_per_epoch=steps_per_epoch, 181 | validation_steps=validation_steps, 182 | ) 183 | return model 184 | 185 | def get_compiled_model(model, steps_per_execution): 186 | model.compile( 187 | optimizer=tf.keras.optimizers.RMSprop(), 188 | loss=JointsMSE(), 189 | metrics=[PercentageOfCorrectKeypoints()], 190 | steps_per_execution=steps_per_execution, 191 | ) 192 | return model 193 | 194 | def get_custom_objects(schedule_per_step): 195 | if schedule_per_step: 196 | lr_object = callbacks.LRSchedulerPerStep 197 | else: 198 | lr_object = callbacks.LRScheduler 199 | 200 | return { 201 | "JointsMSE":JointsMSE, 202 | "PercentageOfCorrectKeypoints":PercentageOfCorrectKeypoints, 203 | "LRScheduler":lr_object, 204 | "Checkpoint":callbacks.Checkpoint, 205 | } 206 | 207 | 208 | 209 | 210 | 211 | 212 | -------------------------------------------------------------------------------- /bbpose/train/utils.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | 4 | def get_max_indices(heatmaps): 5 | batch_size = tf.shape(heatmaps)[0] 6 | hm_size = tf.cast(tf.shape(heatmaps)[2], tf.float32) 7 | num_joints = tf.shape(heatmaps)[3] 8 | tf.debugging.assert_equal( 9 | tf.rank(heatmaps), 10 | 4, 11 | message='heatmaps should be of rank 4', 12 | ) 13 | 14 | hm_reshaped = tf.reshape(heatmaps, [batch_size, -1, num_joints]) 15 | max_vals = tf.reduce_max(hm_reshaped, axis=1) 16 | max_vals = tf.reshape(max_vals, [batch_size, 1, num_joints]) 17 | 18 | max_idx = tf.argmax(hm_reshaped, axis=1) 19 | max_idx = tf.reshape(max_idx, [batch_size, 1, num_joints]) 20 | max_idx = tf.cast( 21 | tf.tile( 22 | max_idx + 1, 23 | multiples=[1, 2, 1], 24 | ), 25 | tf.float32, 26 | ) 27 | 28 | max_indices = tf.TensorArray( 29 | tf.float32, 30 | size=2, 31 | dynamic_size=False, 32 | ) 33 | max_indices = max_indices.write( 34 | index=0, 35 | value=(max_idx[:, 0, :] - 1) % hm_size + 1, 36 | ) 37 | max_indices = max_indices.write( 38 | index=1, 39 | value=tf.floor((max_idx[:, 1, :] - 1) / hm_size) + 1, 40 | ) 41 | max_indices = max_indices.stack() 42 | max_indices = tf.transpose(max_indices, perm=[1,0,2]) 43 | 44 | mask = tf.cast( 45 | tf.greater(max_vals, 0.0), 46 | tf.float32, 47 | ) 48 | mask = tf.tile(mask, multiples=[1,2,1]) 49 | return max_indices * mask 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | -------------------------------------------------------------------------------- /bbpose/utils.py: -------------------------------------------------------------------------------- 1 | from configparser import ConfigParser, ExtendedInterpolation 2 | from importlib import reload 3 | import os 4 | import shutil 5 | import subprocess 6 | import sys 7 | 8 | 9 | BB_DIR = os.path.dirname(os.path.realpath(__file__)) + '/' 10 | CFG = BB_DIR + 'configs/' 11 | MAIN = 'bulat' 12 | MPII = { 13 | 'data':'https://datasets.d2.mpi-inf.mpg.de/andriluka14cvpr/', 14 | 'annos':'https://github.com/bearpaw/pytorch-pose/raw/master/', 15 | } 16 | DATASETS = { 17 | 'mpii':{ 18 | 'data':MPII['data']+'mpii_human_pose_v1.tar.gz', 19 | 'annotations':MPII['annos']+'data/mpii/mpii_annotations.json', 20 | 'detections':MPII['annos']+'evaluation/data/detections_our_format.mat', 21 | }, 22 | } 23 | SECTIONS = [ 24 | 'Ingestion', 25 | 'Preprocess', 26 | 'Models', 27 | 'Train', 28 | 'Test' 29 | ] 30 | REQUIRED = [ 31 | 'version', 32 | 'img_dir', 33 | ] 34 | TYPES = [ 35 | 'string', 36 | 'integer', 37 | 'float', 38 | 'boolean', 39 | 'list', 40 | ] 41 | DEFAULTS = { 42 | 'newell':{ 43 | 'string':{ 44 | 'img_dir':'/PATH/TO/IMG/DIR', 45 | 'dataset':'mpii', 46 | 'gcs_data':'', 47 | 'arch':'hourglass', 48 | 'initializer':'glorot_normal', 49 | 'strategy':'default', 50 | 'tpu_address':'', 51 | 'gcs_results':'', 52 | }, 53 | 'integer':{ 54 | 'version':0, 55 | 'img_size':256, 56 | 'hm_size':64, 57 | 'num_stacks': 8, 58 | 'batch_per_replica':8, 59 | 'num_replicas':1, 60 | 'train_size':0, 61 | 'val_size':0, 62 | 'toy_samples':250, 63 | 'examples_per_record':250, 64 | 'interleave_num':-1, 65 | 'interleave_block':1, 66 | 'sigma':1, 67 | 'num_filters':256, 68 | 'num_epochs':100, 69 | 'steps_per_execution':1, 70 | 'track_every':32, 71 | 'decay_steps':2000, 72 | }, 73 | 'float':{ 74 | 'momentum':0.9, 75 | 'epsilon':0.001, 76 | 'dropout_rate':0.2, 77 | 'threshold':0.5, 78 | 'decay_factor':0.1, 79 | 'learning_rate':0.00025, 80 | 'scale':0.0, 81 | 'decay_rate':0.96, 82 | }, 83 | 'boolean':{ 84 | 'use_records':False, 85 | 'use_cloud':False, 86 | 'switch':False, 87 | 'download_images':False, 88 | 'toy_set':False, 89 | 'is_eager':False, 90 | 'mixed_precision':False, 91 | 'schedule_per_step':False, 92 | }, 93 | 'list':{ 94 | 'mean':[ 95 | 0.4624228775501251, 96 | 0.44416481256484985, 97 | 0.4025438725948334 98 | ], 99 | 'decay_epochs':[60, 90], 100 | } 101 | }, 102 | 'bulat':{ 103 | 'string':{ 104 | 'img_dir':'/PATH/TO/IMG/DIR', 105 | 'dataset':'mpii', 106 | 'gcs_records':'', 107 | 'arch':'softgate', 108 | 'initializer':'glorot_uniform', 109 | 'strategy':'default', 110 | 'tpu_address':'', 111 | 'gcs_results':'', 112 | }, 113 | 'integer':{ 114 | 'version':0, 115 | 'img_size':256, 116 | 'hm_size':64, 117 | 'num_stacks': 4, 118 | 'batch_per_replica':24, 119 | 'num_replicas':1, 120 | 'train_size':0, 121 | 'val_size':0, 122 | 'toy_samples':250, 123 | 'examples_per_record':250, 124 | 'interleave_num':2, 125 | 'interleave_block':1, 126 | 'sigma':1, 127 | 'num_filters':144, 128 | 'num_epochs':200, 129 | 'steps_per_execution':1, 130 | 'track_every':32, 131 | 'decay_steps':2000, 132 | }, 133 | 'float':{ 134 | 'momentum':0.9, 135 | 'epsilon':0.001, 136 | 'dropout_rate':0.2, 137 | 'threshold':0.5, 138 | 'decay_factor':0.2, 139 | 'learning_rate':0.00025, 140 | 'scale':0.0, 141 | 'decay_rate':0.96, 142 | }, 143 | 'boolean':{ 144 | 'use_records':False, 145 | 'use_cloud':False, 146 | 'switch':False, 147 | 'download_images':False, 148 | 'toy_set':False, 149 | 'is_eager':False, 150 | 'mixed_precision':False, 151 | 'schedule_per_step':False, 152 | }, 153 | 'list':{ 154 | 'mean':[ 155 | 0.4624228775501251, 156 | 0.44416481256484985, 157 | 0.4025438725948334 158 | ], 159 | 'decay_epochs':[75, 100, 150], 160 | } 161 | } 162 | } 163 | IGNORE = [ 164 | 'switch', 165 | 'num_replicas', 166 | 'train_size', 167 | 'val_size', 168 | ] 169 | DUPLICATES = { 170 | 'DEFAULT':[], 171 | 'Ingestion':[], 172 | 'Preprocess':[], 173 | 'Models':[], 174 | 'Train':[], 175 | 'Test':[ 176 | 'toy_set', 177 | 'is_eager', 178 | 'strategy', 179 | 'tpu_address', 180 | 'gcs_results', 181 | 'mixed_precision', 182 | 'threshold', 183 | 'schedule_per_step', 184 | ], 185 | } 186 | FILTERS = { 187 | 'strategy':[ 188 | 'default', 189 | 'mirrored', 190 | 'tpu', 191 | ], 192 | 'dataset':[ 193 | 'mpii', 194 | ], 195 | 'img_size':[128, 192, 256, 320, 384, 448, 512], 196 | 'hm_size':[32, 48, 64, 80, 96, 112, 128], 197 | 'num_stacks':[2, 4, 8], 198 | 'arch':[ 199 | 'hourglass', 200 | 'softgate', 201 | ], 202 | 'initializer':[ 203 | 'glorot_normal', 204 | 'glorot_uniform', 205 | 'he_normal', 206 | 'he_uniform', 207 | ], 208 | } 209 | 210 | 211 | def get_params(*args, cfg=CFG+'config.cfg'): 212 | config = ConfigParser( 213 | interpolation=ExtendedInterpolation(), 214 | ) 215 | config.read(cfg) 216 | params = {} 217 | for section in args: 218 | assert isinstance(section, str), f"'{section}' must be a string" 219 | if section.capitalize() in SECTIONS: 220 | for option in config.options(section): 221 | params[option] = eval(config.get( 222 | section, 223 | option, 224 | )) 225 | else: 226 | print(( 227 | f"Section '{section}' was not found in the configuration file." 228 | )) 229 | print(f"Available sections: {SECTIONS}") 230 | return params 231 | 232 | def update_config(*args, cfg=CFG+'config.cfg'): 233 | config = ConfigParser( 234 | interpolation=ExtendedInterpolation(), 235 | ) 236 | config.read(cfg) 237 | for arg in args: 238 | config.set( 239 | arg[0], 240 | arg[1], 241 | arg[2], 242 | ) 243 | with open(cfg, 'w') as f: 244 | config.write(f) 245 | 246 | def set_model_version(version): 247 | assert isinstance(version, int), f"'{version}' must be an integer" 248 | 249 | update_params({ 250 | REQUIRED[0]:version, 251 | }) 252 | 253 | def set_media_directory(media_dir): 254 | assert isinstance(media_dir, str), f"'{media_dir}' must be a string" 255 | 256 | update_config(( 257 | 'DEFAULT', 258 | REQUIRED[1], 259 | '\'' + media_dir.rstrip('/') + '/\'', 260 | )) 261 | 262 | def update_params(param_dict, cfg=CFG+'config.cfg'): 263 | assert isinstance(param_dict, dict), f"'{param_dict}' must be a dictionary" 264 | assert isinstance(cfg, str), f"'{cfg}' must be a string" 265 | 266 | force_update({'switch':False}) 267 | 268 | config = ConfigParser( 269 | interpolation=ExtendedInterpolation() 270 | ) 271 | config.read(cfg) 272 | 273 | params_dict = filter_params(param_dict) 274 | for param, value in param_dict.items(): 275 | if param in config.defaults().keys(): 276 | conditional_update( 277 | 'DEFAULT', 278 | param, 279 | value, 280 | cfg=cfg, 281 | ) 282 | elif config.has_option(SECTIONS[0], param): 283 | conditional_update( 284 | SECTIONS[0], 285 | param, 286 | value, 287 | cfg=cfg, 288 | ) 289 | elif config.has_option(SECTIONS[1], param): 290 | conditional_update( 291 | SECTIONS[1], 292 | param, 293 | value, 294 | cfg=cfg, 295 | ) 296 | elif config.has_option(SECTIONS[2], param): 297 | conditional_update( 298 | SECTIONS[2], 299 | param, 300 | value, 301 | cfg=cfg, 302 | ) 303 | elif config.has_option(SECTIONS[3], param): 304 | conditional_update( 305 | SECTIONS[3], 306 | param, 307 | value, 308 | cfg=cfg, 309 | ) 310 | elif config.has_option(SECTIONS[4], param): 311 | conditional_update( 312 | SECTIONS[4], 313 | param, 314 | value, 315 | cfg=cfg, 316 | ) 317 | 318 | def filter_params(param_dict): 319 | param_list = get_param_list() 320 | 321 | for param, value in param_dict.copy().items(): 322 | if param not in param_list: 323 | print(f"Unknown param: '{param}'") 324 | param_dict.pop(param) 325 | continue 326 | 327 | if not type_check(param, value): 328 | type_message(param, value) 329 | param_dict.pop(param) 330 | continue 331 | 332 | if param in IGNORE: 333 | print(f"'{param}' is not an adjustable parameter.\n") 334 | param_dict.pop(param) 335 | elif param == 'initializer': 336 | if value not in FILTERS[param]: 337 | print( 338 | f"Recommended '{param}' values include: {FILTERS[param]}." 339 | ) 340 | print(( 341 | "Please check with the tensorflow documentation first to " 342 | "determine if your requested initializer is supported. If " 343 | "not, you may receive an error at runtime." 344 | )) 345 | print(( 346 | 'https://www.tensorflow.org/api_docs/python/tf/keras/' 347 | 'initializers\n' 348 | )) 349 | elif param == 'num_filters': 350 | if value % 8 != 0: 351 | print(( 352 | f"Supported '{param}' values include only multiples of 8. " 353 | )) 354 | print(f"'{param}' was not updated.\n") 355 | param_dict.pop(param) 356 | elif param == 'scale': 357 | if value > 1.0: 358 | print(( 359 | "Warning: It is not recommended to scale the learning rate " 360 | "with a value greater than one." 361 | )) 362 | elif param == 'strategy': 363 | generic_filter(param_dict, param) 364 | elif param == 'dataset': 365 | generic_filter(param_dict, param) 366 | elif param == 'img_size': 367 | generic_filter(param_dict, param) 368 | elif param == 'hm_size': 369 | generic_filter(param_dict, param) 370 | elif param == 'num_stacks': 371 | generic_filter(param_dict, param) 372 | elif param == 'arch': 373 | generic_filter(param_dict, param) 374 | return param_dict 375 | 376 | def conditional_update( 377 | section, 378 | param, 379 | value, 380 | cfg=CFG+'config.cfg', 381 | ): 382 | if param in DUPLICATES[section]: 383 | pass 384 | elif param in DEFAULTS[MAIN]['string']: 385 | update_config( 386 | (section, f'{param}', f"'{value}'"), 387 | cfg=cfg, 388 | ) 389 | else: 390 | update_config( 391 | (section, f'{param}', f'{value}'), 392 | cfg=cfg, 393 | ) 394 | 395 | def get_param_list(): 396 | param_list = [] 397 | for t in TYPES: 398 | param_list += [k for k in DEFAULTS[MAIN][t]] 399 | return param_list 400 | 401 | def type_check( 402 | param, 403 | value, 404 | ): 405 | if isinstance(value, bool): 406 | if param in DEFAULTS[MAIN]['boolean']: 407 | return True 408 | elif isinstance(value, list): 409 | if param in DEFAULTS[MAIN]['list']: 410 | list_type = type( 411 | DEFAULTS[MAIN]['list'][param][0] 412 | ) 413 | if all(isinstance(elem, list_type) for elem in value): 414 | return True 415 | elif isinstance(value, int): 416 | if param in DEFAULTS[MAIN]['integer']: 417 | return True 418 | elif isinstance(value, float): 419 | if param in DEFAULTS[MAIN]['float']: 420 | return True 421 | elif isinstance(value, str): 422 | if param in DEFAULTS[MAIN]['string']: 423 | return True 424 | return False 425 | 426 | def type_message(param, value): 427 | if isinstance(value, list): 428 | list_type = type( 429 | DEFAULTS[MAIN]['list'][param][0] 430 | ).__name__ 431 | print(( 432 | f"'{param}' must be of type 'list' containing elements of type " 433 | f"'{list_type}'" 434 | )) 435 | print(f"'{param}' was not updated.\n") 436 | else: 437 | param_type = get_type(param) 438 | print(f"'{param}' must be of type '{param_type}'") 439 | print(f"'{param}' was not updated.\n") 440 | 441 | def generic_filter(param_dict, param): 442 | if param_dict[param] not in FILTERS[param]: 443 | print(f"Supported '{param}' values include: {FILTERS[param]}.") 444 | print(f"'{param}' was not updated.\n") 445 | param_dict.pop(param) 446 | 447 | def get_type(param): 448 | for k, v in DEFAULTS[MAIN].items(): 449 | if param in v: 450 | return k 451 | 452 | def force_update(param_dict): 453 | config = ConfigParser( 454 | interpolation=ExtendedInterpolation(), 455 | ) 456 | config.read(CFG + 'config.cfg') 457 | 458 | for param in param_dict: 459 | if param in config.defaults().keys(): 460 | if param in IGNORE: 461 | update_config(( 462 | 'DEFAULT', 463 | f'{param}', 464 | f'{param_dict[param]}', 465 | )) 466 | 467 | def view_config(version='current'): 468 | if version == 'current': 469 | params = get_params(*SECTIONS) 470 | elif isinstance(version, int): 471 | try: 472 | params = get_frozen_params( 473 | *SECTIONS, 474 | version=version, 475 | ) 476 | except: 477 | return f"No config file for version '{version}' exists." 478 | else: 479 | return 'Invalid entry.' 480 | 481 | for ignore in IGNORE: 482 | params.pop(ignore) 483 | return params 484 | 485 | def reset_default_params(defaults=MAIN): 486 | default_dict = {} 487 | ignore_dict = {} 488 | for k, v in DEFAULTS[defaults].items(): 489 | for nested_k, nested_v in v.items(): 490 | if nested_k not in IGNORE: 491 | default_dict[nested_k] = nested_v 492 | else: 493 | ignore_dict[nested_k] = nested_v 494 | update_params(default_dict) 495 | force_update(ignore_dict) 496 | 497 | def freeze_cfg(version): 498 | shutil.copyfile( 499 | CFG + 'config.cfg', 500 | CFG + f'freeze_v{version}.cfg', 501 | ) 502 | 503 | def get_frozen_params(*section, version=0): 504 | return get_params( 505 | *section, 506 | cfg=CFG + f'freeze_v{version}.cfg', 507 | ) 508 | 509 | def update_frozen_params(param_dict, version=0): 510 | assert isinstance(param_dict, dict), f"'{param_dict}' must be a dictionary" 511 | assert isinstance(version, int), f"'{version}' must be an integer" 512 | 513 | update_params( 514 | param_dict, 515 | cfg=CFG + f'freeze_v{version}.cfg', 516 | ) 517 | 518 | def reload_modules(*args): 519 | for arg in args: 520 | reload(arg) 521 | 522 | def is_file( 523 | path, 524 | filename, 525 | restore=False, 526 | ): 527 | if os.path.isdir(path): 528 | if os.path.isfile(path + filename): 529 | if restore == False: 530 | open(path + filename, 'w').close() 531 | else: 532 | pass 533 | else: 534 | open(path + filename, 'w').close() 535 | else: 536 | os.mkdir(path) 537 | open(path + filename, 'w').close() 538 | 539 | def is_dir(path): 540 | if os.path.isdir(path): 541 | pass 542 | else: 543 | os.mkdir(path) 544 | 545 | def get_results_dir(dataset_name): 546 | return get_module_dir('results/', dataset_name) 547 | 548 | def get_data_dir(dataset_name): 549 | return get_module_dir('data/', dataset_name) 550 | 551 | def get_module_dir(module, dataset_name): 552 | is_dir(BB_DIR + module) 553 | module_dir = os.path.abspath(os.path.join( 554 | BB_DIR, 555 | module, 556 | dataset_name, 557 | )) 558 | is_dir(module_dir) 559 | return module_dir + '/' 560 | 561 | def call_bash(command, message=None): 562 | try: 563 | output = subprocess.check_output( 564 | command, 565 | shell=True, 566 | stderr=subprocess.STDOUT, 567 | ) 568 | if message is not None: 569 | print(message) 570 | return output 571 | except subprocess.CalledProcessError as e: 572 | print(e.output) 573 | print("Exiting program...") 574 | sys.exit() 575 | 576 | 577 | 578 | 579 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import setuptools 2 | 3 | setuptools.setup( 4 | name='bbpose', 5 | version='0.1.0', 6 | packages=setuptools.find_packages() 7 | ) --------------------------------------------------------------------------------