├── LICENSE ├── README.md ├── data_generator.py ├── inference.py ├── main_script.py ├── requirements.txt └── train.py /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Sabarinathan 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Person Attribute Recognition with Deep Learning 2 | [![License: GPL](https://img.shields.io/badge/License-GPL-yellow.svg)](https://opensource.org/licenses/GPL-3.0) ![By](https://img.shields.io/static/v1?label=By&message=PyTorch&color=red) 3 | 4 | ## Star History 5 | 6 | [![Star History Chart](https://api.star-history.com/svg?repos=dsabarinathan/attribute-recognition&type=Date)](https://star-history.com/#dsabarinathan/attribute-recognition&Date) 7 | 8 | ## Overview 9 | 10 | This repository contains a PyTorch implementation of a person attribute recognition model. The model has been trained to recognize various attributes such as age, gender, hair length, upper body features, lower body features, and accessories. 11 | 12 | ## Pre-trained model 13 | 14 | 15 | | Model | ROC AUC | F1 Score | Model Type | 16 | |------------|-------------|--------------|--------------| 17 | | [Resnet 18](https://drive.google.com/file/d/1lxdNB2Ix8bOOTxFeVVz2VcgPQMIuMQCZ/view?usp=sharing) | 0.9221371 | 0.910283516 | Pytorch | 18 | | [Resnet 30](https://drive.google.com/file/d/1hQZQDu0x7ugBLm_bjisJO-oqHhMlaGEo/view?usp=sharing) | 0.94394 | 0.943229 | Pytorch | 19 | 20 | 21 | ## Model Details 22 | 23 | The model is trained to recognize the following attributes: 24 | 25 | - Age 26 | - Young 27 | - Adult 28 | - Old 29 | 30 | - Gender 31 | - Female 32 | 33 | - Hair Length 34 | - Short 35 | - Long 36 | - Bald 37 | 38 | - Upper Body Features 39 | - Length 40 | - Short 41 | - Color 42 | - Black 43 | - Blue 44 | - Brown 45 | - Green 46 | - Grey 47 | - Orange 48 | - Pink 49 | - Purple 50 | - Red 51 | - White 52 | - Yellow 53 | - Other 54 | 55 | - Lower Body Features 56 | - Length 57 | - Short 58 | - Color 59 | - Black 60 | - Blue 61 | - Brown 62 | - Green 63 | - Grey 64 | - Orange 65 | - Pink 66 | - Purple 67 | - Red 68 | - White 69 | - Yellow 70 | - Other 71 | - Type 72 | - Trousers & Shorts 73 | - Skirt & Dress 74 | 75 | - Accessories 76 | - Backpack 77 | - Bag 78 | - Glasses 79 | - Normal 80 | - Sun 81 | - Hat 82 | 83 | ## Usage 84 | 85 | ### 1. Installation 86 | 87 | Clone the repository and navigate to the project directory: 88 | 89 | ```bash 90 | git clone https://github.com/dsabarinathan/attribute-recognition.git 91 | cd attribute-recognition 92 | ``` 93 | 94 | ### 2. Download Pre-trained Model 95 | 96 | Download the pre-trained model weights file from the releases section of this repository and place it in the `models/` directory. 97 | 98 | The pre-trained model weights can be downloaded from Google Drive. [Download Model](https://drive.google.com/file/d/1lxdNB2Ix8bOOTxFeVVz2VcgPQMIuMQCZ/view?usp=sharing) 99 | 100 | ### 3. Install Dependencies 101 | 102 | Install the required Python packages: 103 | 104 | ```bash 105 | pip install -r requirements.txt 106 | ``` 107 | 108 | ### 4. Run Inference 109 | 110 | Use the provided script to perform attribute recognition on an input image: 111 | 112 | ```bash 113 | python inference.py --image_path path/to/your/image.jpg 114 | ``` 115 | 116 | Replace `path/to/your/image.jpg` with the path to the image you want to analyze. 117 | 118 | ### 5. Sample Results: 119 | 120 | ##### Input image: 121 | 122 | ![0028_c3s1_002001_02](https://github.com/dsabarinathan/attribute-recognition/assets/40907627/3b39e073-d39a-4174-8dca-ab152c0d10d9) 123 | 124 | #### Output: 125 | 126 | ``` 127 | Predicted results: {'labels': array(['Age-Adult', 'Gender-Female', 'LowerBody-Color-Black', 128 | 'LowerBody-Type-Trousers&Shorts'], dtype='0.5 72 | 73 | pos = np.where(predicted_results==1)[0] 74 | 75 | 76 | return {"labels" :label_col[pos],"prob":predicted_probs[0][pos]} 77 | 78 | def get_label_from_index(index): 79 | return label_col[index] 80 | 81 | import cv2 82 | import numpy as np 83 | import matplotlib.pyplot as plt 84 | 85 | # ... (previous code remains unchanged) 86 | 87 | def perform_inference_with_visualization(model, image_path, output_path): 88 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 89 | model.to(device) 90 | model.eval() 91 | 92 | # Create an empty white image 93 | white_image = np.ones((256, 256, 3), dtype=np.uint8) * 255 94 | 95 | # Load the person image 96 | person_image = cv2.imread(image_path) 97 | person_image = cv2.cvtColor(person_image, cv2.COLOR_BGR2RGB) 98 | 99 | # Resize the person image to fit within the white image 100 | person_image = cv2.resize(person_image, (128, 64)) 101 | 102 | # Calculate the position to center the person image in the white image 103 | y_offset = (256 - person_image.shape[0]) // 2 104 | x_offset = (256 - person_image.shape[1]) // 2 105 | 106 | # Place the person image on the white image 107 | white_image[y_offset:y_offset + person_image.shape[0], x_offset:x_offset + person_image.shape[1]] = person_image 108 | 109 | predicted_results = [] 110 | 111 | normalized_image = preprocess_image(image_path) 112 | normalized_image_tensor = normalized_image.to(device) 113 | normalized_image_tensor = normalized_image_tensor.unsqueeze(0) 114 | 115 | with torch.no_grad(): 116 | output = model(normalized_image_tensor) 117 | 118 | predicted_probs = output.cpu().numpy().astype(float) 119 | predicted_probs = sigmoid(predicted_probs) 120 | 121 | predicted_results = predicted_probs[0] > 0.5 122 | 123 | pos = np.where(predicted_results == 1)[0] 124 | 125 | labels = label_col[pos] 126 | probs = predicted_probs[0][pos] 127 | 128 | # Draw text labels on the image 129 | for label, prob in zip(labels, probs): 130 | text = f"{label}: {prob:.2f}" 131 | cv2.putText(white_image, text, ( person_image.shape[0], person_image.shape[1]), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2, cv2.LINE_AA) 132 | 133 | # Display the result image 134 | plt.imshow(white_image) 135 | plt.axis('off') 136 | plt.show() 137 | 138 | # Save the result image 139 | cv2.imwrite(output_path, cv2.cvtColor(white_image, cv2.COLOR_RGB2BGR)) 140 | 141 | return {"labels": labels, "prob": probs} 142 | 143 | # ... (main function remains unchanged) 144 | 145 | 146 | def main(): 147 | parser = argparse.ArgumentParser(description='Perform inference on an image using a trained PyTorch model.') 148 | parser.add_argument('--model_path', type=str, default='./models/ResNet18_best_model.pth', help='Path to the trained PyTorch model file') 149 | parser.add_argument('--image_path', type=str, required=True, help='Path to the input image for inference') 150 | args = parser.parse_args() 151 | 152 | print(args.model_path) 153 | print(args.image_path) 154 | # Load the model 155 | model_ft = torch.load(args.model_path) 156 | 157 | # Perform inference on the input image 158 | results = perform_inference(model_ft, args.image_path) 159 | 160 | 161 | print("Predicted results:", results) 162 | 163 | if __name__ == "__main__": 164 | main() 165 | -------------------------------------------------------------------------------- /main_script.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Sun Dec 3 22:14:04 2023 4 | 5 | @author: SABARI 6 | """ 7 | import pandas as pd 8 | import torch 9 | from data_generator import ClassificationDataset # Import your dataset module 10 | from model import setup_model # Import your model module 11 | from train import train_batch, evaluate_batch # Import your training functions 12 | import numpy as np 13 | from sklearn import metrics 14 | # Set device 15 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 16 | 17 | # Set hyperparameters 18 | batch_size = 32 19 | learning_rate = 0.0001 20 | num_epochs = 100 21 | 22 | # Load CSV file containing image names and labels 23 | train_data_csv = pd.read_csv('/content/drive/My Drive/WACV2023/combined_data.csv') 24 | 25 | # Split image names for training and validation 26 | train_label_img_name = train_data_csv['image_name'][0:97669].values 27 | valid_label_img_name = train_data_csv['image_name'][97669:].values 28 | 29 | # Define the labels and their corresponding columns 30 | label_col = ['Age-Young', 'Age-Adult', 'Age-Old', 'Gender-Female', 31 | 'Hair-Length-Short', 'Hair-Length-Long', 'Hair-Length-Bald', 32 | 'UpperBody-Length-Short', 'UpperBody-Color-Black', 33 | 'UpperBody-Color-Blue', 'UpperBody-Color-Brown', 34 | 'UpperBody-Color-Green', 'UpperBody-Color-Grey', 35 | 'UpperBody-Color-Orange', 'UpperBody-Color-Pink', 36 | 'UpperBody-Color-Purple', 'UpperBody-Color-Red', 37 | 'UpperBody-Color-White', 'UpperBody-Color-Yellow', 38 | 'UpperBody-Color-Other', 'LowerBody-Length-Short', 39 | 'LowerBody-Color-Black', 'LowerBody-Color-Blue', 40 | 'LowerBody-Color-Brown', 'LowerBody-Color-Green', 41 | 'LowerBody-Color-Grey', 'LowerBody-Color-Orange', 42 | 'LowerBody-Color-Pink', 'LowerBody-Color-Purple', 'LowerBody-Color-Red', 43 | 'LowerBody-Color-White', 'LowerBody-Color-Yellow', 44 | 'LowerBody-Color-Other', 'LowerBody-Type-Trousers&Shorts', 45 | 'LowerBody-Type-Skirt&Dress', 'Accessory-Backpack', 'Accessory-Bag', 46 | 'Accessory-Glasses-Normal', 'Accessory-Glasses-Sun', 'Accessory-Hat'] 47 | 48 | # Extract labels for training and validation sets 49 | train_label = train_data_csv[label_col][0:97669].values 50 | valid_label = train_data_csv[label_col][97669:].values 51 | 52 | # Construct full paths for images in training and validation sets 53 | train_label_names = ["/content/new_dataset/copied_files/" + name for name in train_label_img_name] 54 | valid_label_names = ["/content/new_dataset/copied_files/" + name for name in valid_label_img_name] 55 | 56 | # Create instances of the dataset and dataloaders 57 | train_dataset = ClassificationDataset(image_paths=train_label_names, 58 | targets=train_label, 59 | resize=(128, 64), 60 | augmentations=None, 61 | ) 62 | 63 | valid_dataset = ClassificationDataset(image_paths=valid_label_names, 64 | targets=valid_label, 65 | resize=(128, 64), 66 | augmentations=None, 67 | ) 68 | 69 | train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=16, shuffle=True, num_workers=4) 70 | 71 | valid_loader = torch.utils.data.DataLoader(valid_dataset, batch_size=16, shuffle=False, num_workers=4) 72 | 73 | # Initialize the model, criterion, and optimizer 74 | model, criterion, optimizer = setup_model(num_classes=len(train_dataset.classes), device=device) 75 | 76 | best_val_loss = float('inf') 77 | 78 | # Training loop 79 | for epoch in range(num_epochs): 80 | print(f"Epoch {epoch + 1}/{num_epochs}") 81 | 82 | # Training 83 | train_batch(train_loader, model, optimizer, device) 84 | 85 | # Evaluation 86 | predictions, valid_targets,best_val_loss,val_loss = evaluate_batch( 87 | valid_loader, model,best_val_loss,device=device 88 | ) 89 | #roc_auc = metrics.roc_auc_score(valid_targets, predictions) 90 | # Calculate accuracy 91 | valid_targets = np.array(valid_targets).flatten() 92 | predictions = np.array(predictions).flatten() 93 | predictions = np.uint8(predictions>0.5) 94 | accuracy = metrics.accuracy_score(valid_targets, predictions) 95 | 96 | # Calculate F1 score 97 | #f1 = metrics.f1_score(valid_targets, predictions, average='macro') 98 | f1 = metrics.f1_score(valid_targets, predictions,average='weighted') 99 | print(f"Validation Loss: {val_loss}") 100 | 101 | # Save the final model 102 | torch.save(model.state_dict(), "/content/drive/My Drive/WACV2023/final_model.pth") 103 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | torch==2.1.0+cu118 2 | opencv-python==4.8.0 3 | torchvision==0.16.0+cu118 4 | numpy==1.23.5 5 | scikit-learn==1.2.2 6 | tqdm 7 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Created on Sun Dec 3 22:12:11 2023 4 | 5 | @author: SABARI 6 | """ 7 | import torch 8 | import torch.nn as nn 9 | import torch.optim as optim 10 | from torchvision import models 11 | import numpy as np 12 | from sklearn import metrics 13 | import albumentations 14 | from torch.utils.data import Dataset 15 | # Define your ClassificationDataset class and other necessary classes/functions here 16 | 17 | def train_batch(data_loader, model, optimizer, device): 18 | model.train() 19 | for data in data_loader: 20 | inputs = data["image"] 21 | targets = data["targets"] 22 | inputs = inputs.to(device, dtype=torch.float) 23 | targets = targets.to(device, dtype=torch.float) 24 | optimizer.zero_grad() 25 | outputs = model(inputs) 26 | loss = nn.BCEWithLogitsLoss()(outputs, targets) 27 | loss.backward() 28 | optimizer.step() 29 | 30 | def evaluate_batch(data_loader, model, best_val_loss, device): 31 | checkpoint_filepath = "/content/drive/My Drive/WACV2023/" 32 | model.eval() 33 | final_targets = [] 34 | final_outputs = [] 35 | val_loss = 0 36 | 37 | with torch.no_grad(): 38 | for data in data_loader: 39 | inputs = data["image"] 40 | targets = data["targets"] 41 | inputs = inputs.to(device, dtype=torch.float) 42 | targets = targets.to(device, dtype=torch.float) 43 | output = model(inputs) 44 | cur_valid_loss = nn.BCEWithLogitsLoss()(output, targets) 45 | val_loss += cur_valid_loss.item() 46 | 47 | targets = targets.detach().cpu().numpy().tolist() 48 | output = output.detach().cpu().numpy().tolist() 49 | final_targets.extend(targets) 50 | final_outputs.extend(output) 51 | 52 | val_loss = val_loss / len(data_loader) 53 | 54 | if val_loss < best_val_loss: 55 | best_val_loss = val_loss 56 | torch.save(model.state_dict(), checkpoint_filepath + "DeepMAR_ResNet18_best_model.pth") 57 | print("Model saved!: "+checkpoint_filepath + "DeepMAR_ResNet18_best_model.pth") 58 | 59 | return final_outputs, final_targets, best_val_loss,val_loss 60 | 61 | 62 | --------------------------------------------------------------------------------