├── .gitignore ├── .gitmodules ├── README.md ├── data_analysis ├── bdd100k_dataset_distribution_analysis.ipynb ├── feature_extraction.py ├── kitti_dataset_distribution_analysis.ipynb ├── launcher.sh ├── nuscenes_dataset_distribution_analysis.ipynb ├── plot_functions.ipynb ├── plots │ ├── bdd100k_dataset_likelihood.png │ ├── bdd100k_training_clusters.gif │ ├── bdd100k_training_clusters.png │ ├── dequity_loss.png │ ├── kitti_dataset_likelihood.png │ ├── kitti_training_clusters.gif │ ├── kitti_training_clusters.png │ ├── nuscenes_dataset_likelihood.png │ ├── nuscenes_training_clusters.gif │ ├── nuscenes_training_clusters.png │ ├── tusimple_dataset_likelihood.png │ ├── tusimple_training_clusters.gif │ ├── tusimple_training_clusters.png │ ├── waymo_dataset_likelihood.png │ ├── waymo_training_clusters.gif │ └── waymo_training_clusters.png ├── tusimple_dataset_distribution_analysis.ipynb ├── visualize_kitti3d_pred.ipynb └── waymo_dataset_distribution_analysis.ipynb ├── docs ├── bevformer.md ├── bevfusion.md ├── dd3d.md ├── fig │ └── teaser.png └── getting_started.md └── utils └── kitti_object_dataset_downloader.sh /.gitignore: -------------------------------------------------------------------------------- 1 | # pickle files 2 | *.pkl 3 | 4 | # cache files 5 | *__pycache__ 6 | *.ipynb_checkpoints 7 | -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "BEVFormer"] 2 | path = BEVFormer 3 | url = git@github.com:towardsautonomy/BEVFormer.git 4 | [submodule "dd3d"] 5 | path = dd3d 6 | url = git@github.com:towardsautonomy/dd3d.git 7 | [submodule "BevFusion"] 8 | path = BevFusion 9 | url = git@github.com:alchemz/bevfusion.git 10 | 11 | [submodule "bevfusion"] 12 | path = bevfusion 13 | url = https://github.com/alchemz/bevfusion.git 14 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DatasetEquity: Are All Samples Created Equal? In The Quest For Equity Within Datasets 2 | 3 | *This is the official implementation of the ICCV 2023 Workshop paper: [DatasetEquity: Are All Samples Created Equal? In The Quest For Equity Within Datasets](https://arxiv.org/abs/2308.09878).* 4 | 5 | This paper presents a novel method for addressing data imbalance in machine learning. The method computes sample likelihoods based on image appearance using deep perceptual embeddings and clustering. It then uses these likelihoods to weigh samples differently during training with a proposed **Generalized Focal Loss** function. This loss can be easily integrated with deep learning algorithms. Experiments validate the method's effectiveness across autonomous driving vision datasets including KITTI and nuScenes. The loss function improves state-of-the-art 3D object detection methods, achieving over 200% AP gains on under-represented classes (Cyclist) in the KITTI dataset. The results demonstrate the method is generalizable, complements existing techniques, and is particularly beneficial for smaller datasets and rare classes. 6 | 7 | ![Teaser](docs/fig/teaser.png) 8 | 9 | ## [Getting Started](docs/getting_started.md) 10 | 11 | ## TL;DR 12 | 13 | The concept of this paper is simple: (1) Quantify the likelihood of occurrence for each sample in the training dataset (2) Compute **Generalized Focal Loss** based on the likelihoods (3) Train the model with the new weighted loss function. 14 | 15 | **Generalized Focal Loss** requires computing a loss weight for each sample, called *Dequity Weight*, and can be computed as follows: 16 | 17 | ```python 18 | def dequity_loss_weight(self, p: float, 19 | eta: float=1.0, 20 | gamma: float=5.0 21 | ) -> float: 22 | """Calculate the Dquity Weight. 23 | Args: 24 | p (float): The probability of the sample. 25 | eta (float): The parameter to control the weight. 26 | gamma (float): The parameter to control the weight. 27 | Returns: 28 | float: The dequity loss weight. 29 | """ 30 | return (eta + (1 - p) ** gamma) / (eta + 1) 31 | ``` 32 | 33 | The pseudo-code for the training algorithm is as follows: 34 | 35 | ```python 36 | for sample in dataloader: 37 | # retrieve the sample likelihood 38 | p = get_sample_likelihood(sample) 39 | # compute the DEquity weight 40 | w = dequity_loss_weight(p) 41 | # forward pass 42 | y_hat = model(sample) 43 | # compute the loss 44 | loss = loss_fn(y_hat, sample) 45 | # compute the weighted loss 46 | weighted_loss = w * loss <-- Generalized Focal Loss (Our Contribution) 47 | # backward pass 48 | weighted_loss.backward() 49 | ``` 50 | 51 | Sample likelihoods are computed beforehand. Please refer to the [Getting Started](docs/getting_started.md) guide for more details. 52 | 53 | ## Citation 54 | 55 | If you find this work useful for your research, please cite our paper: 56 | 57 | ``` 58 | @inproceedings{shrivastava2023datasetequity, 59 | title={DatasetEquity: Are All Samples Created Equal? In The Quest For Equity Within Datasets}, 60 | author={Shrivastava, Shubham and Zhang, Xianling and Nagesh, Sushruth and Parchami, Armin}, 61 | booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, 62 | year={2023} 63 | } 64 | ``` 65 | -------------------------------------------------------------------------------- /data_analysis/feature_extraction.py: -------------------------------------------------------------------------------- 1 | # import clustering method 2 | from sklearn.manifold import TSNE 3 | from sklearn.cluster import DBSCAN 4 | from sklearn import metrics 5 | from tqdm.notebook import tqdm 6 | from PIL import Image 7 | 8 | # import torch, cv2 and other dependencies 9 | from torchvision import datasets, transforms 10 | import torch 11 | import torch.nn as nn 12 | from torchvision import transforms 13 | from torchvision.models import resnet101 14 | 15 | import os 16 | import cv2 17 | import pickle 18 | import logging 19 | import numpy as np 20 | import pathlib 21 | import matplotlib.pyplot as plt 22 | import math 23 | import time 24 | import argparse 25 | 26 | # set random seed 27 | import random 28 | random.seed(33) 29 | np.random.seed(33) 30 | 31 | class BDD100kDataset(torch.utils.data.Dataset): 32 | def __init__( 33 | self, 34 | data_root, 35 | image_set="train", 36 | transform=transforms.Compose([transforms.Resize((384, 384)), 37 | transforms.ToTensor(), 38 | transforms.Normalize(0.5, 0.5), 39 | ]), 40 | is_test=False, 41 | keep_difficult=False, 42 | label_file=None, 43 | version='100k', 44 | split='train' 45 | ): 46 | """Dataset for BDD100k data. 47 | Args: 48 | data_root: the root of the BDD100k dataset, 49 | """ 50 | self.data_root = data_root 51 | self.transform = transform 52 | self.version = version 53 | self.split = split 54 | self.filenames, self.tokens = [], [] 55 | 56 | if self.version in ['10k', '100k']: 57 | self.data_root = os.path.join(self.data_root, self.version, self.split) 58 | else: 59 | assert self.version in ['10k','100k'] 60 | print('Please use either 10k samples version or 100k full dataset') 61 | 62 | #pbar = tqdm(enumerate(os.listdir(self.data_root))) 63 | #for filename in os.listdir(self.data_dir): 64 | img_list = os.listdir(self.data_root) 65 | for idx, filename in enumerate(tqdm(img_list)): 66 | img_path = os.path.join(self.data_root, filename) 67 | self.filenames.append(img_path) 68 | self.tokens.append(idx) 69 | 70 | print('Number of data samples in the set: {}'.format(len(self.filenames))) 71 | 72 | # method to get length of data 73 | def __len__(self): 74 | return len(self.filenames) 75 | 76 | # method to get a sample 77 | def __getitem__(self, idx): 78 | # get image 79 | image_filename = self.filenames[idx] 80 | image = Image.open(image_filename).convert('RGB') 81 | # transform image 82 | image = self.transform(image) 83 | # get token 84 | token = self.tokens[idx] 85 | # return image 86 | return {'image': image, 'token': token} 87 | 88 | 89 | 90 | if __name__ == "__main__": 91 | parser = argparse.ArgumentParser() 92 | parser.add_argument('--data_root', type=str, help="path to load the data") 93 | parser.add_argument('--results_dir', type=str, help="folder to save the embeddings") 94 | parser.add_argument('--feature_extactor', type=str, help="model to extract the embeddings") 95 | parser.add_argument('--data_type', type=str, help="select 10k or 100k") 96 | 97 | 98 | args = parser.parse_args() 99 | 100 | results_dir_timestamp = os.path.join(args.results_dir, time.strftime("%Y%m%d_%H%M%S")) 101 | if not os.path.exists(results_dir_timestamp): 102 | os.makedirs(results_dir_timestamp) 103 | print('INFO: {} created to store retrieval results \n'.format(results_dir_timestamp)) 104 | 105 | if args.data_type == 'bdd100k': 106 | dataset = BDD100kDataset(data_root=args.data_root, version='100k', split='train') 107 | else: 108 | pass 109 | 110 | 111 | dataset_str='bdd100k' 112 | train_features_fname = os.path.join(args.results_dir, f'{dataset_str}_training_features.pkl') 113 | train_cluster_info_fname = os.path.join(args.results_dir, f'{dataset_str}_training_cluster_info.pkl') 114 | 115 | 116 | if args.feature_extactor == 'resnet101': 117 | # use a pre-trained model to get the feature vectors for all the images 118 | # in the dataset 119 | model = resnet101(pretrained=True, progress=True) 120 | # remove the last layer keeping the weights 121 | model = torch.nn.Sequential(*list(model.children())[:-1]) 122 | 123 | model.eval() 124 | model.cuda() 125 | train_tokens, train_features = [], [] 126 | with torch.no_grad(): 127 | for i in tqdm(range(len(dataset))): 128 | img = dataset[i]['image'].unsqueeze(0).cuda() 129 | # inference 130 | feature = model(img).flatten().cpu().numpy() 131 | train_features.append(feature) 132 | train_tokens.append(dataset[i]['token']) 133 | train_features = np.array(train_features) 134 | # save the features to disk as a pickle file 135 | with open(train_features_fname, 'wb') as f: 136 | pickle.dump({'tokens': train_tokens, 'features': train_features}, f) 137 | else: 138 | pass 139 | 140 | 141 | -------------------------------------------------------------------------------- /data_analysis/launcher.sh: -------------------------------------------------------------------------------- 1 | export BDD100K_DIR=/s/dat/UserFolders/xzhan258/LaneDetection/BDD100K/bdd100k_images/bdd100k/images 2 | export OUT_DIR=/s/dat/UserFolders/xzhan258/private_dev/DatasetEquity/data_analysis/bdd100k 3 | export ROOT=/s/dat/UserFolders/xzhan258/private_dev/DatasetEquity/data_analysis 4 | 5 | python $ROOT/feature_extraction.py \ 6 | --data_root $BDD100K_DIR \ 7 | --results_dir $OUT_DIR \ 8 | --feature_extactor resnet101 \ 9 | --data_type bdd100k 10 | 11 | 12 | # cd /s/dat/UserFolders/xzhan258/private_dev/DatasetEquity/data_analysis && runpytorch -J feature_extraction -NGPUS 1 -i harbor.hpc.ford.com/xzhan258/torch:1.10_cuda11.4_yolop -x launcher.sh 13 | -------------------------------------------------------------------------------- /data_analysis/plot_functions.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "id": "45e31442-613b-4731-85f4-f8fae202e6fa", 7 | "metadata": {}, 8 | "outputs": [], 9 | "source": [ 10 | "# import packages\n", 11 | "import numpy as np\n", 12 | "import matplotlib.pyplot as plt" 13 | ] 14 | }, 15 | { 16 | "cell_type": "code", 17 | "execution_count": 2, 18 | "id": "473eb0c2-9e89-480e-8522-fb1eb3c76343", 19 | "metadata": {}, 20 | "outputs": [], 21 | "source": [ 22 | "def dequity_loss(eta: float, p: float, gamma: float) -> float:\n", 23 | " return (eta + (1 - p) ** gamma) / (eta + 1)" 24 | ] 25 | }, 26 | { 27 | "cell_type": "code", 28 | "execution_count": 4, 29 | "id": "9f849b29-7707-4be1-bcce-c187ab881c00", 30 | "metadata": {}, 31 | "outputs": [ 32 | { 33 | "data": { 34 | "image/png": "\n", 35 | "text/plain": [ 36 | "
" 37 | ] 38 | }, 39 | "metadata": { 40 | "needs_background": "light" 41 | }, 42 | "output_type": "display_data" 43 | } 44 | ], 45 | "source": [ 46 | "# plot functions for various values of eta and gamma\n", 47 | "fig, ax = plt.subplots(figsize=(10, 8))\n", 48 | "ax.set_xlim(0, 1)\n", 49 | "ax.set_ylim(0, 1)\n", 50 | "ax.set_xlabel('Sample Likelihood', fontsize=16)\n", 51 | "ax.set_ylabel('Loss Weighing Factor', fontsize=16)\n", 52 | "ax.set_title('Generalized Focal Loss', fontsize=20)\n", 53 | "ax.grid(alpha=0.5, linestyle='--', linewidth=1)\n", 54 | "\n", 55 | "eta = 0.0\n", 56 | "p = np.linspace(0, 1, 100)\n", 57 | "gamma = 2\n", 58 | "ax.plot(p, dequity_loss(eta, p, gamma), label=r'$\\eta$ = 0.0, $\\gamma$ = 2', linewidth=4)\n", 59 | "\n", 60 | "eta = 0.0\n", 61 | "p = np.linspace(0, 1, 100)\n", 62 | "gamma = 5\n", 63 | "ax.plot(p, dequity_loss(eta, p, gamma), label=r'$\\eta$ = 0.0, $\\gamma$ = 5', linewidth=4)\n", 64 | "\n", 65 | "eta = 0.3\n", 66 | "p = np.linspace(0, 1, 100)\n", 67 | "gamma = 2\n", 68 | "ax.plot(p, dequity_loss(eta, p, gamma), label=r'$\\eta$ = 0.3, $\\gamma$ = 2', linewidth=4)\n", 69 | "\n", 70 | "eta = 0.3\n", 71 | "p = np.linspace(0, 1, 100)\n", 72 | "gamma = 5\n", 73 | "ax.plot(p, dequity_loss(eta, p, gamma), label=r'$\\eta$ = 0.3, $\\gamma$ = 5', linewidth=4)\n", 74 | "\n", 75 | "eta = 1.0\n", 76 | "p = np.linspace(0, 1, 100)\n", 77 | "gamma = 2\n", 78 | "ax.plot(p, dequity_loss(eta, p, gamma), label=r'$\\eta$ = 1.0 $\\gamma$ = 2', linewidth=4)\n", 79 | "\n", 80 | "eta = 1.0\n", 81 | "p = np.linspace(0, 1, 100)\n", 82 | "gamma = 5\n", 83 | "ax.plot(p, dequity_loss(eta, p, gamma), label=r'$\\eta$ = 1.0, $\\gamma$ = 5', linewidth=4)\n", 84 | "\n", 85 | "eta = 3\n", 86 | "p = np.linspace(0, 1, 100)\n", 87 | "gamma = 2\n", 88 | "ax.plot(p, dequity_loss(eta, p, gamma), label=r'$\\eta$ = 3.0, $\\gamma$ = 2', linewidth=4)\n", 89 | "\n", 90 | "eta = 3\n", 91 | "p = np.linspace(0, 1, 100)\n", 92 | "gamma = 5\n", 93 | "ax.plot(p, dequity_loss(eta, p, gamma), label=r'$\\eta$ = 3.0, $\\gamma$ = 5', linewidth=4)\n", 94 | "\n", 95 | "# eta = 0\n", 96 | "# p = np.linspace(0, 1, 100)\n", 97 | "# gamma = 1\n", 98 | "# ax.plot(p, dequity_loss(eta, p, gamma), label='eta = 0.0, gamma = 1')\n", 99 | "\n", 100 | "# fancy legend\n", 101 | "plt.legend(loc='upper right', fancybox=True, framealpha=0.8, fontsize=14, \n", 102 | " ncol=2, facecolor='white', edgecolor='black', shadow=True)\n", 103 | "# shadow\n", 104 | "ax.spines['top'].set_linewidth(0.5)\n", 105 | "ax.spines['right'].set_linewidth(0.5)\n", 106 | "ax.spines['bottom'].set_linewidth(1.5)\n", 107 | "ax.spines['left'].set_linewidth(1.5)\n", 108 | "plt.savefig('plots/dequity_loss.png', dpi=300, bbox_inches='tight')\n", 109 | "plt.show()" 110 | ] 111 | }, 112 | { 113 | "cell_type": "code", 114 | "execution_count": null, 115 | "id": "103cf83e-2f33-4860-a6a7-54d1e70fa112", 116 | "metadata": {}, 117 | "outputs": [], 118 | "source": [] 119 | } 120 | ], 121 | "metadata": { 122 | "kernelspec": { 123 | "display_name": "cdt3d", 124 | "language": "python", 125 | "name": "cdt3d" 126 | }, 127 | "language_info": { 128 | "codemirror_mode": { 129 | "name": "ipython", 130 | "version": 3 131 | }, 132 | "file_extension": ".py", 133 | "mimetype": "text/x-python", 134 | "name": "python", 135 | "nbconvert_exporter": "python", 136 | "pygments_lexer": "ipython3", 137 | "version": "3.7.0" 138 | } 139 | }, 140 | "nbformat": 4, 141 | "nbformat_minor": 5 142 | } 143 | -------------------------------------------------------------------------------- /data_analysis/plots/bdd100k_dataset_likelihood.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/bdd100k_dataset_likelihood.png -------------------------------------------------------------------------------- /data_analysis/plots/bdd100k_training_clusters.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/bdd100k_training_clusters.gif -------------------------------------------------------------------------------- /data_analysis/plots/bdd100k_training_clusters.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/bdd100k_training_clusters.png -------------------------------------------------------------------------------- /data_analysis/plots/dequity_loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/dequity_loss.png -------------------------------------------------------------------------------- /data_analysis/plots/kitti_dataset_likelihood.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/kitti_dataset_likelihood.png -------------------------------------------------------------------------------- /data_analysis/plots/kitti_training_clusters.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/kitti_training_clusters.gif -------------------------------------------------------------------------------- /data_analysis/plots/kitti_training_clusters.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/kitti_training_clusters.png -------------------------------------------------------------------------------- /data_analysis/plots/nuscenes_dataset_likelihood.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/nuscenes_dataset_likelihood.png -------------------------------------------------------------------------------- /data_analysis/plots/nuscenes_training_clusters.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/nuscenes_training_clusters.gif -------------------------------------------------------------------------------- /data_analysis/plots/nuscenes_training_clusters.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/nuscenes_training_clusters.png -------------------------------------------------------------------------------- /data_analysis/plots/tusimple_dataset_likelihood.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/tusimple_dataset_likelihood.png -------------------------------------------------------------------------------- /data_analysis/plots/tusimple_training_clusters.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/tusimple_training_clusters.gif -------------------------------------------------------------------------------- /data_analysis/plots/tusimple_training_clusters.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/tusimple_training_clusters.png -------------------------------------------------------------------------------- /data_analysis/plots/waymo_dataset_likelihood.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/waymo_dataset_likelihood.png -------------------------------------------------------------------------------- /data_analysis/plots/waymo_training_clusters.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/waymo_training_clusters.gif -------------------------------------------------------------------------------- /data_analysis/plots/waymo_training_clusters.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/data_analysis/plots/waymo_training_clusters.png -------------------------------------------------------------------------------- /docs/bevformer.md: -------------------------------------------------------------------------------- 1 | ## Training BEVFormer 2 | 3 | Run the docker container in interactive mode. 4 | 5 | ```bash 6 | cd BEVFormer/docker 7 | sh run-docker.sh 8 | ``` 9 | 10 | Prepare the pre-trained model. 11 | 12 | ```bash 13 | cd BEVFormer 14 | mkdir ckpts 15 | 16 | cd ckpts & wget https://github.com/zhiqi-li/storage/releases/download/v1.0/r101_dcn_fcos3d_pretrain.pth 17 | ``` 18 | 19 | Prepare the nuScenes dataset. 20 | 21 | Download `Full dataset (v1.0)` and `CAN bus expansion` from [here](https://www.nuscenes.org/download) and unzip. 22 | 23 | ```bash 24 | mkdir data 25 | cd data 26 | # create symlink to the dataset 27 | ln -s /path/to/nuScenes/data/sets/nuscenes nuscenes 28 | # create symlink to the CAN bus data 29 | ln -s /path/to/nuScenes/data/sets/can_bus can_bus 30 | cd .. 31 | # prepare dataset 32 | python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes --version v1.0 --canbus ./data 33 | # You should see the following structure 34 | BEVFormer 35 | ├── projects/ 36 | ├── tools/ 37 | ├── configs/ 38 | ├── ckpts/ 39 | │ ├── r101_dcn_fcos3d_pretrain.pth 40 | ├── data/ 41 | │ ├── can_bus/ 42 | │ ├── nuscenes/ 43 | │ │ ├── maps/ 44 | │ │ ├── samples/ 45 | │ │ ├── sweeps/ 46 | │ │ ├── v1.0-test/ 47 | | | ├── v1.0-trainval/ 48 | | | ├── nuscenes_infos_temporal_train.pkl 49 | | | ├── nuscenes_infos_temporal_val.pkl 50 | ``` 51 | 52 | Train the baseline model (`variants={base|small|tiny}`). 53 | 54 | ```bash 55 | python tools/train.py projects/configs/bevformer/bevformer_{variant}.py --deterministic 56 | ``` 57 | 58 | If using a launcher for distributed training, use the following command. 59 | 60 | ```bash 61 | python tools/train.py projects/configs/bevformer/bevformer_{variant}.py --deterministic --launcher pytorch 62 | ``` 63 | 64 | For training the model with dataset equity configurations, first change the parameters `model.pts_bbox_head.dequity_eta` and `model.pts_bbox_head.dequity_gamma` in the config file (`projects/configs/bevformer/bevformer_{variant}_de.py`) to the desired values. 65 | 66 | ```bash 67 | python tools/train.py projects/configs/bevformer/bevformer_{variant}_de.py --deterministic --launcher pytorch 68 | ``` -------------------------------------------------------------------------------- /docs/bevfusion.md: -------------------------------------------------------------------------------- 1 | Build the docker using the following command: 2 | 3 | ```bash 4 | cd bevfusion/docker && docker build . -t bevfusion 5 | ``` 6 | 7 | We can then run the docker with the following command: 8 | 9 | ```bash 10 | nvidia-docker run -it -v `pwd`/../data:/dataset --shm-size 16g bevfusion /bin/bash 11 | ``` 12 | Then clone the bevfusion submodule and install it: 13 | 14 | ```bash 15 | cd home && git clone https://github.com/alchemz/bevfusion && cd bevfusion 16 | python setup.py develop 17 | ``` 18 | 19 | You can then create a symbolic link `data` to the `/dataset` directory in the docker. 20 | 21 | ### Data Preparation 22 | 23 | #### nuScenes 24 | 25 | Please follow the instructions from [here](https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/datasets/nuscenes_det.md) to download and preprocess the nuScenes dataset. Please remember to download both detection dataset and the map extension (for BEV map segmentation). After data preparation, you will be able to see the following directory structure (as is indicated in mmdetection3d): 26 | 27 | ``` 28 | mmdetection3d 29 | ├── mmdet3d 30 | ├── tools 31 | ├── configs 32 | ├── data 33 | │ ├── nuscenes 34 | │ │ ├── maps 35 | │ │ ├── samples 36 | │ │ ├── sweeps 37 | │ │ ├── v1.0-test 38 | | | ├── v1.0-trainval 39 | │ │ ├── nuscenes_database 40 | │ │ ├── nuscenes_infos_train.pkl 41 | │ │ ├── nuscenes_infos_val.pkl 42 | │ │ ├── nuscenes_infos_test.pkl 43 | │ │ ├── nuscenes_dbinfos_train.pkl 44 | ``` 45 | ### Training 46 | 47 | To train the camera-only variant for object detection with cbgs dataset, please run: 48 | 49 | ```bash 50 | torchpack dist-run -np 8 python tools/train.py configs/nuscenes/det/centerhead/lssfpn/camera/256x704/swint/default.yaml --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth 51 | ``` 52 | 53 | To train the camera-only variant for object detection without cbgs dataset, please set ```type``` to ```NoCBGSDataset``` [here](https://github.com/alchemz/bevfusion/blob/d08fd86cccefdacc0f19669cf6c3176a1a51d491/configs/nuscenes/default.yaml#L266). Use the same command as above to train the model. 54 | 55 | Use ```dequity_eta``` and ```dequity_gamma``` parameters in [config](https://github.com/alchemz/bevfusion/blob/main/configs/nuscenes/det/centerhead/default.yaml) to change data equity configurations for performing experiments. 56 | -------------------------------------------------------------------------------- /docs/dd3d.md: -------------------------------------------------------------------------------- 1 | ## Training DD3D 2 | 3 | Prepare KITTI dataset. 4 | 5 | ```bash 6 | mkdir -p data/datasets/KITTI3D 7 | cd data/datasets/KITTI3D 8 | # make symlinks to the KITTI dataset 9 | ln -s /path/to/kitti/training training 10 | ln -s /path/to/kitti/testing testing 11 | # download a standard splits subset of KITTI 12 | curl -s https://tri-ml-public.s3.amazonaws.com/github/dd3d/mv3d_kitti_splits.tar | sudo tar xv -C ./ 13 | cd ../../../.. 14 | ``` 15 | 16 | Run the docker container in interactive mode. 17 | 18 | ```bash 19 | cd dd3d/docker 20 | sh run-docker.sh 21 | ``` 22 | 23 | Train the model with various dequity loss configs. 24 | ```bash 25 | ./train_scripts/train_dd3d_kitti_{dequity_config}.sh 26 | ``` 27 | 28 | Options for `dequity_config` are: 29 | ```bash 30 | baseline 31 | eta0.3_gamma2.0 32 | eta0.5_gamma5.0 33 | ... etc. 34 | ``` -------------------------------------------------------------------------------- /docs/fig/teaser.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/towardsautonomy/DatasetEquity/d6898ad2370a9ea27d6d087c4098620e8e476b59/docs/fig/teaser.png -------------------------------------------------------------------------------- /docs/getting_started.md: -------------------------------------------------------------------------------- 1 | # Getting Started 2 | 3 | ## Installation 4 | 5 | Clone the repository and submodules. 6 | 7 | ```bash 8 | git clone https://github.com/towardsautonomy/DatasetEquity.git --recursive 9 | cd DatasetEquity/BEVFormer 10 | git checkout dataset_equity 11 | cd .. 12 | cd dd3d 13 | git checkout dataset_equity 14 | cd .. 15 | ``` 16 | 17 | Build docker images for `BEVFormer` and `dd3d`. 18 | 19 | ```bash 20 | cd BEVFormer/docker 21 | sh build-docker.sh 22 | cd ../.. 23 | cd dd3d/docker 24 | sh build-docker.sh 25 | cd ../.. 26 | ``` 27 | 28 | **Note**: These docker images are already built and available on [Docker Hub](https://hub.docker.com/u/towardsautonomy). You can skip this step if you don't want to build the docker images yourself. Running the docker images will automatically pull the images from Docker Hub. 29 | 30 | ## Dataset Analysis 31 | 32 | Perform dataset analysis on the KITTI dataset by opening up the [this notebook](/data_analysis/kitti_dataset_distribution_analysis.ipynb), and running the cells. It will generate a file `data_analysis/kitti_training_cluster_info.pkl` which contains all the information needed. By default, this should be copied over to `dd3d/data/datasets/KITTI3D/`. 33 | 34 | For the nuScenes dataset, use [this notebook](/data_analysis/nuscenes_dataset_distribution_analysis.ipynb). It will generate a file `data_analysis/nuscenes_training_cluster_info.pkl`, which should be copied over to `BEVFormer/data/nuscenes/`. These paths can be changed in the project config files. 35 | 36 | 37 | ## [Training BEVFormer](bevformer.md) 38 | 39 | ## [Training DD3D](dd3d.md) 40 | 41 | ## [Training BEVFusion](bevfusion.md) 42 | -------------------------------------------------------------------------------- /utils/kitti_object_dataset_downloader.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | # 4 | # Please note the license conditions of this software / dataset! 5 | # 6 | 7 | # left images 8 | wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip 9 | # right images 10 | wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_3.zip 11 | # calibration 12 | wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip 13 | # labels 14 | wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip 15 | # unzip all 16 | for z in *.zip; do unzip $z; done --------------------------------------------------------------------------------