├── Dockerfile ├── README.md ├── deep_sort ├── .gitignore ├── LICENSE ├── README.md ├── configs │ ├── deep_sort.yaml │ ├── yolov3.yaml │ └── yolov3_tiny.yaml ├── deep_sort │ ├── .deep_sort.py.swp │ ├── README.md │ ├── __init__.py │ ├── deep │ │ ├── __init__.py │ │ ├── checkpoint │ │ │ └── .gitkeep │ │ ├── evaluate.py │ │ ├── feature_extractor.py │ │ ├── model.py │ │ ├── original_model.py │ │ ├── test.py │ │ ├── train.jpg │ │ └── train.py │ ├── deep_sort.py │ └── sort │ │ ├── __init__.py │ │ ├── detection.py │ │ ├── iou_matching.py │ │ ├── kalman_filter.py │ │ ├── linear_assignment.py │ │ ├── nn_matching.py │ │ ├── preprocessing.py │ │ ├── track.py │ │ └── tracker.py ├── demo │ ├── 1.jpg │ ├── 2.jpg │ └── demo.gif ├── detector │ ├── YOLOv3 │ │ ├── README.md │ │ ├── __init__.py │ │ ├── cfg.py │ │ ├── cfg │ │ │ ├── coco.data │ │ │ ├── coco.names │ │ │ ├── darknet19_448.cfg │ │ │ ├── tiny-yolo-voc.cfg │ │ │ ├── tiny-yolo.cfg │ │ │ ├── voc.data │ │ │ ├── voc.names │ │ │ ├── voc_gaotie.data │ │ │ ├── yolo-voc.cfg │ │ │ ├── yolo.cfg │ │ │ ├── yolo_v3.cfg │ │ │ └── yolov3-tiny.cfg │ │ ├── darknet.py │ │ ├── demo │ │ │ ├── 004545.jpg │ │ │ └── results │ │ │ │ └── 004545.jpg │ │ ├── detect.py │ │ ├── detector.py │ │ ├── nms │ │ │ ├── __init__.py │ │ │ ├── build.sh │ │ │ ├── ext │ │ │ │ ├── __init__.py │ │ │ │ ├── build.py │ │ │ │ ├── cpu │ │ │ │ │ ├── nms_cpu.cpp │ │ │ │ │ └── vision.h │ │ │ │ ├── cuda │ │ │ │ │ ├── nms.cu │ │ │ │ │ └── vision.h │ │ │ │ ├── nms.h │ │ │ │ └── vision.cpp │ │ │ ├── nms.py │ │ │ └── python_nms.py │ │ ├── region_layer.py │ │ ├── weight │ │ │ └── .gitkeep │ │ ├── yolo_layer.py │ │ └── yolo_utils.py │ └── __init__.py ├── ped_det_server.py ├── scripts │ ├── yolov3_deepsort.sh │ └── yolov3_tiny_deepsort.sh ├── utils │ ├── __init__.py │ ├── asserts.py │ ├── draw.py │ ├── evaluation.py │ ├── io.py │ ├── json_logger.py │ ├── log.py │ ├── parser.py │ └── tools.py ├── webserver │ ├── .env │ ├── __init__.py │ ├── config │ │ └── config.py │ ├── images │ │ ├── Thumbs.db │ │ ├── arc.png │ │ └── request.png │ ├── readme.md │ ├── rtsp_threaded_tracker.py │ ├── rtsp_webserver.py │ ├── server_cfg.py │ └── templates │ │ └── index.html ├── yolov3_deepsort.py └── yolov3_deepsort_eval.py ├── label_split.py ├── main.py ├── nba.mp4 ├── nba_inf.gif ├── nba_inf.mp4 ├── requirements.txt ├── requirements_yum.txt ├── track.py └── yolov5 ├── .dockerignore ├── .gitattributes ├── .gitignore ├── LICENSE ├── README.md ├── data ├── get_coco2017.sh └── get_voc.sh ├── detect.py ├── hubconf.py ├── models ├── __init__.py ├── common.py ├── experimental.py ├── export.py ├── hub │ ├── yolov3-spp.yaml │ ├── yolov5-fpn.yaml │ └── yolov5-panet.yaml ├── yolo.py ├── yolov5l.yaml ├── yolov5m.yaml ├── yolov5s.yaml └── yolov5x.yaml ├── test.py ├── train.py ├── tutorial.ipynb ├── utils ├── __init__.py ├── activations.py ├── datasets.py ├── general.py ├── google_utils.py └── torch_utils.py └── weights └── download_weights.sh /Dockerfile: -------------------------------------------------------------------------------- 1 | # Base image 2 | FROM nvidia/cuda:10.1-cudnn7-devel-centos7 3 | 4 | 5 | # Install packages specified in requirements_yum.txt and requirements.txt 6 | COPY requirements.txt ./ 7 | COPY requirements_yum.txt ./ 8 | COPY Yolov5_DeepSort_Pytorch ./Yolov5_DeepSort_Pytorch 9 | RUN yum install $(cat requirements_yum.txt) ./ 10 | RUN pip3 install --upgrade pip 11 | RUN pip3 install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html 12 | RUN pip3 install -r requirements.txt 13 | 14 | # Make port available outside this container 15 | EXPOSE 5000 16 | 17 | # Working directory for container 18 | WORKDIR Multi-class_Yolov5_DeepSort_Pytorch 19 | 20 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Multi-class Yolov5 + Deep Sort with PyTorch 2 | 3 | 4 | ![](nba_inf.gif) 5 | 6 | ## Introduction 7 | 8 | This repository is modified from mikel-brostrom/Yolov5_DeepSort_Pytorch (https://github.com/mikel-brostrom/Yolov5_DeepSort_Pytorch). I fixed some bugs and extend it to multi-class version.It contains YOLOv5 (https://github.com/ultralytics/yolov5) and Deep Sort (https://github.com/ZQPei/deep_sort_pytorch). The deep sort model in this repository was only trained by pedestrians. 9 | 10 | ## Requirements 11 | 12 | Python 3.6 or later with all requirements.txt dependencies installed, including torch>=1.6. To install run: 13 | 14 | `pip install -U -r requirements.txt` 15 | 16 | All dependencies are included in the associated docker images. Docker requirements are: 17 | - `nvidia-docker` 18 | - Nvidia Driver Version >= 440.44 19 | 20 | Alternatively, you can build a docker image by Dockerfile supplied here if you use Centos7. 21 | - `sudo docker pull nvidia/cuda:10.1-cudnn7-devel-centos7` 22 | - `sudo docker build -t [image_name] .` 23 | - `sudo docker run --runtime=nvidia --name [container_name] --shm-size [8G] -t -i [image_name:tag] /bin/bash` 24 | 25 | ## Download Weights 26 | 27 | - Yolov5 pedestrian weight from https://drive.google.com/file/d/1BsWywxaQtuz2Tq3i0M3qFsscZC6a18u8/view?usp=sharing. Place the downlaoded `.pt` file under `yolov5/weights/` 28 | - Yolov5 nba weight from https://drive.google.com/file/d/12qDKovSi9PRdY-77zFJx_7gi41zE3BJY/view?usp=sharing. Place the downlaoded `.pt` file under `yolov5/weights/`. It was trained by a very small dataset. 29 | - Deep sort weights from https://drive.google.com/file/d/18qIFaoPWu4OFiH1kO2JiJ2Lq2D3lhXYY/view?usp=sharing. Place ckpt.t7 file under`deep_sort/deep/checkpoint/` 30 | 31 | ## Download Sample Video 32 | 33 | - Sample nba video from https://drive.google.com/file/d/19ESDqwvO5LRQQ5nQqgjxwsD85eFsW0pp/view?usp=sharing 34 | 35 | ## Tracking 36 | 37 | Tracking can be run on most video formats. Results are saved to ./inference/output. 38 | 39 | ```bash 40 | python3 track.py --source nba.mp4 --weights nba.pt --device ... 41 | ``` 42 | 43 | - Video: `--source file.mp4` 44 | - Webcam: `--source 0` 45 | - RTSP stream: `--source rtsp://170.93.143.139/rtplive/470011e600ef003a004ee33696235daa` 46 | - HTTP stream: `--source http://wmccpinetop.axiscam.net/mjpg/video.mjpg` 47 | 48 | ## Train Yolov5 49 | - Put your images in dataset/images and annotations(in PASCAL format) in dataset/annotations. 50 | - Modify yolov5/data/data.yaml 51 | - `python3 label_split.py` 52 | - `cd yolov5` 53 | - `CUDA_VISIBLE_DEVICE=... python3 train.py --img 640 --batch 16 --epochs 500 --data ./data/data.yaml --cfg ./models/yolov5s.yaml --weights weights/nba.pt` 54 | 55 | ## Reference 56 | 57 | For more details, you can check three orgin repositories. 58 | - Simple Online and Realtime Tracking with a Deep Association Metric 59 | https://arxiv.org/abs/1703.07402 60 | - YOLOv4: Optimal Speed and Accuracy of Object Detection 61 | https://arxiv.org/pdf/2004.10934.pdf 62 | -------------------------------------------------------------------------------- /deep_sort/.gitignore: -------------------------------------------------------------------------------- 1 | # Folders 2 | __pycache__/ 3 | build/ 4 | *.egg-info 5 | 6 | 7 | # Files 8 | *.weights 9 | *.t7 10 | *.mp4 11 | *.avi 12 | *.so 13 | *.txt 14 | -------------------------------------------------------------------------------- /deep_sort/LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Ziqiang 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. -------------------------------------------------------------------------------- /deep_sort/README.md: -------------------------------------------------------------------------------- 1 | # Deep Sort with PyTorch 2 | 3 | ![](demo/demo.gif) 4 | 5 | ## Update(1-1-2020) 6 | Changes 7 | - fix bugs 8 | - refactor code 9 | - accerate detection by adding nms on gpu 10 | 11 | ## Latest Update(07-22) 12 | Changes 13 | - bug fix (Thanks @JieChen91 and @yingsen1 for bug reporting). 14 | - using batch for feature extracting for each frame, which lead to a small speed up. 15 | - code improvement. 16 | 17 | Futher improvement direction 18 | - Train detector on specific dataset rather than the official one. 19 | - Retrain REID model on pedestrain dataset for better performance. 20 | - Replace YOLOv3 detector with advanced ones. 21 | 22 | **Any contributions to this repository is welcome!** 23 | 24 | 25 | ## Introduction 26 | This is an implement of MOT tracking algorithm deep sort. Deep sort is basicly the same with sort but added a CNN model to extract features in image of human part bounded by a detector. This CNN model is indeed a RE-ID model and the detector used in [PAPER](https://arxiv.org/abs/1703.07402) is FasterRCNN , and the original source code is [HERE](https://github.com/nwojke/deep_sort). 27 | However in original code, the CNN model is implemented with tensorflow, which I'm not familier with. SO I re-implemented the CNN feature extraction model with PyTorch, and changed the CNN model a little bit. Also, I use **YOLOv3** to generate bboxes instead of FasterRCNN. 28 | 29 | ## Dependencies 30 | - python 3 (python2 not sure) 31 | - numpy 32 | - scipy 33 | - opencv-python 34 | - sklearn 35 | - torch >= 0.4 36 | - torchvision >= 0.1 37 | - pillow 38 | - vizer 39 | - edict 40 | 41 | ## Quick Start 42 | 0. Check all dependencies installed 43 | ```bash 44 | pip install -r requirements.txt 45 | ``` 46 | for user in china, you can specify pypi source to accelerate install like: 47 | ```bash 48 | pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple 49 | ``` 50 | 51 | 1. Clone this repository 52 | ``` 53 | git clone git@github.com:ZQPei/deep_sort_pytorch.git 54 | ``` 55 | 56 | 2. Download YOLOv3 parameters 57 | ``` 58 | cd detector/YOLOv3/weight/ 59 | wget https://pjreddie.com/media/files/yolov3.weights 60 | wget https://pjreddie.com/media/files/yolov3-tiny.weights 61 | cd ../../../ 62 | ``` 63 | 64 | 3. Download deepsort parameters ckpt.t7 65 | ``` 66 | cd deep_sort/deep/checkpoint 67 | # download ckpt.t7 from 68 | https://drive.google.com/drive/folders/1xhG0kRH1EX5B9_Iz8gQJb7UNnn_riXi6 to this folder 69 | cd ../../../ 70 | ``` 71 | 72 | 4. Compile nms module 73 | ```bash 74 | cd detector/YOLOv3/nms 75 | sh build.sh 76 | cd ../../.. 77 | ``` 78 | 79 | Notice: 80 | If compiling failed, the simplist way is to **Upgrade your pytorch >= 1.1 and torchvision >= 0.3" and you can avoid the troublesome compiling problems which are most likely caused by either `gcc version too low` or `libraries missing`. 81 | 82 | 5. Run demo 83 | ``` 84 | usage: python yolov3_deepsort.py VIDEO_PATH 85 | [--help] 86 | [--frame_interval FRAME_INTERVAL] 87 | [--config_detection CONFIG_DETECTION] 88 | [--config_deepsort CONFIG_DEEPSORT] 89 | [--display] 90 | [--display_width DISPLAY_WIDTH] 91 | [--display_height DISPLAY_HEIGHT] 92 | [--save_path SAVE_PATH] 93 | [--cpu] 94 | 95 | # yolov3 + deepsort 96 | python yolov3_deepsort.py [VIDEO_PATH] 97 | 98 | # yolov3_tiny + deepsort 99 | python yolov3_deepsort.py [VIDEO_PATH] --config_detection ./configs/yolov3_tiny.yaml 100 | 101 | # yolov3 + deepsort on webcam 102 | python3 yolov3_deepsort.py /dev/video0 --camera 0 103 | 104 | # yolov3_tiny + deepsort on webcam 105 | python3 yolov3_deepsort.py /dev/video0 --config_detection ./configs/yolov3_tiny.yaml --camera 0 106 | ``` 107 | Use `--display` to enable display. 108 | Results will be saved to `./output/results.avi` and `./output/results.txt`. 109 | 110 | All files above can also be accessed from BaiduDisk! 111 | linker:[BaiduDisk](https://pan.baidu.com/s/1YJ1iPpdFTlUyLFoonYvozg) 112 | passwd:fbuw 113 | 114 | ## Training the RE-ID model 115 | The original model used in paper is in original_model.py, and its parameter here [original_ckpt.t7](https://drive.google.com/drive/folders/1xhG0kRH1EX5B9_Iz8gQJb7UNnn_riXi6). 116 | 117 | To train the model, first you need download [Market1501](http://www.liangzheng.com.cn/Project/project_reid.html) dataset or [Mars](http://www.liangzheng.com.cn/Project/project_mars.html) dataset. 118 | 119 | Then you can try [train.py](deep_sort/deep/train.py) to train your own parameter and evaluate it using [test.py](deep_sort/deep/test.py) and [evaluate.py](deep_sort/deep/evalute.py). 120 | ![train.jpg](deep_sort/deep/train.jpg) 121 | 122 | ## Demo videos and images 123 | [demo.avi](https://drive.google.com/drive/folders/1xhG0kRH1EX5B9_Iz8gQJb7UNnn_riXi6) 124 | [demo2.avi](https://drive.google.com/drive/folders/1xhG0kRH1EX5B9_Iz8gQJb7UNnn_riXi6) 125 | 126 | ![1.jpg](demo/1.jpg) 127 | ![2.jpg](demo/2.jpg) 128 | 129 | 130 | ## References 131 | - paper: [Simple Online and Realtime Tracking with a Deep Association Metric](https://arxiv.org/abs/1703.07402) 132 | 133 | - code: [nwojke/deep_sort](https://github.com/nwojke/deep_sort) 134 | 135 | - paper: [YOLOv3](https://pjreddie.com/media/files/papers/YOLOv3.pdf) 136 | 137 | - code: [Joseph Redmon/yolov3](https://pjreddie.com/darknet/yolo/) 138 | -------------------------------------------------------------------------------- /deep_sort/configs/deep_sort.yaml: -------------------------------------------------------------------------------- 1 | DEEPSORT: 2 | REID_CKPT: "deep_sort/deep_sort/deep/checkpoint/ckpt.t7" 3 | MAX_DIST: 0.2 4 | MIN_CONFIDENCE: 0.4 5 | NMS_MAX_OVERLAP: 0.75 6 | MAX_IOU_DISTANCE: 0.7 7 | MAX_AGE: 70 8 | N_INIT: 3 9 | NN_BUDGET: 100 10 | 11 | -------------------------------------------------------------------------------- /deep_sort/configs/yolov3.yaml: -------------------------------------------------------------------------------- 1 | YOLOV3: 2 | CFG: "./detector/YOLOv3/cfg/yolo_v3.cfg" 3 | WEIGHT: "./detector/YOLOv3/weight/yolov3.weights" 4 | CLASS_NAMES: "./detector/YOLOv3/cfg/coco.names" 5 | 6 | SCORE_THRESH: 0.5 7 | NMS_THRESH: 0.4 8 | -------------------------------------------------------------------------------- /deep_sort/configs/yolov3_tiny.yaml: -------------------------------------------------------------------------------- 1 | YOLOV3: 2 | CFG: "./detector/YOLOv3/cfg/yolov3-tiny.cfg" 3 | WEIGHT: "./detector/YOLOv3/weight/yolov3-tiny.weights" 4 | CLASS_NAMES: "./detector/YOLOv3/cfg/coco.names" 5 | 6 | SCORE_THRESH: 0.5 7 | NMS_THRESH: 0.4 -------------------------------------------------------------------------------- /deep_sort/deep_sort/.deep_sort.py.swp: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/deep_sort/.deep_sort.py.swp -------------------------------------------------------------------------------- /deep_sort/deep_sort/README.md: -------------------------------------------------------------------------------- 1 | # Deep Sort 2 | 3 | This is the implemention of deep sort with pytorch. -------------------------------------------------------------------------------- /deep_sort/deep_sort/__init__.py: -------------------------------------------------------------------------------- 1 | from .deep_sort import DeepSort 2 | 3 | 4 | __all__ = ['DeepSort', 'build_tracker'] 5 | 6 | 7 | def build_tracker(cfg, use_cuda): 8 | return DeepSort(cfg.DEEPSORT.REID_CKPT, 9 | max_dist=cfg.DEEPSORT.MAX_DIST, min_confidence=cfg.DEEPSORT.MIN_CONFIDENCE, 10 | nms_max_overlap=cfg.DEEPSORT.NMS_MAX_OVERLAP, max_iou_distance=cfg.DEEPSORT.MAX_IOU_DISTANCE, 11 | max_age=cfg.DEEPSORT.MAX_AGE, n_init=cfg.DEEPSORT.N_INIT, nn_budget=cfg.DEEPSORT.NN_BUDGET, use_cuda=use_cuda) 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | -------------------------------------------------------------------------------- /deep_sort/deep_sort/deep/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/deep_sort/deep/__init__.py -------------------------------------------------------------------------------- /deep_sort/deep_sort/deep/checkpoint/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/deep_sort/deep/checkpoint/.gitkeep -------------------------------------------------------------------------------- /deep_sort/deep_sort/deep/evaluate.py: -------------------------------------------------------------------------------- 1 | import torch 2 | 3 | features = torch.load("features.pth") 4 | qf = features["qf"] 5 | ql = features["ql"] 6 | gf = features["gf"] 7 | gl = features["gl"] 8 | 9 | scores = qf.mm(gf.t()) 10 | res = scores.topk(5, dim=1)[1][:,0] 11 | top1correct = gl[res].eq(ql).sum().item() 12 | 13 | print("Acc top1:{:.3f}".format(top1correct/ql.size(0))) 14 | 15 | 16 | -------------------------------------------------------------------------------- /deep_sort/deep_sort/deep/feature_extractor.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torchvision.transforms as transforms 3 | import numpy as np 4 | import cv2 5 | import logging 6 | 7 | from .model import Net 8 | 9 | class Extractor(object): 10 | def __init__(self, model_path, use_cuda=True): 11 | self.net = Net(reid=True) 12 | self.device = "cuda" if torch.cuda.is_available() and use_cuda else "cpu" 13 | state_dict = torch.load(model_path, map_location=lambda storage, loc: storage)['net_dict'] 14 | self.net.load_state_dict(state_dict) 15 | logger = logging.getLogger("root.tracker") 16 | logger.info("Loading weights from {}... Done!".format(model_path)) 17 | self.net.to(self.device) 18 | self.size = (64, 128) 19 | self.norm = transforms.Compose([ 20 | transforms.ToTensor(), 21 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), 22 | ]) 23 | 24 | 25 | 26 | def _preprocess(self, im_crops): 27 | """ 28 | TODO: 29 | 1. to float with scale from 0 to 1 30 | 2. resize to (64, 128) as Market1501 dataset did 31 | 3. concatenate to a numpy array 32 | 3. to torch Tensor 33 | 4. normalize 34 | """ 35 | def _resize(im, size): 36 | return cv2.resize(im.astype(np.float32)/255., size) 37 | 38 | im_batch = torch.cat([self.norm(_resize(im, self.size)).unsqueeze(0) for im in im_crops], dim=0).float() 39 | return im_batch 40 | 41 | 42 | def __call__(self, im_crops): 43 | im_batch = self._preprocess(im_crops) 44 | with torch.no_grad(): 45 | im_batch = im_batch.to(self.device) 46 | features = self.net(im_batch) 47 | return features.cpu().numpy() 48 | 49 | 50 | if __name__ == '__main__': 51 | img = cv2.imread("demo.jpg")[:,:,(2,1,0)] 52 | extr = Extractor("checkpoint/ckpt.t7") 53 | feature = extr(img) 54 | print(feature.shape) 55 | 56 | -------------------------------------------------------------------------------- /deep_sort/deep_sort/deep/model.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | class BasicBlock(nn.Module): 6 | def __init__(self, c_in, c_out,is_downsample=False): 7 | super(BasicBlock,self).__init__() 8 | self.is_downsample = is_downsample 9 | if is_downsample: 10 | self.conv1 = nn.Conv2d(c_in, c_out, 3, stride=2, padding=1, bias=False) 11 | else: 12 | self.conv1 = nn.Conv2d(c_in, c_out, 3, stride=1, padding=1, bias=False) 13 | self.bn1 = nn.BatchNorm2d(c_out) 14 | self.relu = nn.ReLU(True) 15 | self.conv2 = nn.Conv2d(c_out,c_out,3,stride=1,padding=1, bias=False) 16 | self.bn2 = nn.BatchNorm2d(c_out) 17 | if is_downsample: 18 | self.downsample = nn.Sequential( 19 | nn.Conv2d(c_in, c_out, 1, stride=2, bias=False), 20 | nn.BatchNorm2d(c_out) 21 | ) 22 | elif c_in != c_out: 23 | self.downsample = nn.Sequential( 24 | nn.Conv2d(c_in, c_out, 1, stride=1, bias=False), 25 | nn.BatchNorm2d(c_out) 26 | ) 27 | self.is_downsample = True 28 | 29 | def forward(self,x): 30 | y = self.conv1(x) 31 | y = self.bn1(y) 32 | y = self.relu(y) 33 | y = self.conv2(y) 34 | y = self.bn2(y) 35 | if self.is_downsample: 36 | x = self.downsample(x) 37 | return F.relu(x.add(y),True) 38 | 39 | def make_layers(c_in,c_out,repeat_times, is_downsample=False): 40 | blocks = [] 41 | for i in range(repeat_times): 42 | if i ==0: 43 | blocks += [BasicBlock(c_in,c_out, is_downsample=is_downsample),] 44 | else: 45 | blocks += [BasicBlock(c_out,c_out),] 46 | return nn.Sequential(*blocks) 47 | 48 | class Net(nn.Module): 49 | def __init__(self, num_classes=751 ,reid=False): 50 | super(Net,self).__init__() 51 | # 3 128 64 52 | self.conv = nn.Sequential( 53 | nn.Conv2d(3,64,3,stride=1,padding=1), 54 | nn.BatchNorm2d(64), 55 | nn.ReLU(inplace=True), 56 | # nn.Conv2d(32,32,3,stride=1,padding=1), 57 | # nn.BatchNorm2d(32), 58 | # nn.ReLU(inplace=True), 59 | nn.MaxPool2d(3,2,padding=1), 60 | ) 61 | # 32 64 32 62 | self.layer1 = make_layers(64,64,2,False) 63 | # 32 64 32 64 | self.layer2 = make_layers(64,128,2,True) 65 | # 64 32 16 66 | self.layer3 = make_layers(128,256,2,True) 67 | # 128 16 8 68 | self.layer4 = make_layers(256,512,2,True) 69 | # 256 8 4 70 | self.avgpool = nn.AvgPool2d((8,4),1) 71 | # 256 1 1 72 | self.reid = reid 73 | self.classifier = nn.Sequential( 74 | nn.Linear(512, 256), 75 | nn.BatchNorm1d(256), 76 | nn.ReLU(inplace=True), 77 | nn.Dropout(), 78 | nn.Linear(256, num_classes), 79 | ) 80 | 81 | def forward(self, x): 82 | x = self.conv(x) 83 | x = self.layer1(x) 84 | x = self.layer2(x) 85 | x = self.layer3(x) 86 | x = self.layer4(x) 87 | x = self.avgpool(x) 88 | x = x.view(x.size(0),-1) 89 | # B x 128 90 | if self.reid: 91 | x = x.div(x.norm(p=2,dim=1,keepdim=True)) 92 | return x 93 | # classifier 94 | x = self.classifier(x) 95 | return x 96 | 97 | 98 | if __name__ == '__main__': 99 | net = Net() 100 | x = torch.randn(4,3,128,64) 101 | y = net(x) 102 | import ipdb; ipdb.set_trace() 103 | 104 | 105 | -------------------------------------------------------------------------------- /deep_sort/deep_sort/deep/original_model.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | class BasicBlock(nn.Module): 6 | def __init__(self, c_in, c_out,is_downsample=False): 7 | super(BasicBlock,self).__init__() 8 | self.is_downsample = is_downsample 9 | if is_downsample: 10 | self.conv1 = nn.Conv2d(c_in, c_out, 3, stride=2, padding=1, bias=False) 11 | else: 12 | self.conv1 = nn.Conv2d(c_in, c_out, 3, stride=1, padding=1, bias=False) 13 | self.bn1 = nn.BatchNorm2d(c_out) 14 | self.relu = nn.ReLU(True) 15 | self.conv2 = nn.Conv2d(c_out,c_out,3,stride=1,padding=1, bias=False) 16 | self.bn2 = nn.BatchNorm2d(c_out) 17 | if is_downsample: 18 | self.downsample = nn.Sequential( 19 | nn.Conv2d(c_in, c_out, 1, stride=2, bias=False), 20 | nn.BatchNorm2d(c_out) 21 | ) 22 | elif c_in != c_out: 23 | self.downsample = nn.Sequential( 24 | nn.Conv2d(c_in, c_out, 1, stride=1, bias=False), 25 | nn.BatchNorm2d(c_out) 26 | ) 27 | self.is_downsample = True 28 | 29 | def forward(self,x): 30 | y = self.conv1(x) 31 | y = self.bn1(y) 32 | y = self.relu(y) 33 | y = self.conv2(y) 34 | y = self.bn2(y) 35 | if self.is_downsample: 36 | x = self.downsample(x) 37 | return F.relu(x.add(y),True) 38 | 39 | def make_layers(c_in,c_out,repeat_times, is_downsample=False): 40 | blocks = [] 41 | for i in range(repeat_times): 42 | if i ==0: 43 | blocks += [BasicBlock(c_in,c_out, is_downsample=is_downsample),] 44 | else: 45 | blocks += [BasicBlock(c_out,c_out),] 46 | return nn.Sequential(*blocks) 47 | 48 | class Net(nn.Module): 49 | def __init__(self, num_classes=625 ,reid=False): 50 | super(Net,self).__init__() 51 | # 3 128 64 52 | self.conv = nn.Sequential( 53 | nn.Conv2d(3,32,3,stride=1,padding=1), 54 | nn.BatchNorm2d(32), 55 | nn.ELU(inplace=True), 56 | nn.Conv2d(32,32,3,stride=1,padding=1), 57 | nn.BatchNorm2d(32), 58 | nn.ELU(inplace=True), 59 | nn.MaxPool2d(3,2,padding=1), 60 | ) 61 | # 32 64 32 62 | self.layer1 = make_layers(32,32,2,False) 63 | # 32 64 32 64 | self.layer2 = make_layers(32,64,2,True) 65 | # 64 32 16 66 | self.layer3 = make_layers(64,128,2,True) 67 | # 128 16 8 68 | self.dense = nn.Sequential( 69 | nn.Dropout(p=0.6), 70 | nn.Linear(128*16*8, 128), 71 | nn.BatchNorm1d(128), 72 | nn.ELU(inplace=True) 73 | ) 74 | # 256 1 1 75 | self.reid = reid 76 | self.batch_norm = nn.BatchNorm1d(128) 77 | self.classifier = nn.Sequential( 78 | nn.Linear(128, num_classes), 79 | ) 80 | 81 | def forward(self, x): 82 | x = self.conv(x) 83 | x = self.layer1(x) 84 | x = self.layer2(x) 85 | x = self.layer3(x) 86 | 87 | x = x.view(x.size(0),-1) 88 | if self.reid: 89 | x = self.dense[0](x) 90 | x = self.dense[1](x) 91 | x = x.div(x.norm(p=2,dim=1,keepdim=True)) 92 | return x 93 | x = self.dense(x) 94 | # B x 128 95 | # classifier 96 | x = self.classifier(x) 97 | return x 98 | 99 | 100 | if __name__ == '__main__': 101 | net = Net(reid=True) 102 | x = torch.randn(4,3,128,64) 103 | y = net(x) 104 | import ipdb; ipdb.set_trace() 105 | 106 | 107 | -------------------------------------------------------------------------------- /deep_sort/deep_sort/deep/test.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.backends.cudnn as cudnn 3 | import torchvision 4 | 5 | import argparse 6 | import os 7 | 8 | from model import Net 9 | 10 | parser = argparse.ArgumentParser(description="Train on market1501") 11 | parser.add_argument("--data-dir",default='data',type=str) 12 | parser.add_argument("--no-cuda",action="store_true") 13 | parser.add_argument("--gpu-id",default=0,type=int) 14 | args = parser.parse_args() 15 | 16 | # device 17 | device = "cuda:{}".format(args.gpu_id) if torch.cuda.is_available() and not args.no_cuda else "cpu" 18 | if torch.cuda.is_available() and not args.no_cuda: 19 | cudnn.benchmark = True 20 | 21 | # data loader 22 | root = args.data_dir 23 | query_dir = os.path.join(root,"query") 24 | gallery_dir = os.path.join(root,"gallery") 25 | transform = torchvision.transforms.Compose([ 26 | torchvision.transforms.Resize((128,64)), 27 | torchvision.transforms.ToTensor(), 28 | torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) 29 | ]) 30 | queryloader = torch.utils.data.DataLoader( 31 | torchvision.datasets.ImageFolder(query_dir, transform=transform), 32 | batch_size=64, shuffle=False 33 | ) 34 | galleryloader = torch.utils.data.DataLoader( 35 | torchvision.datasets.ImageFolder(gallery_dir, transform=transform), 36 | batch_size=64, shuffle=False 37 | ) 38 | 39 | # net definition 40 | net = Net(reid=True) 41 | assert os.path.isfile("./checkpoint/ckpt.t7"), "Error: no checkpoint file found!" 42 | print('Loading from checkpoint/ckpt.t7') 43 | checkpoint = torch.load("./checkpoint/ckpt.t7") 44 | net_dict = checkpoint['net_dict'] 45 | net.load_state_dict(net_dict, strict=False) 46 | net.eval() 47 | net.to(device) 48 | 49 | # compute features 50 | query_features = torch.tensor([]).float() 51 | query_labels = torch.tensor([]).long() 52 | gallery_features = torch.tensor([]).float() 53 | gallery_labels = torch.tensor([]).long() 54 | 55 | with torch.no_grad(): 56 | for idx,(inputs,labels) in enumerate(queryloader): 57 | inputs = inputs.to(device) 58 | features = net(inputs).cpu() 59 | query_features = torch.cat((query_features, features), dim=0) 60 | query_labels = torch.cat((query_labels, labels)) 61 | 62 | for idx,(inputs,labels) in enumerate(galleryloader): 63 | inputs = inputs.to(device) 64 | features = net(inputs).cpu() 65 | gallery_features = torch.cat((gallery_features, features), dim=0) 66 | gallery_labels = torch.cat((gallery_labels, labels)) 67 | 68 | gallery_labels -= 2 69 | 70 | # save features 71 | features = { 72 | "qf": query_features, 73 | "ql": query_labels, 74 | "gf": gallery_features, 75 | "gl": gallery_labels 76 | } 77 | torch.save(features,"features.pth") -------------------------------------------------------------------------------- /deep_sort/deep_sort/deep/train.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/deep_sort/deep/train.jpg -------------------------------------------------------------------------------- /deep_sort/deep_sort/deep/train.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import time 4 | 5 | import numpy as np 6 | import matplotlib.pyplot as plt 7 | import torch 8 | import torch.backends.cudnn as cudnn 9 | import torchvision 10 | 11 | from model import Net 12 | 13 | parser = argparse.ArgumentParser(description="Train on market1501") 14 | parser.add_argument("--data-dir",default='data',type=str) 15 | parser.add_argument("--no-cuda",action="store_true") 16 | parser.add_argument("--gpu-id",default=0,type=int) 17 | parser.add_argument("--lr",default=0.1, type=float) 18 | parser.add_argument("--interval",'-i',default=20,type=int) 19 | parser.add_argument('--resume', '-r',action='store_true') 20 | args = parser.parse_args() 21 | 22 | # device 23 | device = "cuda:{}".format(args.gpu_id) if torch.cuda.is_available() and not args.no_cuda else "cpu" 24 | if torch.cuda.is_available() and not args.no_cuda: 25 | cudnn.benchmark = True 26 | 27 | # data loading 28 | root = args.data_dir 29 | train_dir = os.path.join(root,"train") 30 | test_dir = os.path.join(root,"test") 31 | transform_train = torchvision.transforms.Compose([ 32 | torchvision.transforms.RandomCrop((128,64),padding=4), 33 | torchvision.transforms.RandomHorizontalFlip(), 34 | torchvision.transforms.ToTensor(), 35 | torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) 36 | ]) 37 | transform_test = torchvision.transforms.Compose([ 38 | torchvision.transforms.Resize((128,64)), 39 | torchvision.transforms.ToTensor(), 40 | torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) 41 | ]) 42 | trainloader = torch.utils.data.DataLoader( 43 | torchvision.datasets.ImageFolder(train_dir, transform=transform_train), 44 | batch_size=64,shuffle=True 45 | ) 46 | testloader = torch.utils.data.DataLoader( 47 | torchvision.datasets.ImageFolder(test_dir, transform=transform_test), 48 | batch_size=64,shuffle=True 49 | ) 50 | num_classes = max(len(trainloader.dataset.classes), len(testloader.dataset.classes)) 51 | 52 | # net definition 53 | start_epoch = 0 54 | net = Net(num_classes=num_classes) 55 | if args.resume: 56 | assert os.path.isfile("./checkpoint/ckpt.t7"), "Error: no checkpoint file found!" 57 | print('Loading from checkpoint/ckpt.t7') 58 | checkpoint = torch.load("./checkpoint/ckpt.t7") 59 | # import ipdb; ipdb.set_trace() 60 | net_dict = checkpoint['net_dict'] 61 | net.load_state_dict(net_dict) 62 | best_acc = checkpoint['acc'] 63 | start_epoch = checkpoint['epoch'] 64 | net.to(device) 65 | 66 | # loss and optimizer 67 | criterion = torch.nn.CrossEntropyLoss() 68 | optimizer = torch.optim.SGD(net.parameters(), args.lr, momentum=0.9, weight_decay=5e-4) 69 | best_acc = 0. 70 | 71 | # train function for each epoch 72 | def train(epoch): 73 | print("\nEpoch : %d"%(epoch+1)) 74 | net.train() 75 | training_loss = 0. 76 | train_loss = 0. 77 | correct = 0 78 | total = 0 79 | interval = args.interval 80 | start = time.time() 81 | for idx, (inputs, labels) in enumerate(trainloader): 82 | # forward 83 | inputs,labels = inputs.to(device),labels.to(device) 84 | outputs = net(inputs) 85 | loss = criterion(outputs, labels) 86 | 87 | # backward 88 | optimizer.zero_grad() 89 | loss.backward() 90 | optimizer.step() 91 | 92 | # accumurating 93 | training_loss += loss.item() 94 | train_loss += loss.item() 95 | correct += outputs.max(dim=1)[1].eq(labels).sum().item() 96 | total += labels.size(0) 97 | 98 | # print 99 | if (idx+1)%interval == 0: 100 | end = time.time() 101 | print("[progress:{:.1f}%]time:{:.2f}s Loss:{:.5f} Correct:{}/{} Acc:{:.3f}%".format( 102 | 100.*(idx+1)/len(trainloader), end-start, training_loss/interval, correct, total, 100.*correct/total 103 | )) 104 | training_loss = 0. 105 | start = time.time() 106 | 107 | return train_loss/len(trainloader), 1.- correct/total 108 | 109 | def test(epoch): 110 | global best_acc 111 | net.eval() 112 | test_loss = 0. 113 | correct = 0 114 | total = 0 115 | start = time.time() 116 | with torch.no_grad(): 117 | for idx, (inputs, labels) in enumerate(testloader): 118 | inputs, labels = inputs.to(device), labels.to(device) 119 | outputs = net(inputs) 120 | loss = criterion(outputs, labels) 121 | 122 | test_loss += loss.item() 123 | correct += outputs.max(dim=1)[1].eq(labels).sum().item() 124 | total += labels.size(0) 125 | 126 | print("Testing ...") 127 | end = time.time() 128 | print("[progress:{:.1f}%]time:{:.2f}s Loss:{:.5f} Correct:{}/{} Acc:{:.3f}%".format( 129 | 100.*(idx+1)/len(testloader), end-start, test_loss/len(testloader), correct, total, 100.*correct/total 130 | )) 131 | 132 | # saving checkpoint 133 | acc = 100.*correct/total 134 | if acc > best_acc: 135 | best_acc = acc 136 | print("Saving parameters to checkpoint/ckpt.t7") 137 | checkpoint = { 138 | 'net_dict':net.state_dict(), 139 | 'acc':acc, 140 | 'epoch':epoch, 141 | } 142 | if not os.path.isdir('checkpoint'): 143 | os.mkdir('checkpoint') 144 | torch.save(checkpoint, './checkpoint/ckpt.t7') 145 | 146 | return test_loss/len(testloader), 1.- correct/total 147 | 148 | # plot figure 149 | x_epoch = [] 150 | record = {'train_loss':[], 'train_err':[], 'test_loss':[], 'test_err':[]} 151 | fig = plt.figure() 152 | ax0 = fig.add_subplot(121, title="loss") 153 | ax1 = fig.add_subplot(122, title="top1err") 154 | def draw_curve(epoch, train_loss, train_err, test_loss, test_err): 155 | global record 156 | record['train_loss'].append(train_loss) 157 | record['train_err'].append(train_err) 158 | record['test_loss'].append(test_loss) 159 | record['test_err'].append(test_err) 160 | 161 | x_epoch.append(epoch) 162 | ax0.plot(x_epoch, record['train_loss'], 'bo-', label='train') 163 | ax0.plot(x_epoch, record['test_loss'], 'ro-', label='val') 164 | ax1.plot(x_epoch, record['train_err'], 'bo-', label='train') 165 | ax1.plot(x_epoch, record['test_err'], 'ro-', label='val') 166 | if epoch == 0: 167 | ax0.legend() 168 | ax1.legend() 169 | fig.savefig("train.jpg") 170 | 171 | # lr decay 172 | def lr_decay(): 173 | global optimizer 174 | for params in optimizer.param_groups: 175 | params['lr'] *= 0.1 176 | lr = params['lr'] 177 | print("Learning rate adjusted to {}".format(lr)) 178 | 179 | def main(): 180 | for epoch in range(start_epoch, start_epoch+40): 181 | train_loss, train_err = train(epoch) 182 | test_loss, test_err = test(epoch) 183 | draw_curve(epoch, train_loss, train_err, test_loss, test_err) 184 | if (epoch+1)%20==0: 185 | lr_decay() 186 | 187 | 188 | if __name__ == '__main__': 189 | main() 190 | -------------------------------------------------------------------------------- /deep_sort/deep_sort/deep_sort.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch 3 | 4 | from .deep.feature_extractor import Extractor 5 | from .sort.nn_matching import NearestNeighborDistanceMetric 6 | from .sort.preprocessing import non_max_suppression 7 | from .sort.detection import Detection 8 | from .sort.tracker import Tracker 9 | 10 | import datetime 11 | 12 | 13 | __all__ = ['DeepSort'] 14 | 15 | 16 | class DeepSort(object): 17 | def __init__(self, model_path, max_dist=0.2, min_confidence=0.3, nms_max_overlap=1.0, max_iou_distance=0.7, max_age=70, n_init=3, nn_budget=100, use_cuda=True): 18 | self.min_confidence = min_confidence 19 | self.nms_max_overlap = nms_max_overlap 20 | 21 | self.extractor = Extractor(model_path, use_cuda=use_cuda) 22 | 23 | max_cosine_distance = max_dist 24 | nn_budget = 100 25 | metric = NearestNeighborDistanceMetric("cosine", max_cosine_distance, nn_budget) 26 | self.tracker = Tracker(metric, max_iou_distance=max_iou_distance, max_age=max_age, n_init=n_init) 27 | 28 | def update(self, bbox_xywh, confidences, clses, ori_img): 29 | self.height, self.width = ori_img.shape[:2] 30 | # generate detections 31 | features = self._get_features(bbox_xywh, ori_img) 32 | bbox_tlwh = self._xywh_to_tlwh(bbox_xywh) 33 | detections = [Detection(bbox_tlwh[i], conf, features[i], clses[i]) for i,conf in enumerate(confidences) if conf>self.min_confidence] 34 | 35 | # run on non-maximum supression 36 | boxes = np.array([d.tlwh for d in detections]) 37 | scores = np.array([d.confidence for d in detections]) 38 | indices = non_max_suppression(boxes, self.nms_max_overlap, scores) 39 | detections = [detections[i] for i in indices] 40 | 41 | # update tracker 42 | self.tracker.predict() 43 | self.tracker.update(detections) 44 | 45 | # output bbox identities 46 | outputs = [] 47 | now_time = datetime.datetime.now() 48 | for now_line, track in enumerate(self.tracker.tracks): 49 | if not track.is_confirmed() or track.time_since_update > 1: 50 | continue 51 | box = track.to_tlwh() 52 | x1,y1,x2,y2 = self._tlwh_to_xyxy(box) 53 | track_id = track.track_id 54 | cls2 = track.cls 55 | score2 = int(track.score*100) 56 | start_time = track.start_time 57 | stay = int((now_time - start_time).total_seconds()) 58 | outputs.append(np.array([x1,y1,x2,y2,track_id, cls2, score2, stay], dtype=np.int)) 59 | if len(outputs) > 0: 60 | outputs = np.stack(outputs,axis=0) 61 | return outputs 62 | 63 | 64 | """ 65 | TODO: 66 | Convert bbox from xc_yc_w_h to xtl_ytl_w_h 67 | Thanks JieChen91@github.com for reporting this bug! 68 | """ 69 | @staticmethod 70 | def _xywh_to_tlwh(bbox_xywh): 71 | if isinstance(bbox_xywh, np.ndarray): 72 | bbox_tlwh = bbox_xywh.copy() 73 | elif isinstance(bbox_xywh, torch.Tensor): 74 | bbox_tlwh = bbox_xywh.clone() 75 | bbox_tlwh[:,0] = bbox_xywh[:,0] - bbox_xywh[:,2]/2. 76 | bbox_tlwh[:,1] = bbox_xywh[:,1] - bbox_xywh[:,3]/2. 77 | return bbox_tlwh 78 | 79 | 80 | def _xywh_to_xyxy(self, bbox_xywh): 81 | x,y,w,h = bbox_xywh 82 | x1 = max(int(x-w/2),0) 83 | x2 = min(int(x+w/2),self.width-1) 84 | y1 = max(int(y-h/2),0) 85 | y2 = min(int(y+h/2),self.height-1) 86 | return x1,y1,x2,y2 87 | 88 | def _tlwh_to_xyxy(self, bbox_tlwh): 89 | """ 90 | TODO: 91 | Convert bbox from xtl_ytl_w_h to xc_yc_w_h 92 | Thanks JieChen91@github.com for reporting this bug! 93 | """ 94 | x,y,w,h = bbox_tlwh 95 | x1 = max(int(x),0) 96 | x2 = min(int(x+w),self.width-1) 97 | y1 = max(int(y),0) 98 | y2 = min(int(y+h),self.height-1) 99 | return x1,y1,x2,y2 100 | 101 | def _xyxy_to_tlwh(self, bbox_xyxy): 102 | x1,y1,x2,y2 = bbox_xyxy 103 | 104 | t = x1 105 | l = y1 106 | w = int(x2-x1) 107 | h = int(y2-y1) 108 | return t,l,w,h 109 | 110 | def _get_features(self, bbox_xywh, ori_img): 111 | im_crops = [] 112 | for box in bbox_xywh: 113 | x1,y1,x2,y2 = self._xywh_to_xyxy(box) 114 | im = ori_img[y1:y2,x1:x2] 115 | im_crops.append(im) 116 | if im_crops: 117 | features = self.extractor(im_crops) 118 | else: 119 | features = np.array([]) 120 | return features 121 | 122 | 123 | -------------------------------------------------------------------------------- /deep_sort/deep_sort/sort/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/deep_sort/sort/__init__.py -------------------------------------------------------------------------------- /deep_sort/deep_sort/sort/detection.py: -------------------------------------------------------------------------------- 1 | # vim: expandtab:ts=4:sw=4 2 | import numpy as np 3 | 4 | 5 | class Detection(object): 6 | """ 7 | This class represents a bounding box detection in a single image. 8 | 9 | Parameters 10 | ---------- 11 | tlwh : array_like 12 | Bounding box in format `(x, y, w, h)`. 13 | confidence : float 14 | Detector confidence score. 15 | feature : array_like 16 | A feature vector that describes the object contained in this image. 17 | 18 | Attributes 19 | ---------- 20 | tlwh : ndarray 21 | Bounding box in format `(top left x, top left y, width, height)`. 22 | confidence : ndarray 23 | Detector confidence score. 24 | feature : ndarray | NoneType 25 | A feature vector that describes the object contained in this image. 26 | 27 | """ 28 | 29 | def __init__(self, tlwh, confidence, feature, clses, ): 30 | self.tlwh = np.asarray(tlwh, dtype=np.float) 31 | self.confidence = float(confidence) 32 | self.feature = np.asarray(feature, dtype=np.float32) 33 | #self.clses = np.asarray(clses, dtype=np.int) 34 | self.clses = int(clses) 35 | 36 | def to_tlbr(self): 37 | """Convert bounding box to format `(min x, min y, max x, max y)`, i.e., 38 | `(top left, bottom right)`. 39 | """ 40 | ret = self.tlwh.copy() 41 | ret[2:] += ret[:2] 42 | return ret 43 | 44 | def to_xyah(self): 45 | """Convert bounding box to format `(center x, center y, aspect ratio, 46 | height)`, where the aspect ratio is `width / height`. 47 | """ 48 | ret = self.tlwh.copy() 49 | ret[:2] += ret[2:] / 2 50 | ret[2] /= ret[3] 51 | return ret 52 | -------------------------------------------------------------------------------- /deep_sort/deep_sort/sort/iou_matching.py: -------------------------------------------------------------------------------- 1 | # vim: expandtab:ts=4:sw=4 2 | from __future__ import absolute_import 3 | import numpy as np 4 | from . import linear_assignment 5 | 6 | 7 | def iou(bbox, candidates): 8 | """Computer intersection over union. 9 | 10 | Parameters 11 | ---------- 12 | bbox : ndarray 13 | A bounding box in format `(top left x, top left y, width, height)`. 14 | candidates : ndarray 15 | A matrix of candidate bounding boxes (one per row) in the same format 16 | as `bbox`. 17 | 18 | Returns 19 | ------- 20 | ndarray 21 | The intersection over union in [0, 1] between the `bbox` and each 22 | candidate. A higher score means a larger fraction of the `bbox` is 23 | occluded by the candidate. 24 | 25 | """ 26 | bbox_tl, bbox_br = bbox[:2], bbox[:2] + bbox[2:] 27 | candidates_tl = candidates[:, :2] 28 | candidates_br = candidates[:, :2] + candidates[:, 2:] 29 | 30 | tl = np.c_[np.maximum(bbox_tl[0], candidates_tl[:, 0])[:, np.newaxis], 31 | np.maximum(bbox_tl[1], candidates_tl[:, 1])[:, np.newaxis]] 32 | br = np.c_[np.minimum(bbox_br[0], candidates_br[:, 0])[:, np.newaxis], 33 | np.minimum(bbox_br[1], candidates_br[:, 1])[:, np.newaxis]] 34 | wh = np.maximum(0., br - tl) 35 | 36 | area_intersection = wh.prod(axis=1) 37 | area_bbox = bbox[2:].prod() 38 | area_candidates = candidates[:, 2:].prod(axis=1) 39 | return area_intersection / (area_bbox + area_candidates - area_intersection) 40 | 41 | 42 | def iou_cost(tracks, detections, track_indices=None, 43 | detection_indices=None): 44 | """An intersection over union distance metric. 45 | 46 | Parameters 47 | ---------- 48 | tracks : List[deep_sort.track.Track] 49 | A list of tracks. 50 | detections : List[deep_sort.detection.Detection] 51 | A list of detections. 52 | track_indices : Optional[List[int]] 53 | A list of indices to tracks that should be matched. Defaults to 54 | all `tracks`. 55 | detection_indices : Optional[List[int]] 56 | A list of indices to detections that should be matched. Defaults 57 | to all `detections`. 58 | 59 | Returns 60 | ------- 61 | ndarray 62 | Returns a cost matrix of shape 63 | len(track_indices), len(detection_indices) where entry (i, j) is 64 | `1 - iou(tracks[track_indices[i]], detections[detection_indices[j]])`. 65 | 66 | """ 67 | if track_indices is None: 68 | track_indices = np.arange(len(tracks)) 69 | if detection_indices is None: 70 | detection_indices = np.arange(len(detections)) 71 | 72 | cost_matrix = np.zeros((len(track_indices), len(detection_indices))) 73 | for row, track_idx in enumerate(track_indices): 74 | if tracks[track_idx].time_since_update > 1: 75 | cost_matrix[row, :] = linear_assignment.INFTY_COST 76 | continue 77 | 78 | bbox = tracks[track_idx].to_tlwh() 79 | candidates = np.asarray([detections[i].tlwh for i in detection_indices]) 80 | cost_matrix[row, :] = 1. - iou(bbox, candidates) 81 | return cost_matrix 82 | -------------------------------------------------------------------------------- /deep_sort/deep_sort/sort/nn_matching.py: -------------------------------------------------------------------------------- 1 | # vim: expandtab:ts=4:sw=4 2 | import numpy as np 3 | 4 | 5 | def _pdist(a, b): 6 | """Compute pair-wise squared distance between points in `a` and `b`. 7 | 8 | Parameters 9 | ---------- 10 | a : array_like 11 | An NxM matrix of N samples of dimensionality M. 12 | b : array_like 13 | An LxM matrix of L samples of dimensionality M. 14 | 15 | Returns 16 | ------- 17 | ndarray 18 | Returns a matrix of size len(a), len(b) such that eleement (i, j) 19 | contains the squared distance between `a[i]` and `b[j]`. 20 | 21 | """ 22 | a, b = np.asarray(a), np.asarray(b) 23 | if len(a) == 0 or len(b) == 0: 24 | return np.zeros((len(a), len(b))) 25 | a2, b2 = np.square(a).sum(axis=1), np.square(b).sum(axis=1) 26 | r2 = -2. * np.dot(a, b.T) + a2[:, None] + b2[None, :] 27 | r2 = np.clip(r2, 0., float(np.inf)) 28 | return r2 29 | 30 | 31 | def _cosine_distance(a, b, data_is_normalized=False): 32 | """Compute pair-wise cosine distance between points in `a` and `b`. 33 | 34 | Parameters 35 | ---------- 36 | a : array_like 37 | An NxM matrix of N samples of dimensionality M. 38 | b : array_like 39 | An LxM matrix of L samples of dimensionality M. 40 | data_is_normalized : Optional[bool] 41 | If True, assumes rows in a and b are unit length vectors. 42 | Otherwise, a and b are explicitly normalized to lenght 1. 43 | 44 | Returns 45 | ------- 46 | ndarray 47 | Returns a matrix of size len(a), len(b) such that eleement (i, j) 48 | contains the squared distance between `a[i]` and `b[j]`. 49 | 50 | """ 51 | if not data_is_normalized: 52 | a = np.asarray(a) / np.linalg.norm(a, axis=1, keepdims=True) 53 | b = np.asarray(b) / np.linalg.norm(b, axis=1, keepdims=True) 54 | return 1. - np.dot(a, b.T) 55 | 56 | 57 | def _nn_euclidean_distance(x, y): 58 | """ Helper function for nearest neighbor distance metric (Euclidean). 59 | 60 | Parameters 61 | ---------- 62 | x : ndarray 63 | A matrix of N row-vectors (sample points). 64 | y : ndarray 65 | A matrix of M row-vectors (query points). 66 | 67 | Returns 68 | ------- 69 | ndarray 70 | A vector of length M that contains for each entry in `y` the 71 | smallest Euclidean distance to a sample in `x`. 72 | 73 | """ 74 | distances = _pdist(x, y) 75 | return np.maximum(0.0, distances.min(axis=0)) 76 | 77 | 78 | def _nn_cosine_distance(x, y): 79 | """ Helper function for nearest neighbor distance metric (cosine). 80 | 81 | Parameters 82 | ---------- 83 | x : ndarray 84 | A matrix of N row-vectors (sample points). 85 | y : ndarray 86 | A matrix of M row-vectors (query points). 87 | 88 | Returns 89 | ------- 90 | ndarray 91 | A vector of length M that contains for each entry in `y` the 92 | smallest cosine distance to a sample in `x`. 93 | 94 | """ 95 | distances = _cosine_distance(x, y) 96 | return distances.min(axis=0) 97 | 98 | 99 | class NearestNeighborDistanceMetric(object): 100 | """ 101 | A nearest neighbor distance metric that, for each target, returns 102 | the closest distance to any sample that has been observed so far. 103 | 104 | Parameters 105 | ---------- 106 | metric : str 107 | Either "euclidean" or "cosine". 108 | matching_threshold: float 109 | The matching threshold. Samples with larger distance are considered an 110 | invalid match. 111 | budget : Optional[int] 112 | If not None, fix samples per class to at most this number. Removes 113 | the oldest samples when the budget is reached. 114 | 115 | Attributes 116 | ---------- 117 | samples : Dict[int -> List[ndarray]] 118 | A dictionary that maps from target identities to the list of samples 119 | that have been observed so far. 120 | 121 | """ 122 | 123 | def __init__(self, metric, matching_threshold, budget=None): 124 | 125 | 126 | if metric == "euclidean": 127 | self._metric = _nn_euclidean_distance 128 | elif metric == "cosine": 129 | self._metric = _nn_cosine_distance 130 | else: 131 | raise ValueError( 132 | "Invalid metric; must be either 'euclidean' or 'cosine'") 133 | self.matching_threshold = matching_threshold 134 | self.budget = budget 135 | self.samples = {} 136 | 137 | def partial_fit(self, features, targets, active_targets): 138 | """Update the distance metric with new data. 139 | 140 | Parameters 141 | ---------- 142 | features : ndarray 143 | An NxM matrix of N features of dimensionality M. 144 | targets : ndarray 145 | An integer array of associated target identities. 146 | active_targets : List[int] 147 | A list of targets that are currently present in the scene. 148 | 149 | """ 150 | for feature, target in zip(features, targets): 151 | self.samples.setdefault(target, []).append(feature) 152 | if self.budget is not None: 153 | self.samples[target] = self.samples[target][-self.budget:] 154 | self.samples = {k: self.samples[k] for k in active_targets} 155 | 156 | def distance(self, features, targets): 157 | """Compute distance between features and targets. 158 | 159 | Parameters 160 | ---------- 161 | features : ndarray 162 | An NxM matrix of N features of dimensionality M. 163 | targets : List[int] 164 | A list of targets to match the given `features` against. 165 | 166 | Returns 167 | ------- 168 | ndarray 169 | Returns a cost matrix of shape len(targets), len(features), where 170 | element (i, j) contains the closest squared distance between 171 | `targets[i]` and `features[j]`. 172 | 173 | """ 174 | cost_matrix = np.zeros((len(targets), len(features))) 175 | for i, target in enumerate(targets): 176 | cost_matrix[i, :] = self._metric(self.samples[target], features) 177 | return cost_matrix 178 | -------------------------------------------------------------------------------- /deep_sort/deep_sort/sort/preprocessing.py: -------------------------------------------------------------------------------- 1 | # vim: expandtab:ts=4:sw=4 2 | import numpy as np 3 | import cv2 4 | 5 | 6 | def non_max_suppression(boxes, max_bbox_overlap, scores=None): 7 | """Suppress overlapping detections. 8 | 9 | Original code from [1]_ has been adapted to include confidence score. 10 | 11 | .. [1] http://www.pyimagesearch.com/2015/02/16/ 12 | faster-non-maximum-suppression-python/ 13 | 14 | Examples 15 | -------- 16 | 17 | >>> boxes = [d.roi for d in detections] 18 | >>> scores = [d.confidence for d in detections] 19 | >>> indices = non_max_suppression(boxes, max_bbox_overlap, scores) 20 | >>> detections = [detections[i] for i in indices] 21 | 22 | Parameters 23 | ---------- 24 | boxes : ndarray 25 | Array of ROIs (x, y, width, height). 26 | max_bbox_overlap : float 27 | ROIs that overlap more than this values are suppressed. 28 | scores : Optional[array_like] 29 | Detector confidence score. 30 | 31 | Returns 32 | ------- 33 | List[int] 34 | Returns indices of detections that have survived non-maxima suppression. 35 | 36 | """ 37 | if len(boxes) == 0: 38 | return [] 39 | 40 | boxes = boxes.astype(np.float) 41 | pick = [] 42 | 43 | x1 = boxes[:, 0] 44 | y1 = boxes[:, 1] 45 | x2 = boxes[:, 2] + boxes[:, 0] 46 | y2 = boxes[:, 3] + boxes[:, 1] 47 | 48 | area = (x2 - x1 + 1) * (y2 - y1 + 1) 49 | if scores is not None: 50 | idxs = np.argsort(scores) 51 | else: 52 | idxs = np.argsort(y2) 53 | 54 | while len(idxs) > 0: 55 | last = len(idxs) - 1 56 | i = idxs[last] 57 | pick.append(i) 58 | 59 | xx1 = np.maximum(x1[i], x1[idxs[:last]]) 60 | yy1 = np.maximum(y1[i], y1[idxs[:last]]) 61 | xx2 = np.minimum(x2[i], x2[idxs[:last]]) 62 | yy2 = np.minimum(y2[i], y2[idxs[:last]]) 63 | 64 | w = np.maximum(0, xx2 - xx1 + 1) 65 | h = np.maximum(0, yy2 - yy1 + 1) 66 | 67 | overlap = (w * h) / area[idxs[:last]] 68 | 69 | idxs = np.delete( 70 | idxs, np.concatenate( 71 | ([last], np.where(overlap > max_bbox_overlap)[0]))) 72 | 73 | return pick 74 | -------------------------------------------------------------------------------- /deep_sort/deep_sort/sort/track.py: -------------------------------------------------------------------------------- 1 | # vim: expandtab:ts=4:sw=4 2 | 3 | import datetime 4 | 5 | class TrackState: 6 | """ 7 | Enumeration type for the single target track state. Newly created tracks are 8 | classified as `tentative` until enough evidence has been collected. Then, 9 | the track state is changed to `confirmed`. Tracks that are no longer alive 10 | are classified as `deleted` to mark them for removal from the set of active 11 | tracks. 12 | 13 | """ 14 | 15 | Tentative = 1 16 | Confirmed = 2 17 | Deleted = 3 18 | 19 | 20 | class Track: 21 | """ 22 | A single target track with state space `(x, y, a, h)` and associated 23 | velocities, where `(x, y)` is the center of the bounding box, `a` is the 24 | aspect ratio and `h` is the height. 25 | 26 | Parameters 27 | ---------- 28 | mean : ndarray 29 | Mean vector of the initial state distribution. 30 | covariance : ndarray 31 | Covariance matrix of the initial state distribution. 32 | track_id : int 33 | A unique track identifier. 34 | n_init : int 35 | Number of consecutive detections before the track is confirmed. The 36 | track state is set to `Deleted` if a miss occurs within the first 37 | `n_init` frames. 38 | max_age : int 39 | The maximum number of consecutive misses before the track state is 40 | set to `Deleted`. 41 | feature : Optional[ndarray] 42 | Feature vector of the detection this track originates from. If not None, 43 | this feature is added to the `features` cache. 44 | 45 | Attributes 46 | ---------- 47 | mean : ndarray 48 | Mean vector of the initial state distribution. 49 | covariance : ndarray 50 | Covariance matrix of the initial state distribution. 51 | track_id : int 52 | A unique track identifier. 53 | hits : int 54 | Total number of measurement updates. 55 | age : int 56 | Total number of frames since first occurance. 57 | time_since_update : int 58 | Total number of frames since last measurement update. 59 | state : TrackState 60 | The current track state. 61 | features : List[ndarray] 62 | A cache of features. On each measurement update, the associated feature 63 | vector is added to this list. 64 | 65 | """ 66 | 67 | def __init__(self, mean, covariance, track_id, n_init, max_age, cls, confidence, 68 | feature=None): 69 | self.mean = mean 70 | self.covariance = covariance 71 | self.track_id = track_id 72 | self.hits = 1 73 | self.age = 1 74 | self.time_since_update = 0 75 | self.cls = cls 76 | self.score = confidence 77 | self.start_time = datetime.datetime.now() 78 | 79 | self.state = TrackState.Tentative 80 | self.features = [] 81 | if feature is not None: 82 | self.features.append(feature) 83 | 84 | self._n_init = n_init 85 | self._max_age = max_age 86 | 87 | def to_tlwh(self): 88 | """Get current position in bounding box format `(top left x, top left y, 89 | width, height)`. 90 | 91 | Returns 92 | ------- 93 | ndarray 94 | The bounding box. 95 | 96 | """ 97 | ret = self.mean[:4].copy() 98 | ret[2] *= ret[3] 99 | ret[:2] -= ret[2:] / 2 100 | return ret 101 | 102 | def to_tlbr(self): 103 | """Get current position in bounding box format `(min x, miny, max x, 104 | max y)`. 105 | 106 | Returns 107 | ------- 108 | ndarray 109 | The bounding box. 110 | 111 | """ 112 | ret = self.to_tlwh() 113 | ret[2:] = ret[:2] + ret[2:] 114 | return ret 115 | 116 | def predict(self, kf): 117 | """Propagate the state distribution to the current time step using a 118 | Kalman filter prediction step. 119 | 120 | Parameters 121 | ---------- 122 | kf : kalman_filter.KalmanFilter 123 | The Kalman filter. 124 | 125 | """ 126 | self.mean, self.covariance = kf.predict(self.mean, self.covariance) 127 | self.age += 1 128 | self.time_since_update += 1 129 | 130 | def update(self, kf, detection): 131 | """Perform Kalman filter measurement update step and update the feature 132 | cache. 133 | 134 | Parameters 135 | ---------- 136 | kf : kalman_filter.KalmanFilter 137 | The Kalman filter. 138 | detection : Detection 139 | The associated detection. 140 | 141 | """ 142 | self.mean, self.covariance = kf.update( 143 | self.mean, self.covariance, detection.to_xyah()) 144 | self.features.append(detection.feature) 145 | self.cls = detection.clses 146 | self.score = detection.confidence 147 | 148 | self.hits += 1 149 | self.time_since_update = 0 150 | if self.state == TrackState.Tentative and self.hits >= self._n_init: 151 | self.state = TrackState.Confirmed 152 | 153 | def mark_missed(self): 154 | """Mark this track as missed (no association at the current time step). 155 | """ 156 | if self.state == TrackState.Tentative: 157 | self.state = TrackState.Deleted 158 | elif self.time_since_update > self._max_age: 159 | self.state = TrackState.Deleted 160 | 161 | def is_tentative(self): 162 | """Returns True if this track is tentative (unconfirmed). 163 | """ 164 | return self.state == TrackState.Tentative 165 | 166 | def is_confirmed(self): 167 | """Returns True if this track is confirmed.""" 168 | return self.state == TrackState.Confirmed 169 | 170 | def is_deleted(self): 171 | """Returns True if this track is dead and should be deleted.""" 172 | return self.state == TrackState.Deleted 173 | -------------------------------------------------------------------------------- /deep_sort/deep_sort/sort/tracker.py: -------------------------------------------------------------------------------- 1 | # vim: expandtab:ts=4:sw=4 2 | from __future__ import absolute_import 3 | import numpy as np 4 | from . import kalman_filter 5 | from . import linear_assignment 6 | from . import iou_matching 7 | from .track import Track 8 | 9 | 10 | class Tracker: 11 | """ 12 | This is the multi-target tracker. 13 | 14 | Parameters 15 | ---------- 16 | metric : nn_matching.NearestNeighborDistanceMetric 17 | A distance metric for measurement-to-track association. 18 | max_age : int 19 | Maximum number of missed misses before a track is deleted. 20 | n_init : int 21 | Number of consecutive detections before the track is confirmed. The 22 | track state is set to `Deleted` if a miss occurs within the first 23 | `n_init` frames. 24 | 25 | Attributes 26 | ---------- 27 | metric : nn_matching.NearestNeighborDistanceMetric 28 | The distance metric used for measurement to track association. 29 | max_age : int 30 | Maximum number of missed misses before a track is deleted. 31 | n_init : int 32 | Number of frames that a track remains in initialization phase. 33 | kf : kalman_filter.KalmanFilter 34 | A Kalman filter to filter target trajectories in image space. 35 | tracks : List[Track] 36 | The list of active tracks at the current time step. 37 | 38 | """ 39 | 40 | def __init__(self, metric, max_iou_distance=0.7, max_age=70, n_init=3): 41 | self.metric = metric 42 | self.max_iou_distance = max_iou_distance 43 | self.max_age = max_age 44 | self.n_init = n_init 45 | 46 | self.kf = kalman_filter.KalmanFilter() 47 | self.tracks = [] 48 | self._next_id = 1 49 | 50 | def predict(self): 51 | """Propagate track state distributions one time step forward. 52 | 53 | This function should be called once every time step, before `update`. 54 | """ 55 | for track in self.tracks: 56 | track.predict(self.kf) 57 | 58 | def update(self, detections): 59 | """Perform measurement update and track management. 60 | 61 | Parameters 62 | ---------- 63 | detections : List[deep_sort.detection.Detection] 64 | A list of detections at the current time step. 65 | 66 | """ 67 | # Run matching cascade. 68 | matches, unmatched_tracks, unmatched_detections = \ 69 | self._match(detections) 70 | 71 | # Update track set. 72 | for track_idx, detection_idx in matches: 73 | self.tracks[track_idx].update( 74 | self.kf, detections[detection_idx]) 75 | for track_idx in unmatched_tracks: 76 | self.tracks[track_idx].mark_missed() 77 | for detection_idx in unmatched_detections: 78 | new_id = self._initiate_track(detections[detection_idx]) 79 | self._initiate_track(detections[detection_idx]) 80 | self.tracks = [t for t in self.tracks if not t.is_deleted()] 81 | 82 | # Update distance metric. 83 | active_targets = [t.track_id for t in self.tracks if t.is_confirmed()] 84 | features, targets = [], [] 85 | for track in self.tracks: 86 | if not track.is_confirmed(): 87 | continue 88 | features += track.features 89 | targets += [track.track_id for _ in track.features] 90 | track.features = [] 91 | self.metric.partial_fit( 92 | np.asarray(features), np.asarray(targets), active_targets) 93 | 94 | def _match(self, detections): 95 | 96 | def gated_metric(tracks, dets, track_indices, detection_indices): 97 | features = np.array([dets[i].feature for i in detection_indices]) 98 | targets = np.array([tracks[i].track_id for i in track_indices]) 99 | cost_matrix = self.metric.distance(features, targets) 100 | cost_matrix = linear_assignment.gate_cost_matrix( 101 | self.kf, cost_matrix, tracks, dets, track_indices, 102 | detection_indices) 103 | 104 | return cost_matrix 105 | 106 | # Split track set into confirmed and unconfirmed tracks. 107 | confirmed_tracks = [ 108 | i for i, t in enumerate(self.tracks) if t.is_confirmed()] 109 | unconfirmed_tracks = [ 110 | i for i, t in enumerate(self.tracks) if not t.is_confirmed()] 111 | 112 | # Associate confirmed tracks using appearance features. 113 | matches_a, unmatched_tracks_a, unmatched_detections = \ 114 | linear_assignment.matching_cascade( 115 | gated_metric, self.metric.matching_threshold, self.max_age, 116 | self.tracks, detections, confirmed_tracks) 117 | 118 | # Associate remaining tracks together with unconfirmed tracks using IOU. 119 | iou_track_candidates = unconfirmed_tracks + [ 120 | k for k in unmatched_tracks_a if 121 | self.tracks[k].time_since_update == 1] 122 | unmatched_tracks_a = [ 123 | k for k in unmatched_tracks_a if 124 | self.tracks[k].time_since_update != 1] 125 | matches_b, unmatched_tracks_b, unmatched_detections = \ 126 | linear_assignment.min_cost_matching( 127 | iou_matching.iou_cost, self.max_iou_distance, self.tracks, 128 | detections, iou_track_candidates, unmatched_detections) 129 | 130 | matches = matches_a + matches_b 131 | unmatched_tracks = list(set(unmatched_tracks_a + unmatched_tracks_b)) 132 | return matches, unmatched_tracks, unmatched_detections 133 | 134 | def _initiate_track(self, detection): 135 | mean, covariance = self.kf.initiate(detection.to_xyah()) 136 | cls = detection.clses 137 | score = detection.confidence 138 | self.tracks.append(Track( 139 | mean, covariance, self._next_id, self.n_init, self.max_age, cls, score, 140 | detection.feature)) 141 | self._next_id += 1 142 | return (self._next_id-1) 143 | -------------------------------------------------------------------------------- /deep_sort/demo/1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/demo/1.jpg -------------------------------------------------------------------------------- /deep_sort/demo/2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/demo/2.jpg -------------------------------------------------------------------------------- /deep_sort/demo/demo.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/demo/demo.gif -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/README.md: -------------------------------------------------------------------------------- 1 | # YOLOv3 for detection 2 | 3 | This is an implemention of YOLOv3 with only the forward part. 4 | 5 | If you want to train YOLOv3 on your custom dataset, please search `YOLOv3` on github. 6 | 7 | ## Quick forward 8 | ```bash 9 | cd YOLOv3 10 | python 11 | ``` -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/__init__.py: -------------------------------------------------------------------------------- 1 | import sys 2 | sys.path.append("detector/YOLOv3") 3 | 4 | 5 | from .detector import YOLOv3 6 | __all__ = ['YOLOv3'] 7 | 8 | 9 | 10 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/cfg/coco.data: -------------------------------------------------------------------------------- 1 | train = coco_train.txt 2 | valid = coco_test.txt 3 | names = data/coco.names 4 | backup = backup 5 | gpus = 0,1,2,3 6 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/cfg/coco.names: -------------------------------------------------------------------------------- 1 | person 2 | bicycle 3 | car 4 | motorbike 5 | aeroplane 6 | bus 7 | train 8 | truck 9 | boat 10 | traffic light 11 | fire hydrant 12 | stop sign 13 | parking meter 14 | bench 15 | bird 16 | cat 17 | dog 18 | horse 19 | sheep 20 | cow 21 | elephant 22 | bear 23 | zebra 24 | giraffe 25 | backpack 26 | umbrella 27 | handbag 28 | tie 29 | suitcase 30 | frisbee 31 | skis 32 | snowboard 33 | sports ball 34 | kite 35 | baseball bat 36 | baseball glove 37 | skateboard 38 | surfboard 39 | tennis racket 40 | bottle 41 | wine glass 42 | cup 43 | fork 44 | knife 45 | spoon 46 | bowl 47 | banana 48 | apple 49 | sandwich 50 | orange 51 | broccoli 52 | carrot 53 | hot dog 54 | pizza 55 | donut 56 | cake 57 | chair 58 | sofa 59 | pottedplant 60 | bed 61 | diningtable 62 | toilet 63 | tvmonitor 64 | laptop 65 | mouse 66 | remote 67 | keyboard 68 | cell phone 69 | microwave 70 | oven 71 | toaster 72 | sink 73 | refrigerator 74 | book 75 | clock 76 | vase 77 | scissors 78 | teddy bear 79 | hair drier 80 | toothbrush 81 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/cfg/darknet19_448.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | batch=128 3 | subdivisions=4 4 | height=448 5 | width=448 6 | max_crop=512 7 | channels=3 8 | momentum=0.9 9 | decay=0.0005 10 | 11 | learning_rate=0.001 12 | policy=poly 13 | power=4 14 | max_batches=100000 15 | 16 | angle=7 17 | hue = .1 18 | saturation=.75 19 | exposure=.75 20 | aspect=.75 21 | 22 | [convolutional] 23 | batch_normalize=1 24 | filters=32 25 | size=3 26 | stride=1 27 | pad=1 28 | activation=leaky 29 | 30 | [maxpool] 31 | size=2 32 | stride=2 33 | 34 | [convolutional] 35 | batch_normalize=1 36 | filters=64 37 | size=3 38 | stride=1 39 | pad=1 40 | activation=leaky 41 | 42 | [maxpool] 43 | size=2 44 | stride=2 45 | 46 | [convolutional] 47 | batch_normalize=1 48 | filters=128 49 | size=3 50 | stride=1 51 | pad=1 52 | activation=leaky 53 | 54 | [convolutional] 55 | batch_normalize=1 56 | filters=64 57 | size=1 58 | stride=1 59 | pad=1 60 | activation=leaky 61 | 62 | [convolutional] 63 | batch_normalize=1 64 | filters=128 65 | size=3 66 | stride=1 67 | pad=1 68 | activation=leaky 69 | 70 | [maxpool] 71 | size=2 72 | stride=2 73 | 74 | [convolutional] 75 | batch_normalize=1 76 | filters=256 77 | size=3 78 | stride=1 79 | pad=1 80 | activation=leaky 81 | 82 | [convolutional] 83 | batch_normalize=1 84 | filters=128 85 | size=1 86 | stride=1 87 | pad=1 88 | activation=leaky 89 | 90 | [convolutional] 91 | batch_normalize=1 92 | filters=256 93 | size=3 94 | stride=1 95 | pad=1 96 | activation=leaky 97 | 98 | [maxpool] 99 | size=2 100 | stride=2 101 | 102 | [convolutional] 103 | batch_normalize=1 104 | filters=512 105 | size=3 106 | stride=1 107 | pad=1 108 | activation=leaky 109 | 110 | [convolutional] 111 | batch_normalize=1 112 | filters=256 113 | size=1 114 | stride=1 115 | pad=1 116 | activation=leaky 117 | 118 | [convolutional] 119 | batch_normalize=1 120 | filters=512 121 | size=3 122 | stride=1 123 | pad=1 124 | activation=leaky 125 | 126 | [convolutional] 127 | batch_normalize=1 128 | filters=256 129 | size=1 130 | stride=1 131 | pad=1 132 | activation=leaky 133 | 134 | [convolutional] 135 | batch_normalize=1 136 | filters=512 137 | size=3 138 | stride=1 139 | pad=1 140 | activation=leaky 141 | 142 | [maxpool] 143 | size=2 144 | stride=2 145 | 146 | [convolutional] 147 | batch_normalize=1 148 | filters=1024 149 | size=3 150 | stride=1 151 | pad=1 152 | activation=leaky 153 | 154 | [convolutional] 155 | batch_normalize=1 156 | filters=512 157 | size=1 158 | stride=1 159 | pad=1 160 | activation=leaky 161 | 162 | [convolutional] 163 | batch_normalize=1 164 | filters=1024 165 | size=3 166 | stride=1 167 | pad=1 168 | activation=leaky 169 | 170 | [convolutional] 171 | batch_normalize=1 172 | filters=512 173 | size=1 174 | stride=1 175 | pad=1 176 | activation=leaky 177 | 178 | [convolutional] 179 | batch_normalize=1 180 | filters=1024 181 | size=3 182 | stride=1 183 | pad=1 184 | activation=leaky 185 | 186 | [convolutional] 187 | filters=1000 188 | size=1 189 | stride=1 190 | pad=1 191 | activation=linear 192 | 193 | [avgpool] 194 | 195 | [softmax] 196 | groups=1 197 | 198 | [cost] 199 | type=sse 200 | 201 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/cfg/tiny-yolo-voc.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | batch=64 3 | subdivisions=8 4 | width=416 5 | height=416 6 | channels=3 7 | momentum=0.9 8 | decay=0.0005 9 | angle=0 10 | saturation = 1.5 11 | exposure = 1.5 12 | hue=.1 13 | 14 | learning_rate=0.001 15 | max_batches = 40200 16 | policy=steps 17 | steps=-1,100,20000,30000 18 | scales=.1,10,.1,.1 19 | 20 | [convolutional] 21 | batch_normalize=1 22 | filters=16 23 | size=3 24 | stride=1 25 | pad=1 26 | activation=leaky 27 | 28 | [maxpool] 29 | size=2 30 | stride=2 31 | 32 | [convolutional] 33 | batch_normalize=1 34 | filters=32 35 | size=3 36 | stride=1 37 | pad=1 38 | activation=leaky 39 | 40 | [maxpool] 41 | size=2 42 | stride=2 43 | 44 | [convolutional] 45 | batch_normalize=1 46 | filters=64 47 | size=3 48 | stride=1 49 | pad=1 50 | activation=leaky 51 | 52 | [maxpool] 53 | size=2 54 | stride=2 55 | 56 | [convolutional] 57 | batch_normalize=1 58 | filters=128 59 | size=3 60 | stride=1 61 | pad=1 62 | activation=leaky 63 | 64 | [maxpool] 65 | size=2 66 | stride=2 67 | 68 | [convolutional] 69 | batch_normalize=1 70 | filters=256 71 | size=3 72 | stride=1 73 | pad=1 74 | activation=leaky 75 | 76 | [maxpool] 77 | size=2 78 | stride=2 79 | 80 | [convolutional] 81 | batch_normalize=1 82 | filters=512 83 | size=3 84 | stride=1 85 | pad=1 86 | activation=leaky 87 | 88 | [maxpool] 89 | size=2 90 | stride=1 91 | 92 | [convolutional] 93 | batch_normalize=1 94 | filters=1024 95 | size=3 96 | stride=1 97 | pad=1 98 | activation=leaky 99 | 100 | ########### 101 | 102 | [convolutional] 103 | batch_normalize=1 104 | size=3 105 | stride=1 106 | pad=1 107 | filters=1024 108 | activation=leaky 109 | 110 | [convolutional] 111 | size=1 112 | stride=1 113 | pad=1 114 | filters=125 115 | activation=linear 116 | 117 | [region] 118 | anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52 119 | bias_match=1 120 | classes=20 121 | coords=4 122 | num=5 123 | softmax=1 124 | jitter=.2 125 | rescore=1 126 | 127 | object_scale=5 128 | noobject_scale=1 129 | class_scale=1 130 | coord_scale=1 131 | 132 | absolute=1 133 | thresh = .6 134 | random=1 135 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/cfg/tiny-yolo.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Training 3 | # batch=64 4 | # subdivisions=2 5 | # Testing 6 | batch=1 7 | subdivisions=1 8 | width=416 9 | height=416 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 500200 21 | policy=steps 22 | steps=400000,450000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=16 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | [maxpool] 34 | size=2 35 | stride=2 36 | 37 | [convolutional] 38 | batch_normalize=1 39 | filters=32 40 | size=3 41 | stride=1 42 | pad=1 43 | activation=leaky 44 | 45 | [maxpool] 46 | size=2 47 | stride=2 48 | 49 | [convolutional] 50 | batch_normalize=1 51 | filters=64 52 | size=3 53 | stride=1 54 | pad=1 55 | activation=leaky 56 | 57 | [maxpool] 58 | size=2 59 | stride=2 60 | 61 | [convolutional] 62 | batch_normalize=1 63 | filters=128 64 | size=3 65 | stride=1 66 | pad=1 67 | activation=leaky 68 | 69 | [maxpool] 70 | size=2 71 | stride=2 72 | 73 | [convolutional] 74 | batch_normalize=1 75 | filters=256 76 | size=3 77 | stride=1 78 | pad=1 79 | activation=leaky 80 | 81 | [maxpool] 82 | size=2 83 | stride=2 84 | 85 | [convolutional] 86 | batch_normalize=1 87 | filters=512 88 | size=3 89 | stride=1 90 | pad=1 91 | activation=leaky 92 | 93 | [maxpool] 94 | size=2 95 | stride=1 96 | 97 | [convolutional] 98 | batch_normalize=1 99 | filters=1024 100 | size=3 101 | stride=1 102 | pad=1 103 | activation=leaky 104 | 105 | ########### 106 | 107 | [convolutional] 108 | batch_normalize=1 109 | size=3 110 | stride=1 111 | pad=1 112 | filters=512 113 | activation=leaky 114 | 115 | [convolutional] 116 | size=1 117 | stride=1 118 | pad=1 119 | filters=425 120 | activation=linear 121 | 122 | [region] 123 | anchors = 0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828 124 | bias_match=1 125 | classes=80 126 | coords=4 127 | num=5 128 | softmax=1 129 | jitter=.2 130 | rescore=0 131 | 132 | object_scale=5 133 | noobject_scale=1 134 | class_scale=1 135 | coord_scale=1 136 | 137 | absolute=1 138 | thresh = .6 139 | random=1 140 | 141 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/cfg/voc.data: -------------------------------------------------------------------------------- 1 | train = data/voc_train.txt 2 | valid = data/2007_test.txt 3 | names = data/voc.names 4 | backup = backup 5 | gpus = 3 6 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/cfg/voc.names: -------------------------------------------------------------------------------- 1 | aeroplane 2 | bicycle 3 | bird 4 | boat 5 | bottle 6 | bus 7 | car 8 | cat 9 | chair 10 | cow 11 | diningtable 12 | dog 13 | horse 14 | motorbike 15 | person 16 | pottedplant 17 | sheep 18 | sofa 19 | train 20 | tvmonitor 21 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/cfg/voc_gaotie.data: -------------------------------------------------------------------------------- 1 | train = data/gaotie_trainval.txt 2 | valid = data/gaotie_test.txt 3 | names = data/voc.names 4 | backup = backup 5 | gpus = 3 -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/cfg/yolo-voc.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | batch=64 4 | subdivisions=8 5 | # Training 6 | # batch=64 7 | # subdivisions=8 8 | height=416 9 | width=416 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 80200 21 | policy=steps 22 | steps=-1,500,40000,60000 23 | scales=0.1,10,.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=32 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | [maxpool] 34 | size=2 35 | stride=2 36 | 37 | [convolutional] 38 | batch_normalize=1 39 | filters=64 40 | size=3 41 | stride=1 42 | pad=1 43 | activation=leaky 44 | 45 | [maxpool] 46 | size=2 47 | stride=2 48 | 49 | [convolutional] 50 | batch_normalize=1 51 | filters=128 52 | size=3 53 | stride=1 54 | pad=1 55 | activation=leaky 56 | 57 | [convolutional] 58 | batch_normalize=1 59 | filters=64 60 | size=1 61 | stride=1 62 | pad=1 63 | activation=leaky 64 | 65 | [convolutional] 66 | batch_normalize=1 67 | filters=128 68 | size=3 69 | stride=1 70 | pad=1 71 | activation=leaky 72 | 73 | [maxpool] 74 | size=2 75 | stride=2 76 | 77 | [convolutional] 78 | batch_normalize=1 79 | filters=256 80 | size=3 81 | stride=1 82 | pad=1 83 | activation=leaky 84 | 85 | [convolutional] 86 | batch_normalize=1 87 | filters=128 88 | size=1 89 | stride=1 90 | pad=1 91 | activation=leaky 92 | 93 | [convolutional] 94 | batch_normalize=1 95 | filters=256 96 | size=3 97 | stride=1 98 | pad=1 99 | activation=leaky 100 | 101 | [maxpool] 102 | size=2 103 | stride=2 104 | 105 | [convolutional] 106 | batch_normalize=1 107 | filters=512 108 | size=3 109 | stride=1 110 | pad=1 111 | activation=leaky 112 | 113 | [convolutional] 114 | batch_normalize=1 115 | filters=256 116 | size=1 117 | stride=1 118 | pad=1 119 | activation=leaky 120 | 121 | [convolutional] 122 | batch_normalize=1 123 | filters=512 124 | size=3 125 | stride=1 126 | pad=1 127 | activation=leaky 128 | 129 | [convolutional] 130 | batch_normalize=1 131 | filters=256 132 | size=1 133 | stride=1 134 | pad=1 135 | activation=leaky 136 | 137 | [convolutional] 138 | batch_normalize=1 139 | filters=512 140 | size=3 141 | stride=1 142 | pad=1 143 | activation=leaky 144 | 145 | [maxpool] 146 | size=2 147 | stride=2 148 | 149 | [convolutional] 150 | batch_normalize=1 151 | filters=1024 152 | size=3 153 | stride=1 154 | pad=1 155 | activation=leaky 156 | 157 | [convolutional] 158 | batch_normalize=1 159 | filters=512 160 | size=1 161 | stride=1 162 | pad=1 163 | activation=leaky 164 | 165 | [convolutional] 166 | batch_normalize=1 167 | filters=1024 168 | size=3 169 | stride=1 170 | pad=1 171 | activation=leaky 172 | 173 | [convolutional] 174 | batch_normalize=1 175 | filters=512 176 | size=1 177 | stride=1 178 | pad=1 179 | activation=leaky 180 | 181 | [convolutional] 182 | batch_normalize=1 183 | filters=1024 184 | size=3 185 | stride=1 186 | pad=1 187 | activation=leaky 188 | 189 | 190 | ####### 191 | 192 | [convolutional] 193 | batch_normalize=1 194 | size=3 195 | stride=1 196 | pad=1 197 | filters=1024 198 | activation=leaky 199 | 200 | [convolutional] 201 | batch_normalize=1 202 | size=3 203 | stride=1 204 | pad=1 205 | filters=1024 206 | activation=leaky 207 | 208 | [route] 209 | layers=-9 210 | 211 | [convolutional] 212 | batch_normalize=1 213 | size=1 214 | stride=1 215 | pad=1 216 | filters=64 217 | activation=leaky 218 | 219 | [reorg] 220 | stride=2 221 | 222 | [route] 223 | layers=-1,-4 224 | 225 | [convolutional] 226 | batch_normalize=1 227 | size=3 228 | stride=1 229 | pad=1 230 | filters=1024 231 | activation=leaky 232 | 233 | [convolutional] 234 | size=1 235 | stride=1 236 | pad=1 237 | filters=125 238 | activation=linear 239 | 240 | 241 | [region] 242 | anchors = 1.3221, 1.73145, 3.19275, 4.00944, 5.05587, 8.09892, 9.47112, 4.84053, 11.2364, 10.0071 243 | bias_match=1 244 | classes=20 245 | coords=4 246 | num=5 247 | softmax=1 248 | jitter=.3 249 | rescore=1 250 | 251 | object_scale=5 252 | noobject_scale=1 253 | class_scale=1 254 | coord_scale=1 255 | 256 | absolute=1 257 | thresh = .6 258 | random=1 259 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/cfg/yolo.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | batch=1 4 | subdivisions=1 5 | # Training 6 | # batch=64 7 | # subdivisions=8 8 | width=416 9 | height=416 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 500200 21 | policy=steps 22 | steps=400000,450000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=32 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | [maxpool] 34 | size=2 35 | stride=2 36 | 37 | [convolutional] 38 | batch_normalize=1 39 | filters=64 40 | size=3 41 | stride=1 42 | pad=1 43 | activation=leaky 44 | 45 | [maxpool] 46 | size=2 47 | stride=2 48 | 49 | [convolutional] 50 | batch_normalize=1 51 | filters=128 52 | size=3 53 | stride=1 54 | pad=1 55 | activation=leaky 56 | 57 | [convolutional] 58 | batch_normalize=1 59 | filters=64 60 | size=1 61 | stride=1 62 | pad=1 63 | activation=leaky 64 | 65 | [convolutional] 66 | batch_normalize=1 67 | filters=128 68 | size=3 69 | stride=1 70 | pad=1 71 | activation=leaky 72 | 73 | [maxpool] 74 | size=2 75 | stride=2 76 | 77 | [convolutional] 78 | batch_normalize=1 79 | filters=256 80 | size=3 81 | stride=1 82 | pad=1 83 | activation=leaky 84 | 85 | [convolutional] 86 | batch_normalize=1 87 | filters=128 88 | size=1 89 | stride=1 90 | pad=1 91 | activation=leaky 92 | 93 | [convolutional] 94 | batch_normalize=1 95 | filters=256 96 | size=3 97 | stride=1 98 | pad=1 99 | activation=leaky 100 | 101 | [maxpool] 102 | size=2 103 | stride=2 104 | 105 | [convolutional] 106 | batch_normalize=1 107 | filters=512 108 | size=3 109 | stride=1 110 | pad=1 111 | activation=leaky 112 | 113 | [convolutional] 114 | batch_normalize=1 115 | filters=256 116 | size=1 117 | stride=1 118 | pad=1 119 | activation=leaky 120 | 121 | [convolutional] 122 | batch_normalize=1 123 | filters=512 124 | size=3 125 | stride=1 126 | pad=1 127 | activation=leaky 128 | 129 | [convolutional] 130 | batch_normalize=1 131 | filters=256 132 | size=1 133 | stride=1 134 | pad=1 135 | activation=leaky 136 | 137 | [convolutional] 138 | batch_normalize=1 139 | filters=512 140 | size=3 141 | stride=1 142 | pad=1 143 | activation=leaky 144 | 145 | [maxpool] 146 | size=2 147 | stride=2 148 | 149 | [convolutional] 150 | batch_normalize=1 151 | filters=1024 152 | size=3 153 | stride=1 154 | pad=1 155 | activation=leaky 156 | 157 | [convolutional] 158 | batch_normalize=1 159 | filters=512 160 | size=1 161 | stride=1 162 | pad=1 163 | activation=leaky 164 | 165 | [convolutional] 166 | batch_normalize=1 167 | filters=1024 168 | size=3 169 | stride=1 170 | pad=1 171 | activation=leaky 172 | 173 | [convolutional] 174 | batch_normalize=1 175 | filters=512 176 | size=1 177 | stride=1 178 | pad=1 179 | activation=leaky 180 | 181 | [convolutional] 182 | batch_normalize=1 183 | filters=1024 184 | size=3 185 | stride=1 186 | pad=1 187 | activation=leaky 188 | 189 | 190 | ####### 191 | 192 | [convolutional] 193 | batch_normalize=1 194 | size=3 195 | stride=1 196 | pad=1 197 | filters=1024 198 | activation=leaky 199 | 200 | [convolutional] 201 | batch_normalize=1 202 | size=3 203 | stride=1 204 | pad=1 205 | filters=1024 206 | activation=leaky 207 | 208 | [route] 209 | layers=-9 210 | 211 | [convolutional] 212 | batch_normalize=1 213 | size=1 214 | stride=1 215 | pad=1 216 | filters=64 217 | activation=leaky 218 | 219 | [reorg] 220 | stride=2 221 | 222 | [route] 223 | layers=-1,-4 224 | 225 | [convolutional] 226 | batch_normalize=1 227 | size=3 228 | stride=1 229 | pad=1 230 | filters=1024 231 | activation=leaky 232 | 233 | [convolutional] 234 | size=1 235 | stride=1 236 | pad=1 237 | filters=425 238 | activation=linear 239 | 240 | 241 | [region] 242 | anchors = 0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828 243 | bias_match=1 244 | classes=80 245 | coords=4 246 | num=5 247 | softmax=1 248 | jitter=.3 249 | rescore=1 250 | 251 | object_scale=5 252 | noobject_scale=1 253 | class_scale=1 254 | coord_scale=1 255 | 256 | absolute=1 257 | thresh = .6 258 | random=1 259 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/cfg/yolov3-tiny.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | batch=1 4 | subdivisions=1 5 | # Training 6 | # batch=64 7 | # subdivisions=2 8 | width=416 9 | height=416 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 500200 21 | policy=steps 22 | steps=400000,450000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=16 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | [maxpool] 34 | size=2 35 | stride=2 36 | 37 | [convolutional] 38 | batch_normalize=1 39 | filters=32 40 | size=3 41 | stride=1 42 | pad=1 43 | activation=leaky 44 | 45 | [maxpool] 46 | size=2 47 | stride=2 48 | 49 | [convolutional] 50 | batch_normalize=1 51 | filters=64 52 | size=3 53 | stride=1 54 | pad=1 55 | activation=leaky 56 | 57 | [maxpool] 58 | size=2 59 | stride=2 60 | 61 | [convolutional] 62 | batch_normalize=1 63 | filters=128 64 | size=3 65 | stride=1 66 | pad=1 67 | activation=leaky 68 | 69 | [maxpool] 70 | size=2 71 | stride=2 72 | 73 | [convolutional] 74 | batch_normalize=1 75 | filters=256 76 | size=3 77 | stride=1 78 | pad=1 79 | activation=leaky 80 | 81 | [maxpool] 82 | size=2 83 | stride=2 84 | 85 | [convolutional] 86 | batch_normalize=1 87 | filters=512 88 | size=3 89 | stride=1 90 | pad=1 91 | activation=leaky 92 | 93 | [maxpool] 94 | size=2 95 | stride=1 96 | 97 | [convolutional] 98 | batch_normalize=1 99 | filters=1024 100 | size=3 101 | stride=1 102 | pad=1 103 | activation=leaky 104 | 105 | ########### 106 | 107 | [convolutional] 108 | batch_normalize=1 109 | filters=256 110 | size=1 111 | stride=1 112 | pad=1 113 | activation=leaky 114 | 115 | [convolutional] 116 | batch_normalize=1 117 | filters=512 118 | size=3 119 | stride=1 120 | pad=1 121 | activation=leaky 122 | 123 | [convolutional] 124 | size=1 125 | stride=1 126 | pad=1 127 | filters=255 128 | activation=linear 129 | 130 | 131 | 132 | [yolo] 133 | mask = 3,4,5 134 | anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 135 | classes=80 136 | num=6 137 | jitter=.3 138 | ignore_thresh = .7 139 | truth_thresh = 1 140 | random=1 141 | 142 | [route] 143 | layers = -4 144 | 145 | [convolutional] 146 | batch_normalize=1 147 | filters=128 148 | size=1 149 | stride=1 150 | pad=1 151 | activation=leaky 152 | 153 | [upsample] 154 | stride=2 155 | 156 | [route] 157 | layers = -1, 8 158 | 159 | [convolutional] 160 | batch_normalize=1 161 | filters=256 162 | size=3 163 | stride=1 164 | pad=1 165 | activation=leaky 166 | 167 | [convolutional] 168 | size=1 169 | stride=1 170 | pad=1 171 | filters=255 172 | activation=linear 173 | 174 | [yolo] 175 | mask = 0,1,2 176 | anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 177 | classes=80 178 | num=6 179 | jitter=.3 180 | ignore_thresh = .7 181 | truth_thresh = 1 182 | random=1 183 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/demo/004545.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/detector/YOLOv3/demo/004545.jpg -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/demo/results/004545.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/detector/YOLOv3/demo/results/004545.jpg -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/detect.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import time 3 | from PIL import Image, ImageDraw 4 | #from models.tiny_yolo import TinyYoloNet 5 | from yolo_utils import * 6 | from darknet import Darknet 7 | 8 | import cv2 9 | 10 | namesfile=None 11 | def detect(cfgfile, weightfile, imgfolder): 12 | m = Darknet(cfgfile) 13 | 14 | #m.print_network() 15 | m.load_weights(weightfile) 16 | print('Loading weights from %s... Done!' % (weightfile)) 17 | 18 | # if m.num_classes == 20: 19 | # namesfile = 'data/voc.names' 20 | # elif m.num_classes == 80: 21 | # namesfile = 'data/coco.names' 22 | # else: 23 | # namesfile = 'data/names' 24 | 25 | use_cuda = True 26 | if use_cuda: 27 | m.cuda() 28 | 29 | imgfiles = [x for x in os.listdir(imgfolder) if x[-4:] == '.jpg'] 30 | imgfiles.sort() 31 | for imgname in imgfiles: 32 | imgfile = os.path.join(imgfolder,imgname) 33 | 34 | img = Image.open(imgfile).convert('RGB') 35 | sized = img.resize((m.width, m.height)) 36 | 37 | #for i in range(2): 38 | start = time.time() 39 | boxes = do_detect(m, sized, 0.5, 0.4, use_cuda) 40 | finish = time.time() 41 | #if i == 1: 42 | print('%s: Predicted in %f seconds.' % (imgfile, (finish-start))) 43 | 44 | class_names = load_class_names(namesfile) 45 | img = plot_boxes(img, boxes, 'result/{}'.format(os.path.basename(imgfile)), class_names) 46 | img = np.array(img) 47 | cv2.imshow('{}'.format(os.path.basename(imgfolder)), img) 48 | cv2.resizeWindow('{}'.format(os.path.basename(imgfolder)), 1000,800) 49 | cv2.waitKey(1000) 50 | 51 | def detect_cv2(cfgfile, weightfile, imgfile): 52 | import cv2 53 | m = Darknet(cfgfile) 54 | 55 | m.print_network() 56 | m.load_weights(weightfile) 57 | print('Loading weights from %s... Done!' % (weightfile)) 58 | 59 | if m.num_classes == 20: 60 | namesfile = 'data/voc.names' 61 | elif m.num_classes == 80: 62 | namesfile = 'data/coco.names' 63 | else: 64 | namesfile = 'data/names' 65 | 66 | use_cuda = True 67 | if use_cuda: 68 | m.cuda() 69 | 70 | img = cv2.imread(imgfile) 71 | sized = cv2.resize(img, (m.width, m.height)) 72 | sized = cv2.cvtColor(sized, cv2.COLOR_BGR2RGB) 73 | 74 | for i in range(2): 75 | start = time.time() 76 | boxes = do_detect(m, sized, 0.5, 0.4, use_cuda) 77 | finish = time.time() 78 | if i == 1: 79 | print('%s: Predicted in %f seconds.' % (imgfile, (finish-start))) 80 | 81 | class_names = load_class_names(namesfile) 82 | plot_boxes_cv2(img, boxes, savename='predictions.jpg', class_names=class_names) 83 | 84 | def detect_skimage(cfgfile, weightfile, imgfile): 85 | from skimage import io 86 | from skimage.transform import resize 87 | m = Darknet(cfgfile) 88 | 89 | m.print_network() 90 | m.load_weights(weightfile) 91 | print('Loading weights from %s... Done!' % (weightfile)) 92 | 93 | if m.num_classes == 20: 94 | namesfile = 'data/voc.names' 95 | elif m.num_classes == 80: 96 | namesfile = 'data/coco.names' 97 | else: 98 | namesfile = 'data/names' 99 | 100 | use_cuda = True 101 | if use_cuda: 102 | m.cuda() 103 | 104 | img = io.imread(imgfile) 105 | sized = resize(img, (m.width, m.height)) * 255 106 | 107 | for i in range(2): 108 | start = time.time() 109 | boxes = do_detect(m, sized, 0.5, 0.4, use_cuda) 110 | finish = time.time() 111 | if i == 1: 112 | print('%s: Predicted in %f seconds.' % (imgfile, (finish-start))) 113 | 114 | class_names = load_class_names(namesfile) 115 | plot_boxes_cv2(img, boxes, savename='predictions.jpg', class_names=class_names) 116 | 117 | if __name__ == '__main__': 118 | if len(sys.argv) == 5: 119 | cfgfile = sys.argv[1] 120 | weightfile = sys.argv[2] 121 | imgfolder = sys.argv[3] 122 | cv2.namedWindow('{}'.format(os.path.basename(imgfolder)), cv2.WINDOW_NORMAL ) 123 | cv2.resizeWindow('{}'.format(os.path.basename(imgfolder)), 1000,800) 124 | globals()["namesfile"] = sys.argv[4] 125 | detect(cfgfile, weightfile, imgfolder) 126 | #detect_cv2(cfgfile, weightfile, imgfile) 127 | #detect_skimage(cfgfile, weightfile, imgfile) 128 | else: 129 | print('Usage: ') 130 | print(' python detect.py cfgfile weightfile imgfolder names') 131 | #detect('cfg/tiny-yolo-voc.cfg', 'tiny-yolo-voc.weights', 'data/person.jpg', version=1) 132 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/detector.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import logging 3 | import numpy as np 4 | import cv2 5 | 6 | from .darknet import Darknet 7 | from .yolo_utils import get_all_boxes, nms, post_process, xywh_to_xyxy, xyxy_to_xywh 8 | from .nms import boxes_nms 9 | 10 | 11 | class YOLOv3(object): 12 | def __init__(self, cfgfile, weightfile, namesfile, score_thresh=0.7, conf_thresh=0.01, nms_thresh=0.45, 13 | is_xywh=False, use_cuda=True): 14 | # net definition 15 | self.net = Darknet(cfgfile) 16 | self.net.load_weights(weightfile) 17 | logger = logging.getLogger("root.detector") 18 | logger.info('Loading weights from %s... Done!' % (weightfile)) 19 | self.device = "cuda" if use_cuda else "cpu" 20 | self.net.eval() 21 | self.net.to(self.device) 22 | 23 | # constants 24 | self.size = self.net.width, self.net.height 25 | self.score_thresh = score_thresh 26 | self.conf_thresh = conf_thresh 27 | self.nms_thresh = nms_thresh 28 | self.use_cuda = use_cuda 29 | self.is_xywh = is_xywh 30 | self.num_classes = self.net.num_classes 31 | self.class_names = self.load_class_names(namesfile) 32 | 33 | def __call__(self, ori_img): 34 | # img to tensor 35 | assert isinstance(ori_img, np.ndarray), "input must be a numpy array!" 36 | img = ori_img.astype(np.float) / 255. 37 | 38 | img = cv2.resize(img, self.size) 39 | img = torch.from_numpy(img).float().permute(2, 0, 1).unsqueeze(0) 40 | 41 | # forward 42 | with torch.no_grad(): 43 | img = img.to(self.device) 44 | out_boxes = self.net(img) 45 | boxes = get_all_boxes(out_boxes, self.conf_thresh, self.num_classes, 46 | use_cuda=self.use_cuda) # batch size is 1 47 | # boxes = nms(boxes, self.nms_thresh) 48 | 49 | boxes = post_process(boxes, self.net.num_classes, self.conf_thresh, self.nms_thresh)[0].cpu() 50 | boxes = boxes[boxes[:, -2] > self.score_thresh, :] # bbox xmin ymin xmax ymax 51 | 52 | if len(boxes) == 0: 53 | bbox = torch.FloatTensor([]).reshape([0, 4]) 54 | cls_conf = torch.FloatTensor([]) 55 | cls_ids = torch.LongTensor([]) 56 | else: 57 | height, width = ori_img.shape[:2] 58 | bbox = boxes[:, :4] 59 | if self.is_xywh: 60 | # bbox x y w h 61 | bbox = xyxy_to_xywh(bbox) 62 | 63 | bbox *= torch.FloatTensor([[width, height, width, height]]) 64 | cls_conf = boxes[:, 5] 65 | cls_ids = boxes[:, 6].long() 66 | return bbox.numpy(), cls_conf.numpy(), cls_ids.numpy() 67 | 68 | def load_class_names(self, namesfile): 69 | with open(namesfile, 'r', encoding='utf8') as fp: 70 | class_names = [line.strip() for line in fp.readlines()] 71 | return class_names 72 | 73 | 74 | def demo(): 75 | import os 76 | from vizer.draw import draw_boxes 77 | 78 | yolo = YOLOv3("cfg/yolo_v3.cfg", "weight/yolov3.weights", "cfg/coco.names") 79 | print("yolo.size =", yolo.size) 80 | root = "./demo" 81 | resdir = os.path.join(root, "results") 82 | os.makedirs(resdir, exist_ok=True) 83 | files = [os.path.join(root, file) for file in os.listdir(root) if file.endswith('.jpg')] 84 | files.sort() 85 | for filename in files: 86 | img = cv2.imread(filename) 87 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 88 | bbox, cls_conf, cls_ids = yolo(img) 89 | 90 | if bbox is not None: 91 | img = draw_boxes(img, bbox, cls_ids, cls_conf, class_name_map=yolo.class_names) 92 | # save results 93 | cv2.imwrite(os.path.join(resdir, os.path.basename(filename)), img[:, :, (2, 1, 0)]) 94 | # imshow 95 | # cv2.namedWindow("yolo", cv2.WINDOW_NORMAL) 96 | # cv2.resizeWindow("yolo", 600,600) 97 | # cv2.imshow("yolo",res[:,:,(2,1,0)]) 98 | # cv2.waitKey(0) 99 | 100 | 101 | if __name__ == "__main__": 102 | demo() 103 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/nms/__init__.py: -------------------------------------------------------------------------------- 1 | from .nms import boxes_nms -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/nms/build.sh: -------------------------------------------------------------------------------- 1 | cd ext 2 | 3 | python build.py build_ext develop 4 | 5 | cd .. 6 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/nms/ext/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/detector/YOLOv3/nms/ext/__init__.py -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/nms/ext/build.py: -------------------------------------------------------------------------------- 1 | import glob 2 | import os 3 | 4 | import torch 5 | from setuptools import setup 6 | from torch.utils.cpp_extension import CUDA_HOME 7 | from torch.utils.cpp_extension import CppExtension 8 | from torch.utils.cpp_extension import CUDAExtension 9 | 10 | requirements = ["torch"] 11 | 12 | 13 | def get_extensions(): 14 | extensions_dir = os.path.dirname(os.path.abspath(__file__)) 15 | 16 | main_file = glob.glob(os.path.join(extensions_dir, "*.cpp")) 17 | source_cpu = glob.glob(os.path.join(extensions_dir, "cpu", "*.cpp")) 18 | source_cuda = glob.glob(os.path.join(extensions_dir, "cuda", "*.cu")) 19 | 20 | sources = main_file + source_cpu 21 | extension = CppExtension 22 | 23 | extra_compile_args = {"cxx": []} 24 | define_macros = [] 25 | 26 | if torch.cuda.is_available() and CUDA_HOME is not None: 27 | extension = CUDAExtension 28 | sources += source_cuda 29 | define_macros += [("WITH_CUDA", None)] 30 | extra_compile_args["nvcc"] = [ 31 | "-DCUDA_HAS_FP16=1", 32 | "-D__CUDA_NO_HALF_OPERATORS__", 33 | "-D__CUDA_NO_HALF_CONVERSIONS__", 34 | "-D__CUDA_NO_HALF2_OPERATORS__", 35 | ] 36 | 37 | sources = [os.path.join(extensions_dir, s) for s in sources] 38 | 39 | include_dirs = [extensions_dir] 40 | 41 | ext_modules = [ 42 | extension( 43 | "torch_extension", 44 | sources, 45 | include_dirs=include_dirs, 46 | define_macros=define_macros, 47 | extra_compile_args=extra_compile_args, 48 | ) 49 | ] 50 | 51 | return ext_modules 52 | 53 | 54 | setup( 55 | name="torch_extension", 56 | version="0.1", 57 | ext_modules=get_extensions(), 58 | cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension}) 59 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/nms/ext/cpu/nms_cpu.cpp: -------------------------------------------------------------------------------- 1 | // Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved. 2 | #include "cpu/vision.h" 3 | 4 | 5 | template 6 | at::Tensor nms_cpu_kernel(const at::Tensor& dets, 7 | const at::Tensor& scores, 8 | const float threshold) { 9 | AT_ASSERTM(!dets.type().is_cuda(), "dets must be a CPU tensor"); 10 | AT_ASSERTM(!scores.type().is_cuda(), "scores must be a CPU tensor"); 11 | AT_ASSERTM(dets.type() == scores.type(), "dets should have the same type as scores"); 12 | 13 | if (dets.numel() == 0) { 14 | return at::empty({0}, dets.options().dtype(at::kLong).device(at::kCPU)); 15 | } 16 | 17 | auto x1_t = dets.select(1, 0).contiguous(); 18 | auto y1_t = dets.select(1, 1).contiguous(); 19 | auto x2_t = dets.select(1, 2).contiguous(); 20 | auto y2_t = dets.select(1, 3).contiguous(); 21 | 22 | at::Tensor areas_t = (x2_t - x1_t) * (y2_t - y1_t); 23 | 24 | auto order_t = std::get<1>(scores.sort(0, /* descending=*/true)); 25 | 26 | auto ndets = dets.size(0); 27 | at::Tensor suppressed_t = at::zeros({ndets}, dets.options().dtype(at::kByte).device(at::kCPU)); 28 | 29 | auto suppressed = suppressed_t.data(); 30 | auto order = order_t.data(); 31 | auto x1 = x1_t.data(); 32 | auto y1 = y1_t.data(); 33 | auto x2 = x2_t.data(); 34 | auto y2 = y2_t.data(); 35 | auto areas = areas_t.data(); 36 | 37 | for (int64_t _i = 0; _i < ndets; _i++) { 38 | auto i = order[_i]; 39 | if (suppressed[i] == 1) 40 | continue; 41 | auto ix1 = x1[i]; 42 | auto iy1 = y1[i]; 43 | auto ix2 = x2[i]; 44 | auto iy2 = y2[i]; 45 | auto iarea = areas[i]; 46 | 47 | for (int64_t _j = _i + 1; _j < ndets; _j++) { 48 | auto j = order[_j]; 49 | if (suppressed[j] == 1) 50 | continue; 51 | auto xx1 = std::max(ix1, x1[j]); 52 | auto yy1 = std::max(iy1, y1[j]); 53 | auto xx2 = std::min(ix2, x2[j]); 54 | auto yy2 = std::min(iy2, y2[j]); 55 | 56 | auto w = std::max(static_cast(0), xx2 - xx1); 57 | auto h = std::max(static_cast(0), yy2 - yy1); 58 | auto inter = w * h; 59 | auto ovr = inter / (iarea + areas[j] - inter); 60 | if (ovr >= threshold) 61 | suppressed[j] = 1; 62 | } 63 | } 64 | return at::nonzero(suppressed_t == 0).squeeze(1); 65 | } 66 | 67 | at::Tensor nms_cpu(const at::Tensor& dets, 68 | const at::Tensor& scores, 69 | const float threshold) { 70 | at::Tensor result; 71 | AT_DISPATCH_FLOATING_TYPES(dets.type(), "nms", [&] { 72 | result = nms_cpu_kernel(dets, scores, threshold); 73 | }); 74 | return result; 75 | } -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/nms/ext/cpu/vision.h: -------------------------------------------------------------------------------- 1 | // Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved. 2 | #pragma once 3 | #include 4 | 5 | at::Tensor nms_cpu(const at::Tensor& dets, 6 | const at::Tensor& scores, 7 | const float threshold); 8 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/nms/ext/cuda/nms.cu: -------------------------------------------------------------------------------- 1 | // Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved. 2 | #include 3 | #include 4 | 5 | #include 6 | #include 7 | 8 | #include 9 | #include 10 | 11 | int const threadsPerBlock = sizeof(unsigned long long) * 8; 12 | 13 | __device__ inline float devIoU(float const * const a, float const * const b) { 14 | float left = max(a[0], b[0]), right = min(a[2], b[2]); 15 | float top = max(a[1], b[1]), bottom = min(a[3], b[3]); 16 | float width = max(right - left, 0.f), height = max(bottom - top, 0.f); 17 | float interS = width * height; 18 | float Sa = (a[2] - a[0]) * (a[3] - a[1]); 19 | float Sb = (b[2] - b[0]) * (b[3] - b[1]); 20 | return interS / (Sa + Sb - interS); 21 | } 22 | 23 | __global__ void nms_kernel(const int n_boxes, const float nms_overlap_thresh, 24 | const float *dev_boxes, unsigned long long *dev_mask) { 25 | const int row_start = blockIdx.y; 26 | const int col_start = blockIdx.x; 27 | 28 | // if (row_start > col_start) return; 29 | 30 | const int row_size = 31 | min(n_boxes - row_start * threadsPerBlock, threadsPerBlock); 32 | const int col_size = 33 | min(n_boxes - col_start * threadsPerBlock, threadsPerBlock); 34 | 35 | __shared__ float block_boxes[threadsPerBlock * 5]; 36 | if (threadIdx.x < col_size) { 37 | block_boxes[threadIdx.x * 5 + 0] = 38 | dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 0]; 39 | block_boxes[threadIdx.x * 5 + 1] = 40 | dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 1]; 41 | block_boxes[threadIdx.x * 5 + 2] = 42 | dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 2]; 43 | block_boxes[threadIdx.x * 5 + 3] = 44 | dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 3]; 45 | block_boxes[threadIdx.x * 5 + 4] = 46 | dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 4]; 47 | } 48 | __syncthreads(); 49 | 50 | if (threadIdx.x < row_size) { 51 | const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x; 52 | const float *cur_box = dev_boxes + cur_box_idx * 5; 53 | int i = 0; 54 | unsigned long long t = 0; 55 | int start = 0; 56 | if (row_start == col_start) { 57 | start = threadIdx.x + 1; 58 | } 59 | for (i = start; i < col_size; i++) { 60 | if (devIoU(cur_box, block_boxes + i * 5) > nms_overlap_thresh) { 61 | t |= 1ULL << i; 62 | } 63 | } 64 | const int col_blocks = THCCeilDiv(n_boxes, threadsPerBlock); 65 | dev_mask[cur_box_idx * col_blocks + col_start] = t; 66 | } 67 | } 68 | 69 | // boxes is a N x 5 tensor 70 | at::Tensor nms_cuda(const at::Tensor boxes, float nms_overlap_thresh) { 71 | using scalar_t = float; 72 | AT_ASSERTM(boxes.type().is_cuda(), "boxes must be a CUDA tensor"); 73 | auto scores = boxes.select(1, 4); 74 | auto order_t = std::get<1>(scores.sort(0, /* descending=*/true)); 75 | auto boxes_sorted = boxes.index_select(0, order_t); 76 | 77 | int boxes_num = boxes.size(0); 78 | 79 | const int col_blocks = THCCeilDiv(boxes_num, threadsPerBlock); 80 | 81 | scalar_t* boxes_dev = boxes_sorted.data(); 82 | 83 | THCState *state = at::globalContext().lazyInitCUDA(); // TODO replace with getTHCState 84 | 85 | unsigned long long* mask_dev = NULL; 86 | //THCudaCheck(THCudaMalloc(state, (void**) &mask_dev, 87 | // boxes_num * col_blocks * sizeof(unsigned long long))); 88 | 89 | mask_dev = (unsigned long long*) THCudaMalloc(state, boxes_num * col_blocks * sizeof(unsigned long long)); 90 | 91 | dim3 blocks(THCCeilDiv(boxes_num, threadsPerBlock), 92 | THCCeilDiv(boxes_num, threadsPerBlock)); 93 | dim3 threads(threadsPerBlock); 94 | nms_kernel<<>>(boxes_num, 95 | nms_overlap_thresh, 96 | boxes_dev, 97 | mask_dev); 98 | 99 | std::vector mask_host(boxes_num * col_blocks); 100 | THCudaCheck(cudaMemcpy(&mask_host[0], 101 | mask_dev, 102 | sizeof(unsigned long long) * boxes_num * col_blocks, 103 | cudaMemcpyDeviceToHost)); 104 | 105 | std::vector remv(col_blocks); 106 | memset(&remv[0], 0, sizeof(unsigned long long) * col_blocks); 107 | 108 | at::Tensor keep = at::empty({boxes_num}, boxes.options().dtype(at::kLong).device(at::kCPU)); 109 | int64_t* keep_out = keep.data(); 110 | 111 | int num_to_keep = 0; 112 | for (int i = 0; i < boxes_num; i++) { 113 | int nblock = i / threadsPerBlock; 114 | int inblock = i % threadsPerBlock; 115 | 116 | if (!(remv[nblock] & (1ULL << inblock))) { 117 | keep_out[num_to_keep++] = i; 118 | unsigned long long *p = &mask_host[0] + i * col_blocks; 119 | for (int j = nblock; j < col_blocks; j++) { 120 | remv[j] |= p[j]; 121 | } 122 | } 123 | } 124 | 125 | THCudaFree(state, mask_dev); 126 | // TODO improve this part 127 | return std::get<0>(order_t.index({ 128 | keep.narrow(/*dim=*/0, /*start=*/0, /*length=*/num_to_keep).to( 129 | order_t.device(), keep.scalar_type()) 130 | }).sort(0, false)); 131 | } -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/nms/ext/cuda/vision.h: -------------------------------------------------------------------------------- 1 | // Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved. 2 | #pragma once 3 | #include 4 | 5 | at::Tensor nms_cuda(const at::Tensor boxes, float nms_overlap_thresh); 6 | 7 | 8 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/nms/ext/nms.h: -------------------------------------------------------------------------------- 1 | // Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved. 2 | #pragma once 3 | #include "cpu/vision.h" 4 | 5 | #ifdef WITH_CUDA 6 | #include "cuda/vision.h" 7 | #endif 8 | 9 | 10 | at::Tensor nms(const at::Tensor& dets, 11 | const at::Tensor& scores, 12 | const float threshold) { 13 | 14 | if (dets.type().is_cuda()) { 15 | #ifdef WITH_CUDA 16 | // TODO raise error if not compiled with CUDA 17 | if (dets.numel() == 0) 18 | return at::empty({0}, dets.options().dtype(at::kLong).device(at::kCPU)); 19 | auto b = at::cat({dets, scores.unsqueeze(1)}, 1); 20 | return nms_cuda(b, threshold); 21 | #else 22 | AT_ERROR("Not compiled with GPU support"); 23 | #endif 24 | } 25 | 26 | at::Tensor result = nms_cpu(dets, scores, threshold); 27 | return result; 28 | } 29 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/nms/ext/vision.cpp: -------------------------------------------------------------------------------- 1 | // Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved. 2 | #include "nms.h" 3 | 4 | 5 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) { 6 | m.def("nms", &nms, "non-maximum suppression"); 7 | } 8 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/nms/nms.py: -------------------------------------------------------------------------------- 1 | import warnings 2 | import torchvision 3 | 4 | try: 5 | import torch 6 | import torch_extension 7 | 8 | _nms = torch_extension.nms 9 | except ImportError: 10 | if torchvision.__version__ >= '0.3.0': 11 | _nms = torchvision.ops.nms 12 | else: 13 | from .python_nms import python_nms 14 | 15 | _nms = python_nms 16 | warnings.warn('You are using python version NMS, which is very very slow. Try compile c++ NMS ' 17 | 'using `cd ext & python build.py build_ext develop`') 18 | 19 | 20 | def boxes_nms(boxes, scores, nms_thresh, max_count=-1): 21 | """ Performs non-maximum suppression, run on GPU or CPU according to 22 | boxes's device. 23 | Args: 24 | boxes(Tensor): `xyxy` mode boxes, use absolute coordinates(or relative coordinates), shape is (n, 4) 25 | scores(Tensor): scores, shape is (n, ) 26 | nms_thresh(float): thresh 27 | max_count (int): if > 0, then only the top max_proposals are kept after non-maximum suppression 28 | Returns: 29 | indices kept. 30 | """ 31 | keep = _nms(boxes, scores, nms_thresh) 32 | if max_count > 0: 33 | keep = keep[:max_count] 34 | return keep 35 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/nms/python_nms.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import numpy as np 3 | 4 | 5 | def python_nms(boxes, scores, nms_thresh): 6 | """ Performs non-maximum suppression using numpy 7 | Args: 8 | boxes(Tensor): `xyxy` mode boxes, use absolute coordinates(not support relative coordinates), 9 | shape is (n, 4) 10 | scores(Tensor): scores, shape is (n, ) 11 | nms_thresh(float): thresh 12 | Returns: 13 | indices kept. 14 | """ 15 | if boxes.numel() == 0: 16 | return torch.empty((0,), dtype=torch.long) 17 | # Use numpy to run nms. Running nms in PyTorch code on CPU is really slow. 18 | origin_device = boxes.device 19 | cpu_device = torch.device('cpu') 20 | boxes = boxes.to(cpu_device).numpy() 21 | scores = scores.to(cpu_device).numpy() 22 | 23 | x1 = boxes[:, 0] 24 | y1 = boxes[:, 1] 25 | x2 = boxes[:, 2] 26 | y2 = boxes[:, 3] 27 | areas = (x2 - x1) * (y2 - y1) 28 | order = np.argsort(scores)[::-1] 29 | num_detections = boxes.shape[0] 30 | suppressed = np.zeros((num_detections,), dtype=np.bool) 31 | for _i in range(num_detections): 32 | i = order[_i] 33 | if suppressed[i]: 34 | continue 35 | ix1 = x1[i] 36 | iy1 = y1[i] 37 | ix2 = x2[i] 38 | iy2 = y2[i] 39 | iarea = areas[i] 40 | 41 | for _j in range(_i + 1, num_detections): 42 | j = order[_j] 43 | if suppressed[j]: 44 | continue 45 | 46 | xx1 = max(ix1, x1[j]) 47 | yy1 = max(iy1, y1[j]) 48 | xx2 = min(ix2, x2[j]) 49 | yy2 = min(iy2, y2[j]) 50 | w = max(0, xx2 - xx1) 51 | h = max(0, yy2 - yy1) 52 | 53 | inter = w * h 54 | ovr = inter / (iarea + areas[j] - inter) 55 | if ovr >= nms_thresh: 56 | suppressed[j] = True 57 | keep = np.nonzero(suppressed == 0)[0] 58 | keep = torch.from_numpy(keep).to(origin_device) 59 | return keep 60 | -------------------------------------------------------------------------------- /deep_sort/detector/YOLOv3/weight/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/detector/YOLOv3/weight/.gitkeep -------------------------------------------------------------------------------- /deep_sort/detector/__init__.py: -------------------------------------------------------------------------------- 1 | from .YOLOv3 import YOLOv3 2 | 3 | 4 | __all__ = ['build_detector'] 5 | 6 | def build_detector(cfg, use_cuda): 7 | return YOLOv3(cfg.YOLOV3.CFG, cfg.YOLOV3.WEIGHT, cfg.YOLOV3.CLASS_NAMES, 8 | score_thresh=cfg.YOLOV3.SCORE_THRESH, nms_thresh=cfg.YOLOV3.NMS_THRESH, 9 | is_xywh=True, use_cuda=use_cuda) 10 | -------------------------------------------------------------------------------- /deep_sort/ped_det_server.py: -------------------------------------------------------------------------------- 1 | """ 2 | This module gets video in input and outputs the 3 | json file with coordination of bboxes in the video. 4 | 5 | """ 6 | from os.path import basename, splitext, join, isfile, isdir, dirname 7 | from os import makedirs 8 | 9 | from tqdm import tqdm 10 | import cv2 11 | import argparse 12 | import torch 13 | 14 | from detector import build_detector 15 | from deep_sort import build_tracker 16 | from utils.tools import tik_tok, is_video 17 | from utils.draw import compute_color_for_labels 18 | from utils.parser import get_config 19 | from utils.json_logger import BboxToJsonLogger 20 | import warnings 21 | 22 | 23 | def parse_args(): 24 | parser = argparse.ArgumentParser() 25 | parser.add_argument("--VIDEO_PATH", type=str, default="./demo/ped.avi") 26 | parser.add_argument("--config_detection", type=str, default="./configs/yolov3.yaml") 27 | parser.add_argument("--config_deepsort", type=str, default="./configs/deep_sort.yaml") 28 | parser.add_argument("--write-fps", type=int, default=20) 29 | parser.add_argument("--frame_interval", type=int, default=1) 30 | parser.add_argument("--save_path", type=str, default="./output") 31 | parser.add_argument("--cpu", dest="use_cuda", action="store_false", default=True) 32 | args = parser.parse_args() 33 | 34 | assert isfile(args.VIDEO_PATH), "Error: Video not found" 35 | assert is_video(args.VIDEO_PATH), "Error: Not Supported format" 36 | if args.frame_interval < 1: args.frame_interval = 1 37 | 38 | return args 39 | 40 | 41 | class VideoTracker(object): 42 | def __init__(self, cfg, args): 43 | self.cfg = cfg 44 | self.args = args 45 | use_cuda = args.use_cuda and torch.cuda.is_available() 46 | if not use_cuda: 47 | warnings.warn("Running in cpu mode!") 48 | 49 | self.vdo = cv2.VideoCapture() 50 | self.detector = build_detector(cfg, use_cuda=use_cuda) 51 | self.deepsort = build_tracker(cfg, use_cuda=use_cuda) 52 | self.class_names = self.detector.class_names 53 | 54 | # Configure output video and json 55 | self.logger = BboxToJsonLogger() 56 | filename, extension = splitext(basename(self.args.VIDEO_PATH)) 57 | self.output_file = join(self.args.save_path, f'{filename}.avi') 58 | self.json_output = join(self.args.save_path, f'{filename}.json') 59 | if not isdir(dirname(self.json_output)): 60 | makedirs(dirname(self.json_output)) 61 | 62 | def __enter__(self): 63 | self.vdo.open(self.args.VIDEO_PATH) 64 | self.total_frames = int(cv2.VideoCapture.get(self.vdo, cv2.CAP_PROP_FRAME_COUNT)) 65 | self.im_width = int(self.vdo.get(cv2.CAP_PROP_FRAME_WIDTH)) 66 | self.im_height = int(self.vdo.get(cv2.CAP_PROP_FRAME_HEIGHT)) 67 | 68 | video_details = {'frame_width': self.im_width, 69 | 'frame_height': self.im_height, 70 | 'frame_rate': self.args.write_fps, 71 | 'video_name': self.args.VIDEO_PATH} 72 | codec = cv2.VideoWriter_fourcc(*'XVID') 73 | self.writer = cv2.VideoWriter(self.output_file, codec, self.args.write_fps, 74 | (self.im_width, self.im_height)) 75 | self.logger.add_video_details(**video_details) 76 | 77 | assert self.vdo.isOpened() 78 | return self 79 | 80 | def __exit__(self, exc_type, exc_value, exc_traceback): 81 | if exc_type: 82 | print(exc_type, exc_value, exc_traceback) 83 | 84 | def run(self): 85 | idx_frame = 0 86 | pbar = tqdm(total=self.total_frames + 1) 87 | while self.vdo.grab(): 88 | if idx_frame % args.frame_interval == 0: 89 | _, ori_im = self.vdo.retrieve() 90 | timestamp = self.vdo.get(cv2.CAP_PROP_POS_MSEC) 91 | frame_id = int(self.vdo.get(cv2.CAP_PROP_POS_FRAMES)) 92 | self.logger.add_frame(frame_id=frame_id, timestamp=timestamp) 93 | self.detection(frame=ori_im, frame_id=frame_id) 94 | self.save_frame(ori_im) 95 | idx_frame += 1 96 | pbar.update() 97 | self.logger.json_output(self.json_output) 98 | 99 | @tik_tok 100 | def detection(self, frame, frame_id): 101 | im = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) 102 | # do detection 103 | bbox_xywh, cls_conf, cls_ids = self.detector(im) 104 | if bbox_xywh is not None: 105 | # select person class 106 | mask = cls_ids == 0 107 | 108 | bbox_xywh = bbox_xywh[mask] 109 | bbox_xywh[:, 3:] *= 1.2 # bbox dilation just in case bbox too small 110 | cls_conf = cls_conf[mask] 111 | 112 | # do tracking 113 | outputs = self.deepsort.update(bbox_xywh, cls_conf, im) 114 | 115 | # draw boxes for visualization 116 | if len(outputs) > 0: 117 | frame = self.draw_boxes(img=frame, frame_id=frame_id, output=outputs) 118 | 119 | def draw_boxes(self, img, frame_id, output, offset=(0, 0)): 120 | for i, box in enumerate(output): 121 | x1, y1, x2, y2, identity = [int(ii) for ii in box] 122 | self.logger.add_bbox_to_frame(frame_id=frame_id, 123 | bbox_id=identity, 124 | top=y1, 125 | left=x1, 126 | width=x2 - x1, 127 | height=y2 - y1) 128 | x1 += offset[0] 129 | x2 += offset[0] 130 | y1 += offset[1] 131 | y2 += offset[1] 132 | 133 | # box text and bar 134 | self.logger.add_label_to_bbox(frame_id=frame_id, bbox_id=identity, category='pedestrian', confidence=0.9) 135 | color = compute_color_for_labels(identity) 136 | label = '{}{:d}'.format("", identity) 137 | t_size = cv2.getTextSize(label, cv2.FONT_HERSHEY_PLAIN, 2, 2)[0] 138 | cv2.rectangle(img, (x1, y1), (x2, y2), color, 3) 139 | cv2.rectangle(img, (x1, y1), (x1 + t_size[0] + 3, y1 + t_size[1] + 4), color, -1) 140 | cv2.putText(img, label, (x1, y1 + t_size[1] + 4), cv2.FONT_HERSHEY_PLAIN, 2, [255, 255, 255], 2) 141 | return img 142 | 143 | def save_frame(self, frame) -> None: 144 | if frame is not None: self.writer.write(frame) 145 | 146 | 147 | if __name__ == "__main__": 148 | args = parse_args() 149 | cfg = get_config() 150 | cfg.merge_from_file(args.config_detection) 151 | cfg.merge_from_file(args.config_deepsort) 152 | 153 | with VideoTracker(cfg, args) as vdo_trk: 154 | vdo_trk.run() 155 | 156 | -------------------------------------------------------------------------------- /deep_sort/scripts/yolov3_deepsort.sh: -------------------------------------------------------------------------------- 1 | python yolov3_deepsort.py [VIDEO_PATH] --config_detection -------------------------------------------------------------------------------- /deep_sort/scripts/yolov3_tiny_deepsort.sh: -------------------------------------------------------------------------------- 1 | python yolov3_deepsort.py [VIDEO_PATH] --config_detection ./configs/yolov3_tiny.yaml -------------------------------------------------------------------------------- /deep_sort/utils/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/utils/__init__.py -------------------------------------------------------------------------------- /deep_sort/utils/asserts.py: -------------------------------------------------------------------------------- 1 | from os import environ 2 | 3 | 4 | def assert_in(file, files_to_check): 5 | if file not in files_to_check: 6 | raise AssertionError("{} does not exist in the list".format(str(file))) 7 | return True 8 | 9 | 10 | def assert_in_env(check_list: list): 11 | for item in check_list: 12 | assert_in(item, environ.keys()) 13 | return True 14 | -------------------------------------------------------------------------------- /deep_sort/utils/draw.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | 4 | palette = (2 ** 11 - 1, 2 ** 15 - 1, 2 ** 20 - 1) 5 | 6 | 7 | def compute_color_for_labels(label): 8 | """ 9 | Simple function that adds fixed color depending on the class 10 | """ 11 | color = [int((p * (label ** 2 - label + 1)) % 255) for p in palette] 12 | return tuple(color) 13 | 14 | 15 | def draw_boxes(img, bbox, identities=None, offset=(0,0)): 16 | for i,box in enumerate(bbox): 17 | x1,y1,x2,y2 = [int(i) for i in box] 18 | x1 += offset[0] 19 | x2 += offset[0] 20 | y1 += offset[1] 21 | y2 += offset[1] 22 | # box text and bar 23 | id = int(identities[i]) if identities is not None else 0 24 | color = compute_color_for_labels(id) 25 | label = '{}{:d}'.format("", id) 26 | t_size = cv2.getTextSize(label, cv2.FONT_HERSHEY_PLAIN, 2 , 2)[0] 27 | cv2.rectangle(img,(x1, y1),(x2,y2),color,3) 28 | cv2.rectangle(img,(x1, y1),(x1+t_size[0]+3,y1+t_size[1]+4), color,-1) 29 | cv2.putText(img,label,(x1,y1+t_size[1]+4), cv2.FONT_HERSHEY_PLAIN, 2, [255,255,255], 2) 30 | return img 31 | 32 | 33 | 34 | if __name__ == '__main__': 35 | for i in range(82): 36 | print(compute_color_for_labels(i)) 37 | -------------------------------------------------------------------------------- /deep_sort/utils/evaluation.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import copy 4 | import motmetrics as mm 5 | mm.lap.default_solver = 'lap' 6 | from utils.io import read_results, unzip_objs 7 | 8 | 9 | class Evaluator(object): 10 | 11 | def __init__(self, data_root, seq_name, data_type): 12 | self.data_root = data_root 13 | self.seq_name = seq_name 14 | self.data_type = data_type 15 | 16 | self.load_annotations() 17 | self.reset_accumulator() 18 | 19 | def load_annotations(self): 20 | assert self.data_type == 'mot' 21 | 22 | gt_filename = os.path.join(self.data_root, self.seq_name, 'gt', 'gt.txt') 23 | self.gt_frame_dict = read_results(gt_filename, self.data_type, is_gt=True) 24 | self.gt_ignore_frame_dict = read_results(gt_filename, self.data_type, is_ignore=True) 25 | 26 | def reset_accumulator(self): 27 | self.acc = mm.MOTAccumulator(auto_id=True) 28 | 29 | def eval_frame(self, frame_id, trk_tlwhs, trk_ids, rtn_events=False): 30 | # results 31 | trk_tlwhs = np.copy(trk_tlwhs) 32 | trk_ids = np.copy(trk_ids) 33 | 34 | # gts 35 | gt_objs = self.gt_frame_dict.get(frame_id, []) 36 | gt_tlwhs, gt_ids = unzip_objs(gt_objs)[:2] 37 | 38 | # ignore boxes 39 | ignore_objs = self.gt_ignore_frame_dict.get(frame_id, []) 40 | ignore_tlwhs = unzip_objs(ignore_objs)[0] 41 | 42 | 43 | # remove ignored results 44 | keep = np.ones(len(trk_tlwhs), dtype=bool) 45 | iou_distance = mm.distances.iou_matrix(ignore_tlwhs, trk_tlwhs, max_iou=0.5) 46 | if len(iou_distance) > 0: 47 | match_is, match_js = mm.lap.linear_sum_assignment(iou_distance) 48 | match_is, match_js = map(lambda a: np.asarray(a, dtype=int), [match_is, match_js]) 49 | match_ious = iou_distance[match_is, match_js] 50 | 51 | match_js = np.asarray(match_js, dtype=int) 52 | match_js = match_js[np.logical_not(np.isnan(match_ious))] 53 | keep[match_js] = False 54 | trk_tlwhs = trk_tlwhs[keep] 55 | trk_ids = trk_ids[keep] 56 | 57 | # get distance matrix 58 | iou_distance = mm.distances.iou_matrix(gt_tlwhs, trk_tlwhs, max_iou=0.5) 59 | 60 | # acc 61 | self.acc.update(gt_ids, trk_ids, iou_distance) 62 | 63 | if rtn_events and iou_distance.size > 0 and hasattr(self.acc, 'last_mot_events'): 64 | events = self.acc.last_mot_events # only supported by https://github.com/longcw/py-motmetrics 65 | else: 66 | events = None 67 | return events 68 | 69 | def eval_file(self, filename): 70 | self.reset_accumulator() 71 | 72 | result_frame_dict = read_results(filename, self.data_type, is_gt=False) 73 | frames = sorted(list(set(self.gt_frame_dict.keys()) | set(result_frame_dict.keys()))) 74 | for frame_id in frames: 75 | trk_objs = result_frame_dict.get(frame_id, []) 76 | trk_tlwhs, trk_ids = unzip_objs(trk_objs)[:2] 77 | self.eval_frame(frame_id, trk_tlwhs, trk_ids, rtn_events=False) 78 | 79 | return self.acc 80 | 81 | @staticmethod 82 | def get_summary(accs, names, metrics=('mota', 'num_switches', 'idp', 'idr', 'idf1', 'precision', 'recall')): 83 | names = copy.deepcopy(names) 84 | if metrics is None: 85 | metrics = mm.metrics.motchallenge_metrics 86 | metrics = copy.deepcopy(metrics) 87 | 88 | mh = mm.metrics.create() 89 | summary = mh.compute_many( 90 | accs, 91 | metrics=metrics, 92 | names=names, 93 | generate_overall=True 94 | ) 95 | 96 | return summary 97 | 98 | @staticmethod 99 | def save_summary(summary, filename): 100 | import pandas as pd 101 | writer = pd.ExcelWriter(filename) 102 | summary.to_excel(writer) 103 | writer.save() 104 | -------------------------------------------------------------------------------- /deep_sort/utils/io.py: -------------------------------------------------------------------------------- 1 | import os 2 | from typing import Dict 3 | import numpy as np 4 | 5 | # from utils.log import get_logger 6 | 7 | 8 | def write_results(filename, results, data_type): 9 | if data_type == 'mot': 10 | save_format = '{frame},{id},{x1},{y1},{w},{h},-1,-1,-1,-1\n' 11 | elif data_type == 'kitti': 12 | save_format = '{frame} {id} pedestrian 0 0 -10 {x1} {y1} {x2} {y2} -10 -10 -10 -1000 -1000 -1000 -10\n' 13 | else: 14 | raise ValueError(data_type) 15 | 16 | with open(filename, 'w') as f: 17 | for frame_id, tlwhs, track_ids in results: 18 | if data_type == 'kitti': 19 | frame_id -= 1 20 | for tlwh, track_id in zip(tlwhs, track_ids): 21 | if track_id < 0: 22 | continue 23 | x1, y1, w, h = tlwh 24 | x2, y2 = x1 + w, y1 + h 25 | line = save_format.format(frame=frame_id, id=track_id, x1=x1, y1=y1, x2=x2, y2=y2, w=w, h=h) 26 | f.write(line) 27 | 28 | 29 | # def write_results(filename, results_dict: Dict, data_type: str): 30 | # if not filename: 31 | # return 32 | # path = os.path.dirname(filename) 33 | # if not os.path.exists(path): 34 | # os.makedirs(path) 35 | 36 | # if data_type in ('mot', 'mcmot', 'lab'): 37 | # save_format = '{frame},{id},{x1},{y1},{w},{h},1,-1,-1,-1\n' 38 | # elif data_type == 'kitti': 39 | # save_format = '{frame} {id} pedestrian -1 -1 -10 {x1} {y1} {x2} {y2} -1 -1 -1 -1000 -1000 -1000 -10 {score}\n' 40 | # else: 41 | # raise ValueError(data_type) 42 | 43 | # with open(filename, 'w') as f: 44 | # for frame_id, frame_data in results_dict.items(): 45 | # if data_type == 'kitti': 46 | # frame_id -= 1 47 | # for tlwh, track_id in frame_data: 48 | # if track_id < 0: 49 | # continue 50 | # x1, y1, w, h = tlwh 51 | # x2, y2 = x1 + w, y1 + h 52 | # line = save_format.format(frame=frame_id, id=track_id, x1=x1, y1=y1, x2=x2, y2=y2, w=w, h=h, score=1.0) 53 | # f.write(line) 54 | # logger.info('Save results to {}'.format(filename)) 55 | 56 | 57 | def read_results(filename, data_type: str, is_gt=False, is_ignore=False): 58 | if data_type in ('mot', 'lab'): 59 | read_fun = read_mot_results 60 | else: 61 | raise ValueError('Unknown data type: {}'.format(data_type)) 62 | 63 | return read_fun(filename, is_gt, is_ignore) 64 | 65 | 66 | """ 67 | labels={'ped', ... % 1 68 | 'person_on_vhcl', ... % 2 69 | 'car', ... % 3 70 | 'bicycle', ... % 4 71 | 'mbike', ... % 5 72 | 'non_mot_vhcl', ... % 6 73 | 'static_person', ... % 7 74 | 'distractor', ... % 8 75 | 'occluder', ... % 9 76 | 'occluder_on_grnd', ... %10 77 | 'occluder_full', ... % 11 78 | 'reflection', ... % 12 79 | 'crowd' ... % 13 80 | }; 81 | """ 82 | 83 | 84 | def read_mot_results(filename, is_gt, is_ignore): 85 | valid_labels = {1} 86 | ignore_labels = {2, 7, 8, 12} 87 | results_dict = dict() 88 | if os.path.isfile(filename): 89 | with open(filename, 'r') as f: 90 | for line in f.readlines(): 91 | linelist = line.split(',') 92 | if len(linelist) < 7: 93 | continue 94 | fid = int(linelist[0]) 95 | if fid < 1: 96 | continue 97 | results_dict.setdefault(fid, list()) 98 | 99 | if is_gt: 100 | if 'MOT16-' in filename or 'MOT17-' in filename: 101 | label = int(float(linelist[7])) 102 | mark = int(float(linelist[6])) 103 | if mark == 0 or label not in valid_labels: 104 | continue 105 | score = 1 106 | elif is_ignore: 107 | if 'MOT16-' in filename or 'MOT17-' in filename: 108 | label = int(float(linelist[7])) 109 | vis_ratio = float(linelist[8]) 110 | if label not in ignore_labels and vis_ratio >= 0: 111 | continue 112 | else: 113 | continue 114 | score = 1 115 | else: 116 | score = float(linelist[6]) 117 | 118 | tlwh = tuple(map(float, linelist[2:6])) 119 | target_id = int(linelist[1]) 120 | 121 | results_dict[fid].append((tlwh, target_id, score)) 122 | 123 | return results_dict 124 | 125 | 126 | def unzip_objs(objs): 127 | if len(objs) > 0: 128 | tlwhs, ids, scores = zip(*objs) 129 | else: 130 | tlwhs, ids, scores = [], [], [] 131 | tlwhs = np.asarray(tlwhs, dtype=float).reshape(-1, 4) 132 | 133 | return tlwhs, ids, scores -------------------------------------------------------------------------------- /deep_sort/utils/log.py: -------------------------------------------------------------------------------- 1 | import logging 2 | 3 | 4 | def get_logger(name='root'): 5 | formatter = logging.Formatter( 6 | # fmt='%(asctime)s [%(levelname)s]: %(filename)s(%(funcName)s:%(lineno)s) >> %(message)s') 7 | fmt='%(asctime)s [%(levelname)s]: %(message)s', datefmt='%Y-%m-%d %H:%M:%S') 8 | 9 | handler = logging.StreamHandler() 10 | handler.setFormatter(formatter) 11 | 12 | logger = logging.getLogger(name) 13 | logger.setLevel(logging.INFO) 14 | logger.addHandler(handler) 15 | return logger 16 | 17 | 18 | -------------------------------------------------------------------------------- /deep_sort/utils/parser.py: -------------------------------------------------------------------------------- 1 | import os 2 | import yaml 3 | from easydict import EasyDict as edict 4 | 5 | class YamlParser(edict): 6 | """ 7 | This is yaml parser based on EasyDict. 8 | """ 9 | def __init__(self, cfg_dict=None, config_file=None): 10 | if cfg_dict is None: 11 | cfg_dict = {} 12 | 13 | if config_file is not None: 14 | assert(os.path.isfile(config_file)) 15 | with open(config_file, 'r') as fo: 16 | cfg_dict.update(yaml.load(fo.read())) 17 | 18 | super(YamlParser, self).__init__(cfg_dict) 19 | 20 | 21 | def merge_from_file(self, config_file): 22 | with open(config_file, 'r') as fo: 23 | self.update(yaml.load(fo.read())) 24 | 25 | 26 | def merge_from_dict(self, config_dict): 27 | self.update(config_dict) 28 | 29 | 30 | def get_config(config_file=None): 31 | return YamlParser(config_file=config_file) 32 | 33 | 34 | if __name__ == "__main__": 35 | cfg = YamlParser(config_file="../configs/yolov3.yaml") 36 | cfg.merge_from_file("../configs/deep_sort.yaml") 37 | 38 | import ipdb; ipdb.set_trace() -------------------------------------------------------------------------------- /deep_sort/utils/tools.py: -------------------------------------------------------------------------------- 1 | from functools import wraps 2 | from time import time 3 | 4 | 5 | def is_video(ext: str): 6 | """ 7 | Returns true if ext exists in 8 | allowed_exts for video files. 9 | 10 | Args: 11 | ext: 12 | 13 | Returns: 14 | 15 | """ 16 | 17 | allowed_exts = ('.mp4', '.webm', '.ogg', '.avi', '.wmv', '.mkv', '.3gp') 18 | return any((ext.endswith(x) for x in allowed_exts)) 19 | 20 | 21 | def tik_tok(func): 22 | """ 23 | keep track of time for each process. 24 | Args: 25 | func: 26 | 27 | Returns: 28 | 29 | """ 30 | @wraps(func) 31 | def _time_it(*args, **kwargs): 32 | start = time() 33 | try: 34 | return func(*args, **kwargs) 35 | finally: 36 | end_ = time() 37 | print("time: {:.03f}s, fps: {:.03f}".format(end_ - start, 1 / (end_ - start))) 38 | 39 | return _time_it 40 | -------------------------------------------------------------------------------- /deep_sort/webserver/.env: -------------------------------------------------------------------------------- 1 | project_root="C:\Users\ZQ_deep_sort_pytorch" 2 | model_type="yolov3" 3 | output_dir="public/" 4 | json_output="json_output/" # ignored for the moment in ped_det_online_server.py 5 | reid_ckpt="deep_sort/deep/checkpoint/ckpt.t7" 6 | yolov3_cfg="detector/YOLOv3/cfg/yolo_v3.cfg" 7 | yolov3_weight="detector/YOLOv3/weight/yolov3.weights" 8 | yolov3_tiny_cfg="detector/YOLOv3/cfg/yolov3-tiny.cfg" 9 | yolov3_tiny_weight="detector/YOLOv3/weight/yolov3-tiny.weights" 10 | yolov3_class_names="detector/YOLOv3/cfg/coco.names" 11 | analysis_output="video_analysis/" 12 | app="flask_stream_server.py" 13 | camera_stream= "rtsp://user@111.222.333.444:somesecretcode" 14 | -------------------------------------------------------------------------------- /deep_sort/webserver/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/webserver/__init__.py -------------------------------------------------------------------------------- /deep_sort/webserver/config/config.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | app_dir = os.path.abspath(os.path.dirname(__file__)) 4 | 5 | 6 | class BaseConfig: 7 | SECRET_KEY = os.environ.get('SECRET_KEY') or 'Sm9obiBTY2hyb20ga2lja3MgYXNz' 8 | SERVER_NAME = '127.0.0.1:8888' 9 | 10 | 11 | class DevelopmentConfig(BaseConfig): 12 | ENV = 'development' 13 | DEBUG = True 14 | 15 | 16 | class TestingConfig(BaseConfig): 17 | DEBUG = True 18 | 19 | 20 | class ProductionConfig(BaseConfig): 21 | DEBUG = False 22 | -------------------------------------------------------------------------------- /deep_sort/webserver/images/Thumbs.db: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/webserver/images/Thumbs.db -------------------------------------------------------------------------------- /deep_sort/webserver/images/arc.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/webserver/images/arc.png -------------------------------------------------------------------------------- /deep_sort/webserver/images/request.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/deep_sort/webserver/images/request.png -------------------------------------------------------------------------------- /deep_sort/webserver/readme.md: -------------------------------------------------------------------------------- 1 | # Stream pedestrian detection web server 2 | 3 | ### Requirements 4 | 5 | - python = 3.7 6 | - redis 7 | - flask 8 | - opencv 9 | - pytorch 10 | - dotenv 11 | 12 | Please note that you need to install redis on your system. 13 | 14 | ### The architecture. 15 | 16 | ![web server architecture](images/arc.png) 17 | 18 | 1 - `RealTimeTracking` reads frames from rtsp link using threads 19 | (Using threads make the web server robust against network packet loss) 20 | 21 | 2 - In `RealTimeTracking.run` function in each iteration frame is stored in redis cache on server. 22 | 23 | 3 - Now we can serve the frames on redis to clients. 24 | 25 | 4 - To start the pedestrian detection, after running 26 | `rtsp_webserver.py`, send a GET request on `127.0.0.1:8888/run` 27 | with setting these GET method parameters 28 | 29 | | Param | Value | Description | 30 | | :-------------: | :-------------: | :-------------: | 31 | | run | 1/0 | to start the tracking set 1/ to stop tracking service set it as 0| 32 | | camera_stream | 'rtsp://ip:port/admin...' | provide it with valid rtsp link | 33 | 34 | for example: 35 | 36 | (to start the service) 127.0.0.1:8888/run?run=1 37 | (to stop the service) 127.0.0.1:8888/run?run=0 38 | (to change the camera) 39 | 1- 127.0.0.1:8888/run?run=0 (first stop the current service) 40 | 2- 127.0.0.1:8888/run?run=0&camera_stream=rtsp://ip:port/admin... (then start it with another rtsp link) 41 | 42 | ![web server architecture](images/request.png) 43 | 44 | 45 | 5 - get pedestrian detection stream in `127.0.0.1:8888` 46 | -------------------------------------------------------------------------------- /deep_sort/webserver/rtsp_threaded_tracker.py: -------------------------------------------------------------------------------- 1 | import warnings 2 | from os import getenv 3 | import sys 4 | from os.path import dirname, abspath 5 | 6 | sys.path.append(dirname(dirname(abspath(__file__)))) 7 | 8 | import torch 9 | from deep_sort import build_tracker 10 | from detector import build_detector 11 | import cv2 12 | from utils.draw import compute_color_for_labels 13 | from concurrent.futures import ThreadPoolExecutor 14 | from redis import Redis 15 | 16 | redis_cache = Redis('127.0.0.1') 17 | 18 | 19 | class RealTimeTracking(object): 20 | """ 21 | This class is built to get frame from rtsp link and continuously 22 | assign each frame to an attribute namely as frame in order to 23 | compensate the network packet loss. then we use flask to give it 24 | as service to client. 25 | Args: 26 | args: parse_args inputs 27 | cfg: deepsort dict and yolo-model cfg from server_cfg file 28 | 29 | """ 30 | 31 | def __init__(self, cfg, args): 32 | # Create a VideoCapture object 33 | self.cfg = cfg 34 | self.args = args 35 | use_cuda = self.args.use_cuda and torch.cuda.is_available() 36 | 37 | if not use_cuda: 38 | warnings.warn(UserWarning("Running in cpu mode!")) 39 | 40 | self.detector = build_detector(cfg, use_cuda=use_cuda) 41 | self.deepsort = build_tracker(cfg, use_cuda=use_cuda) 42 | self.class_names = self.detector.class_names 43 | 44 | self.vdo = cv2.VideoCapture(self.args.input) 45 | self.status, self.frame = None, None 46 | self.total_frames = int(cv2.VideoCapture.get(self.vdo, cv2.CAP_PROP_FRAME_COUNT)) 47 | self.im_width = int(self.vdo.get(cv2.CAP_PROP_FRAME_WIDTH)) 48 | self.im_height = int(self.vdo.get(cv2.CAP_PROP_FRAME_HEIGHT)) 49 | 50 | self.output_frame = None 51 | 52 | self.thread = ThreadPoolExecutor(max_workers=1) 53 | self.thread.submit(self.update) 54 | 55 | def update(self): 56 | while True: 57 | if self.vdo.isOpened(): 58 | (self.status, self.frame) = self.vdo.read() 59 | 60 | def run(self): 61 | print('streaming started ...') 62 | while getenv('in_progress') != 'off': 63 | try: 64 | frame = self.frame.copy() 65 | self.detection(frame=frame) 66 | frame_to_bytes = cv2.imencode('.jpg', frame)[1].tobytes() 67 | redis_cache.set('frame', frame_to_bytes) 68 | except AttributeError: 69 | pass 70 | print('streaming stopped ...') 71 | 72 | 73 | def detection(self, frame): 74 | im = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) 75 | # do detection 76 | bbox_xywh, cls_conf, cls_ids = self.detector(im) 77 | if bbox_xywh is not None: 78 | # select person class 79 | mask = cls_ids == 0 80 | 81 | bbox_xywh = bbox_xywh[mask] 82 | bbox_xywh[:, 3:] *= 1.2 # bbox dilation just in case bbox too small 83 | cls_conf = cls_conf[mask] 84 | 85 | # do tracking 86 | outputs = self.deepsort.update(bbox_xywh, cls_conf, im) 87 | 88 | # draw boxes for visualization 89 | if len(outputs) > 0: 90 | self.draw_boxes(img=frame, output=outputs) 91 | 92 | @staticmethod 93 | def draw_boxes(img, output, offset=(0, 0)): 94 | for i, box in enumerate(output): 95 | x1, y1, x2, y2, identity = [int(ii) for ii in box] 96 | x1 += offset[0] 97 | x2 += offset[0] 98 | y1 += offset[1] 99 | y2 += offset[1] 100 | 101 | # box text and bar 102 | color = compute_color_for_labels(identity) 103 | label = '{}{:d}'.format("", identity) 104 | t_size = cv2.getTextSize(label, cv2.FONT_HERSHEY_PLAIN, 2, 2)[0] 105 | cv2.rectangle(img, (x1, y1), (x2, y2), color, 3) 106 | cv2.rectangle(img, (x1, y1), (x1 + t_size[0] + 3, y1 + t_size[1] + 4), color, -1) 107 | cv2.putText(img, label, (x1, y1 + t_size[1] + 4), cv2.FONT_HERSHEY_PLAIN, 2, [255, 255, 255], 2) 108 | return img 109 | -------------------------------------------------------------------------------- /deep_sort/webserver/rtsp_webserver.py: -------------------------------------------------------------------------------- 1 | """ 2 | 3 | # TODO: Load ML model with redis and keep it for sometime. 4 | 1- detector/yolov3/detector.py |=> yolov3 weightfile -> redis cache 5 | 2- deepsort/deep/feature_extractor |=> model_path -> redis cache 6 | 3- Use tmpfs (Insert RAM as a virtual disk and store model state): https://pypi.org/project/memory-tempfile/ 7 | 8 | """ 9 | from os.path import join 10 | from os import getenv, environ 11 | from dotenv import load_dotenv 12 | import argparse 13 | from threading import Thread 14 | 15 | from redis import Redis 16 | from flask import Response, Flask, jsonify, request, abort 17 | 18 | from rtsp_threaded_tracker import RealTimeTracking 19 | from server_cfg import model, deep_sort_dict 20 | from config.config import DevelopmentConfig 21 | from utils.parser import get_config 22 | 23 | redis_cache = Redis('127.0.0.1') 24 | app = Flask(__name__) 25 | environ['in_progress'] = 'off' 26 | 27 | 28 | def parse_args(): 29 | """ 30 | Parses the arguments 31 | Returns: 32 | argparse Namespace 33 | """ 34 | assert 'project_root' in environ.keys() 35 | project_root = getenv('project_root') 36 | parser = argparse.ArgumentParser() 37 | 38 | parser.add_argument("--input", 39 | type=str, 40 | default=getenv('camera_stream')) 41 | 42 | parser.add_argument("--model", 43 | type=str, 44 | default=join(project_root, 45 | getenv('model_type'))) 46 | 47 | parser.add_argument("--cpu", 48 | dest="use_cuda", 49 | action="store_false", default=True) 50 | args = parser.parse_args() 51 | 52 | return args 53 | 54 | 55 | def gen(): 56 | """ 57 | 58 | Returns: video frames from redis cache 59 | 60 | """ 61 | while True: 62 | frame = redis_cache.get('frame') 63 | if frame is not None: 64 | yield b'--frame\r\n'b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n' 65 | 66 | 67 | def pedestrian_tracking(cfg, args): 68 | """ 69 | starts the pedestrian detection on rtsp link 70 | Args: 71 | cfg: 72 | args: 73 | 74 | Returns: 75 | 76 | """ 77 | tracker = RealTimeTracking(cfg, args) 78 | tracker.run() 79 | 80 | 81 | def trigger_process(cfg, args): 82 | """ 83 | triggers pedestrian_tracking process on rtsp link using a thread 84 | Args: 85 | cfg: 86 | args: 87 | 88 | Returns: 89 | """ 90 | try: 91 | t = Thread(target=pedestrian_tracking, args=(cfg, args)) 92 | t.start() 93 | return jsonify({"message": "Pedestrian detection started successfully"}) 94 | except Exception: 95 | return jsonify({'message': "Unexpected exception occured in process"}) 96 | 97 | 98 | @app.errorhandler(400) 99 | def bad_argument(error): 100 | return jsonify({'message': error.description['message']}) 101 | 102 | 103 | # Routes 104 | @app.route('/stream', methods=['GET']) 105 | def stream(): 106 | """ 107 | Provides video frames on http link 108 | Returns: 109 | 110 | """ 111 | return Response(gen(), 112 | mimetype='multipart/x-mixed-replace; boundary=frame') 113 | 114 | 115 | @app.route("/run", methods=['GET']) 116 | def process_manager(): 117 | """ 118 | request parameters: 119 | run (bool): 1 -> start the pedestrian tracking 120 | 0 -> stop it 121 | camera_stream: str -> rtsp link to security camera 122 | 123 | :return: 124 | """ 125 | # data = request.args 126 | data = request.args 127 | status = data['run'] 128 | status = int(status) if status.isnumeric() else abort(400, {'message': f"bad argument for run {data['run']}"}) 129 | if status == 1: 130 | # if pedestrian tracking is not running, start it off! 131 | try: 132 | if environ.get('in_progress', 'off') == 'off': 133 | global cfg, args 134 | vdo = data.get('camera_stream') 135 | if vdo is not None: 136 | args.input = int(vdo) 137 | environ['in_progress'] = 'on' 138 | return trigger_process(cfg, args) 139 | elif environ.get('in_progress') == 'on': 140 | # if pedestrian tracking is running, don't start another one (we are short of gpu resources) 141 | return jsonify({"message": " Pedestrian detection is already in progress."}) 142 | except Exception: 143 | environ['in_progress'] = 'off' 144 | return abort(503) 145 | elif status == 0: 146 | if environ.get('in_progress', 'off') == 'off': 147 | return jsonify({"message": "pedestrian detection is already terminated!"}) 148 | else: 149 | environ['in_progress'] = 'off' 150 | return jsonify({"message": "Pedestrian detection terminated!"}) 151 | 152 | 153 | if __name__ == '__main__': 154 | load_dotenv() 155 | app.config.from_object(DevelopmentConfig) 156 | 157 | # BackProcess Initialization 158 | args = parse_args() 159 | cfg = get_config() 160 | cfg.merge_from_dict(model) 161 | cfg.merge_from_dict(deep_sort_dict) 162 | # Start the flask app 163 | app.run() 164 | -------------------------------------------------------------------------------- /deep_sort/webserver/server_cfg.py: -------------------------------------------------------------------------------- 1 | """""" 2 | import sys 3 | from os.path import dirname, abspath, isfile 4 | 5 | sys.path.append(dirname(dirname(abspath(__file__)))) 6 | 7 | from dotenv import load_dotenv 8 | from utils.asserts import assert_in_env 9 | from os import getenv 10 | from os.path import join 11 | 12 | load_dotenv('.env') 13 | # Configure deep sort info 14 | deep_sort_info = dict(REID_CKPT=join(getenv('project_root'), getenv('reid_ckpt')), 15 | MAX_DIST=0.2, 16 | MIN_CONFIDENCE=.3, 17 | NMS_MAX_OVERLAP=0.5, 18 | MAX_IOU_DISTANCE=0.7, 19 | N_INIT=3, 20 | MAX_AGE=70, 21 | NN_BUDGET=100) 22 | deep_sort_dict = {'DEEPSORT': deep_sort_info} 23 | 24 | # Configure yolov3 info 25 | 26 | yolov3_info = dict(CFG=join(getenv('project_root'), getenv('yolov3_cfg')), 27 | WEIGHT=join(getenv('project_root'), getenv('yolov3_weight')), 28 | CLASS_NAMES=join(getenv('project_root'), getenv('yolov3_class_names')), 29 | SCORE_THRESH=0.5, 30 | NMS_THRESH=0.4 31 | ) 32 | yolov3_dict = {'YOLOV3': yolov3_info} 33 | 34 | # Configure yolov3-tiny info 35 | 36 | yolov3_tiny_info = dict(CFG=join(getenv('project_root'), getenv('yolov3_tiny_cfg')), 37 | WEIGHT=join(getenv('project_root'), getenv('yolov3_tiny_weight')), 38 | CLASS_NAMES=join(getenv('project_root'), getenv('yolov3_class_names')), 39 | SCORE_THRESH=0.5, 40 | NMS_THRESH=0.4 41 | ) 42 | yolov3_tiny_dict = {'YOLOV3': yolov3_tiny_info} 43 | 44 | 45 | check_list = ['project_root', 'reid_ckpt', 'yolov3_class_names', 'model_type', 'yolov3_cfg', 'yolov3_weight', 46 | 'yolov3_tiny_cfg', 'yolov3_tiny_weight', 'yolov3_class_names'] 47 | 48 | if assert_in_env(check_list): 49 | assert isfile(deep_sort_info['REID_CKPT']) 50 | if getenv('model_type') == 'yolov3': 51 | assert isfile(yolov3_info['WEIGHT']) 52 | assert isfile(yolov3_info['CFG']) 53 | assert isfile(yolov3_info['CLASS_NAMES']) 54 | model = yolov3_dict.copy() 55 | 56 | elif getenv('model_type') == 'yolov3_tiny': 57 | assert isfile(yolov3_tiny_info['WEIGHT']) 58 | assert isfile(yolov3_tiny_info['CFG']) 59 | assert isfile(yolov3_tiny_info['CLASS_NAMES']) 60 | model = yolov3_tiny_dict.copy() 61 | else: 62 | raise ValueError("Value '{}' for model_type is not valid".format(getenv('model_type'))) 63 | -------------------------------------------------------------------------------- /deep_sort/webserver/templates/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | دوربین طبقه 2: راهرو 2 4 | 5 | 6 |

طبقه 2

7 | 8 | 9 | -------------------------------------------------------------------------------- /deep_sort/yolov3_deepsort.py: -------------------------------------------------------------------------------- 1 | import os 2 | import cv2 3 | import time 4 | import argparse 5 | import torch 6 | import warnings 7 | import numpy as np 8 | 9 | from detector import build_detector 10 | from deep_sort import build_tracker 11 | from utils.draw import draw_boxes 12 | from utils.parser import get_config 13 | from utils.log import get_logger 14 | from utils.io import write_results 15 | 16 | 17 | class VideoTracker(object): 18 | def __init__(self, cfg, args, video_path): 19 | self.cfg = cfg 20 | self.args = args 21 | self.video_path = video_path 22 | self.logger = get_logger("root") 23 | 24 | use_cuda = args.use_cuda and torch.cuda.is_available() 25 | if not use_cuda: 26 | warnings.warn("Running in cpu mode which maybe very slow!", UserWarning) 27 | 28 | if args.display: 29 | cv2.namedWindow("test", cv2.WINDOW_NORMAL) 30 | cv2.resizeWindow("test", args.display_width, args.display_height) 31 | 32 | if args.cam != -1: 33 | print("Using webcam " + str(args.cam)) 34 | self.vdo = cv2.VideoCapture(args.cam) 35 | else: 36 | self.vdo = cv2.VideoCapture() 37 | self.detector = build_detector(cfg, use_cuda=use_cuda) 38 | self.deepsort = build_tracker(cfg, use_cuda=use_cuda) 39 | self.class_names = self.detector.class_names 40 | 41 | def __enter__(self): 42 | if self.args.cam != -1: 43 | ret, frame = self.vdo.read() 44 | assert ret, "Error: Camera error" 45 | self.im_width = frame.shape[0] 46 | self.im_height = frame.shape[1] 47 | 48 | else: 49 | assert os.path.isfile(self.video_path), "Path error" 50 | self.vdo.open(self.video_path) 51 | self.im_width = int(self.vdo.get(cv2.CAP_PROP_FRAME_WIDTH)) 52 | self.im_height = int(self.vdo.get(cv2.CAP_PROP_FRAME_HEIGHT)) 53 | assert self.vdo.isOpened() 54 | 55 | if self.args.save_path: 56 | os.makedirs(self.args.save_path, exist_ok=True) 57 | 58 | # path of saved video and results 59 | self.save_video_path = os.path.join(self.args.save_path, "results.avi") 60 | self.save_results_path = os.path.join(self.args.save_path, "results.txt") 61 | 62 | # create video writer 63 | fourcc = cv2.VideoWriter_fourcc(*'MJPG') 64 | self.writer = cv2.VideoWriter(self.save_video_path, fourcc, 20, (self.im_width, self.im_height)) 65 | 66 | # logging 67 | self.logger.info("Save results to {}".format(self.args.save_path)) 68 | 69 | return self 70 | 71 | def __exit__(self, exc_type, exc_value, exc_traceback): 72 | if exc_type: 73 | print(exc_type, exc_value, exc_traceback) 74 | 75 | def run(self): 76 | results = [] 77 | idx_frame = 0 78 | while self.vdo.grab(): 79 | idx_frame += 1 80 | if idx_frame % self.args.frame_interval: 81 | continue 82 | 83 | start = time.time() 84 | _, ori_im = self.vdo.retrieve() 85 | im = cv2.cvtColor(ori_im, cv2.COLOR_BGR2RGB) 86 | 87 | # do detection 88 | bbox_xywh, cls_conf, cls_ids = self.detector(im) 89 | 90 | # select person class 91 | mask = cls_ids == 0 92 | 93 | bbox_xywh = bbox_xywh[mask] 94 | # bbox dilation just in case bbox too small, delete this line if using a better pedestrian detector 95 | bbox_xywh[:, 3:] *= 1.2 96 | cls_conf = cls_conf[mask] 97 | 98 | # do tracking 99 | outputs = self.deepsort.update(bbox_xywh, cls_conf, im) 100 | 101 | # draw boxes for visualization 102 | if len(outputs) > 0: 103 | bbox_tlwh = [] 104 | bbox_xyxy = outputs[:, :4] 105 | identities = outputs[:, -1] 106 | ori_im = draw_boxes(ori_im, bbox_xyxy, identities) 107 | 108 | for bb_xyxy in bbox_xyxy: 109 | bbox_tlwh.append(self.deepsort._xyxy_to_tlwh(bb_xyxy)) 110 | 111 | results.append((idx_frame - 1, bbox_tlwh, identities)) 112 | 113 | end = time.time() 114 | 115 | if self.args.display: 116 | cv2.imshow("test", ori_im) 117 | cv2.waitKey(1) 118 | 119 | if self.args.save_path: 120 | self.writer.write(ori_im) 121 | 122 | # save results 123 | write_results(self.save_results_path, results, 'mot') 124 | 125 | # logging 126 | self.logger.info("time: {:.03f}s, fps: {:.03f}, detection numbers: {}, tracking numbers: {}" \ 127 | .format(end - start, 1 / (end - start), bbox_xywh.shape[0], len(outputs))) 128 | 129 | 130 | def parse_args(): 131 | parser = argparse.ArgumentParser() 132 | parser.add_argument("VIDEO_PATH", type=str) 133 | parser.add_argument("--config_detection", type=str, default="./configs/yolov3.yaml") 134 | parser.add_argument("--config_deepsort", type=str, default="./configs/deep_sort.yaml") 135 | # parser.add_argument("--ignore_display", dest="display", action="store_false", default=True) 136 | parser.add_argument("--display", action="store_true") 137 | parser.add_argument("--frame_interval", type=int, default=1) 138 | parser.add_argument("--display_width", type=int, default=800) 139 | parser.add_argument("--display_height", type=int, default=600) 140 | parser.add_argument("--save_path", type=str, default="./output/") 141 | parser.add_argument("--cpu", dest="use_cuda", action="store_false", default=True) 142 | parser.add_argument("--camera", action="store", dest="cam", type=int, default="-1") 143 | return parser.parse_args() 144 | 145 | 146 | if __name__ == "__main__": 147 | args = parse_args() 148 | cfg = get_config() 149 | cfg.merge_from_file(args.config_detection) 150 | cfg.merge_from_file(args.config_deepsort) 151 | 152 | with VideoTracker(cfg, args, video_path=args.VIDEO_PATH) as vdo_trk: 153 | vdo_trk.run() 154 | -------------------------------------------------------------------------------- /deep_sort/yolov3_deepsort_eval.py: -------------------------------------------------------------------------------- 1 | import os 2 | import os.path as osp 3 | import logging 4 | import argparse 5 | from pathlib import Path 6 | 7 | from utils.log import get_logger 8 | from yolov3_deepsort import VideoTracker 9 | from utils.parser import get_config 10 | 11 | import motmetrics as mm 12 | mm.lap.default_solver = 'lap' 13 | from utils.evaluation import Evaluator 14 | 15 | def mkdir_if_missing(dir): 16 | os.makedirs(dir, exist_ok=True) 17 | 18 | def main(data_root='', seqs=('',), args=""): 19 | logger = get_logger() 20 | logger.setLevel(logging.INFO) 21 | data_type = 'mot' 22 | result_root = os.path.join(Path(data_root), "mot_results") 23 | mkdir_if_missing(result_root) 24 | 25 | cfg = get_config() 26 | cfg.merge_from_file(args.config_detection) 27 | cfg.merge_from_file(args.config_deepsort) 28 | 29 | # run tracking 30 | accs = [] 31 | for seq in seqs: 32 | logger.info('start seq: {}'.format(seq)) 33 | result_filename = os.path.join(result_root, '{}.txt'.format(seq)) 34 | video_path = data_root+"/"+seq+"/video/video.mp4" 35 | 36 | with VideoTracker(cfg, args, video_path, result_filename) as vdo_trk: 37 | vdo_trk.run() 38 | 39 | # eval 40 | logger.info('Evaluate seq: {}'.format(seq)) 41 | evaluator = Evaluator(data_root, seq, data_type) 42 | accs.append(evaluator.eval_file(result_filename)) 43 | 44 | # get summary 45 | metrics = mm.metrics.motchallenge_metrics 46 | mh = mm.metrics.create() 47 | summary = Evaluator.get_summary(accs, seqs, metrics) 48 | strsummary = mm.io.render_summary( 49 | summary, 50 | formatters=mh.formatters, 51 | namemap=mm.io.motchallenge_metric_names 52 | ) 53 | print(strsummary) 54 | Evaluator.save_summary(summary, os.path.join(result_root, 'summary_global.xlsx')) 55 | 56 | 57 | def parse_args(): 58 | parser = argparse.ArgumentParser() 59 | parser.add_argument("--config_detection", type=str, default="./configs/yolov3.yaml") 60 | parser.add_argument("--config_deepsort", type=str, default="./configs/deep_sort.yaml") 61 | parser.add_argument("--ignore_display", dest="display", action="store_false", default=False) 62 | parser.add_argument("--frame_interval", type=int, default=1) 63 | parser.add_argument("--display_width", type=int, default=800) 64 | parser.add_argument("--display_height", type=int, default=600) 65 | parser.add_argument("--save_path", type=str, default="./demo/demo.avi") 66 | parser.add_argument("--cpu", dest="use_cuda", action="store_false", default=True) 67 | parser.add_argument("--camera", action="store", dest="cam", type=int, default="-1") 68 | return parser.parse_args() 69 | 70 | if __name__ == '__main__': 71 | args = parse_args() 72 | 73 | seqs_str = '''MOT16-02 74 | MOT16-04 75 | MOT16-05 76 | MOT16-09 77 | MOT16-10 78 | MOT16-11 79 | MOT16-13 80 | ''' 81 | data_root = 'data/dataset/MOT16/train/' 82 | 83 | seqs = [seq.strip() for seq in seqs_str.split()] 84 | 85 | main(data_root=data_root, 86 | seqs=seqs, 87 | args=args) -------------------------------------------------------------------------------- /label_split.py: -------------------------------------------------------------------------------- 1 | import xml.etree.ElementTree as ET 2 | import pickle 3 | import os 4 | #from os import listdir, getcwd 5 | import os 6 | from os.path import join 7 | import glob 8 | import numpy as np 9 | import shutil 10 | 11 | # ratio of sample size in train and val 12 | ratio = [0.8, 0.2] 13 | 14 | 15 | # Functions 16 | def convert(size, box): 17 | dw = 1./size[0] 18 | dh = 1./size[1] 19 | x = (box[0] + box[1])/2.0 20 | y = (box[2] + box[3])/2.0 21 | w = box[1] - box[0] 22 | h = box[3] - box[2] 23 | x = x*dw 24 | w = w*dw 25 | y = y*dh 26 | h = h*dh 27 | return (x,y,w,h) 28 | 29 | def convert_annotation(in_file, out_file, classes): 30 | tree=ET.parse(in_file) 31 | root = tree.getroot() 32 | size = root.find('size') 33 | w = int(size.find('width').text) 34 | h = int(size.find('height').text) 35 | 36 | for obj in root.iter('object'): 37 | cls = obj.find('name').text 38 | if cls not in classes: 39 | continue 40 | cls_id = classes.index(cls) 41 | xmlbox = obj.find('bndbox') 42 | b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text)) 43 | bb = convert((w,h), b) 44 | out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n') 45 | 46 | # Main 47 | def main(): 48 | 49 | # Initial 50 | folders = ['dataset/images/train', 'dataset/images/val', 'dataset/labels/train', 'dataset/labels/val'] 51 | for folder in folders: 52 | if os.path.isdir(folder)==False: 53 | os.makedirs(folder) 54 | print('Creating %s' % folder) 55 | else: 56 | shutil.rmtree(folder) 57 | os.makedirs(folder) 58 | 59 | # Get Classes 60 | file = open('yolov5/data/data.yaml', 'r') 61 | for line in file: 62 | if line.find('names')!=-1: 63 | line = line.split('[')[-1].split(']')[0] 64 | classes = line.split("'") 65 | classes = [j for i,j in enumerate(classes) if i%2==1] 66 | 67 | # Split 68 | ann_files = glob.glob('dataset/annotations/*.xml') 69 | size = len(ann_files) 70 | train_size = int(size*ratio[0]) 71 | dataset = [] 72 | train_list = list(np.random.choice(ann_files, train_size, replace=False)) 73 | dataset.append(train_list) # Train 74 | dataset.append([i for i in ann_files if i not in train_list]) # Val 75 | 76 | # Convert xml to txt 77 | for dataset_idx in range(len(dataset)): 78 | for file in dataset[dataset_idx]: 79 | file_name = os.path.basename(file).replace('.xml', '.txt') 80 | print('Creating %s ....' % file_name) 81 | in_file = open(file, 'r') 82 | out_file = open('dataset/labels/%s/%s' % (['train', 'val'][dataset_idx], file_name), 'w') 83 | convert_annotation(in_file, out_file, classes) 84 | in_file.close() 85 | out_file.close() 86 | os.rename('dataset/images/%s' % file_name.replace('.txt', '.jpg'), 'dataset/images/%s/%s' % (['train', 'val'][dataset_idx], file_name.replace('.txt', '.jpg'))) 87 | 88 | print('Finised....') 89 | 90 | 91 | if __name__=='__main__': 92 | main() 93 | -------------------------------------------------------------------------------- /nba.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/nba.mp4 -------------------------------------------------------------------------------- /nba_inf.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/nba_inf.gif -------------------------------------------------------------------------------- /nba_inf.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WuPedin/Multi-class_Yolov5_DeepSort_Pytorch/507e0cec465fa2e01d88827abfa88708c7250392/nba_inf.mp4 -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | wheel>=0.35.1 2 | Cython>=0.29.21 3 | opencv-python>=4.4.0.42 4 | matplotlib>=3.3.1 5 | numpy>=1.19.1 6 | scipy>=1.5.2 7 | tqdm>=4.48.2 8 | pillow>=7.2.0 9 | PyYAML>=5.3.1 10 | tensorboard>=2.3.0 11 | easydict>=1.9 12 | -------------------------------------------------------------------------------- /requirements_yum.txt: -------------------------------------------------------------------------------- 1 | vim -y 2 | epel-release -y 3 | python36 -y 4 | python3-devel -y 5 | libXext -y 6 | libSM -y 7 | libXrender -y 8 | libXdmc -y 9 | mesa-libGL.x86_64 -y 10 | -------------------------------------------------------------------------------- /yolov5/.dockerignore: -------------------------------------------------------------------------------- 1 | # Repo-specific DockerIgnore ------------------------------------------------------------------------------------------- 2 | # .git 3 | .cache 4 | .idea 5 | runs 6 | output 7 | coco 8 | storage.googleapis.com 9 | 10 | data/samples/* 11 | **/results*.txt 12 | *.jpg 13 | 14 | # Neural Network weights ----------------------------------------------------------------------------------------------- 15 | **/*.weights 16 | **/*.pt 17 | **/*.pth 18 | **/*.onnx 19 | **/*.mlmodel 20 | **/*.torchscript 21 | 22 | 23 | # Below Copied From .gitignore ----------------------------------------------------------------------------------------- 24 | # Below Copied From .gitignore ----------------------------------------------------------------------------------------- 25 | 26 | 27 | # GitHub Python GitIgnore ---------------------------------------------------------------------------------------------- 28 | # Byte-compiled / optimized / DLL files 29 | __pycache__/ 30 | *.py[cod] 31 | *$py.class 32 | 33 | # C extensions 34 | *.so 35 | 36 | # Distribution / packaging 37 | .Python 38 | env/ 39 | build/ 40 | develop-eggs/ 41 | dist/ 42 | downloads/ 43 | eggs/ 44 | .eggs/ 45 | lib/ 46 | lib64/ 47 | parts/ 48 | sdist/ 49 | var/ 50 | wheels/ 51 | *.egg-info/ 52 | .installed.cfg 53 | *.egg 54 | 55 | # PyInstaller 56 | # Usually these files are written by a python script from a template 57 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 58 | *.manifest 59 | *.spec 60 | 61 | # Installer logs 62 | pip-log.txt 63 | pip-delete-this-directory.txt 64 | 65 | # Unit test / coverage reports 66 | htmlcov/ 67 | .tox/ 68 | .coverage 69 | .coverage.* 70 | .cache 71 | nosetests.xml 72 | coverage.xml 73 | *.cover 74 | .hypothesis/ 75 | 76 | # Translations 77 | *.mo 78 | *.pot 79 | 80 | # Django stuff: 81 | *.log 82 | local_settings.py 83 | 84 | # Flask stuff: 85 | instance/ 86 | .webassets-cache 87 | 88 | # Scrapy stuff: 89 | .scrapy 90 | 91 | # Sphinx documentation 92 | docs/_build/ 93 | 94 | # PyBuilder 95 | target/ 96 | 97 | # Jupyter Notebook 98 | .ipynb_checkpoints 99 | 100 | # pyenv 101 | .python-version 102 | 103 | # celery beat schedule file 104 | celerybeat-schedule 105 | 106 | # SageMath parsed files 107 | *.sage.py 108 | 109 | # dotenv 110 | .env 111 | 112 | # virtualenv 113 | .venv 114 | venv/ 115 | ENV/ 116 | 117 | # Spyder project settings 118 | .spyderproject 119 | .spyproject 120 | 121 | # Rope project settings 122 | .ropeproject 123 | 124 | # mkdocs documentation 125 | /site 126 | 127 | # mypy 128 | .mypy_cache/ 129 | 130 | 131 | # https://github.com/github/gitignore/blob/master/Global/macOS.gitignore ----------------------------------------------- 132 | 133 | # General 134 | .DS_Store 135 | .AppleDouble 136 | .LSOverride 137 | 138 | # Icon must end with two \r 139 | Icon 140 | Icon? 141 | 142 | # Thumbnails 143 | ._* 144 | 145 | # Files that might appear in the root of a volume 146 | .DocumentRevisions-V100 147 | .fseventsd 148 | .Spotlight-V100 149 | .TemporaryItems 150 | .Trashes 151 | .VolumeIcon.icns 152 | .com.apple.timemachine.donotpresent 153 | 154 | # Directories potentially created on remote AFP share 155 | .AppleDB 156 | .AppleDesktop 157 | Network Trash Folder 158 | Temporary Items 159 | .apdisk 160 | 161 | 162 | # https://github.com/github/gitignore/blob/master/Global/JetBrains.gitignore 163 | # Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio and WebStorm 164 | # Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839 165 | 166 | # User-specific stuff: 167 | .idea/* 168 | .idea/**/workspace.xml 169 | .idea/**/tasks.xml 170 | .idea/dictionaries 171 | .html # Bokeh Plots 172 | .pg # TensorFlow Frozen Graphs 173 | .avi # videos 174 | 175 | # Sensitive or high-churn files: 176 | .idea/**/dataSources/ 177 | .idea/**/dataSources.ids 178 | .idea/**/dataSources.local.xml 179 | .idea/**/sqlDataSources.xml 180 | .idea/**/dynamic.xml 181 | .idea/**/uiDesigner.xml 182 | 183 | # Gradle: 184 | .idea/**/gradle.xml 185 | .idea/**/libraries 186 | 187 | # CMake 188 | cmake-build-debug/ 189 | cmake-build-release/ 190 | 191 | # Mongo Explorer plugin: 192 | .idea/**/mongoSettings.xml 193 | 194 | ## File-based project format: 195 | *.iws 196 | 197 | ## Plugin-specific files: 198 | 199 | # IntelliJ 200 | out/ 201 | 202 | # mpeltonen/sbt-idea plugin 203 | .idea_modules/ 204 | 205 | # JIRA plugin 206 | atlassian-ide-plugin.xml 207 | 208 | # Cursive Clojure plugin 209 | .idea/replstate.xml 210 | 211 | # Crashlytics plugin (for Android Studio and IntelliJ) 212 | com_crashlytics_export_strings.xml 213 | crashlytics.properties 214 | crashlytics-build.properties 215 | fabric.properties 216 | -------------------------------------------------------------------------------- /yolov5/.gitattributes: -------------------------------------------------------------------------------- 1 | # this drop notebooks from GitHub language stats 2 | *.ipynb linguist-vendored 3 | -------------------------------------------------------------------------------- /yolov5/.gitignore: -------------------------------------------------------------------------------- 1 | # Repo-specific GitIgnore ---------------------------------------------------------------------------------------------- 2 | *.jpg 3 | *.jpeg 4 | *.png 5 | *.bmp 6 | *.tif 7 | *.tiff 8 | *.heic 9 | *.JPG 10 | *.JPEG 11 | *.PNG 12 | *.BMP 13 | *.TIF 14 | *.TIFF 15 | *.HEIC 16 | *.mp4 17 | *.mov 18 | *.MOV 19 | *.avi 20 | *.data 21 | *.json 22 | 23 | *.cfg 24 | !cfg/yolov3*.cfg 25 | 26 | storage.googleapis.com 27 | runs/* 28 | data/* 29 | !data/samples/zidane.jpg 30 | !data/samples/bus.jpg 31 | !data/coco.names 32 | !data/coco_paper.names 33 | !data/coco.data 34 | !data/coco_*.data 35 | !data/coco_*.txt 36 | !data/trainvalno5k.shapes 37 | !data/*.sh 38 | 39 | pycocotools/* 40 | results*.txt 41 | gcp_test*.sh 42 | 43 | # MATLAB GitIgnore ----------------------------------------------------------------------------------------------------- 44 | *.m~ 45 | *.mat 46 | !targets*.mat 47 | 48 | # Neural Network weights ----------------------------------------------------------------------------------------------- 49 | *.weights 50 | *.pt 51 | *.onnx 52 | *.mlmodel 53 | *.torchscript 54 | darknet53.conv.74 55 | yolov3-tiny.conv.15 56 | 57 | # GitHub Python GitIgnore ---------------------------------------------------------------------------------------------- 58 | # Byte-compiled / optimized / DLL files 59 | __pycache__/ 60 | *.py[cod] 61 | *$py.class 62 | 63 | # C extensions 64 | *.so 65 | 66 | # Distribution / packaging 67 | .Python 68 | env/ 69 | build/ 70 | develop-eggs/ 71 | dist/ 72 | downloads/ 73 | eggs/ 74 | .eggs/ 75 | lib/ 76 | lib64/ 77 | parts/ 78 | sdist/ 79 | var/ 80 | wheels/ 81 | *.egg-info/ 82 | .installed.cfg 83 | *.egg 84 | 85 | # PyInstaller 86 | # Usually these files are written by a python script from a template 87 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 88 | *.manifest 89 | *.spec 90 | 91 | # Installer logs 92 | pip-log.txt 93 | pip-delete-this-directory.txt 94 | 95 | # Unit test / coverage reports 96 | htmlcov/ 97 | .tox/ 98 | .coverage 99 | .coverage.* 100 | .cache 101 | nosetests.xml 102 | coverage.xml 103 | *.cover 104 | .hypothesis/ 105 | 106 | # Translations 107 | *.mo 108 | *.pot 109 | 110 | # Django stuff: 111 | *.log 112 | local_settings.py 113 | 114 | # Flask stuff: 115 | instance/ 116 | .webassets-cache 117 | 118 | # Scrapy stuff: 119 | .scrapy 120 | 121 | # Sphinx documentation 122 | docs/_build/ 123 | 124 | # PyBuilder 125 | target/ 126 | 127 | # Jupyter Notebook 128 | .ipynb_checkpoints 129 | 130 | # pyenv 131 | .python-version 132 | 133 | # celery beat schedule file 134 | celerybeat-schedule 135 | 136 | # SageMath parsed files 137 | *.sage.py 138 | 139 | # dotenv 140 | .env 141 | 142 | # virtualenv 143 | .venv 144 | venv/ 145 | ENV/ 146 | 147 | # Spyder project settings 148 | .spyderproject 149 | .spyproject 150 | 151 | # Rope project settings 152 | .ropeproject 153 | 154 | # mkdocs documentation 155 | /site 156 | 157 | # mypy 158 | .mypy_cache/ 159 | 160 | 161 | # https://github.com/github/gitignore/blob/master/Global/macOS.gitignore ----------------------------------------------- 162 | 163 | # General 164 | .DS_Store 165 | .AppleDouble 166 | .LSOverride 167 | 168 | # Icon must end with two \r 169 | Icon 170 | Icon? 171 | 172 | # Thumbnails 173 | ._* 174 | 175 | # Files that might appear in the root of a volume 176 | .DocumentRevisions-V100 177 | .fseventsd 178 | .Spotlight-V100 179 | .TemporaryItems 180 | .Trashes 181 | .VolumeIcon.icns 182 | .com.apple.timemachine.donotpresent 183 | 184 | # Directories potentially created on remote AFP share 185 | .AppleDB 186 | .AppleDesktop 187 | Network Trash Folder 188 | Temporary Items 189 | .apdisk 190 | 191 | 192 | # https://github.com/github/gitignore/blob/master/Global/JetBrains.gitignore 193 | # Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio and WebStorm 194 | # Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839 195 | 196 | # User-specific stuff: 197 | .idea/* 198 | .idea/**/workspace.xml 199 | .idea/**/tasks.xml 200 | .idea/dictionaries 201 | .html # Bokeh Plots 202 | .pg # TensorFlow Frozen Graphs 203 | .avi # videos 204 | 205 | # Sensitive or high-churn files: 206 | .idea/**/dataSources/ 207 | .idea/**/dataSources.ids 208 | .idea/**/dataSources.local.xml 209 | .idea/**/sqlDataSources.xml 210 | .idea/**/dynamic.xml 211 | .idea/**/uiDesigner.xml 212 | 213 | # Gradle: 214 | .idea/**/gradle.xml 215 | .idea/**/libraries 216 | 217 | # CMake 218 | cmake-build-debug/ 219 | cmake-build-release/ 220 | 221 | # Mongo Explorer plugin: 222 | .idea/**/mongoSettings.xml 223 | 224 | ## File-based project format: 225 | *.iws 226 | 227 | ## Plugin-specific files: 228 | 229 | # IntelliJ 230 | out/ 231 | 232 | # mpeltonen/sbt-idea plugin 233 | .idea_modules/ 234 | 235 | # JIRA plugin 236 | atlassian-ide-plugin.xml 237 | 238 | # Cursive Clojure plugin 239 | .idea/replstate.xml 240 | 241 | # Crashlytics plugin (for Android Studio and IntelliJ) 242 | com_crashlytics_export_strings.xml 243 | crashlytics.properties 244 | crashlytics-build.properties 245 | fabric.properties 246 | -------------------------------------------------------------------------------- /yolov5/data/get_coco2017.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # COCO 2017 dataset http://cocodataset.org 3 | # Download command: bash yolov5/data/get_coco2017.sh 4 | # Train command: python train.py --data coco.yaml 5 | # Default dataset location is next to /yolov5: 6 | # /parent_folder 7 | # /coco 8 | # /yolov5 9 | 10 | 11 | # Download labels from Google Drive, accepting presented query 12 | filename="coco2017labels.zip" 13 | fileid="1cXZR_ckHki6nddOmcysCuuJFM--T-Q6L" 14 | curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=${fileid}" > /dev/null 15 | curl -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=${fileid}" -o ${filename} 16 | rm ./cookie 17 | 18 | # Unzip labels 19 | unzip -q ${filename} # for coco.zip 20 | # tar -xzf ${filename} # for coco.tar.gz 21 | rm ${filename} 22 | 23 | # Download and unzip images 24 | cd coco/images 25 | f="train2017.zip" && curl http://images.cocodataset.org/zips/$f -o $f && unzip -q $f && rm $f # 19G, 118k images 26 | f="val2017.zip" && curl http://images.cocodataset.org/zips/$f -o $f && unzip -q $f && rm $f # 1G, 5k images 27 | # f="test2017.zip" && curl http://images.cocodataset.org/zips/$f -o $f && unzip -q $f && rm $f # 7G, 41k images 28 | 29 | # cd out 30 | cd ../.. 31 | -------------------------------------------------------------------------------- /yolov5/data/get_voc.sh: -------------------------------------------------------------------------------- 1 | # PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC/ 2 | # Download command: bash ./data/get_voc.sh 3 | # Train command: python train.py --data voc.yaml 4 | # Default dataset location is next to /yolov5: 5 | # /parent_folder 6 | # /VOC 7 | # /yolov5 8 | 9 | 10 | start=`date +%s` 11 | 12 | # handle optional download dir 13 | if [ -z "$1" ] 14 | then 15 | # navigate to ~/tmp 16 | echo "navigating to ../tmp/ ..." 17 | mkdir -p ../tmp 18 | cd ../tmp/ 19 | else 20 | # check if is valid directory 21 | if [ ! -d $1 ]; then 22 | echo $1 "is not a valid directory" 23 | exit 0 24 | fi 25 | echo "navigating to" $1 "..." 26 | cd $1 27 | fi 28 | 29 | echo "Downloading VOC2007 trainval ..." 30 | # Download the data. 31 | curl -LO http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar 32 | echo "Downloading VOC2007 test data ..." 33 | curl -LO http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar 34 | echo "Done downloading." 35 | 36 | # Extract data 37 | echo "Extracting trainval ..." 38 | tar -xf VOCtrainval_06-Nov-2007.tar 39 | echo "Extracting test ..." 40 | tar -xf VOCtest_06-Nov-2007.tar 41 | echo "removing tars ..." 42 | rm VOCtrainval_06-Nov-2007.tar 43 | rm VOCtest_06-Nov-2007.tar 44 | 45 | end=`date +%s` 46 | runtime=$((end-start)) 47 | 48 | echo "Completed in" $runtime "seconds" 49 | 50 | start=`date +%s` 51 | 52 | # handle optional download dir 53 | if [ -z "$1" ] 54 | then 55 | # navigate to ~/tmp 56 | echo "navigating to ../tmp/ ..." 57 | mkdir -p ../tmp 58 | cd ../tmp/ 59 | else 60 | # check if is valid directory 61 | if [ ! -d $1 ]; then 62 | echo $1 "is not a valid directory" 63 | exit 0 64 | fi 65 | echo "navigating to" $1 "..." 66 | cd $1 67 | fi 68 | 69 | echo "Downloading VOC2012 trainval ..." 70 | # Download the data. 71 | curl -LO http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar 72 | echo "Done downloading." 73 | 74 | 75 | # Extract data 76 | echo "Extracting trainval ..." 77 | tar -xf VOCtrainval_11-May-2012.tar 78 | echo "removing tar ..." 79 | rm VOCtrainval_11-May-2012.tar 80 | 81 | end=`date +%s` 82 | runtime=$((end-start)) 83 | 84 | echo "Completed in" $runtime "seconds" 85 | 86 | cd ../tmp 87 | echo "Spliting dataset..." 88 | python3 - "$@" < train.txt 148 | cat 2007_train.txt 2007_val.txt 2007_test.txt 2012_train.txt 2012_val.txt > train.all.txt 149 | 150 | python3 - "$@" <