├── .gitignore
├── README.md
├── data
    └── cache
    │   ├── caltech
    │       ├── test
    │       ├── train_gt
    │       └── train_nogt
    │   ├── cityperson
    │       ├── train_h50
    │       └── val_500
    │   └── widerface
    │       ├── note.txt~
    │       ├── test
    │       ├── train
    │       └── val
├── docs
    ├── face.jpg
    └── pipeline.png
├── download_weights.sh
├── eval_caltech
    ├── AS.mat
    ├── ResultsEval
    │   └── gt-new
    │   │   └── gt-Reasonable.mat
    ├── bbApply.m
    ├── bbGt.m
    ├── dbEval.m
    ├── dbInfo.m
    ├── extract_img_anno.m
    ├── getPrmDflt.m
    └── vbb.m
├── eval_city
    ├── cocoapi
    │   ├── .gitignore
    │   ├── LuaAPI
    │   │   ├── CocoApi.lua
    │   │   ├── MaskApi.lua
    │   │   ├── cocoDemo.lua
    │   │   ├── env.lua
    │   │   ├── init.lua
    │   │   └── rocks
    │   │   │   └── coco-scm-1.rockspec
    │   ├── MatlabAPI
    │   │   ├── CocoApi.m
    │   │   ├── CocoEval.m
    │   │   ├── CocoUtils.m
    │   │   ├── MaskApi.m
    │   │   ├── cocoDemo.m
    │   │   ├── evalDemo.m
    │   │   ├── gason.m
    │   │   └── private
    │   │   │   ├── gasonMex.cpp
    │   │   │   ├── gasonMex.mexa64
    │   │   │   ├── gasonMex.mexmaci64
    │   │   │   ├── getPrmDflt.m
    │   │   │   ├── maskApiMex.c
    │   │   │   ├── maskApiMex.mexa64
    │   │   │   └── maskApiMex.mexmaci64
    │   ├── PythonAPI
    │   │   ├── Makefile
    │   │   ├── pycocoDemo.ipynb
    │   │   ├── pycocoEvalDemo.ipynb
    │   │   ├── pycocotools
    │   │   │   ├── __init__.py
    │   │   │   ├── _mask.pyx
    │   │   │   ├── coco.py
    │   │   │   ├── cocoeval.py
    │   │   │   └── mask.py
    │   │   └── setup.py
    │   ├── README.txt
    │   ├── common
    │   │   ├── gason.cpp
    │   │   ├── gason.h
    │   │   ├── maskApi.c
    │   │   └── maskApi.h
    │   └── license.txt
    ├── dt_txt2json.m
    ├── eval_script
    │   ├── coco.py
    │   ├── eval_MR_multisetup.py
    │   ├── eval_demo.py
    │   └── readme.txt
    ├── readme.txt
    └── val_gt.json
├── generate_cache_caltech.py
├── generate_cache_city.py
├── generate_cache_wider.py
├── keras_csp
    ├── __init__.py
    ├── bbox_process.py
    ├── bbox_transform.py
    ├── config.py
    ├── data_augment.py
    ├── data_generators.py
    ├── keras_layer_L2Normalization.py
    ├── losses.py
    ├── mobilenet.py
    ├── nms
    │   ├── .gitignore
    │   ├── __init__.py
    │   ├── cpu_nms.pyx
    │   ├── gpu_nms.hpp
    │   ├── gpu_nms.pyx
    │   ├── nms_kernel.cu
    │   └── py_cpu_nms.py
    ├── nms_wrapper.py
    ├── parallel_model.py
    ├── resnet50.py
    └── utilsfunc.py
├── requirements.txt
├── setup.py
├── test_caltech.py
├── test_city.py
├── test_wider_ms.py
├── train_caltech.py
├── train_city.py
└── train_wider.py


/.gitignore:
--------------------------------------------------------------------------------
1 | .idea/
2 | build/
3 | data/
4 | __pycache__
5 | output
6 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # High-level Semantic Feature Detection: A New Perspective for Pedestrian Detection
  2 | Keras implementation of [CSP] accepted by CVPR 2019.
  3 | ## Introduction
  4 | This paper provides a new perspective for detecting pedestrians where detection is formulated as Center and Scale Prediction (CSP), the pipeline is illustrated in the following. For more details, please refer to our [paper](http://openaccess.thecvf.com/content_CVPR_2019/papers/Liu_High-Level_Semantic_Feature_Detection_A_New_Perspective_for_Pedestrian_Detection_CVPR_2019_paper.pdf).
  5 | ![img01](./docs/pipeline.png)
  6 | 
  7 | Besides the superority on pedestrian detection demonstrated in the paper, we take a step further towards the generablity of CSP and validate it on face detection. Experimental reults on WiderFace benchmark also show the competitiveness of CSP.
  8 | ![img02](./docs/face.jpg)
  9 | 
 10 | 
 11 | ### Dependencies
 12 | 
 13 | * Python >= 3.6
 14 | * Tensorflow >= 1.1.3
 15 | * Keras >= 2.0.6
 16 | * OpenCV >= 3.4.1.15  (note that other versions than 3.4.1.15 will result in different performance on Caltech)
 17 | 
 18 | ## Contents
 19 | 1. [Installation](#installation)
 20 | 2. [Preparation](#preparation)
 21 | 3. [Training](#training)
 22 | 4. [Test](#test)
 23 | 5. [Evaluation](#evaluation)
 24 | 6. [Models](#models)
 25 | 
 26 | ### Installation
 27 | 1. Get the code. We will call the cloned directory as '$CSP'.
 28 | ```
 29 |   git clone https://github.com/liuwei16/CSP.git
 30 | ```
 31 | 2. Install the requirments.
 32 | ```
 33 |   pip install -r requirements.txt
 34 | ```
 35 | 
 36 | 3. Build dependencies
 37 | ```
 38 |   python setup.py build_ext --inplace
 39 | ```
 40 | 
 41 | 4. Download pretrained resnet50 weights (basenet only):
 42 | ```
 43 | ./download_weights.sh
 44 | ```
 45 | 
 46 | 
 47 | ### Preparation
 48 | 1. Download the dataset.
 49 | 
 50 |  For pedestrian detection, we train and test our model on [Caltech](http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/) and [CityPersons](https://bitbucket.org/shanshanzhang/citypersons), you should firstly download the datasets. By default, we assume the dataset is stored in `./data/`.
 51 | 
 52 | 2. Dataset preparation.
 53 | 
 54 |  For Caltech, you can follow [./eval_caltech/extract_img_anno.m](./eval_caltech/extract_img_anno.m) to extract official seq files into images. Training and test are based on the new annotations provided by [Shanshan2016CVPR](https://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-computing/research/people-detection-pose-estimation-and-tracking/how-far-are-we-from-solving-pedestrian-detection/). We use the train_10x setting (42782 images) for training, the official test set has 4024 images. By default, we assume that images and annotations are stored in `./data/caltech`, and the directory structure is
 55 |  ```
 56 | *DATA_PATH
 57 | 	*train_3
 58 | 		*annotations_new
 59 |         	*set00_V000_I00002.txt
 60 |         	*...
 61 | 		*images
 62 |             *set00_V000_I00002.jpg
 63 |         	*...
 64 | 	*test
 65 | 		*annotations_new
 66 |         	*set06_V000_I00029.jpg.txt
 67 |             *...
 68 | 		*images
 69 |         	*set06_V000_I00029.jpg
 70 |             *...
 71 | ```
 72 | For citypersons, we use the training set (2975 images) for training and test on the validation set (500 images), we assume that images and annotations are stored in  `./data/citypersons`, and the directory structure is
 73 |  ```
 74 | *DATA_PATH
 75 | 	*annotations
 76 | 		*anno_train.mat
 77 | 		*anno_val.mat
 78 | 	*images
 79 | 		*train
 80 | 		*val
 81 | ```
 82 | We have provided the cache files of training and validation subsets. Optionally, you can also follow the [./generate_cache_caltech.py](./generate_cache_caltech.py) and [./generate_cache_city.py](./generate_cache_city.py) to create the cache files for training and validation. By default, we assume the cache files is stored in `./data/cache/`. For Caltech, we split the training set into images with and without any pedestrian instance, resulting in 9105 and 33767 images.
 83 | 
 84 | 3. Download the initialized models.
 85 | 
 86 |  We use the backbone [ResNet-50](https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5) and [MobileNet_v1](https://github.com/fchollet/deep-learning-models/releases/download/v0.6/) in our experiments. By default, we assume the weight files is stored in `./data/models/`.
 87 | 
 88 | ### Training
 89 | Optionally, you should set the training parameters in [./keras_csp/config.py](./keras_csp/config.py)
 90 | 1. Train on Caltech.
 91 | 
 92 |  Follow the [./train_caltech.py](./train_caltech.py) to start training. The weight files of all epochs will be saved in `./output/valmodels/caltech`.
 93 |  
 94 | 2. Train on CityPersons.
 95 |  
 96 |  Follow the [./train_city.py](./train_city.py) to start training. The weight files of all epochs will be saved in `./output/valmodels/city`.
 97 | 
 98 | ### Test
 99 | 1. Caltech.
100 | 
101 |  Follow the [./test_caltech.py](./test_caltech.py) to get the detection results. You can test from epoch 51, and the results will be saved in `./output/valresults/caltech`. 
102 |  
103 | 2. CityPersons.
104 |  
105 |  Follow the [./test_city.py](./test_city.py) to get the detection results. You can test from epoch 51, and the results will be saved in `./output/valresults/city`. 
106 | 
107 | ### Evaluation
108 | 1. Caltech.
109 | 
110 |  Follow the [./eval_caltech/dbEval.m](./eval_caltech/dbEval.m) to get the Miss Rates of detections in `pth.resDir` defined in line 25. Finally, evaluation results will be saved as `eval-newReasonable.txt` in `./eval_caltech/ResultsEval`.
111 |  
112 | 2. CityPersons.
113 | 
114 |  (1) Follow the [./eval_city/dt_txt2json.m](./eval_city/dt_txt2json.m) to convert the '.txt' files to '.json'. The specific `main_path` is defined in line 3.
115 |  
116 |  (2) Follow the [./eval_city/eval_script/eval_demo.py](./eval_city/eval_script/eval_demo.py) to get the Miss Rates of detections in `main_path` defined in line 9.
117 | 
118 | ### Models
119 | To reproduce the results in our paper, we have provided the models trained from different datasets. You can download them through [BaiduYun](https://pan.baidu.com/s/1SSPQnbDP6zf9xf8eCDi3Fw) (Code: jcgd). For Caltech, please make sure that the version of OpenCV is 3.4.1.15, other versions will read the same image into different data values, resulting in slightly different performance.
120 | 1. For Caltech
121 |  
122 |  ResNet-50 initialized from ImageNet:
123 |  
124 |  Height prediction: [model_CSP/caltech/fromimgnet/h/nooffset](https://pan.baidu.com/s/1SSPQnbDP6zf9xf8eCDi3Fw)
125 |   
126 |  Height+Offset prediciton: [model_CSP/caltech/fromimgnet/h/withoffset](https://pan.baidu.com/s/1SSPQnbDP6zf9xf8eCDi3Fw)
127 |  
128 |  Height+Width prediciton: [model_CSP/caltech/fromimgnet/h+w/](https://pan.baidu.com/s/1SSPQnbDP6zf9xf8eCDi3Fw)
129 |  
130 |  ResNet-50 initialized from CityPersons:
131 |   
132 |  Height+Offset prediciton: [model_CSP/caltech/fromcity/](https://pan.baidu.com/s/1SSPQnbDP6zf9xf8eCDi3Fw)
133 |  
134 | 2. For CityPersons: 
135 | 
136 |  Height prediction: [model_CSP/cityperson/nooffset](https://pan.baidu.com/s/1SSPQnbDP6zf9xf8eCDi3Fw)
137 |  
138 |  Height+Offset prediction: [model_CSP/cityperson/withoffset](https://pan.baidu.com/s/1SSPQnbDP6zf9xf8eCDi3Fw)
139 |  
140 |  Upon this codebase, we also have 10 trails on Height+Offset prediction. Generally, models will be converged after epoch 50. For Caltech and CityPersons, we test the results from epoch 50 to 120 and from epoch 50 to 150, respectively, and get the best result (*MR* under Reasonable setting) given in the following table.
141 |  
142 |  | Trial | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
143 |  |:-----:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:--:|
144 |  |Caltech| 4.98 | 4.75 | 4.57 | 4.84 | 4.72 | 4.15 | 5.17 | 4.60 | 4.63 | 4.91 |
145 |  |CityPersons| 11.31 | 11.17 | 11.42 | 11.69 | 11.56 | 11.05 | 11.59 | 11.78 | 11.27 | 10.62 |
146 |  
147 |  
148 | ### Extension--Face Detection
149 | 1. Data preparation 
150 | 
151 |  You should firstly download the [WiderFace](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/) dataset and put it in `./data/WiderFace`. We have provided the cache files in `./data/cache/widerface` or you can follow [./genetrate_cache_wider.py](./genetrate_cache_wider.py) to cerate them.
152 |  
153 | 2. Training and Test
154 | 
155 |  For face detection, CSP is required to predict both height and width of each instance with various aspect ratios. You can follow the [./train_wider.py](./train_wider.py) to start training and [./test_wider_ms.py](./test_wider_ms.py) for multi-scale test. As a common practice, the model trained on the official training set is evaluated on both validation and test set, and the results are submitted to [WiderFace](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/). To reproduce the result in the benchmark, we provide the model for Height+Width+Offset prediction in [model_CSP/widerface/](https://pan.baidu.com/s/1SSPQnbDP6zf9xf8eCDi3Fw).
156 |  
157 |  Note that we adopt the similar data-augmentation strategy for training and multi-scale testing in [PyramidBox](https://arxiv.org/pdf/1803.07737.pdf), which helps us to achieve better performance in this benchmark. 
158 | 
159 | ## Citation
160 | If you think our work is useful in your research, please consider citing:
161 | ```
162 | @inproceedings{liu2018high,
163 |   title={High-level Semantic Feature Detection: A New Perspective for Pedestrian Detection},
164 |   author={Wei Liu, Shengcai Liao, Weiqiang Ren, Weidong Hu, Yinan Yu},
165 |   booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
166 |   year={2019}
167 | }
168 | ```
169 | 
170 | 
171 | 
172 | 


--------------------------------------------------------------------------------
/data/cache/caltech/test:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/data/cache/caltech/test


--------------------------------------------------------------------------------
/data/cache/caltech/train_gt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/data/cache/caltech/train_gt


--------------------------------------------------------------------------------
/data/cache/caltech/train_nogt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/data/cache/caltech/train_nogt


--------------------------------------------------------------------------------
/data/cache/cityperson/train_h50:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/data/cache/cityperson/train_h50


--------------------------------------------------------------------------------
/data/cache/cityperson/val_500:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/data/cache/cityperson/val_500


--------------------------------------------------------------------------------
/data/cache/widerface/note.txt~:
--------------------------------------------------------------------------------
 1 | filter w>=5 and h>=5
 2 | train:
 3 | 12880 images and 12880 valid images and 159424 boxes
 4 | 12880 images and 12876 valid images and 154301 boxes
 5 | 
 6 | val:
 7 | 3226 images and 3226 valid images and 39708 boxes
 8 | 
 9 | test:
10 | 16097 images
11 | 
12 | 
13 | 
14 | 
15 | 


--------------------------------------------------------------------------------
/data/cache/widerface/test:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/data/cache/widerface/test


--------------------------------------------------------------------------------
/data/cache/widerface/train:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/data/cache/widerface/train


--------------------------------------------------------------------------------
/data/cache/widerface/val:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/data/cache/widerface/val


--------------------------------------------------------------------------------
/docs/face.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/docs/face.jpg


--------------------------------------------------------------------------------
/docs/pipeline.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/docs/pipeline.png


--------------------------------------------------------------------------------
/download_weights.sh:
--------------------------------------------------------------------------------
1 | mkdir -p data/models
2 | wget -O ./data/models/resnet50_weights_tf_dim_ordering_tf_kernels.h5 https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5 --no-check-certificate
3 | 


--------------------------------------------------------------------------------
/eval_caltech/AS.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/eval_caltech/AS.mat


--------------------------------------------------------------------------------
/eval_caltech/ResultsEval/gt-new/gt-Reasonable.mat:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/eval_caltech/ResultsEval/gt-new/gt-Reasonable.mat


--------------------------------------------------------------------------------
/eval_caltech/bbGt.m:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/eval_caltech/bbGt.m


--------------------------------------------------------------------------------
/eval_caltech/dbInfo.m:
--------------------------------------------------------------------------------
 1 | function [pth,setIds,vidIds,skip,ext] = dbInfo( name1 )
 2 | % Specifies data amount and location.
 3 | %
 4 | % 'name' specifies the name of the dataset. Valid options include: 'Usa',
 5 | % 'UsaTest', 'UsaTrain', 'InriaTrain', 'InriaTest', 'Japan', 'TudBrussels',
 6 | % 'ETH', and 'Daimler'. If dbInfo() is called without specifying the
 7 | % dataset name, defaults to the last used name (or 'UsaTest' on first call
 8 | % to dbInfo()). Finally, one can specify a subset of a dataset by appending
 9 | % digits to the end of the name (eg. 'UsaTest01' indicates first set of
10 | % 'UsaTest' and 'UsaTest01005' indicate first set, fifth video).
11 | %
12 | % USAGE
13 | %  [pth,setIds,vidIds,skip,ext] = dbInfo( [name] )
14 | %
15 | % INPUTS
16 | %  name     - ['UsaTest'] specify dataset, caches last passed in name
17 | %
18 | % OUTPUTS
19 | %  pth      - directory containing database
20 | %  setIds   - integer ids of each set
21 | %  vidIds   - [1xnSets] cell of vectors of integer ids of each video
22 | %  skip     - specify subset of frames to use for evaluation
23 | %  ext      - file extension determining image format ('jpg' or 'png')
24 | %
25 | % EXAMPLE
26 | %  [pth,setIds,vidIds,skip,ext] = dbInfo
27 | %
28 | % See also
29 | %
30 | % Caltech Pedestrian Dataset     Version 3.2.1
31 | % Copyright 2014 Piotr Dollar.  [pdollar-at-gmail.com]
32 | % Licensed under the Simplified BSD License [see external/bsd.txt]
33 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
34 | %modification version:
35 | % 2017.01.18:the default of skip is set to be 3 instead of 30
36 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
37 | 
38 | persistent name; % cache last used name
39 | if(nargin && ~isempty(name1)), name=lower(name1); else
40 |   if(isempty(name)), name='usa'; end; end; name1=name;
41 | 
42 | vidId=str2double(name1(end-2:end)); % check if name ends in 3 ints
43 | if(isnan(vidId)), vidId=[]; else name1=name1(1:end-3); end
44 | setId=str2double(name1(end-1:end)); % check if name ends in 2 ints
45 | if(isnan(setId)), setId=[]; else name1=name1(1:end-2); end
46 | 
47 | switch name1
48 |   case 'usa' % Caltech Pedestrian Datasets (all)
49 |     setIds=0:10; subdir='USA'; skip=30; ext='jpg';
50 |     vidIds={0:14 0:5 0:11 0:12 0:11 0:12 0:18 0:11 0:10 0:11 0:11};
51 |   case 'usatrain' % Caltech Pedestrian Datasets (training)
52 |     setIds=0:5; subdir='USA'; skip=30; ext='jpg';
53 |     vidIds={0:14 0:5 0:11 0:12 0:11 0:12};
54 |   case 'usatest' % Caltech Pedestrian Datasets (testing)
55 |     setIds=6:10; subdir='USA'; skip=30; ext='jpg';
56 |     vidIds={0:18 0:11 0:10 0:11 0:11};
57 |   case 'kaist' % Caltech Pedestrian Datasets (testing)
58 |     setIds=6:11; subdir='kaist'; skip=20; ext='jpg';
59 |     vidIds={0:4 0:2 0:2 0 0:1 0:1};
60 |   case 'inriatrain' % INRIA peds (training)
61 |     setIds=0; subdir='INRIA'; skip=1; ext='png'; vidIds={0:1};
62 |   case 'inriatest' % INRIA peds (testing)
63 |     setIds=1; subdir='INRIA'; skip=1; ext='png'; vidIds={0};
64 |   case 'japan' % Caltech Japan data (not publicly avialable)
65 |     setIds=0:12; subdir='Japan'; skip=30; ext='jpg';
66 |     vidIds={0:5 0:5 0:3 0:5 0:5 0:5 0:5 0:5 0:4 0:4 0:5 0:5 0:4};
67 |   case 'tudbrussels' % TUD-Brussels dataset
68 |     setIds=0; subdir='TudBrussels'; skip=1; ext='png'; vidIds={0};
69 |   case 'eth' % ETH dataset
70 |     setIds=0:2; subdir='ETH'; skip=1; ext='png'; vidIds={0 0 0};
71 |   case 'daimler' % Daimler dataset
72 |     setIds=0; subdir='Daimler'; skip=1; ext='png'; vidIds={0};
73 |   case 'pietro'
74 |     setIds=0; subdir='Pietro'; skip=1; ext='jpg'; vidIds={0};
75 |   otherwise, error('unknown data type: %s',name);
76 | end
77 | 
78 | % optionally select only specific set/vid if name ended in ints
79 | if(~isempty(setId)), setIds=setIds(setId); vidIds=vidIds(setId); end
80 | if(~isempty(vidId)), vidIds={vidIds{1}(vidId)}; end
81 | 
82 | % actual directory where data is contained
83 | pth=fileparts(mfilename('fullpath'));
84 | pth=[pth filesep 'data-' subdir];
85 | 
86 | end
87 | 


--------------------------------------------------------------------------------
/eval_caltech/extract_img_anno.m:
--------------------------------------------------------------------------------
 1 | % extract_img_anno()
 2 | % --------------------------------------------------------
 3 | % RPN_BF
 4 | % Copyright (c) 2016, Liliang Zhang
 5 | % Licensed under The MIT License [see LICENSE for details]
 6 | % --------------------------------------------------------
 7 | 
 8 | dataDir='./datasets/caltech/';
 9 | addpath(genpath('./code3.2.1'));
10 | addpath(genpath('./toolbox'));
11 | for s=1:2
12 |   if(s==1), type='test'; skip=[];continue; else type='train'; skip=3; end
13 |   dbInfo(['Usa' type]);
14 |   if(exist([dataDir type '/annotations'],'dir')), continue; end
15 |   dbExtract([dataDir type],1,skip);
16 | end
17 | 
18 | 


--------------------------------------------------------------------------------
/eval_caltech/getPrmDflt.m:
--------------------------------------------------------------------------------
 1 | function varargout = getPrmDflt( prm, dfs, checkExtra )
 2 | % Helper to set default values (if not already set) of parameter struct.
 3 | %
 4 | % Takes input parameters and a list of 'name'/default pairs, and for each
 5 | % 'name' for which prm has no value (prm.(name) is not a field or 'name'
 6 | % does not appear in prm list), getPrmDflt assigns the given default
 7 | % value. If default value for variable 'name' is 'REQ', and value for
 8 | % 'name' is not given, an error is thrown. See below for usage details.
 9 | %
10 | % USAGE (nargout==1)
11 | %  prm = getPrmDflt( prm, dfs, [checkExtra] )
12 | %
13 | % USAGE (nargout>1)
14 | %  [ param1 ... paramN ] = getPrmDflt( prm, dfs, [checkExtra] )
15 | %
16 | % INPUTS
17 | %  prm          - param struct or cell of form {'name1' v1 'name2' v2 ...}
18 | %  dfs          - cell of form {'name1' def1 'name2' def2 ...}
19 | %  checkExtra   - [0] if 1 throw error if prm contains params not in dfs
20 | %                 if -1 if prm contains params not in dfs adds them
21 | %
22 | % OUTPUTS (nargout==1)
23 | %  prm    - parameter struct with fields 'name1' through 'nameN' assigned
24 | %
25 | % OUTPUTS (nargout>1)
26 | %  param1 - value assigned to parameter with 'name1'
27 | %   ...
28 | %  paramN - value assigned to parameter with 'nameN'
29 | %
30 | % EXAMPLE
31 | %  dfs = { 'x','REQ', 'y',0, 'z',[], 'eps',1e-3 };
32 | %  prm = getPrmDflt( struct('x',1,'y',1), dfs )
33 | %  [ x y z eps ] = getPrmDflt( {'x',2,'y',1}, dfs )
34 | %
35 | % See also INPUTPARSER
36 | %
37 | % Piotr's Computer Vision Matlab Toolbox      Version 2.60
38 | % Copyright 2014 Piotr Dollar.  [pdollar-at-gmail.com]
39 | % Licensed under the Simplified BSD License [see external/bsd.txt]
40 | 
41 | if( mod(length(dfs),2) ), error('odd number of default parameters'); end
42 | if nargin<=2, checkExtra = 0; end
43 | 
44 | % get the input parameters as two cell arrays: prmVal and prmField
45 | if iscell(prm) && length(prm)==1, prm=prm{1}; end
46 | if iscell(prm)
47 |   if(mod(length(prm),2)), error('odd number of parameters in prm'); end
48 |   prmField = prm(1:2:end); prmVal = prm(2:2:end);
49 | else
50 |   if(~isstruct(prm)), error('prm must be a struct or a cell'); end
51 |   prmVal = struct2cell(prm); prmField = fieldnames(prm);
52 | end
53 | 
54 | % get and update default values using quick for loop
55 | dfsField = dfs(1:2:end); dfsVal = dfs(2:2:end);
56 | if checkExtra>0
57 |   for i=1:length(prmField)
58 |     j = find(strcmp(prmField{i},dfsField));
59 |     if isempty(j), error('parameter %s is not valid', prmField{i}); end
60 |     dfsVal(j) = prmVal(i);
61 |   end
62 | elseif checkExtra<0
63 |   for i=1:length(prmField)
64 |     j = find(strcmp(prmField{i},dfsField));
65 |     if isempty(j), j=length(dfsVal)+1; dfsField{j}=prmField{i}; end
66 |     dfsVal(j) = prmVal(i);
67 |   end
68 | else
69 |   for i=1:length(prmField)
70 |     dfsVal(strcmp(prmField{i},dfsField)) = prmVal(i);
71 |   end
72 | end
73 | 
74 | % check for missing values
75 | if any(strcmp('REQ',dfsVal))
76 |   cmpArray = find(strcmp('REQ',dfsVal));
77 |   error(['Required field ''' dfsField{cmpArray(1)} ''' not specified.'] );
78 | end
79 | 
80 | % set output
81 | if nargout==1
82 |   varargout{1} = cell2struct( dfsVal, dfsField, 2 );
83 | else
84 |   varargout = dfsVal;
85 | end
86 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/.gitignore:
--------------------------------------------------------------------------------
 1 | images/
 2 | annotations/
 3 | results/
 4 | external/
 5 | .DS_Store
 6 | 
 7 | MatlabAPI/analyze*/
 8 | MatlabAPI/visualize*/
 9 | 
10 | PythonAPI/pycocotools/__init__.pyc
11 | PythonAPI/pycocotools/_mask.c
12 | PythonAPI/pycocotools/_mask.so
13 | PythonAPI/pycocotools/coco.pyc
14 | PythonAPI/pycocotools/cocoeval.pyc
15 | PythonAPI/pycocotools/mask.pyc
16 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/LuaAPI/MaskApi.lua:
--------------------------------------------------------------------------------
  1 | --[[----------------------------------------------------------------------------
  2 | 
  3 | Interface for manipulating masks stored in RLE format.
  4 | 
  5 | For an overview of RLE please see http://mscoco.org/dataset/#download.
  6 | Additionally, more detailed information can be found in the Matlab MaskApi.m:
  7 |   https://github.com/pdollar/coco/blob/master/MatlabAPI/MaskApi.m
  8 | 
  9 | The following API functions are defined:
 10 |   encode - Encode binary masks using RLE.
 11 |   decode - Decode binary masks encoded via RLE.
 12 |   merge  - Compute union or intersection of encoded masks.
 13 |   iou    - Compute intersection over union between masks.
 14 |   nms    - Compute non-maximum suppression between ordered masks.
 15 |   area   - Compute area of encoded masks.
 16 |   toBbox - Get bounding boxes surrounding encoded masks.
 17 |   frBbox - Convert bounding boxes to encoded masks.
 18 |   frPoly - Convert polygon to encoded mask.
 19 |   drawCirc  - Draw circle into image (alters input).
 20 |   drawLine  - Draw line into image (alters input).
 21 |   drawMasks - Draw masks into image (alters input).
 22 | 
 23 | Usage:
 24 |   Rs     = MaskApi.encode( masks )
 25 |   masks  = MaskApi.decode( Rs )
 26 |   R      = MaskApi.merge( Rs, [intersect=false] )
 27 |   o      = MaskApi.iou( dt, gt, [iscrowd=false] )
 28 |   keep   = MaskApi.nms( dt, thr )
 29 |   a      = MaskApi.area( Rs )
 30 |   bbs    = MaskApi.toBbox( Rs )
 31 |   Rs     = MaskApi.frBbox( bbs, h, w )
 32 |   R      = MaskApi.frPoly( poly, h, w )
 33 |   MaskApi.drawCirc( img, x, y, rad, clr )
 34 |   MaskApi.drawLine( img, x0, y0, x1, y1, rad, clr )
 35 |   MaskApi.drawMasks( img, masks, [maxn=n], [alpha=.4], [clrs] )
 36 | For detailed usage information please see cocoDemo.lua.
 37 | 
 38 | In the API the following formats are used:
 39 |   R,Rs   - [table] Run-length encoding of binary mask(s)
 40 |   masks  - [nxhxw] Binary mask(s)
 41 |   bbs    - [nx4] Bounding box(es) stored as [x y w h]
 42 |   poly   - Polygon stored as {[x1 y1 x2 y2...],[x1 y1 ...],...}
 43 |   dt,gt  - May be either bounding boxes or encoded masks
 44 | Both poly and bbs are 0-indexed (bbox=[0 0 1 1] encloses first pixel).
 45 | 
 46 | Common Objects in COntext (COCO) Toolbox.      version 3.0
 47 | Data, paper, and tutorials available at:  http://mscoco.org/
 48 | Code written by Pedro O. Pinheiro and Piotr Dollar, 2016.
 49 | Licensed under the Simplified BSD License [see coco/license.txt]
 50 | 
 51 | ------------------------------------------------------------------------------]]
 52 | 
 53 | local ffi = require 'ffi'
 54 | local coco = require 'coco.env'
 55 | 
 56 | coco.MaskApi = {}
 57 | local MaskApi = coco.MaskApi
 58 | 
 59 | coco.libmaskapi = ffi.load(package.searchpath('libmaskapi',package.cpath))
 60 | local libmaskapi = coco.libmaskapi
 61 | 
 62 | --------------------------------------------------------------------------------
 63 | 
 64 | MaskApi.encode = function( masks )
 65 |   local n, h, w = masks:size(1), masks:size(2), masks:size(3)
 66 |   masks = masks:type('torch.ByteTensor'):transpose(2,3)
 67 |   local data = masks:contiguous():data()
 68 |   local Qs = MaskApi._rlesInit(n)
 69 |   libmaskapi.rleEncode(Qs[0],data,h,w,n)
 70 |   return MaskApi._rlesToLua(Qs,n)
 71 | end
 72 | 
 73 | MaskApi.decode = function( Rs )
 74 |   local Qs, n, h, w = MaskApi._rlesFrLua(Rs)
 75 |   local masks = torch.ByteTensor(n,w,h):zero():contiguous()
 76 |   libmaskapi.rleDecode(Qs,masks:data(),n)
 77 |   MaskApi._rlesFree(Qs,n)
 78 |   return masks:transpose(2,3)
 79 | end
 80 | 
 81 | MaskApi.merge = function( Rs, intersect )
 82 |   intersect = intersect or 0
 83 |   local Qs, n, h, w = MaskApi._rlesFrLua(Rs)
 84 |   local Q = MaskApi._rlesInit(1)
 85 |   libmaskapi.rleMerge(Qs,Q,n,intersect)
 86 |   MaskApi._rlesFree(Qs,n)
 87 |   return MaskApi._rlesToLua(Q,1)[1]
 88 | end
 89 | 
 90 | MaskApi.iou = function( dt, gt, iscrowd )
 91 |   if not iscrowd then iscrowd = NULL else
 92 |     iscrowd = iscrowd:type('torch.ByteTensor'):contiguous():data()
 93 |   end
 94 |   if torch.isTensor(gt) and torch.isTensor(dt) then
 95 |     local nDt, k = dt:size(1), dt:size(2); assert(k==4)
 96 |     local nGt, k = gt:size(1), gt:size(2); assert(k==4)
 97 |     local dDt = dt:type('torch.DoubleTensor'):contiguous():data()
 98 |     local dGt = gt:type('torch.DoubleTensor'):contiguous():data()
 99 |     local o = torch.DoubleTensor(nGt,nDt):contiguous()
100 |     libmaskapi.bbIou(dDt,dGt,nDt,nGt,iscrowd,o:data())
101 |     return o:transpose(1,2)
102 |   else
103 |     local qDt, nDt = MaskApi._rlesFrLua(dt)
104 |     local qGt, nGt = MaskApi._rlesFrLua(gt)
105 |     local o = torch.DoubleTensor(nGt,nDt):contiguous()
106 |     libmaskapi.rleIou(qDt,qGt,nDt,nGt,iscrowd,o:data())
107 |     MaskApi._rlesFree(qDt,nDt); MaskApi._rlesFree(qGt,nGt)
108 |     return o:transpose(1,2)
109 |   end
110 | end
111 | 
112 | MaskApi.nms = function( dt, thr )
113 |   if torch.isTensor(dt) then
114 |     local n, k = dt:size(1), dt:size(2); assert(k==4)
115 |     local Q = dt:type('torch.DoubleTensor'):contiguous():data()
116 |     local kp = torch.IntTensor(n):contiguous()
117 |     libmaskapi.bbNms(Q,n,kp:data(),thr)
118 |     return kp
119 |   else
120 |     local Q, n = MaskApi._rlesFrLua(dt)
121 |     local kp = torch.IntTensor(n):contiguous()
122 |     libmaskapi.rleNms(Q,n,kp:data(),thr)
123 |     MaskApi._rlesFree(Q,n)
124 |     return kp
125 |   end
126 | end
127 | 
128 | MaskApi.area = function( Rs )
129 |   local Qs, n, h, w = MaskApi._rlesFrLua(Rs)
130 |   local a = torch.IntTensor(n):contiguous()
131 |   libmaskapi.rleArea(Qs,n,a:data())
132 |   MaskApi._rlesFree(Qs,n)
133 |   return a
134 | end
135 | 
136 | MaskApi.toBbox = function( Rs )
137 |   local Qs, n, h, w = MaskApi._rlesFrLua(Rs)
138 |   local bb = torch.DoubleTensor(n,4):contiguous()
139 |   libmaskapi.rleToBbox(Qs,bb:data(),n)
140 |   MaskApi._rlesFree(Qs,n)
141 |   return bb
142 | end
143 | 
144 | MaskApi.frBbox = function( bbs, h, w )
145 |   if bbs:dim()==1 then bbs=bbs:view(1,bbs:size(1)) end
146 |   local n, k = bbs:size(1), bbs:size(2); assert(k==4)
147 |   local data = bbs:type('torch.DoubleTensor'):contiguous():data()
148 |   local Qs = MaskApi._rlesInit(n)
149 |   libmaskapi.rleFrBbox(Qs[0],data,h,w,n)
150 |   return MaskApi._rlesToLua(Qs,n)
151 | end
152 | 
153 | MaskApi.frPoly = function( poly, h, w )
154 |   local n = #poly
155 |   local Qs, Q = MaskApi._rlesInit(n), MaskApi._rlesInit(1)
156 |   for i,p in pairs(poly) do
157 |     local xy = p:type('torch.DoubleTensor'):contiguous():data()
158 |     libmaskapi.rleFrPoly(Qs[i-1],xy,p:size(1)/2,h,w)
159 |   end
160 |   libmaskapi.rleMerge(Qs,Q[0],n,0)
161 |   MaskApi._rlesFree(Qs,n)
162 |   return MaskApi._rlesToLua(Q,1)[1]
163 | end
164 | 
165 | --------------------------------------------------------------------------------
166 | 
167 | MaskApi.drawCirc = function( img, x, y, rad, clr )
168 |   assert(img:isContiguous() and img:dim()==3)
169 |   local k, h, w, data = img:size(1), img:size(2), img:size(3), img:data()
170 |   for dx=-rad,rad do for dy=-rad,rad do
171 |     local xi, yi = torch.round(x+dx), torch.round(y+dy)
172 |     if dx*dx+dy*dy<=rad*rad and xi>=0 and yi>=0 and xi<w and yi<h then
173 |       for c=1,k do data[(c-1)*h*w + yi*w + xi] = clr[c] end
174 |     end
175 |   end end
176 | end
177 | 
178 | MaskApi.drawLine = function( img, x0, y0, x1, y1, rad, clr )
179 |   assert(img:isContiguous() and img:dim()==3)
180 |   local k, h, w, data = img:size(1), img:size(2), img:size(3), img:data()
181 |   local dx,dy,d; dx,dy=x1-x0,y1-y0; d=torch.sqrt(dx*dx+dy*dy); dx,dy=dx/d,dy/d
182 |   for i=0,d,.5 do for j=-rad,rad,.5 do
183 |     local xi, yi = torch.round(x0+dx*i+j*dy), torch.round(y0+dy*i-j*dx)
184 |     if xi>=0 and yi>=0 and xi<w and yi<h then
185 |       for c=1,k do data[(c-1)*h*w + yi*w + xi] = clr[c] end
186 |     end
187 |   end end
188 | end
189 | 
190 | MaskApi.drawMasks = function( img, masks, maxn, alpha, clrs )
191 |   assert(img:isContiguous() and img:dim()==3)
192 |   local n, h, w = masks:size(1), masks:size(2), masks:size(3)
193 |   if not maxn then maxn=n end
194 |   if not alpha then alpha=.4 end
195 |   if not clrs then clrs=torch.rand(n,3)*.6+.4 end
196 |   for i=1,math.min(maxn,n) do
197 |     local M = masks[i]:contiguous():data()
198 |     local B = torch.ByteTensor(h,w):zero():contiguous():data()
199 |     -- get boundaries B in masks M quickly
200 |     for y=0,h-2 do for x=0,w-2 do
201 |       local k=y*w+x
202 |       if M[k]~=M[k+1] then B[k],B[k+1]=1,1 end
203 |       if M[k]~=M[k+w] then B[k],B[k+w]=1,1 end
204 |       if M[k]~=M[k+1+w] then B[k],B[k+1+w]=1,1 end
205 |     end end
206 |     -- softly embed masks into image and add solid boundaries
207 |     for j=1,3 do
208 |       local O,c,a = img[j]:data(), clrs[i][j], alpha
209 |       for k=0,w*h-1 do if M[k]==1 then O[k]=O[k]*(1-a)+c*a end end
210 |       for k=0,w*h-1 do if B[k]==1 then O[k]=c end end
211 |     end
212 |   end
213 | end
214 | 
215 | --------------------------------------------------------------------------------
216 | 
217 | MaskApi._rlesToLua = function( Qs, n )
218 |   local h, w, Rs = tonumber(Qs[0].h), tonumber(Qs[0].w), {}
219 |   for i=1,n do Rs[i]={size={h,w}, counts={}} end
220 |   for i=1,n do
221 |     local s = libmaskapi.rleToString(Qs[i-1])
222 |     Rs[i].counts=ffi.string(s)
223 |     ffi.C.free(s)
224 |   end
225 |   MaskApi._rlesFree(Qs,n)
226 |   return Rs
227 | end
228 | 
229 | MaskApi._rlesFrLua = function( Rs )
230 |   if #Rs==0 then Rs={Rs} end
231 |   local n, h, w = #Rs, Rs[1].size[1], Rs[1].size[2]
232 |   local Qs = MaskApi._rlesInit(n)
233 |   for i=1,n do
234 |     local c = Rs[i].counts
235 |     if( torch.type(c)=='string' ) then
236 |       local s=ffi.new("char[?]",#c+1); ffi.copy(s,c)
237 |       libmaskapi.rleFrString(Qs[i-1],s,h,w)
238 |     elseif( torch.type(c)=='torch.IntTensor' ) then
239 |       libmaskapi.rleInit(Qs[i-1],h,w,c:size(1),c:contiguous():data())
240 |     else
241 |       assert(false,"invalid RLE")
242 |     end
243 |   end
244 |   return Qs, n, h, w
245 | end
246 | 
247 | MaskApi._rlesInit = function( n )
248 |   local Qs = ffi.new("RLE[?]",n)
249 |   for i=1,n do libmaskapi.rleInit(Qs[i-1],0,0,0,NULL) end
250 |   return Qs
251 | end
252 | 
253 | MaskApi._rlesFree = function( Qs, n )
254 |   for i=1,n do libmaskapi.rleFree(Qs[i-1]) end
255 | end
256 | 
257 | --------------------------------------------------------------------------------
258 | 
259 | ffi.cdef[[
260 |   void free(void *ptr);
261 |   typedef unsigned int uint;
262 |   typedef unsigned long siz;
263 |   typedef unsigned char byte;
264 |   typedef double* BB;
265 |   typedef struct { siz h, w, m; uint *cnts; } RLE;
266 |   void rleInit( RLE *R, siz h, siz w, siz m, uint *cnts );
267 |   void rleFree( RLE *R );
268 |   void rlesInit( RLE **R, siz n );
269 |   void rlesFree( RLE **R, siz n );
270 |   void rleEncode( RLE *R, const byte *mask, siz h, siz w, siz n );
271 |   void rleDecode( const RLE *R, byte *mask, siz n );
272 |   void rleMerge( const RLE *R, RLE *M, siz n, int intersect );
273 |   void rleArea( const RLE *R, siz n, uint *a );
274 |   void rleIou( RLE *dt, RLE *gt, siz m, siz n, byte *iscrowd, double *o );
275 |   void rleNms( RLE *dt, siz n, uint *keep, double thr );
276 |   void bbIou( BB dt, BB gt, siz m, siz n, byte *iscrowd, double *o );
277 |   void bbNms( BB dt, siz n, uint *keep, double thr );
278 |   void rleToBbox( const RLE *R, BB bb, siz n );
279 |   void rleFrBbox( RLE *R, const BB bb, siz h, siz w, siz n );
280 |   void rleFrPoly( RLE *R, const double *xy, siz k, siz h, siz w );
281 |   char* rleToString( const RLE *R );
282 |   void rleFrString( RLE *R, char *s, siz h, siz w );
283 | ]]
284 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/LuaAPI/cocoDemo.lua:
--------------------------------------------------------------------------------
 1 | -- Demo for the CocoApi (see CocoApi.lua)
 2 | coco = require 'coco'
 3 | image = require 'image'
 4 | 
 5 | -- initialize COCO api (please specify dataType/annType below)
 6 | annTypes = { 'instances', 'captions', 'person_keypoints' }
 7 | dataType, annType = 'val2014', annTypes[1]; -- specify dataType/annType
 8 | annFile = '../annotations/'..annType..'_'..dataType..'.json'
 9 | cocoApi=coco.CocoApi(annFile)
10 | 
11 | -- get all image ids, select one at random
12 | imgIds = cocoApi:getImgIds()
13 | imgId = imgIds[torch.random(imgIds:numel())]
14 | 
15 | -- load image
16 | img = cocoApi:loadImgs(imgId)[1]
17 | I = image.load('../images/'..dataType..'/'..img.file_name,3)
18 | 
19 | -- load and display instance annotations
20 | annIds = cocoApi:getAnnIds({imgId=imgId})
21 | anns = cocoApi:loadAnns(annIds)
22 | J = cocoApi:showAnns(I,anns)
23 | image.save('RES_'..img.file_name,J:double())
24 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/LuaAPI/env.lua:
--------------------------------------------------------------------------------
 1 | --[[----------------------------------------------------------------------------
 2 | 
 3 | Common Objects in COntext (COCO) Toolbox.      version 3.0
 4 | Data, paper, and tutorials available at:  http://mscoco.org/
 5 | Code written by Pedro O. Pinheiro and Piotr Dollar, 2016.
 6 | Licensed under the Simplified BSD License [see coco/license.txt]
 7 | 
 8 | ------------------------------------------------------------------------------]]
 9 | 
10 | local coco = {}
11 | return coco
12 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/LuaAPI/init.lua:
--------------------------------------------------------------------------------
 1 | --[[----------------------------------------------------------------------------
 2 | 
 3 | Common Objects in COntext (COCO) Toolbox.      version 3.0
 4 | Data, paper, and tutorials available at:  http://mscoco.org/
 5 | Code written by Pedro O. Pinheiro and Piotr Dollar, 2016.
 6 | Licensed under the Simplified BSD License [see coco/license.txt]
 7 | 
 8 | ------------------------------------------------------------------------------]]
 9 | 
10 | local coco = require 'coco.env'
11 | require 'coco.CocoApi'
12 | require 'coco.MaskApi'
13 | return coco
14 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/LuaAPI/rocks/coco-scm-1.rockspec:
--------------------------------------------------------------------------------
 1 | package = "coco"
 2 | version = "scm-1"
 3 | 
 4 | source = {
 5 |   url = "git://github.com/pdollar/coco.git"
 6 | }
 7 | 
 8 | description = {
 9 |   summary = "Interface for accessing the Microsoft COCO dataset",
10 |   detailed = "See http://mscoco.org/ for more details",
11 |   homepage = "https://github.com/pdollar/coco",
12 |   license = "Simplified BSD"
13 | }
14 | 
15 | dependencies = {
16 |   "lua >= 5.1",
17 |   "torch >= 7.0",
18 |   "lua-cjson"
19 | }
20 | 
21 | build = {
22 |   type = "builtin",
23 |   modules = {
24 |     ["coco.env"] = "LuaAPI/env.lua",
25 |     ["coco.init"] = "LuaAPI/init.lua",
26 |     ["coco.MaskApi"] = "LuaAPI/MaskApi.lua",
27 |     ["coco.CocoApi"] = "LuaAPI/CocoApi.lua",
28 |     libmaskapi = {
29 |       sources = { "common/maskApi.c" },
30 |       incdirs = { "common/" }
31 |     }
32 |   }
33 | }
34 | 
35 | -- luarocks make LuaAPI/rocks/coco-scm-1.rockspec
36 | -- https://github.com/pdollar/coco/raw/master/LuaAPI/rocks/coco-scm-1.rockspec
37 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/MatlabAPI/MaskApi.m:
--------------------------------------------------------------------------------
  1 | classdef MaskApi
  2 |   % Interface for manipulating masks stored in RLE format.
  3 |   %
  4 |   % RLE is a simple yet efficient format for storing binary masks. RLE
  5 |   % first divides a vector (or vectorized image) into a series of piecewise
  6 |   % constant regions and then for each piece simply stores the length of
  7 |   % that piece. For example, given M=[0 0 1 1 1 0 1] the RLE counts would
  8 |   % be [2 3 1 1], or for M=[1 1 1 1 1 1 0] the counts would be [0 6 1]
  9 |   % (note that the odd counts are always the numbers of zeros). Instead of
 10 |   % storing the counts directly, additional compression is achieved with a
 11 |   % variable bitrate representation based on a common scheme called LEB128.
 12 |   %
 13 |   % Compression is greatest given large piecewise constant regions.
 14 |   % Specifically, the size of the RLE is proportional to the number of
 15 |   % *boundaries* in M (or for an image the number of boundaries in the y
 16 |   % direction). Assuming fairly simple shapes, the RLE representation is
 17 |   % O(sqrt(n)) where n is number of pixels in the object. Hence space usage
 18 |   % is substantially lower, especially for large simple objects (large n).
 19 |   %
 20 |   % Many common operations on masks can be computed directly using the RLE
 21 |   % (without need for decoding). This includes computations such as area,
 22 |   % union, intersection, etc. All of these operations are linear in the
 23 |   % size of the RLE, in other words they are O(sqrt(n)) where n is the area
 24 |   % of the object. Computing these operations on the original mask is O(n).
 25 |   % Thus, using the RLE can result in substantial computational savings.
 26 |   %
 27 |   % The following API functions are defined:
 28 |   %  encode - Encode binary masks using RLE.
 29 |   %  decode - Decode binary masks encoded via RLE.
 30 |   %  merge  - Compute union or intersection of encoded masks.
 31 |   %  iou    - Compute intersection over union between masks.
 32 |   %  nms    - Compute non-maximum suppression between ordered masks.
 33 |   %  area   - Compute area of encoded masks.
 34 |   %  toBbox - Get bounding boxes surrounding encoded masks.
 35 |   %  frBbox - Convert bounding boxes to encoded masks.
 36 |   %  frPoly - Convert polygon to encoded mask.
 37 |   %
 38 |   % Usage:
 39 |   %  Rs     = MaskApi.encode( masks )
 40 |   %  masks  = MaskApi.decode( Rs )
 41 |   %  R      = MaskApi.merge( Rs, [intersect=false] )
 42 |   %  o      = MaskApi.iou( dt, gt, [iscrowd=false] )
 43 |   %  keep   = MaskApi.nms( dt, thr )
 44 |   %  a      = MaskApi.area( Rs )
 45 |   %  bbs    = MaskApi.toBbox( Rs )
 46 |   %  Rs     = MaskApi.frBbox( bbs, h, w )
 47 |   %  R      = MaskApi.frPoly( poly, h, w )
 48 |   %
 49 |   % In the API the following formats are used:
 50 |   %  R,Rs   - [struct] Run-length encoding of binary mask(s)
 51 |   %  masks  - [hxwxn] Binary mask(s) (must have type uint8)
 52 |   %  bbs    - [nx4] Bounding box(es) stored as [x y w h]
 53 |   %  poly   - Polygon stored as {[x1 y1 x2 y2...],[x1 y1 ...],...}
 54 |   %  dt,gt  - May be either bounding boxes or encoded masks
 55 |   % Both poly and bbs are 0-indexed (bbox=[0 0 1 1] encloses first pixel).
 56 |   %
 57 |   % Finally, a note about the intersection over union (iou) computation.
 58 |   % The standard iou of a ground truth (gt) and detected (dt) object is
 59 |   %  iou(gt,dt) = area(intersect(gt,dt)) / area(union(gt,dt))
 60 |   % For "crowd" regions, we use a modified criteria. If a gt object is
 61 |   % marked as "iscrowd", we allow a dt to match any subregion of the gt.
 62 |   % Choosing gt' in the crowd gt that best matches the dt can be done using
 63 |   % gt'=intersect(dt,gt). Since by definition union(gt',dt)=dt, computing
 64 |   %  iou(gt,dt,iscrowd) = iou(gt',dt) = area(intersect(gt,dt)) / area(dt)
 65 |   % For crowd gt regions we use this modified criteria above for the iou.
 66 |   %
 67 |   % To compile use the following (some precompiled binaries are included):
 68 |   %   mex('CFLAGS=\$CFLAGS -Wall -std=c99','-largeArrayDims',...
 69 |   %     'private/maskApiMex.c','../common/maskApi.c',...
 70 |   %     '-I../common/','-outdir','private');
 71 |   % Please do not contact us for help with compiling.
 72 |   %
 73 |   % Microsoft COCO Toolbox.      version 2.0
 74 |   % Data, paper, and tutorials available at:  http://mscoco.org/
 75 |   % Code written by Piotr Dollar and Tsung-Yi Lin, 2015.
 76 |   % Licensed under the Simplified BSD License [see coco/license.txt]
 77 |   
 78 |   methods( Static )
 79 |     function Rs = encode( masks )
 80 |       Rs = maskApiMex( 'encode', masks );
 81 |     end
 82 |     
 83 |     function masks = decode( Rs )
 84 |       masks = maskApiMex( 'decode', Rs );
 85 |     end
 86 |     
 87 |     function R = merge( Rs, varargin )
 88 |       R = maskApiMex( 'merge', Rs, varargin{:} );
 89 |     end
 90 |     
 91 |     function o = iou( dt, gt, varargin )
 92 |       o = maskApiMex( 'iou', dt', gt', varargin{:} );
 93 |     end
 94 |     
 95 |     function keep = nms( dt, thr )
 96 |       keep = maskApiMex('nms',dt',thr);
 97 |     end
 98 |     
 99 |     function a = area( Rs )
100 |       a = maskApiMex( 'area', Rs );
101 |     end
102 |     
103 |     function bbs = toBbox( Rs )
104 |       bbs = maskApiMex( 'toBbox', Rs )';
105 |     end
106 |     
107 |     function Rs = frBbox( bbs, h, w )
108 |       Rs = maskApiMex( 'frBbox', bbs', h, w );
109 |     end
110 |     
111 |     function R = frPoly( poly, h, w )
112 |       R = maskApiMex( 'frPoly', poly, h , w );
113 |     end
114 |   end
115 |   
116 | end
117 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/MatlabAPI/cocoDemo.m:
--------------------------------------------------------------------------------
 1 | %% Demo for the CocoApi (see CocoApi.m)
 2 | 
 3 | %% initialize COCO api (please specify dataType/annType below)
 4 | annTypes = { 'instances', 'captions', 'person_keypoints' };
 5 | dataType='val2014'; annType=annTypes{1}; % specify dataType/annType
 6 | annFile=sprintf('../annotations/%s_%s.json',annType,dataType);
 7 | coco=CocoApi(annFile);
 8 | 
 9 | %% display COCO categories and supercategories
10 | if( ~strcmp(annType,'captions') )
11 |   cats = coco.loadCats(coco.getCatIds());
12 |   nms={cats.name}; fprintf('COCO categories: ');
13 |   fprintf('%s, ',nms{:}); fprintf('\n');
14 |   nms=unique({cats.supercategory}); fprintf('COCO supercategories: ');
15 |   fprintf('%s, ',nms{:}); fprintf('\n');
16 | end
17 | 
18 | %% get all images containing given categories, select one at random
19 | catIds = coco.getCatIds('catNms',{'person','dog','skateboard'});
20 | imgIds = coco.getImgIds('catIds',catIds);
21 | imgId = imgIds(randi(length(imgIds)));
22 | 
23 | %% load and display image
24 | img = coco.loadImgs(imgId);
25 | I = imread(sprintf('../images/%s/%s',dataType,img.file_name));
26 | figure(1); imagesc(I); axis('image'); set(gca,'XTick',[],'YTick',[])
27 | 
28 | %% load and display annotations
29 | annIds = coco.getAnnIds('imgIds',imgId,'catIds',catIds,'iscrowd',[]);
30 | anns = coco.loadAnns(annIds); coco.showAnns(anns);
31 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/MatlabAPI/evalDemo.m:
--------------------------------------------------------------------------------
 1 | %% Demo demonstrating the algorithm result formats for COCO
 2 | 
 3 | %% select results type for demo (either bbox or segm)
 4 | type = {'segm','bbox','keypoints'}; type = type{1}; % specify type here
 5 | fprintf('Running demo for *%s* results.\n\n',type);
 6 | 
 7 | %% initialize COCO ground truth api
 8 | dataDir='../'; prefix='instances'; dataType='val2014';
 9 | if(strcmp(type,'keypoints')), prefix='person_keypoints'; end
10 | annFile=sprintf('%s/annotations/%s_%s.json',dataDir,prefix,dataType);
11 | cocoGt=CocoApi(annFile);
12 | 
13 | %% initialize COCO detections api
14 | resFile='%s/results/%s_%s_fake%s100_results.json';
15 | resFile=sprintf(resFile,dataDir,prefix,dataType,type);
16 | cocoDt=cocoGt.loadRes(resFile);
17 | 
18 | %% visialuze gt and dt side by side
19 | imgIds=sort(cocoGt.getImgIds()); imgIds=imgIds(1:100);
20 | imgId = imgIds(randi(100)); img = cocoGt.loadImgs(imgId);
21 | I = imread(sprintf('%s/images/val2014/%s',dataDir,img.file_name));
22 | figure(1); subplot(1,2,1); imagesc(I); axis('image'); axis off;
23 | annIds = cocoGt.getAnnIds('imgIds',imgId); title('ground truth')
24 | anns = cocoGt.loadAnns(annIds); cocoGt.showAnns(anns);
25 | figure(1); subplot(1,2,2); imagesc(I); axis('image'); axis off;
26 | annIds = cocoDt.getAnnIds('imgIds',imgId); title('results')
27 | anns = cocoDt.loadAnns(annIds); cocoDt.showAnns(anns);
28 | 
29 | %% load raw JSON and show exact format for results
30 | fprintf('results structure have the following format:\n');
31 | res = gason(fileread(resFile)); disp(res)
32 | 
33 | %% the following command can be used to save the results back to disk
34 | if(0), f=fopen(resFile,'w'); fwrite(f,gason(res)); fclose(f); end
35 | 
36 | %% run COCO evaluation code (see CocoEval.m)
37 | cocoEval=CocoEval(cocoGt,cocoDt,type);
38 | cocoEval.params.imgIds=imgIds;
39 | cocoEval.evaluate();
40 | cocoEval.accumulate();
41 | cocoEval.summarize();
42 | 
43 | %% generate Derek Hoiem style analyis of false positives (slow)
44 | if(0), cocoEval.analyze(); end
45 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/MatlabAPI/gason.m:
--------------------------------------------------------------------------------
 1 | function out = gason( in )
 2 | % Convert between JSON strings and corresponding JSON objects.
 3 | %
 4 | % This parser is based on Gason written and maintained by Ivan Vashchaev:
 5 | %                 https://github.com/vivkin/gason
 6 | % Gason is a "lightweight and fast JSON parser for C++". Please see the
 7 | % above link for license information and additional details about Gason.
 8 | %
 9 | % Given a JSON string, gason calls the C++ parser and converts the output
10 | % into an appropriate Matlab structure. As the parsing is performed in mex
11 | % the resulting parser is blazingly fast. Large JSON structs (100MB+) take
12 | % only a few seconds to parse (compared to hours for pure Matlab parsers).
13 | %
14 | % Given a JSON object, gason calls the C++ encoder to convert the object
15 | % back into a JSON string representation. Nearly any Matlab struct, cell
16 | % array, or numeric array represent a valid JSON object. Note that gason()
17 | % can be used to go both from JSON string to JSON object and back.
18 | %
19 | % Gason requires C++11 to compile (for GCC this requires version 4.7 or
20 | % later). The following command compiles the parser (may require tweaking):
21 | %   mex('CXXFLAGS=\$CXXFLAGS -std=c++11 -Wall','-largeArrayDims',...
22 | %     'private/gasonMex.cpp','../common/gason.cpp',...
23 | %     '-I../common/','-outdir','private');
24 | % Note the use of the "-std=c++11" flag. A number of precompiled binaries
25 | % are included, please do not contact us for help with compiling. If needed
26 | % you can specify a compiler by adding the option 'CXX="/usr/bin/g++"'.
27 | %
28 | % Note that by default JSON arrays that contain only numbers are stored as
29 | % regular Matlab arrays. Likewise, JSON arrays that contain only objects of
30 | % the same type are stored as Matlab struct arrays. This is much faster and
31 | % can use considerably less memory than always using Matlab cell arrays.
32 | %
33 | % USAGE
34 | %  object = gason( string )
35 | %  string = gason( object )
36 | %
37 | % INPUTS/OUTPUTS
38 | %  string     - JSON string
39 | %  object     - JSON object
40 | %
41 | % EXAMPLE
42 | %  o = struct('first',{'piotr','ty'},'last',{'dollar','lin'})
43 | %  s = gason( o ) % convert JSON object -> JSON string
44 | %  p = gason( s ) % convert JSON string -> JSON object
45 | %
46 | % See also
47 | %
48 | % Microsoft COCO Toolbox.      version 2.0
49 | % Data, paper, and tutorials available at:  http://mscoco.org/
50 | % Code written by Piotr Dollar and Tsung-Yi Lin, 2015.
51 | % Licensed under the Simplified BSD License [see coco/license.txt]
52 | 
53 | out = gasonMex( 'convert', in );
54 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/MatlabAPI/private/gasonMex.cpp:
--------------------------------------------------------------------------------
  1 | /**************************************************************************
  2 | * Microsoft COCO Toolbox.      version 2.0
  3 | * Data, paper, and tutorials available at:  http://mscoco.org/
  4 | * Code written by Piotr Dollar and Tsung-Yi Lin, 2015.
  5 | * Licensed under the Simplified BSD License [see coco/license.txt]
  6 | **************************************************************************/
  7 | #include "gason.h"
  8 | #include "mex.h"
  9 | #include "string.h"
 10 | #include "math.h"
 11 | #include <cstdint>
 12 | #include <iomanip>
 13 | #include <sstream>
 14 | typedef std::ostringstream ostrm;
 15 | typedef unsigned long siz;
 16 | typedef unsigned short ushort;
 17 | 
 18 | siz length( const JsonValue &a ) {
 19 |   // get number of elements in JSON_ARRAY or JSON_OBJECT
 20 |   siz k=0; auto n=a.toNode(); while(n) { k++; n=n->next; } return k;
 21 | }
 22 | 
 23 | bool isRegularObjArray( const JsonValue &a ) {
 24 |   // check if all JSON_OBJECTs in JSON_ARRAY have the same fields
 25 |   JsonValue o=a.toNode()->value; siz k, n; const char **keys;
 26 |   n=length(o); keys=new const char*[n];
 27 |   k=0; for(auto j:o) keys[k++]=j->key;
 28 |   for( auto i:a ) {
 29 |     if(length(i->value)!=n) return false; k=0;
 30 |     for(auto j:i->value) if(strcmp(j->key,keys[k++])) return false;
 31 |   }
 32 |   delete [] keys; return true;
 33 | }
 34 | 
 35 | mxArray* json( const JsonValue &o ) {
 36 |   // convert JsonValue to Matlab mxArray
 37 |   siz k, m, n; mxArray *M; const char **keys;
 38 |   switch( o.getTag() ) {
 39 |     case JSON_NUMBER:
 40 |       return mxCreateDoubleScalar(o.toNumber());
 41 |     case JSON_STRING:
 42 |       return mxCreateString(o.toString());
 43 |     case JSON_ARRAY: {
 44 |       if(!o.toNode()) return mxCreateDoubleMatrix(1,0,mxREAL);
 45 |       JsonValue o0=o.toNode()->value; JsonTag tag=o0.getTag();
 46 |       n=length(o); bool isRegular=true;
 47 |       for(auto i:o) isRegular=isRegular && i->value.getTag()==tag;
 48 |       if( isRegular && tag==JSON_OBJECT && isRegularObjArray(o) ) {
 49 |         m=length(o0); keys=new const char*[m];
 50 |         k=0; for(auto j:o0) keys[k++]=j->key;
 51 |         M = mxCreateStructMatrix(1,n,m,keys);
 52 |         k=0; for(auto i:o) { m=0; for(auto j:i->value)
 53 |           mxSetFieldByNumber(M,k,m++,json(j->value)); k++; }
 54 |         delete [] keys; return M;
 55 |       } else if( isRegular && tag==JSON_NUMBER ) {
 56 |         M = mxCreateDoubleMatrix(1,n,mxREAL); double *p=mxGetPr(M);
 57 |         k=0; for(auto i:o) p[k++]=i->value.toNumber(); return M;
 58 |       } else {
 59 |         M = mxCreateCellMatrix(1,n);
 60 |         k=0; for(auto i:o) mxSetCell(M,k++,json(i->value));
 61 |         return M;
 62 |       }
 63 |     }
 64 |     case JSON_OBJECT:
 65 |       if(!o.toNode()) return mxCreateStructMatrix(1,0,0,NULL);
 66 |       n=length(o); keys=new const char*[n];
 67 |       k=0; for(auto i:o) keys[k++]=i->key;
 68 |       M = mxCreateStructMatrix(1,1,n,keys); k=0;
 69 |       for(auto i:o) mxSetFieldByNumber(M,0,k++,json(i->value));
 70 |       delete [] keys; return M;
 71 |     case JSON_TRUE:
 72 |       return mxCreateDoubleScalar(1);
 73 |     case JSON_FALSE:
 74 |       return mxCreateDoubleScalar(0);
 75 |     case JSON_NULL:
 76 |       return mxCreateDoubleMatrix(0,0,mxREAL);
 77 |     default: return NULL;
 78 |   }
 79 | }
 80 | 
 81 | template<class T, class C> ostrm& json( ostrm &S, T *A, siz n ) {
 82 |   // convert numeric array to JSON string with casting
 83 |   if(n==0) { S<<"[]"; return S; } if(n==1) { S<<C(A[0]); return S; }
 84 |   S<<"["; for(siz i=0; i<n-1; i++) S<<C(A[i])<<",";
 85 |   S<<C(A[n-1]); S<<"]"; return S;
 86 | }
 87 | 
 88 | template<class T> ostrm& json( ostrm &S, T *A, siz n ) {
 89 |   // convert numeric array to JSON string without casting
 90 |   return json<T,T>(S,A,n);
 91 | }
 92 | 
 93 | ostrm& json( ostrm &S, const char *A ) {
 94 |   // convert char array to JSON string (handle escape characters)
 95 |   #define RPL(a,b) case a: { S << b; A++; break; }
 96 |   S << "\""; while( *A>0 ) switch( *A ) {
 97 |     RPL('"',"\\\""); RPL('\\',"\\\\"); RPL('/',"\\/"); RPL('\b',"\\b");
 98 |     RPL('\f',"\\f"); RPL('\n',"\\n"); RPL('\r',"\\r"); RPL('\t',"\\t");
 99 |     default: S << *A; A++;
100 |   }
101 |   S << "\""; return S;
102 | }
103 | 
104 | ostrm& json( ostrm& S, const JsonValue *o ) {
105 |   // convert JsonValue to JSON string
106 |   switch( o->getTag() ) {
107 |     case JSON_NUMBER: S << o->toNumber(); return S;
108 |     case JSON_TRUE:   S << "true"; return S;
109 |     case JSON_FALSE:  S << "false"; return S;
110 |     case JSON_NULL:   S << "null"; return S;
111 |     case JSON_STRING: return json(S,o->toString());
112 |     case JSON_ARRAY:
113 |       S << "["; for(auto i:*o) {
114 |         json(S,&i->value) << (i->next ? "," : ""); }
115 |       S << "]"; return S;
116 |     case JSON_OBJECT:
117 |       S << "{"; for(auto i:*o) {
118 |         json(S,i->key) << ":";
119 |         json(S,&i->value) << (i->next ? "," : ""); }
120 |       S << "}"; return S;
121 |     default: return S;
122 |   }
123 | }
124 | 
125 | ostrm& json( ostrm& S, const mxArray *M ) {
126 |   // convert Matlab mxArray to JSON string
127 |   siz i, j, m, n=mxGetNumberOfElements(M);
128 |   void *A=mxGetData(M); ostrm *nms;
129 |   switch( mxGetClassID(M) ) {
130 |     case mxDOUBLE_CLASS:  return json(S,(double*)   A,n);
131 |     case mxSINGLE_CLASS:  return json(S,(float*)    A,n);
132 |     case mxINT64_CLASS:   return json(S,(int64_t*)  A,n);
133 |     case mxUINT64_CLASS:  return json(S,(uint64_t*) A,n);
134 |     case mxINT32_CLASS:   return json(S,(int32_t*)  A,n);
135 |     case mxUINT32_CLASS:  return json(S,(uint32_t*) A,n);
136 |     case mxINT16_CLASS:   return json(S,(int16_t*)  A,n);
137 |     case mxUINT16_CLASS:  return json(S,(uint16_t*) A,n);
138 |     case mxINT8_CLASS:    return json<int8_t,int32_t>(S,(int8_t*) A,n);
139 |     case mxUINT8_CLASS:   return json<uint8_t,uint32_t>(S,(uint8_t*) A,n);
140 |     case mxLOGICAL_CLASS: return json<uint8_t,uint32_t>(S,(uint8_t*) A,n);
141 |     case mxCHAR_CLASS:    return json(S,mxArrayToString(M));
142 |     case mxCELL_CLASS:
143 |       S << "["; for(i=0; i<n-1; i++) json(S,mxGetCell(M,i)) << ",";
144 |       if(n>0) json(S,mxGetCell(M,n-1)); S << "]"; return S;
145 |     case mxSTRUCT_CLASS:
146 |       if(n==0) { S<<"{}"; return S; } m=mxGetNumberOfFields(M);
147 |       if(m==0) { S<<"["; for(i=0; i<n; i++) S<<"{},"; S<<"]"; return S; }
148 |       if(n>1) S<<"["; nms=new ostrm[m];
149 |       for(j=0; j<m; j++) json(nms[j],mxGetFieldNameByNumber(M,j));
150 |       for(i=0; i<n; i++) for(j=0; j<m; j++) {
151 |         if(j==0) S << "{"; S << nms[j].str() << ":";
152 |         json(S,mxGetFieldByNumber(M,i,j)) << ((j<m-1) ? "," : "}");
153 |         if(j==m-1 && i<n-1) S<<",";
154 |       }
155 |       if(n>1) S<<"]"; delete [] nms; return S;
156 |     default:
157 |       mexErrMsgTxt( "Unknown type." ); return S;
158 |   }
159 | }
160 | 
161 | mxArray* mxCreateStringRobust( const char* str ) {
162 |   // convert char* to Matlab string (robust version of mxCreateString)
163 |   mxArray *M; ushort *c; mwSize n[2]={1,strlen(str)};
164 |   M=mxCreateCharArray(2,n); c=(ushort*) mxGetData(M);
165 |   for( siz i=0; i<n[1]; i++ ) c[i]=str[i]; return M;
166 | }
167 | 
168 | char* mxArrayToStringRobust( const mxArray *M ) {
169 |   // convert Matlab string to char* (robust version of mxArrayToString)
170 |   if(!mxIsChar(M)) mexErrMsgTxt("String expected.");
171 |   ushort *c=(ushort*) mxGetData(M); char* str; siz n;
172 |   n=mxGetNumberOfElements(M); str=(char*) mxMalloc(n+1);
173 |   for( siz i=0; i<n; i++ ) str[i]=c[i]; str[n]=0; return str;
174 | }
175 | 
176 | void mexFunction( int nl, mxArray *pl[], int nr, const mxArray *pr[] )
177 | {
178 |   char action[1024]; if(!nr) mexErrMsgTxt("Inputs expected.");
179 |   mxGetString(pr[0],action,1024); nr--; pr++;
180 |   char *endptr; JsonValue val; JsonAllocator allocator;
181 |   if( nl>1 ) mexErrMsgTxt("One output expected.");
182 |   
183 |   if(!strcmp(action,"convert")) {
184 |     if( nr!=1 ) mexErrMsgTxt("One input expected.");
185 |     if( mxGetClassID(pr[0])==mxCHAR_CLASS ) {
186 |       // object = mexFunction( string )
187 |       char *str = mxArrayToStringRobust(pr[0]);
188 |       int status = jsonParse(str, &endptr, &val, allocator);
189 |       if( status != JSON_OK) mexErrMsgTxt(jsonStrError(status));
190 |       pl[0] = json(val); mxFree(str);
191 |     } else {
192 |       // string = mexFunction( object )
193 |       ostrm S; S << std::setprecision(12); json(S,pr[0]);
194 |       pl[0]=mxCreateStringRobust(S.str().c_str());
195 |     }
196 |     
197 |   } else if(!strcmp(action,"split")) {
198 |     // strings = mexFunction( string, k )
199 |     if( nr!=2 ) mexErrMsgTxt("Two input expected.");
200 |     char *str = mxArrayToStringRobust(pr[0]);
201 |     int status = jsonParse(str, &endptr, &val, allocator);
202 |     if( status != JSON_OK) mexErrMsgTxt(jsonStrError(status));
203 |     if( val.getTag()!=JSON_ARRAY ) mexErrMsgTxt("Array expected");
204 |     siz i=0, t=0, n=length(val), k=(siz) mxGetScalar(pr[1]);
205 |     k=(k>n)?n:(k<1)?1:k; k=ceil(n/ceil(double(n)/k));
206 |     pl[0]=mxCreateCellMatrix(1,k); ostrm S; S<<std::setprecision(12);
207 |     for(auto o:val) {
208 |       if(!t) { S.str(std::string()); S << "["; t=ceil(double(n)/k); }
209 |       json(S,&o->value); t--; if(!o->next) t=0; S << (t ? "," : "]");
210 |       if(!t) mxSetCell(pl[0],i++,mxCreateStringRobust(S.str().c_str()));
211 |     }
212 |     
213 |   } else if(!strcmp(action,"merge")) {
214 |     // string = mexFunction( strings )
215 |     if( nr!=1 ) mexErrMsgTxt("One input expected.");
216 |     if(!mxIsCell(pr[0])) mexErrMsgTxt("Cell array expected.");
217 |     siz n = mxGetNumberOfElements(pr[0]);
218 |     ostrm S; S << std::setprecision(12); S << "[";
219 |     for( siz i=0; i<n; i++ ) {
220 |       char *str = mxArrayToStringRobust(mxGetCell(pr[0],i));
221 |       int status = jsonParse(str, &endptr, &val, allocator);
222 |       if( status != JSON_OK) mexErrMsgTxt(jsonStrError(status));
223 |       if( val.getTag()!=JSON_ARRAY ) mexErrMsgTxt("Array expected");
224 |       for(auto j:val) json(S,&j->value) << (j->next ? "," : "");
225 |       mxFree(str); if(i<n-1) S<<",";
226 |     }
227 |     S << "]"; pl[0]=mxCreateStringRobust(S.str().c_str());
228 |     
229 |   } else mexErrMsgTxt("Invalid action.");
230 | }
231 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/MatlabAPI/private/gasonMex.mexa64:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/eval_city/cocoapi/MatlabAPI/private/gasonMex.mexa64


--------------------------------------------------------------------------------
/eval_city/cocoapi/MatlabAPI/private/gasonMex.mexmaci64:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/eval_city/cocoapi/MatlabAPI/private/gasonMex.mexmaci64


--------------------------------------------------------------------------------
/eval_city/cocoapi/MatlabAPI/private/getPrmDflt.m:
--------------------------------------------------------------------------------
 1 | function varargout = getPrmDflt( prm, dfs, checkExtra )
 2 | % Helper to set default values (if not already set) of parameter struct.
 3 | %
 4 | % Takes input parameters and a list of 'name'/default pairs, and for each
 5 | % 'name' for which prm has no value (prm.(name) is not a field or 'name'
 6 | % does not appear in prm list), getPrmDflt assigns the given default
 7 | % value. If default value for variable 'name' is 'REQ', and value for
 8 | % 'name' is not given, an error is thrown. See below for usage details.
 9 | %
10 | % USAGE (nargout==1)
11 | %  prm = getPrmDflt( prm, dfs, [checkExtra] )
12 | %
13 | % USAGE (nargout>1)
14 | %  [ param1 ... paramN ] = getPrmDflt( prm, dfs, [checkExtra] )
15 | %
16 | % INPUTS
17 | %  prm          - param struct or cell of form {'name1' v1 'name2' v2 ...}
18 | %  dfs          - cell of form {'name1' def1 'name2' def2 ...}
19 | %  checkExtra   - [0] if 1 throw error if prm contains params not in dfs
20 | %                 if -1 if prm contains params not in dfs adds them
21 | %
22 | % OUTPUTS (nargout==1)
23 | %  prm    - parameter struct with fields 'name1' through 'nameN' assigned
24 | %
25 | % OUTPUTS (nargout>1)
26 | %  param1 - value assigned to parameter with 'name1'
27 | %   ...
28 | %  paramN - value assigned to parameter with 'nameN'
29 | %
30 | % EXAMPLE
31 | %  dfs = { 'x','REQ', 'y',0, 'z',[], 'eps',1e-3 };
32 | %  prm = getPrmDflt( struct('x',1,'y',1), dfs )
33 | %  [ x y z eps ] = getPrmDflt( {'x',2,'y',1}, dfs )
34 | %
35 | % See also INPUTPARSER
36 | %
37 | % Piotr's Computer Vision Matlab Toolbox      Version 2.60
38 | % Copyright 2014 Piotr Dollar.  [pdollar-at-gmail.com]
39 | % Licensed under the Simplified BSD License [see external/bsd.txt]
40 | 
41 | if( mod(length(dfs),2) ), error('odd number of default parameters'); end
42 | if nargin<=2, checkExtra = 0; end
43 | 
44 | % get the input parameters as two cell arrays: prmVal and prmField
45 | if iscell(prm) && length(prm)==1, prm=prm{1}; end
46 | if iscell(prm)
47 |   if(mod(length(prm),2)), error('odd number of parameters in prm'); end
48 |   prmField = prm(1:2:end); prmVal = prm(2:2:end);
49 | else
50 |   if(~isstruct(prm)), error('prm must be a struct or a cell'); end
51 |   prmVal = struct2cell(prm); prmField = fieldnames(prm);
52 | end
53 | 
54 | % get and update default values using quick for loop
55 | dfsField = dfs(1:2:end); dfsVal = dfs(2:2:end);
56 | if checkExtra>0
57 |   for i=1:length(prmField)
58 |     j = find(strcmp(prmField{i},dfsField));
59 |     if isempty(j), error('parameter %s is not valid', prmField{i}); end
60 |     dfsVal(j) = prmVal(i);
61 |   end
62 | elseif checkExtra<0
63 |   for i=1:length(prmField)
64 |     j = find(strcmp(prmField{i},dfsField));
65 |     if isempty(j), j=length(dfsVal)+1; dfsField{j}=prmField{i}; end
66 |     dfsVal(j) = prmVal(i);
67 |   end
68 | else
69 |   for i=1:length(prmField)
70 |     dfsVal(strcmp(prmField{i},dfsField)) = prmVal(i);
71 |   end
72 | end
73 | 
74 | % check for missing values
75 | if any(strcmp('REQ',dfsVal))
76 |   cmpArray = find(strcmp('REQ',dfsVal));
77 |   error(['Required field ''' dfsField{cmpArray(1)} ''' not specified.'] );
78 | end
79 | 
80 | % set output
81 | if nargout==1
82 |   varargout{1} = cell2struct( dfsVal, dfsField, 2 );
83 | else
84 |   varargout = dfsVal;
85 | end
86 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/MatlabAPI/private/maskApiMex.c:
--------------------------------------------------------------------------------
  1 | /**************************************************************************
  2 | * Microsoft COCO Toolbox.      version 2.0
  3 | * Data, paper, and tutorials available at:  http://mscoco.org/
  4 | * Code written by Piotr Dollar and Tsung-Yi Lin, 2015.
  5 | * Licensed under the Simplified BSD License [see coco/license.txt]
  6 | **************************************************************************/
  7 | #include "mex.h"
  8 | #include "maskApi.h"
  9 | #include <string.h>
 10 | 
 11 | void checkType( const mxArray *M, mxClassID id ) {
 12 |   if(mxGetClassID(M)!=id) mexErrMsgTxt("Invalid type.");
 13 | }
 14 | 
 15 | mxArray* toMxArray( const RLE *R, siz n ) {
 16 |   const char *fs[] = {"size", "counts"};
 17 |   mxArray *M=mxCreateStructMatrix(1,n,2,fs);
 18 |   for( siz i=0; i<n; i++ ) {
 19 |     mxArray *S=mxCreateNumericMatrix(1,2,mxDOUBLE_CLASS,mxREAL);
 20 |     mxSetFieldByNumber(M,i,0,S); double *s=mxGetPr(S);
 21 |     s[0]=R[i].h; s[1]=R[i].w; char *c=rleToString(R+i);
 22 |     mxSetFieldByNumber(M,i,1,mxCreateString(c)); free(c);
 23 |   }
 24 |   return M;
 25 | }
 26 | 
 27 | RLE* frMxArray( const mxArray *M, siz *n, bool same ) {
 28 |   const char *fs[] = {"size", "counts"}; siz i, j, m, h, w, O[2];
 29 |   const char *err="Invalid RLE struct array.";
 30 |   *n=mxGetNumberOfElements(M); RLE *R; rlesInit(&R,*n); if(!(*n)) return R;
 31 |   if(!mxIsStruct(M) || mxGetNumberOfFields(M)!=2) mexErrMsgTxt(err);
 32 |   for( i=0; i<2; i++ ) { O[i]=2; for( j=0; j<2; j++ ) {
 33 |     if(!strcmp(mxGetFieldNameByNumber(M,j),fs[i])) O[i]=j; }}
 34 |   for( i=0; i<2; i++ ) if(O[i]>1) mexErrMsgTxt(err);
 35 |   for( i=0; i<*n; i++ ) {
 36 |     mxArray *S, *C; double *s; void *c;
 37 |     S=mxGetFieldByNumber(M,i,O[0]); checkType(S,mxDOUBLE_CLASS);
 38 |     C=mxGetFieldByNumber(M,i,O[1]); s=mxGetPr(S); c=mxGetData(C);
 39 |     h=(siz)s[0]; w=(siz)s[1]; m=mxGetNumberOfElements(C);
 40 |     if(same && i>0 && (h!=R[0].h || w!=R[0].w)) mexErrMsgTxt(err);
 41 |     if( mxGetClassID(C)==mxDOUBLE_CLASS ) {
 42 |       rleInit(R+i,h,w,m,0);
 43 |       for(j=0; j<m; j++) R[i].cnts[j]=(uint)((double*)c)[j];
 44 |     } else if( mxGetClassID(C)==mxUINT32_CLASS ) {
 45 |       rleInit(R+i,h,w,m,(uint*)c);
 46 |     } else if( mxGetClassID(C)==mxCHAR_CLASS ) {
 47 |       char *c=mxMalloc(sizeof(char)*(m+1)); mxGetString(C,c,m+1);
 48 |       rleFrString(R+i,c,h,w); mxFree(c);
 49 |     }
 50 |     else mexErrMsgTxt(err);
 51 |   }
 52 |   return R;
 53 | }
 54 | 
 55 | void mexFunction( int nl, mxArray *pl[], int nr, const mxArray *pr[] )
 56 | {
 57 |   char action[1024]; RLE *R=0; siz h=0, w=0, n=0;
 58 |   mxGetString(pr[0],action,1024); nr--; pr++;
 59 |   
 60 |   if(!strcmp(action,"encode")) {
 61 |     checkType(pr[0],mxUINT8_CLASS); byte *M=(byte*) mxGetData(pr[0]);
 62 |     const mwSize *ds=mxGetDimensions(pr[0]); n=mxGetN(pr[0])/ds[1];
 63 |     rlesInit(&R,n); rleEncode(R,M,ds[0],ds[1],n); pl[0]=toMxArray(R,n);
 64 |     
 65 |   } else if(!strcmp(action,"decode")) {
 66 |     R=frMxArray(pr[0],&n,1); mwSize ds[3];
 67 |     ds[0]=n?R[0].h:0; ds[1]=n?R[0].w:0; ds[2]=n;
 68 |     pl[0]=mxCreateNumericArray(3,ds,mxUINT8_CLASS,mxREAL);
 69 |     byte *M=(byte*) mxGetPr(pl[0]); rleDecode(R,M,n);
 70 |     
 71 |   } else if(!strcmp(action,"merge")) {
 72 |     R=frMxArray(pr[0],&n,1); RLE M;
 73 |     bool intersect = (nr>=2) ? (mxGetScalar(pr[1])>0) : false;
 74 |     rleMerge(R,&M,n,intersect); pl[0]=toMxArray(&M,1); rleFree(&M);
 75 |     
 76 |   } else if(!strcmp(action,"area")) {
 77 |     R=frMxArray(pr[0],&n,0);
 78 |     pl[0]=mxCreateNumericMatrix(1,n,mxUINT32_CLASS,mxREAL);
 79 |     uint *a=(uint*) mxGetPr(pl[0]); rleArea(R,n,a);
 80 |     
 81 |   } else if(!strcmp(action,"iou")) {
 82 |     if(nr>2) checkType(pr[2],mxUINT8_CLASS); siz nDt, nGt;
 83 |     byte *iscrowd = nr>2 ? (byte*) mxGetPr(pr[2]) : NULL;
 84 |     if(mxIsStruct(pr[0]) || mxIsStruct(pr[1])) {
 85 |       RLE *dt=frMxArray(pr[0],&nDt,1), *gt=frMxArray(pr[1],&nGt,1);
 86 |       pl[0]=mxCreateNumericMatrix(nDt,nGt,mxDOUBLE_CLASS,mxREAL);
 87 |       double *o=mxGetPr(pl[0]); rleIou(dt,gt,nDt,nGt,iscrowd,o);
 88 |       rlesFree(&dt,nDt); rlesFree(&gt,nGt);
 89 |     } else {
 90 |       checkType(pr[0],mxDOUBLE_CLASS); checkType(pr[1],mxDOUBLE_CLASS);
 91 |       double *dt=mxGetPr(pr[0]); nDt=mxGetN(pr[0]);
 92 |       double *gt=mxGetPr(pr[1]); nGt=mxGetN(pr[1]);
 93 |       pl[0]=mxCreateNumericMatrix(nDt,nGt,mxDOUBLE_CLASS,mxREAL);
 94 |       double *o=mxGetPr(pl[0]); bbIou(dt,gt,nDt,nGt,iscrowd,o);
 95 |     }
 96 |     
 97 |   } else if(!strcmp(action,"nms")) {
 98 |     siz n; uint *keep; double thr=(double) mxGetScalar(pr[1]);
 99 |     if(mxIsStruct(pr[0])) {
100 |       RLE *dt=frMxArray(pr[0],&n,1);
101 |       pl[0]=mxCreateNumericMatrix(1,n,mxUINT32_CLASS,mxREAL);
102 |       keep=(uint*) mxGetPr(pl[0]); rleNms(dt,n,keep,thr);
103 |       rlesFree(&dt,n);
104 |     } else {
105 |       checkType(pr[0],mxDOUBLE_CLASS);
106 |       double *dt=mxGetPr(pr[0]); n=mxGetN(pr[0]);
107 |       pl[0]=mxCreateNumericMatrix(1,n,mxUINT32_CLASS,mxREAL);
108 |       keep=(uint*) mxGetPr(pl[0]); bbNms(dt,n,keep,thr);
109 |     }
110 |     
111 |   } else if(!strcmp(action,"toBbox")) {
112 |     R=frMxArray(pr[0],&n,0);
113 |     pl[0]=mxCreateNumericMatrix(4,n,mxDOUBLE_CLASS,mxREAL);
114 |     BB bb=mxGetPr(pl[0]); rleToBbox(R,bb,n);
115 |     
116 |   } else if(!strcmp(action,"frBbox")) {
117 |     checkType(pr[0],mxDOUBLE_CLASS);
118 |     double *bb=mxGetPr(pr[0]); n=mxGetN(pr[0]);
119 |     h=(siz)mxGetScalar(pr[1]); w=(siz)mxGetScalar(pr[2]);
120 |     rlesInit(&R,n); rleFrBbox(R,bb,h,w,n); pl[0]=toMxArray(R,n);
121 |     
122 |   } else if(!strcmp(action,"frPoly")) {
123 |     checkType(pr[0],mxCELL_CLASS); n=mxGetNumberOfElements(pr[0]);
124 |     h=(siz)mxGetScalar(pr[1]); w=(siz)mxGetScalar(pr[2]); rlesInit(&R,n);
125 |     for(siz i=0; i<n; i++) {
126 |       mxArray *XY=mxGetCell(pr[0],i); checkType(XY,mxDOUBLE_CLASS);
127 |       siz k=mxGetNumberOfElements(XY)/2; double *xy=mxGetPr(XY);
128 |       rleFrPoly(R+i,xy,k,h,w);
129 |     }
130 |     RLE M; rleMerge(R,&M,n,0); pl[0]=toMxArray(&M,1); rleFree(&M);
131 |     
132 |   } else mexErrMsgTxt("Invalid action.");
133 |   if( R!=0 ) { rlesFree(&R,n); R=0; }
134 | }
135 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/MatlabAPI/private/maskApiMex.mexa64:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/eval_city/cocoapi/MatlabAPI/private/maskApiMex.mexa64


--------------------------------------------------------------------------------
/eval_city/cocoapi/MatlabAPI/private/maskApiMex.mexmaci64:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/eval_city/cocoapi/MatlabAPI/private/maskApiMex.mexmaci64


--------------------------------------------------------------------------------
/eval_city/cocoapi/PythonAPI/Makefile:
--------------------------------------------------------------------------------
1 | all:
2 |     # install pycocotools locally
3 | 	python setup.py build_ext --inplace
4 | 	rm -rf build
5 | 
6 | install:
7 | 	# install pycocotools to the Python site-packages
8 | 	python setup.py build_ext install
9 | 	rm -rf build


--------------------------------------------------------------------------------
/eval_city/cocoapi/PythonAPI/pycocoEvalDemo.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {
  7 |     "collapsed": false
  8 |    },
  9 |    "outputs": [],
 10 |    "source": [
 11 |     "%matplotlib inline\n",
 12 |     "import matplotlib.pyplot as plt\n",
 13 |     "from pycocotools.coco import COCO\n",
 14 |     "from pycocotools.cocoeval import COCOeval\n",
 15 |     "import numpy as np\n",
 16 |     "import skimage.io as io\n",
 17 |     "import pylab\n",
 18 |     "pylab.rcParams['figure.figsize'] = (10.0, 8.0)"
 19 |    ]
 20 |   },
 21 |   {
 22 |    "cell_type": "code",
 23 |    "execution_count": 2,
 24 |    "metadata": {
 25 |     "collapsed": false
 26 |    },
 27 |    "outputs": [
 28 |     {
 29 |      "name": "stdout",
 30 |      "output_type": "stream",
 31 |      "text": [
 32 |       "Running demo for *bbox* results.\n"
 33 |      ]
 34 |     }
 35 |    ],
 36 |    "source": [
 37 |     "annType = ['segm','bbox','keypoints']\n",
 38 |     "annType = annType[1]      #specify type here\n",
 39 |     "prefix = 'person_keypoints' if annType=='keypoints' else 'instances'\n",
 40 |     "print 'Running demo for *%s* results.'%(annType)"
 41 |    ]
 42 |   },
 43 |   {
 44 |    "cell_type": "code",
 45 |    "execution_count": 3,
 46 |    "metadata": {
 47 |     "collapsed": false
 48 |    },
 49 |    "outputs": [
 50 |     {
 51 |      "name": "stdout",
 52 |      "output_type": "stream",
 53 |      "text": [
 54 |       "loading annotations into memory...\n",
 55 |       "Done (t=8.01s)\n",
 56 |       "creating index...\n",
 57 |       "index created!\n"
 58 |      ]
 59 |     }
 60 |    ],
 61 |    "source": [
 62 |     "#initialize COCO ground truth api\n",
 63 |     "dataDir='../'\n",
 64 |     "dataType='val2014'\n",
 65 |     "annFile = '%s/annotations/%s_%s.json'%(dataDir,prefix,dataType)\n",
 66 |     "cocoGt=COCO(annFile)"
 67 |    ]
 68 |   },
 69 |   {
 70 |    "cell_type": "code",
 71 |    "execution_count": 4,
 72 |    "metadata": {
 73 |     "collapsed": false
 74 |    },
 75 |    "outputs": [
 76 |     {
 77 |      "name": "stdout",
 78 |      "output_type": "stream",
 79 |      "text": [
 80 |       "Loading and preparing results...     \n",
 81 |       "DONE (t=0.05s)\n",
 82 |       "creating index...\n",
 83 |       "index created!\n"
 84 |      ]
 85 |     }
 86 |    ],
 87 |    "source": [
 88 |     "#initialize COCO detections api\n",
 89 |     "resFile='%s/results/%s_%s_fake%s100_results.json'\n",
 90 |     "resFile = resFile%(dataDir, prefix, dataType, annType)\n",
 91 |     "cocoDt=cocoGt.loadRes(resFile)"
 92 |    ]
 93 |   },
 94 |   {
 95 |    "cell_type": "code",
 96 |    "execution_count": 5,
 97 |    "metadata": {
 98 |     "collapsed": false
 99 |    },
100 |    "outputs": [],
101 |    "source": [
102 |     "imgIds=sorted(cocoGt.getImgIds())\n",
103 |     "imgIds=imgIds[0:100]\n",
104 |     "imgId = imgIds[np.random.randint(100)]"
105 |    ]
106 |   },
107 |   {
108 |    "cell_type": "code",
109 |    "execution_count": 6,
110 |    "metadata": {
111 |     "collapsed": false
112 |    },
113 |    "outputs": [
114 |     {
115 |      "name": "stdout",
116 |      "output_type": "stream",
117 |      "text": [
118 |       "Running per image evaluation...      \n",
119 |       "DONE (t=0.46s).\n",
120 |       "Accumulating evaluation results...   \n",
121 |       "DONE (t=0.38s).\n",
122 |       " Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.505\n",
123 |       " Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.697\n",
124 |       " Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.573\n",
125 |       " Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.586\n",
126 |       " Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.519\n",
127 |       " Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.501\n",
128 |       " Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.387\n",
129 |       " Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.594\n",
130 |       " Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.595\n",
131 |       " Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.640\n",
132 |       " Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.566\n",
133 |       " Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.564\n"
134 |      ]
135 |     }
136 |    ],
137 |    "source": [
138 |     "# running evaluation\n",
139 |     "cocoEval = COCOeval(cocoGt,cocoDt,annType)\n",
140 |     "cocoEval.params.imgIds  = imgIds\n",
141 |     "cocoEval.evaluate()\n",
142 |     "cocoEval.accumulate()\n",
143 |     "cocoEval.summarize()"
144 |    ]
145 |   }
146 |  ],
147 |  "metadata": {
148 |   "kernelspec": {
149 |    "display_name": "Python 2",
150 |    "language": "python",
151 |    "name": "python2"
152 |   },
153 |   "language_info": {
154 |    "codemirror_mode": {
155 |     "name": "ipython",
156 |     "version": 2
157 |    },
158 |    "file_extension": ".py",
159 |    "mimetype": "text/x-python",
160 |    "name": "python",
161 |    "nbconvert_exporter": "python",
162 |    "pygments_lexer": "ipython2",
163 |    "version": "2.7.10"
164 |   }
165 |  },
166 |  "nbformat": 4,
167 |  "nbformat_minor": 0
168 | }
169 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/PythonAPI/pycocotools/__init__.py:
--------------------------------------------------------------------------------
1 | __author__ = 'tylin'
2 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/PythonAPI/pycocotools/mask.py:
--------------------------------------------------------------------------------
  1 | __author__ = 'tsungyi'
  2 | 
  3 | import pycocotools._mask as _mask
  4 | 
  5 | # Interface for manipulating masks stored in RLE format.
  6 | #
  7 | # RLE is a simple yet efficient format for storing binary masks. RLE
  8 | # first divides a vector (or vectorized image) into a series of piecewise
  9 | # constant regions and then for each piece simply stores the length of
 10 | # that piece. For example, given M=[0 0 1 1 1 0 1] the RLE counts would
 11 | # be [2 3 1 1], or for M=[1 1 1 1 1 1 0] the counts would be [0 6 1]
 12 | # (note that the odd counts are always the numbers of zeros). Instead of
 13 | # storing the counts directly, additional compression is achieved with a
 14 | # variable bitrate representation based on a common scheme called LEB128.
 15 | #
 16 | # Compression is greatest given large piecewise constant regions.
 17 | # Specifically, the size of the RLE is proportional to the number of
 18 | # *boundaries* in M (or for an image the number of boundaries in the y
 19 | # direction). Assuming fairly simple shapes, the RLE representation is
 20 | # O(sqrt(n)) where n is number of pixels in the object. Hence space usage
 21 | # is substantially lower, especially for large simple objects (large n).
 22 | #
 23 | # Many common operations on masks can be computed directly using the RLE
 24 | # (without need for decoding). This includes computations such as area,
 25 | # union, intersection, etc. All of these operations are linear in the
 26 | # size of the RLE, in other words they are O(sqrt(n)) where n is the area
 27 | # of the object. Computing these operations on the original mask is O(n).
 28 | # Thus, using the RLE can result in substantial computational savings.
 29 | #
 30 | # The following API functions are defined:
 31 | #  encode         - Encode binary masks using RLE.
 32 | #  decode         - Decode binary masks encoded via RLE.
 33 | #  merge          - Compute union or intersection of encoded masks.
 34 | #  iou            - Compute intersection over union between masks.
 35 | #  area           - Compute area of encoded masks.
 36 | #  toBbox         - Get bounding boxes surrounding encoded masks.
 37 | #  frPyObjects    - Convert polygon, bbox, and uncompressed RLE to encoded RLE mask.
 38 | #
 39 | # Usage:
 40 | #  Rs     = encode( masks )
 41 | #  masks  = decode( Rs )
 42 | #  R      = merge( Rs, intersect=false )
 43 | #  o      = iou( dt, gt, iscrowd )
 44 | #  a      = area( Rs )
 45 | #  bbs    = toBbox( Rs )
 46 | #  Rs     = frPyObjects( [pyObjects], h, w )
 47 | #
 48 | # In the API the following formats are used:
 49 | #  Rs      - [dict] Run-length encoding of binary masks
 50 | #  R       - dict Run-length encoding of binary mask
 51 | #  masks   - [hxwxn] Binary mask(s) (must have type np.ndarray(dtype=uint8) in column-major order)
 52 | #  iscrowd - [nx1] list of np.ndarray. 1 indicates corresponding gt image has crowd region to ignore
 53 | #  bbs     - [nx4] Bounding box(es) stored as [x y w h]
 54 | #  poly    - Polygon stored as [[x1 y1 x2 y2...],[x1 y1 ...],...] (2D list)
 55 | #  dt,gt   - May be either bounding boxes or encoded masks
 56 | # Both poly and bbs are 0-indexed (bbox=[0 0 1 1] encloses first pixel).
 57 | #
 58 | # Finally, a note about the intersection over union (iou) computation.
 59 | # The standard iou of a ground truth (gt) and detected (dt) object is
 60 | #  iou(gt,dt) = area(intersect(gt,dt)) / area(union(gt,dt))
 61 | # For "crowd" regions, we use a modified criteria. If a gt object is
 62 | # marked as "iscrowd", we allow a dt to match any subregion of the gt.
 63 | # Choosing gt' in the crowd gt that best matches the dt can be done using
 64 | # gt'=intersect(dt,gt). Since by definition union(gt',dt)=dt, computing
 65 | #  iou(gt,dt,iscrowd) = iou(gt',dt) = area(intersect(gt,dt)) / area(dt)
 66 | # For crowd gt regions we use this modified criteria above for the iou.
 67 | #
 68 | # To compile run "python setup.py build_ext --inplace"
 69 | # Please do not contact us for help with compiling.
 70 | #
 71 | # Microsoft COCO Toolbox.      version 2.0
 72 | # Data, paper, and tutorials available at:  http://mscoco.org/
 73 | # Code written by Piotr Dollar and Tsung-Yi Lin, 2015.
 74 | # Licensed under the Simplified BSD License [see coco/license.txt]
 75 | 
 76 | iou         = _mask.iou
 77 | merge       = _mask.merge
 78 | frPyObjects = _mask.frPyObjects
 79 | 
 80 | def encode(bimask):
 81 |     if len(bimask.shape) == 3:
 82 |         return _mask.encode(bimask)
 83 |     elif len(bimask.shape) == 2:
 84 |         h, w = bimask.shape
 85 |         return _mask.encode(bimask.reshape((h, w, 1), order='F'))[0]
 86 | 
 87 | def decode(rleObjs):
 88 |     if type(rleObjs) == list:
 89 |         return _mask.decode(rleObjs)
 90 |     else:
 91 |         return _mask.decode([rleObjs])[:,:,0]
 92 | 
 93 | def area(rleObjs):
 94 |     if type(rleObjs) == list:
 95 |         return _mask.area(rleObjs)
 96 |     else:
 97 |         return _mask.area([rleObjs])[0]
 98 | 
 99 | def toBbox(rleObjs):
100 |     if type(rleObjs) == list:
101 |         return _mask.toBbox(rleObjs)
102 |     else:
103 |         return _mask.toBbox([rleObjs])[0]


--------------------------------------------------------------------------------
/eval_city/cocoapi/PythonAPI/setup.py:
--------------------------------------------------------------------------------
 1 | from distutils.core import setup
 2 | from Cython.Build import cythonize
 3 | from distutils.extension import Extension
 4 | import numpy as np
 5 | 
 6 | # To compile and install locally run "python setup.py build_ext --inplace"
 7 | # To install library to Python site-packages run "python setup.py build_ext install"
 8 | 
 9 | ext_modules = [
10 |     Extension(
11 |         'pycocotools._mask',
12 |         sources=['../common/maskApi.c', 'pycocotools/_mask.pyx'],
13 |         include_dirs = [np.get_include(), '../common'],
14 |         extra_compile_args=['-Wno-cpp', '-Wno-unused-function', '-std=c99'],
15 |     )
16 | ]
17 | 
18 | setup(name='pycocotools',
19 |       packages=['pycocotools'],
20 |       package_dir = {'pycocotools': 'pycocotools'},
21 |       version='2.0',
22 |       ext_modules=
23 |           cythonize(ext_modules)
24 |       )


--------------------------------------------------------------------------------
/eval_city/cocoapi/README.txt:
--------------------------------------------------------------------------------
 1 | COCO API - http://cocodataset.org/
 2 | 
 3 | COCO is a large image dataset designed for object detection, segmentation, person keypoints detection, stuff segmentation, and caption generation. This package provides Matlab, Python, and Lua APIs that assists in loading, parsing, and visualizing the annotations in COCO. Please visit http://cocodataset.org/ for more information on COCO, including for the data, paper, and tutorials. The exact format of the annotations is also described on the COCO website. The Matlab and Python APIs are complete, the Lua API provides only basic functionality.
 4 | 
 5 | In addition to this API, please download both the COCO images and annotations in order to run the demos and use the API. Both are available on the project website.
 6 | -Please download, unzip, and place the images in: coco/images/
 7 | -Please download and place the annotations in: coco/annotations/
 8 | For substantially more details on the API please see http://cocodataset.org/#download.
 9 | 
10 | After downloading the images and annotations, run the Matlab, Python, or Lua demos for example usage.
11 | 
12 | To install:
13 | -For Matlab, add coco/MatlabApi to the Matlab path (OSX/Linux binaries provided)
14 | -For Python, run "make" under coco/PythonAPI
15 | -For Lua, run “luarocks make LuaAPI/rocks/coco-scm-1.rockspec” under coco/
16 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/common/gason.cpp:
--------------------------------------------------------------------------------
  1 | // https://github.com/vivkin/gason - pulled January 10, 2016
  2 | #include "gason.h"
  3 | #include <stdlib.h>
  4 | 
  5 | #define JSON_ZONE_SIZE 4096
  6 | #define JSON_STACK_SIZE 32
  7 | 
  8 | const char *jsonStrError(int err) {
  9 |     switch (err) {
 10 | #define XX(no, str) \
 11 |     case JSON_##no: \
 12 |         return str;
 13 |         JSON_ERRNO_MAP(XX)
 14 | #undef XX
 15 |     default:
 16 |         return "unknown";
 17 |     }
 18 | }
 19 | 
 20 | void *JsonAllocator::allocate(size_t size) {
 21 |     size = (size + 7) & ~7;
 22 | 
 23 |     if (head && head->used + size <= JSON_ZONE_SIZE) {
 24 |         char *p = (char *)head + head->used;
 25 |         head->used += size;
 26 |         return p;
 27 |     }
 28 | 
 29 |     size_t allocSize = sizeof(Zone) + size;
 30 |     Zone *zone = (Zone *)malloc(allocSize <= JSON_ZONE_SIZE ? JSON_ZONE_SIZE : allocSize);
 31 |     if (zone == nullptr)
 32 |         return nullptr;
 33 |     zone->used = allocSize;
 34 |     if (allocSize <= JSON_ZONE_SIZE || head == nullptr) {
 35 |         zone->next = head;
 36 |         head = zone;
 37 |     } else {
 38 |         zone->next = head->next;
 39 |         head->next = zone;
 40 |     }
 41 |     return (char *)zone + sizeof(Zone);
 42 | }
 43 | 
 44 | void JsonAllocator::deallocate() {
 45 |     while (head) {
 46 |         Zone *next = head->next;
 47 |         free(head);
 48 |         head = next;
 49 |     }
 50 | }
 51 | 
 52 | static inline bool isspace(char c) {
 53 |     return c == ' ' || (c >= '\t' && c <= '\r');
 54 | }
 55 | 
 56 | static inline bool isdelim(char c) {
 57 |     return c == ',' || c == ':' || c == ']' || c == '}' || isspace(c) || !c;
 58 | }
 59 | 
 60 | static inline bool isdigit(char c) {
 61 |     return c >= '0' && c <= '9';
 62 | }
 63 | 
 64 | static inline bool isxdigit(char c) {
 65 |     return (c >= '0' && c <= '9') || ((c & ~' ') >= 'A' && (c & ~' ') <= 'F');
 66 | }
 67 | 
 68 | static inline int char2int(char c) {
 69 |     if (c <= '9')
 70 |         return c - '0';
 71 |     return (c & ~' ') - 'A' + 10;
 72 | }
 73 | 
 74 | static double string2double(char *s, char **endptr) {
 75 |     char ch = *s;
 76 |     if (ch == '-')
 77 |         ++s;
 78 | 
 79 |     double result = 0;
 80 |     while (isdigit(*s))
 81 |         result = (result * 10) + (*s++ - '0');
 82 | 
 83 |     if (*s == '.') {
 84 |         ++s;
 85 | 
 86 |         double fraction = 1;
 87 |         while (isdigit(*s)) {
 88 |             fraction *= 0.1;
 89 |             result += (*s++ - '0') * fraction;
 90 |         }
 91 |     }
 92 | 
 93 |     if (*s == 'e' || *s == 'E') {
 94 |         ++s;
 95 | 
 96 |         double base = 10;
 97 |         if (*s == '+')
 98 |             ++s;
 99 |         else if (*s == '-') {
100 |             ++s;
101 |             base = 0.1;
102 |         }
103 | 
104 |         unsigned int exponent = 0;
105 |         while (isdigit(*s))
106 |             exponent = (exponent * 10) + (*s++ - '0');
107 | 
108 |         double power = 1;
109 |         for (; exponent; exponent >>= 1, base *= base)
110 |             if (exponent & 1)
111 |                 power *= base;
112 | 
113 |         result *= power;
114 |     }
115 | 
116 |     *endptr = s;
117 |     return ch == '-' ? -result : result;
118 | }
119 | 
120 | static inline JsonNode *insertAfter(JsonNode *tail, JsonNode *node) {
121 |     if (!tail)
122 |         return node->next = node;
123 |     node->next = tail->next;
124 |     tail->next = node;
125 |     return node;
126 | }
127 | 
128 | static inline JsonValue listToValue(JsonTag tag, JsonNode *tail) {
129 |     if (tail) {
130 |         auto head = tail->next;
131 |         tail->next = nullptr;
132 |         return JsonValue(tag, head);
133 |     }
134 |     return JsonValue(tag, nullptr);
135 | }
136 | 
137 | int jsonParse(char *s, char **endptr, JsonValue *value, JsonAllocator &allocator) {
138 |     JsonNode *tails[JSON_STACK_SIZE];
139 |     JsonTag tags[JSON_STACK_SIZE];
140 |     char *keys[JSON_STACK_SIZE];
141 |     JsonValue o;
142 |     int pos = -1;
143 |     bool separator = true;
144 |     JsonNode *node;
145 |     *endptr = s;
146 | 
147 |     while (*s) {
148 |         while (isspace(*s)) {
149 |             ++s;
150 |             if (!*s) break;
151 |         }
152 |         *endptr = s++;
153 |         switch (**endptr) {
154 |         case '-':
155 |             if (!isdigit(*s) && *s != '.') {
156 |                 *endptr = s;
157 |                 return JSON_BAD_NUMBER;
158 |             }
159 |         case '0':
160 |         case '1':
161 |         case '2':
162 |         case '3':
163 |         case '4':
164 |         case '5':
165 |         case '6':
166 |         case '7':
167 |         case '8':
168 |         case '9':
169 |             o = JsonValue(string2double(*endptr, &s));
170 |             if (!isdelim(*s)) {
171 |                 *endptr = s;
172 |                 return JSON_BAD_NUMBER;
173 |             }
174 |             break;
175 |         case '"':
176 |             o = JsonValue(JSON_STRING, s);
177 |             for (char *it = s; *s; ++it, ++s) {
178 |                 int c = *it = *s;
179 |                 if (c == '\\') {
180 |                     c = *++s;
181 |                     switch (c) {
182 |                     case '\\':
183 |                     case '"':
184 |                     case '/':
185 |                         *it = c;
186 |                         break;
187 |                     case 'b':
188 |                         *it = '\b';
189 |                         break;
190 |                     case 'f':
191 |                         *it = '\f';
192 |                         break;
193 |                     case 'n':
194 |                         *it = '\n';
195 |                         break;
196 |                     case 'r':
197 |                         *it = '\r';
198 |                         break;
199 |                     case 't':
200 |                         *it = '\t';
201 |                         break;
202 |                     case 'u':
203 |                         c = 0;
204 |                         for (int i = 0; i < 4; ++i) {
205 |                             if (isxdigit(*++s)) {
206 |                                 c = c * 16 + char2int(*s);
207 |                             } else {
208 |                                 *endptr = s;
209 |                                 return JSON_BAD_STRING;
210 |                             }
211 |                         }
212 |                         if (c < 0x80) {
213 |                             *it = c;
214 |                         } else if (c < 0x800) {
215 |                             *it++ = 0xC0 | (c >> 6);
216 |                             *it = 0x80 | (c & 0x3F);
217 |                         } else {
218 |                             *it++ = 0xE0 | (c >> 12);
219 |                             *it++ = 0x80 | ((c >> 6) & 0x3F);
220 |                             *it = 0x80 | (c & 0x3F);
221 |                         }
222 |                         break;
223 |                     default:
224 |                         *endptr = s;
225 |                         return JSON_BAD_STRING;
226 |                     }
227 |                 } else if ((unsigned int)c < ' ' || c == '\x7F') {
228 |                     *endptr = s;
229 |                     return JSON_BAD_STRING;
230 |                 } else if (c == '"') {
231 |                     *it = 0;
232 |                     ++s;
233 |                     break;
234 |                 }
235 |             }
236 |             if (!isdelim(*s)) {
237 |                 *endptr = s;
238 |                 return JSON_BAD_STRING;
239 |             }
240 |             break;
241 |         case 't':
242 |             if (!(s[0] == 'r' && s[1] == 'u' && s[2] == 'e' && isdelim(s[3])))
243 |                 return JSON_BAD_IDENTIFIER;
244 |             o = JsonValue(JSON_TRUE);
245 |             s += 3;
246 |             break;
247 |         case 'f':
248 |             if (!(s[0] == 'a' && s[1] == 'l' && s[2] == 's' && s[3] == 'e' && isdelim(s[4])))
249 |                 return JSON_BAD_IDENTIFIER;
250 |             o = JsonValue(JSON_FALSE);
251 |             s += 4;
252 |             break;
253 |         case 'n':
254 |             if (!(s[0] == 'u' && s[1] == 'l' && s[2] == 'l' && isdelim(s[3])))
255 |                 return JSON_BAD_IDENTIFIER;
256 |             o = JsonValue(JSON_NULL);
257 |             s += 3;
258 |             break;
259 |         case ']':
260 |             if (pos == -1)
261 |                 return JSON_STACK_UNDERFLOW;
262 |             if (tags[pos] != JSON_ARRAY)
263 |                 return JSON_MISMATCH_BRACKET;
264 |             o = listToValue(JSON_ARRAY, tails[pos--]);
265 |             break;
266 |         case '}':
267 |             if (pos == -1)
268 |                 return JSON_STACK_UNDERFLOW;
269 |             if (tags[pos] != JSON_OBJECT)
270 |                 return JSON_MISMATCH_BRACKET;
271 |             if (keys[pos] != nullptr)
272 |                 return JSON_UNEXPECTED_CHARACTER;
273 |             o = listToValue(JSON_OBJECT, tails[pos--]);
274 |             break;
275 |         case '[':
276 |             if (++pos == JSON_STACK_SIZE)
277 |                 return JSON_STACK_OVERFLOW;
278 |             tails[pos] = nullptr;
279 |             tags[pos] = JSON_ARRAY;
280 |             keys[pos] = nullptr;
281 |             separator = true;
282 |             continue;
283 |         case '{':
284 |             if (++pos == JSON_STACK_SIZE)
285 |                 return JSON_STACK_OVERFLOW;
286 |             tails[pos] = nullptr;
287 |             tags[pos] = JSON_OBJECT;
288 |             keys[pos] = nullptr;
289 |             separator = true;
290 |             continue;
291 |         case ':':
292 |             if (separator || keys[pos] == nullptr)
293 |                 return JSON_UNEXPECTED_CHARACTER;
294 |             separator = true;
295 |             continue;
296 |         case ',':
297 |             if (separator || keys[pos] != nullptr)
298 |                 return JSON_UNEXPECTED_CHARACTER;
299 |             separator = true;
300 |             continue;
301 |         case '\0':
302 |             continue;
303 |         default:
304 |             return JSON_UNEXPECTED_CHARACTER;
305 |         }
306 | 
307 |         separator = false;
308 | 
309 |         if (pos == -1) {
310 |             *endptr = s;
311 |             *value = o;
312 |             return JSON_OK;
313 |         }
314 | 
315 |         if (tags[pos] == JSON_OBJECT) {
316 |             if (!keys[pos]) {
317 |                 if (o.getTag() != JSON_STRING)
318 |                     return JSON_UNQUOTED_KEY;
319 |                 keys[pos] = o.toString();
320 |                 continue;
321 |             }
322 |             if ((node = (JsonNode *) allocator.allocate(sizeof(JsonNode))) == nullptr)
323 |                 return JSON_ALLOCATION_FAILURE;
324 |             tails[pos] = insertAfter(tails[pos], node);
325 |             tails[pos]->key = keys[pos];
326 |             keys[pos] = nullptr;
327 |         } else {
328 |             if ((node = (JsonNode *) allocator.allocate(sizeof(JsonNode) - sizeof(char *))) == nullptr)
329 |                 return JSON_ALLOCATION_FAILURE;
330 |             tails[pos] = insertAfter(tails[pos], node);
331 |         }
332 |         tails[pos]->value = o;
333 |     }
334 |     return JSON_BREAKING_BAD;
335 | }
336 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/common/gason.h:
--------------------------------------------------------------------------------
  1 | // https://github.com/vivkin/gason - pulled January 10, 2016
  2 | #pragma once
  3 | 
  4 | #include <stdint.h>
  5 | #include <stddef.h>
  6 | #include <assert.h>
  7 | 
  8 | enum JsonTag {
  9 |     JSON_NUMBER = 0,
 10 |     JSON_STRING,
 11 |     JSON_ARRAY,
 12 |     JSON_OBJECT,
 13 |     JSON_TRUE,
 14 |     JSON_FALSE,
 15 |     JSON_NULL = 0xF
 16 | };
 17 | 
 18 | struct JsonNode;
 19 | 
 20 | #define JSON_VALUE_PAYLOAD_MASK 0x00007FFFFFFFFFFFULL
 21 | #define JSON_VALUE_NAN_MASK 0x7FF8000000000000ULL
 22 | #define JSON_VALUE_TAG_MASK 0xF
 23 | #define JSON_VALUE_TAG_SHIFT 47
 24 | 
 25 | union JsonValue {
 26 |     uint64_t ival;
 27 |     double fval;
 28 | 
 29 |     JsonValue(double x)
 30 |         : fval(x) {
 31 |     }
 32 |     JsonValue(JsonTag tag = JSON_NULL, void *payload = nullptr) {
 33 |         assert((uintptr_t)payload <= JSON_VALUE_PAYLOAD_MASK);
 34 |         ival = JSON_VALUE_NAN_MASK | ((uint64_t)tag << JSON_VALUE_TAG_SHIFT) | (uintptr_t)payload;
 35 |     }
 36 |     bool isDouble() const {
 37 |         return (int64_t)ival <= (int64_t)JSON_VALUE_NAN_MASK;
 38 |     }
 39 |     JsonTag getTag() const {
 40 |         return isDouble() ? JSON_NUMBER : JsonTag((ival >> JSON_VALUE_TAG_SHIFT) & JSON_VALUE_TAG_MASK);
 41 |     }
 42 |     uint64_t getPayload() const {
 43 |         assert(!isDouble());
 44 |         return ival & JSON_VALUE_PAYLOAD_MASK;
 45 |     }
 46 |     double toNumber() const {
 47 |         assert(getTag() == JSON_NUMBER);
 48 |         return fval;
 49 |     }
 50 |     char *toString() const {
 51 |         assert(getTag() == JSON_STRING);
 52 |         return (char *)getPayload();
 53 |     }
 54 |     JsonNode *toNode() const {
 55 |         assert(getTag() == JSON_ARRAY || getTag() == JSON_OBJECT);
 56 |         return (JsonNode *)getPayload();
 57 |     }
 58 | };
 59 | 
 60 | struct JsonNode {
 61 |     JsonValue value;
 62 |     JsonNode *next;
 63 |     char *key;
 64 | };
 65 | 
 66 | struct JsonIterator {
 67 |     JsonNode *p;
 68 | 
 69 |     void operator++() {
 70 |         p = p->next;
 71 |     }
 72 |     bool operator!=(const JsonIterator &x) const {
 73 |         return p != x.p;
 74 |     }
 75 |     JsonNode *operator*() const {
 76 |         return p;
 77 |     }
 78 |     JsonNode *operator->() const {
 79 |         return p;
 80 |     }
 81 | };
 82 | 
 83 | inline JsonIterator begin(JsonValue o) {
 84 |     return JsonIterator{o.toNode()};
 85 | }
 86 | inline JsonIterator end(JsonValue) {
 87 |     return JsonIterator{nullptr};
 88 | }
 89 | 
 90 | #define JSON_ERRNO_MAP(XX)                           \
 91 |     XX(OK, "ok")                                     \
 92 |     XX(BAD_NUMBER, "bad number")                     \
 93 |     XX(BAD_STRING, "bad string")                     \
 94 |     XX(BAD_IDENTIFIER, "bad identifier")             \
 95 |     XX(STACK_OVERFLOW, "stack overflow")             \
 96 |     XX(STACK_UNDERFLOW, "stack underflow")           \
 97 |     XX(MISMATCH_BRACKET, "mismatch bracket")         \
 98 |     XX(UNEXPECTED_CHARACTER, "unexpected character") \
 99 |     XX(UNQUOTED_KEY, "unquoted key")                 \
100 |     XX(BREAKING_BAD, "breaking bad")                 \
101 |     XX(ALLOCATION_FAILURE, "allocation failure")
102 | 
103 | enum JsonErrno {
104 | #define XX(no, str) JSON_##no,
105 |     JSON_ERRNO_MAP(XX)
106 | #undef XX
107 | };
108 | 
109 | const char *jsonStrError(int err);
110 | 
111 | class JsonAllocator {
112 |     struct Zone {
113 |         Zone *next;
114 |         size_t used;
115 |     } *head = nullptr;
116 | 
117 | public:
118 |     JsonAllocator() = default;
119 |     JsonAllocator(const JsonAllocator &) = delete;
120 |     JsonAllocator &operator=(const JsonAllocator &) = delete;
121 |     JsonAllocator(JsonAllocator &&x) : head(x.head) {
122 |         x.head = nullptr;
123 |     }
124 |     JsonAllocator &operator=(JsonAllocator &&x) {
125 |         head = x.head;
126 |         x.head = nullptr;
127 |         return *this;
128 |     }
129 |     ~JsonAllocator() {
130 |         deallocate();
131 |     }
132 |     void *allocate(size_t size);
133 |     void deallocate();
134 | };
135 | 
136 | int jsonParse(char *str, char **endptr, JsonValue *value, JsonAllocator &allocator);
137 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/common/maskApi.c:
--------------------------------------------------------------------------------
  1 | /**************************************************************************
  2 | * Microsoft COCO Toolbox.      version 2.0
  3 | * Data, paper, and tutorials available at:  http://mscoco.org/
  4 | * Code written by Piotr Dollar and Tsung-Yi Lin, 2015.
  5 | * Licensed under the Simplified BSD License [see coco/license.txt]
  6 | **************************************************************************/
  7 | #include "maskApi.h"
  8 | #include <math.h>
  9 | #include <stdlib.h>
 10 | 
 11 | uint umin( uint a, uint b ) { return (a<b) ? a : b; }
 12 | uint umax( uint a, uint b ) { return (a>b) ? a : b; }
 13 | 
 14 | void rleInit( RLE *R, siz h, siz w, siz m, uint *cnts ) {
 15 |   R->h=h; R->w=w; R->m=m; R->cnts=(m==0)?0:malloc(sizeof(uint)*m);
 16 |   siz j; if(cnts) for(j=0; j<m; j++) R->cnts[j]=cnts[j];
 17 | }
 18 | 
 19 | void rleFree( RLE *R ) {
 20 |   free(R->cnts); R->cnts=0;
 21 | }
 22 | 
 23 | void rlesInit( RLE **R, siz n ) {
 24 |   siz i; *R = (RLE*) malloc(sizeof(RLE)*n);
 25 |   for(i=0; i<n; i++) rleInit((*R)+i,0,0,0,0);
 26 | }
 27 | 
 28 | void rlesFree( RLE **R, siz n ) {
 29 |   siz i; for(i=0; i<n; i++) rleFree((*R)+i); free(*R); *R=0;
 30 | }
 31 | 
 32 | void rleEncode( RLE *R, const byte *M, siz h, siz w, siz n ) {
 33 |   siz i, j, k, a=w*h; uint c, *cnts; byte p;
 34 |   cnts = malloc(sizeof(uint)*(a+1));
 35 |   for(i=0; i<n; i++) {
 36 |     const byte *T=M+a*i; k=0; p=0; c=0;
 37 |     for(j=0; j<a; j++) { if(T[j]!=p) { cnts[k++]=c; c=0; p=T[j]; } c++; }
 38 |     cnts[k++]=c; rleInit(R+i,h,w,k,cnts);
 39 |   }
 40 |   free(cnts);
 41 | }
 42 | 
 43 | void rleDecode( const RLE *R, byte *M, siz n ) {
 44 |   siz i, j, k; for( i=0; i<n; i++ ) {
 45 |     byte v=0; for( j=0; j<R[i].m; j++ ) {
 46 |       for( k=0; k<R[i].cnts[j]; k++ ) *(M++)=v; v=!v; }}
 47 | }
 48 | 
 49 | void rleMerge( const RLE *R, RLE *M, siz n, int intersect ) {
 50 |   uint *cnts, c, ca, cb, cc, ct; int v, va, vb, vp;
 51 |   siz i, a, b, h=R[0].h, w=R[0].w, m=R[0].m; RLE A, B;
 52 |   if(n==0) { rleInit(M,0,0,0,0); return; }
 53 |   if(n==1) { rleInit(M,h,w,m,R[0].cnts); return; }
 54 |   cnts = malloc(sizeof(uint)*(h*w+1));
 55 |   for( a=0; a<m; a++ ) cnts[a]=R[0].cnts[a];
 56 |   for( i=1; i<n; i++ ) {
 57 |     B=R[i]; if(B.h!=h||B.w!=w) { h=w=m=0; break; }
 58 |     rleInit(&A,h,w,m,cnts); ca=A.cnts[0]; cb=B.cnts[0];
 59 |     v=va=vb=0; m=0; a=b=1; cc=0; ct=1;
 60 |     while( ct>0 ) {
 61 |       c=umin(ca,cb); cc+=c; ct=0;
 62 |       ca-=c; if(!ca && a<A.m) { ca=A.cnts[a++]; va=!va; } ct+=ca;
 63 |       cb-=c; if(!cb && b<B.m) { cb=B.cnts[b++]; vb=!vb; } ct+=cb;
 64 |       vp=v; if(intersect) v=va&&vb; else v=va||vb;
 65 |       if( v!=vp||ct==0 ) { cnts[m++]=cc; cc=0; }
 66 |     }
 67 |     rleFree(&A);
 68 |   }
 69 |   rleInit(M,h,w,m,cnts); free(cnts);
 70 | }
 71 | 
 72 | void rleArea( const RLE *R, siz n, uint *a ) {
 73 |   siz i, j; for( i=0; i<n; i++ ) {
 74 |     a[i]=0; for( j=1; j<R[i].m; j+=2 ) a[i]+=R[i].cnts[j]; }
 75 | }
 76 | 
 77 | void rleIou( RLE *dt, RLE *gt, siz m, siz n, byte *iscrowd, double *o ) {
 78 |   siz g, d; BB db, gb; int crowd;
 79 |   db=malloc(sizeof(double)*m*4); rleToBbox(dt,db,m);
 80 |   gb=malloc(sizeof(double)*n*4); rleToBbox(gt,gb,n);
 81 |   bbIou(db,gb,m,n,iscrowd,o); free(db); free(gb);
 82 |   for( g=0; g<n; g++ ) for( d=0; d<m; d++ ) if(o[g*m+d]>0) {
 83 |     crowd=iscrowd!=NULL && iscrowd[g];
 84 |     if(dt[d].h!=gt[g].h || dt[d].w!=gt[g].w) { o[g*m+d]=-1; continue; }
 85 |     siz ka, kb, a, b; uint c, ca, cb, ct, i, u; int va, vb;
 86 |     ca=dt[d].cnts[0]; ka=dt[d].m; va=vb=0;
 87 |     cb=gt[g].cnts[0]; kb=gt[g].m; a=b=1; i=u=0; ct=1;
 88 |     while( ct>0 ) {
 89 |       c=umin(ca,cb); if(va||vb) { u+=c; if(va&&vb) i+=c; } ct=0;
 90 |       ca-=c; if(!ca && a<ka) { ca=dt[d].cnts[a++]; va=!va; } ct+=ca;
 91 |       cb-=c; if(!cb && b<kb) { cb=gt[g].cnts[b++]; vb=!vb; } ct+=cb;
 92 |     }
 93 |     if(i==0) u=1; else if(crowd) rleArea(dt+d,1,&u);
 94 |     o[g*m+d] = (double)i/(double)u;
 95 |   }
 96 | }
 97 | 
 98 | void rleNms( RLE *dt, siz n, uint *keep, double thr ) {
 99 |   siz i, j; double u;
100 |   for( i=0; i<n; i++ ) keep[i]=1;
101 |   for( i=0; i<n; i++ ) if(keep[i]) {
102 |     for( j=i+1; j<n; j++ ) if(keep[j]) {
103 |       rleIou(dt+i,dt+j,1,1,0,&u);
104 |       if(u>thr) keep[j]=0;
105 |     }
106 |   }
107 | }
108 | 
109 | void bbIou( BB dt, BB gt, siz m, siz n, byte *iscrowd, double *o ) {
110 |   double h, w, i, u, ga, da; siz g, d; int crowd;
111 |   for( g=0; g<n; g++ ) {
112 |     BB G=gt+g*4; ga=G[2]*G[3]; crowd=iscrowd!=NULL && iscrowd[g];
113 |     for( d=0; d<m; d++ ) {
114 |       BB D=dt+d*4; da=D[2]*D[3]; o[g*m+d]=0;
115 |       w=fmin(D[2]+D[0],G[2]+G[0])-fmax(D[0],G[0]); if(w<=0) continue;
116 |       h=fmin(D[3]+D[1],G[3]+G[1])-fmax(D[1],G[1]); if(h<=0) continue;
117 |       i=w*h; u = crowd ? da : da+ga-i; o[g*m+d]=i/u;
118 |     }
119 |   }
120 | }
121 | 
122 | void bbNms( BB dt, siz n, uint *keep, double thr ) {
123 |   siz i, j; double u;
124 |   for( i=0; i<n; i++ ) keep[i]=1;
125 |   for( i=0; i<n; i++ ) if(keep[i]) {
126 |     for( j=i+1; j<n; j++ ) if(keep[j]) {
127 |       bbIou(dt+i*4,dt+j*4,1,1,0,&u);
128 |       if(u>thr) keep[j]=0;
129 |     }
130 |   }
131 | }
132 | 
133 | void rleToBbox( const RLE *R, BB bb, siz n ) {
134 |   siz i; for( i=0; i<n; i++ ) {
135 |     uint h, w, x, y, xs, ys, xe, ye, cc, t; siz j, m;
136 |     h=(uint)R[i].h; w=(uint)R[i].w; m=R[i].m;
137 |     m=((siz)(m/2))*2; xs=w; ys=h; xe=ye=0; cc=0;
138 |     if(m==0) { bb[4*i+0]=bb[4*i+1]=bb[4*i+2]=bb[4*i+3]=0; continue; }
139 |     for( j=0; j<m; j++ ) {
140 |       cc+=R[i].cnts[j]; t=cc-j%2; y=t%h; x=(t-y)/h;
141 |       xs=umin(xs,x); xe=umax(xe,x); ys=umin(ys,y); ye=umax(ye,y);
142 |     }
143 |     bb[4*i+0]=xs; bb[4*i+2]=xe-xs+1;
144 |     bb[4*i+1]=ys; bb[4*i+3]=ye-ys+1;
145 |   }
146 | }
147 | 
148 | void rleFrBbox( RLE *R, const BB bb, siz h, siz w, siz n ) {
149 |   siz i; for( i=0; i<n; i++ ) {
150 |     double xs=bb[4*i+0], xe=xs+bb[4*i+2];
151 |     double ys=bb[4*i+1], ye=ys+bb[4*i+3];
152 |     double xy[8] = {xs,ys,xs,ye,xe,ye,xe,ys};
153 |     rleFrPoly( R+i, xy, 4, h, w );
154 |   }
155 | }
156 | 
157 | int uintCompare(const void *a, const void *b) {
158 |   uint c=*((uint*)a), d=*((uint*)b); return c>d?1:c<d?-1:0;
159 | }
160 | 
161 | void rleFrPoly( RLE *R, const double *xy, siz k, siz h, siz w ) {
162 |   /* upsample and get discrete points densely along entire boundary */
163 |   siz j, m=0; double scale=5; int *x, *y, *u, *v; uint *a, *b;
164 |   x=malloc(sizeof(int)*(k+1)); y=malloc(sizeof(int)*(k+1));
165 |   for(j=0; j<k; j++) x[j]=(int)(scale*xy[j*2+0]+.5); x[k]=x[0];
166 |   for(j=0; j<k; j++) y[j]=(int)(scale*xy[j*2+1]+.5); y[k]=y[0];
167 |   for(j=0; j<k; j++) m+=umax(abs(x[j]-x[j+1]),abs(y[j]-y[j+1]))+1;
168 |   u=malloc(sizeof(int)*m); v=malloc(sizeof(int)*m); m=0;
169 |   for( j=0; j<k; j++ ) {
170 |     int xs=x[j], xe=x[j+1], ys=y[j], ye=y[j+1], dx, dy, t, d;
171 |     int flip; double s; dx=abs(xe-xs); dy=abs(ys-ye);
172 |     flip = (dx>=dy && xs>xe) || (dx<dy && ys>ye);
173 |     if(flip) { t=xs; xs=xe; xe=t; t=ys; ys=ye; ye=t; }
174 |     s = dx>=dy ? (double)(ye-ys)/dx : (double)(xe-xs)/dy;
175 |     if(dx>=dy) for( d=0; d<=dx; d++ ) {
176 |       t=flip?dx-d:d; u[m]=t+xs; v[m]=(int)(ys+s*t+.5); m++;
177 |     } else for( d=0; d<=dy; d++ ) {
178 |       t=flip?dy-d:d; v[m]=t+ys; u[m]=(int)(xs+s*t+.5); m++;
179 |     }
180 |   }
181 |   /* get points along y-boundary and downsample */
182 |   free(x); free(y); k=m; m=0; double xd, yd;
183 |   x=malloc(sizeof(int)*k); y=malloc(sizeof(int)*k);
184 |   for( j=1; j<k; j++ ) if(u[j]!=u[j-1]) {
185 |     xd=(double)(u[j]<u[j-1]?u[j]:u[j]-1); xd=(xd+.5)/scale-.5;
186 |     if( floor(xd)!=xd || xd<0 || xd>w-1 ) continue;
187 |     yd=(double)(v[j]<v[j-1]?v[j]:v[j-1]); yd=(yd+.5)/scale-.5;
188 |     if(yd<0) yd=0; else if(yd>h) yd=h; yd=ceil(yd);
189 |     x[m]=(int) xd; y[m]=(int) yd; m++;
190 |   }
191 |   /* compute rle encoding given y-boundary points */
192 |   k=m; a=malloc(sizeof(uint)*(k+1));
193 |   for( j=0; j<k; j++ ) a[j]=(uint)(x[j]*(int)(h)+y[j]);
194 |   a[k++]=(uint)(h*w); free(u); free(v); free(x); free(y);
195 |   qsort(a,k,sizeof(uint),uintCompare); uint p=0;
196 |   for( j=0; j<k; j++ ) { uint t=a[j]; a[j]-=p; p=t; }
197 |   b=malloc(sizeof(uint)*k); j=m=0; b[m++]=a[j++];
198 |   while(j<k) if(a[j]>0) b[m++]=a[j++]; else {
199 |     j++; if(j<k) b[m-1]+=a[j++]; }
200 |   rleInit(R,h,w,m,b); free(a); free(b);
201 | }
202 | 
203 | char* rleToString( const RLE *R ) {
204 |   /* Similar to LEB128 but using 6 bits/char and ascii chars 48-111. */
205 |   siz i, m=R->m, p=0; long x; int more;
206 |   char *s=malloc(sizeof(char)*m*6);
207 |   for( i=0; i<m; i++ ) {
208 |     x=(long) R->cnts[i]; if(i>2) x-=(long) R->cnts[i-2]; more=1;
209 |     while( more ) {
210 |       char c=x & 0x1f; x >>= 5; more=(c & 0x10) ? x!=-1 : x!=0;
211 |       if(more) c |= 0x20; c+=48; s[p++]=c;
212 |     }
213 |   }
214 |   s[p]=0; return s;
215 | }
216 | 
217 | void rleFrString( RLE *R, char *s, siz h, siz w ) {
218 |   siz m=0, p=0, k; long x; int more; uint *cnts;
219 |   while( s[m] ) m++; cnts=malloc(sizeof(uint)*m); m=0;
220 |   while( s[p] ) {
221 |     x=0; k=0; more=1;
222 |     while( more ) {
223 |       char c=s[p]-48; x |= (c & 0x1f) << 5*k;
224 |       more = c & 0x20; p++; k++;
225 |       if(!more && (c & 0x10)) x |= -1 << 5*k;
226 |     }
227 |     if(m>2) x+=(long) cnts[m-2]; cnts[m++]=(uint) x;
228 |   }
229 |   rleInit(R,h,w,m,cnts); free(cnts);
230 | }
231 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/common/maskApi.h:
--------------------------------------------------------------------------------
 1 | /**************************************************************************
 2 | * Microsoft COCO Toolbox.      version 2.0
 3 | * Data, paper, and tutorials available at:  http://mscoco.org/
 4 | * Code written by Piotr Dollar and Tsung-Yi Lin, 2015.
 5 | * Licensed under the Simplified BSD License [see coco/license.txt]
 6 | **************************************************************************/
 7 | #pragma once
 8 | 
 9 | typedef unsigned int uint;
10 | typedef unsigned long siz;
11 | typedef unsigned char byte;
12 | typedef double* BB;
13 | typedef struct { siz h, w, m; uint *cnts; } RLE;
14 | 
15 | /* Initialize/destroy RLE. */
16 | void rleInit( RLE *R, siz h, siz w, siz m, uint *cnts );
17 | void rleFree( RLE *R );
18 | 
19 | /* Initialize/destroy RLE array. */
20 | void rlesInit( RLE **R, siz n );
21 | void rlesFree( RLE **R, siz n );
22 | 
23 | /* Encode binary masks using RLE. */
24 | void rleEncode( RLE *R, const byte *mask, siz h, siz w, siz n );
25 | 
26 | /* Decode binary masks encoded via RLE. */
27 | void rleDecode( const RLE *R, byte *mask, siz n );
28 | 
29 | /* Compute union or intersection of encoded masks. */
30 | void rleMerge( const RLE *R, RLE *M, siz n, int intersect );
31 | 
32 | /* Compute area of encoded masks. */
33 | void rleArea( const RLE *R, siz n, uint *a );
34 | 
35 | /* Compute intersection over union between masks. */
36 | void rleIou( RLE *dt, RLE *gt, siz m, siz n, byte *iscrowd, double *o );
37 | 
38 | /* Compute non-maximum suppression between bounding masks */
39 | void rleNms( RLE *dt, siz n, uint *keep, double thr );
40 | 
41 | /* Compute intersection over union between bounding boxes. */
42 | void bbIou( BB dt, BB gt, siz m, siz n, byte *iscrowd, double *o );
43 | 
44 | /* Compute non-maximum suppression between bounding boxes */
45 | void bbNms( BB dt, siz n, uint *keep, double thr );
46 | 
47 | /* Get bounding boxes surrounding encoded masks. */
48 | void rleToBbox( const RLE *R, BB bb, siz n );
49 | 
50 | /* Convert bounding boxes to encoded masks. */
51 | void rleFrBbox( RLE *R, const BB bb, siz h, siz w, siz n );
52 | 
53 | /* Convert polygon to encoded mask. */
54 | void rleFrPoly( RLE *R, const double *xy, siz k, siz h, siz w );
55 | 
56 | /* Get compressed string representation of encoded mask. */
57 | char* rleToString( const RLE *R );
58 | 
59 | /* Convert from compressed string representation of encoded mask. */
60 | void rleFrString( RLE *R, char *s, siz h, siz w );
61 | 


--------------------------------------------------------------------------------
/eval_city/cocoapi/license.txt:
--------------------------------------------------------------------------------
 1 | Copyright (c) 2014, Piotr Dollar and Tsung-Yi Lin
 2 | All rights reserved.
 3 | 
 4 | Redistribution and use in source and binary forms, with or without
 5 | modification, are permitted provided that the following conditions are met: 
 6 | 
 7 | 1. Redistributions of source code must retain the above copyright notice, this
 8 |    list of conditions and the following disclaimer. 
 9 | 2. Redistributions in binary form must reproduce the above copyright notice,
10 |    this list of conditions and the following disclaimer in the documentation
11 |    and/or other materials provided with the distribution. 
12 | 
13 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
14 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
15 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
16 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
17 | ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
18 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
19 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
20 | ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
21 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
22 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
23 | 
24 | The views and conclusions contained in the software and documentation are those
25 | of the authors and should not be interpreted as representing official policies, 
26 | either expressed or implied, of the FreeBSD Project.
27 | 


--------------------------------------------------------------------------------
/eval_city/dt_txt2json.m:
--------------------------------------------------------------------------------
 1 | clear
 2 | addpath('./cocoapi/MatlabAPI')
 3 | main_path = '../output/valresults/city/h/off';
 4 | subdir = dir(main_path);
 5 | for j = 3 : length(subdir)
 6 |     ndt=0;
 7 |     dt_coco = struct();
 8 |     dt_path = fullfile(main_path, subdir(j).name);
 9 | %     res = load([dt_path,'/val_500_det.txt'],'%f');
10 |     res = load([dt_path,'/val_det.txt'],'%f');
11 |     num_imgs = max(res(:,1));
12 |     out = [dt_path,'/val_dt.json'];
13 |     if exist(out,'file')
14 |         continue
15 |     end
16 |     for i = 1:num_imgs
17 |         bbs = res(res(:,1)==i,:);
18 |         for ibb=1:size(bbs,1)
19 |             ndt=ndt+1;
20 |             bb=bbs(ibb,2:6);
21 |             dt_coco(ndt).image_id=i;
22 |             dt_coco(ndt).category_id=1;
23 |             dt_coco(ndt).bbox=bb(1:4);
24 |             dt_coco(ndt).score=bb(5);
25 |         end
26 |     end
27 |     dt_string = gason(dt_coco);
28 |     fp = fopen(out,'w');
29 |     fprintf(fp,'%s',dt_string);
30 |     fclose(fp);
31 | end


--------------------------------------------------------------------------------
/eval_city/eval_script/eval_demo.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | from coco import COCO
 3 | from eval_MR_multisetup import COCOeval
 4 | 
 5 | annType = 'bbox' #specify type here
 6 | 
 7 | #initialize COCO ground truth api
 8 | annFile = '../val_gt.json'
 9 | main_path = '../../output/valresults/city/h/off'
10 | for f in sorted(os.listdir(main_path)):
11 |     print(f)
12 |     # initialize COCO detections api
13 |     dt_path = os.path.join(main_path, f)
14 |     resFile = os.path.join(dt_path,'val_dt.json')
15 |     respath = os.path.join(dt_path,'results.txt')
16 |     # if os.path.exists(respath):
17 |     #     continue
18 |     ## running evaluation
19 |     res_file = open(respath, "w")
20 |     for id_setup in range(0,1):
21 |         cocoGt = COCO(annFile)
22 |         cocoDt = cocoGt.loadRes(resFile)
23 |         imgIds = sorted(cocoGt.getImgIds())
24 |         cocoEval = COCOeval(cocoGt,cocoDt,annType)
25 |         cocoEval.params.imgIds  = imgIds
26 |         cocoEval.evaluate(id_setup)
27 |         cocoEval.accumulate()
28 |         cocoEval.summarize(id_setup,res_file)
29 | 
30 |     res_file.close()


--------------------------------------------------------------------------------
/eval_city/eval_script/readme.txt:
--------------------------------------------------------------------------------
 1 | ################################
 2 | This script is created to produce miss rate numbers by making minor changes to the COCO python evaluation script [1]. It is a python re-implementation of the Caltech evaluation code, which is written in matlab. This python script produces exactly the same numbers as the matlab code.
 3 | [1] https://github.com/pdollar/coco/tree/master/PythonAPI
 4 | [2] http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/code/code3.2.1.zip
 5 | #################################
 6 | Usage
 7 | 1. Prepare detection results in COCO format, and write them in a single .json file.
 8 | 2. Run eval_demo.py.
 9 | 3. Detailed evaluations will be written to results.txt.
10 | #################################
11 | 
12 | 


--------------------------------------------------------------------------------
/eval_city/readme.txt:
--------------------------------------------------------------------------------
 1 | Submitting results
 2 | Please write the detection results for all test images in a single .json file (see dt_mat2json.m), and then send to shanshan.zhang@njust.edu.cn.
 3 | You'll receive the evaulation results via e-mail, and then you decide whether to publish it or not.
 4 | If you would like to publish your results on the dashboard, please also specify a name of your method.
 5 | 
 6 | Metrics
 7 | We use the same protocol as in [1] for evaluation. As a numerical measure of the performance, log-average miss rate (MR) is computed by averaging over the precision range of [10e-2; 10e0] FPPI (false positives per image). 
 8 | For detailed evaluation, we consider the following 4 subsets:
 9 | 1. 'Reasonable': height [50, inf]; visibility [0.65, inf]
10 | 2. 'Reasonable_small': height [50, 75]; visibility [0.65, inf]
11 | 3. 'Reasonable_occ=heavy': height [50, inf]; visibility [0.2, 0.65]
12 | 4. 'All': height [20, inf]; visibility [0.2, inf]
13 | 
14 | 
15 | Reference
16 | [1] P. Dollar, C. Wojek, B. Schiele and P. Perona. Pedestrian Detection: An Evaluation of the State of the Art. TPAMI, 2012. 
17 | 


--------------------------------------------------------------------------------
/generate_cache_caltech.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import pickle
 3 | import numpy as np
 4 | import matplotlib.pyplot as plt
 5 | 
 6 | root_dir = 'data/caltech/train_3'
 7 | # root_dir = 'data/caltech/test'
 8 | all_img_path = os.path.join(root_dir, 'images')
 9 | all_anno_path = os.path.join(root_dir, 'annotations_new/')
10 | res_path_gt = 'data/cache/caltech/train_gt'
11 | res_path_nogt = 'data/cache/caltech/train_nogt'
12 | 
13 | rows, cols = 480, 640
14 | image_data_gt, image_data_nogt = [], []
15 | 
16 | valid_count = 0
17 | iggt_count = 0
18 | box_count = 0
19 | files = sorted(os.listdir(all_anno_path))
20 | for l in range(len(files)):
21 |     gtname = files[l]
22 |     imgname = files[l].split('.')[0] + '.jpg'
23 |     img_path = os.path.join(all_img_path, imgname)
24 |     gt_path = os.path.join(all_anno_path, gtname)
25 | 
26 |     boxes = []
27 |     ig_boxes = []
28 |     with open(gt_path, 'rb') as fid:
29 |         lines = fid.readlines()
30 |     if len(lines) > 1:
31 |         for i in range(1, len(lines)):
32 |             info = lines[i].strip().split(' ')
33 |             label = info[0]
34 |             occ, ignore = info[5], info[10]
35 |             x1, y1 = max(int(float(info[1])), 0), max(int(float(info[2])), 0)
36 |             w, h = min(int(float(info[3])), cols - x1 - 1), min(int(float(info[4])), rows - y1 - 1)
37 |             box = np.array([int(x1), int(y1), int(x1) + int(w), int(y1) + int(h)])
38 |             if int(ignore) == 0:
39 |                 boxes.append(box)
40 |             else:
41 |                 ig_boxes.append(box)
42 |     boxes = np.array(boxes)
43 |     ig_boxes = np.array(ig_boxes)
44 | 
45 |     annotation = {}
46 |     annotation['filepath'] = img_path
47 |     box_count += len(boxes)
48 |     iggt_count += len(ig_boxes)
49 |     annotation['bboxes'] = boxes
50 |     annotation['ignoreareas'] = ig_boxes
51 |     if len(boxes) == 0:
52 |         image_data_nogt.append(annotation)
53 |     else:
54 |         image_data_gt.append(annotation)
55 |         valid_count += 1
56 | print('{} images and {} valid images, {} valid gt and {} ignored gt'.format(len(files), valid_count, box_count,
57 |                                                                             iggt_count))
58 | 
59 | if not os.path.exists(res_path_gt):
60 |     with open(res_path_gt, 'wb') as fid:
61 |         pickle.dump(image_data_gt, fid, pickle.HIGHEST_PROTOCOL)
62 | if not os.path.exists(res_path_nogt):
63 |     with open(res_path_nogt, 'wb') as fid:
64 |         pickle.dump(image_data_nogt, fid, pickle.HIGHEST_PROTOCOL)
65 | 


--------------------------------------------------------------------------------
/generate_cache_city.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import cv2
 3 | import pickle
 4 | import numpy as np
 5 | from scipy import io as scio
 6 | import time
 7 | import matplotlib.pyplot as plt
 8 | import re
 9 | 
10 | root_dir = 'data/cityperson'
11 | all_img_path = os.path.join(root_dir, 'images')
12 | all_anno_path = os.path.join(root_dir, 'annotations')
13 | types = ['train']
14 | rows, cols = 1024, 2048
15 | 
16 | for type in types:
17 |     anno_path = os.path.join(all_anno_path, 'anno_' + type + '.mat')
18 |     res_path = os.path.join('data/cache/cityperson', type)
19 |     image_data = []
20 |     annos = scio.loadmat(anno_path)
21 |     index = 'anno_' + type + '_aligned'
22 |     valid_count = 0
23 |     iggt_count = 0
24 |     box_count = 0
25 |     for l in range(len(annos[index][0])):
26 |         anno = annos[index][0][l]
27 |         cityname = anno[0][0][0][0].encode()
28 |         imgname = anno[0][0][1][0].encode()
29 |         gts = anno[0][0][2]
30 |         img_path = os.path.join(all_img_path, type + '/' + cityname + '/' + imgname)
31 |         boxes = []
32 |         ig_boxes = []
33 |         vis_boxes = []
34 |         for i in range(len(gts)):
35 |             label, x1, y1, w, h = gts[i, :5]
36 |             x1, y1 = max(int(x1), 0), max(int(y1), 0)
37 |             w, h = min(int(w), cols - x1 - 1), min(int(h), rows - y1 - 1)
38 |             xv1, yv1, wv, hv = gts[i, 6:]
39 |             xv1, yv1 = max(int(xv1), 0), max(int(yv1), 0)
40 |             wv, hv = min(int(wv), cols - xv1 - 1), min(int(hv), rows - yv1 - 1)
41 | 
42 |             if label == 1 and h >= 50:
43 |                 box = np.array([int(x1), int(y1), int(x1) + int(w), int(y1) + int(h)])
44 |                 boxes.append(box)
45 |                 vis_box = np.array([int(xv1), int(yv1), int(xv1) + int(wv), int(yv1) + int(hv)])
46 |                 vis_boxes.append(vis_box)
47 |             else:
48 |                 ig_box = np.array([int(x1), int(y1), int(x1) + int(w), int(y1) + int(h)])
49 |                 ig_boxes.append(ig_box)
50 |         boxes = np.array(boxes)
51 |         vis_boxes = np.array(vis_boxes)
52 |         ig_boxes = np.array(ig_boxes)
53 | 
54 |         if len(boxes) > 0:
55 |             valid_count += 1
56 |         annotation = {}
57 |         annotation['filepath'] = img_path
58 |         box_count += len(boxes)
59 |         iggt_count += len(ig_boxes)
60 |         annotation['bboxes'] = boxes
61 |         annotation['vis_bboxes'] = vis_boxes
62 |         annotation['ignoreareas'] = ig_boxes
63 |         image_data.append(annotation)
64 |     if not os.path.exists(res_path):
65 |         with open(res_path, 'wb') as fid:
66 |             pickle.dump(image_data, fid, pickle.HIGHEST_PROTOCOL)
67 |     print('{} has {} images and {} valid images, {} valid gt and {} ignored gt'.format(type, len(annos[index][0]),
68 |                                                                                        valid_count, box_count,
69 |                                                                                        iggt_count))
70 | 


--------------------------------------------------------------------------------
/generate_cache_wider.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import cv2
 3 | import pickle
 4 | import numpy as np
 5 | import matplotlib.pyplot as plt
 6 | 
 7 | root_dir = 'data/WiderFace/'
 8 | img_path = os.path.join(root_dir, 'WIDER_train/images')
 9 | anno_path = os.path.join(root_dir, 'wider_face_split', 'wider_face_train_bbx_gt.txt')
10 | # anno_path = os.path.join(root_dir, 'wider_face_split','wider_face_test_filelist.txt')
11 | 
12 | res_path = 'data/cache/train'
13 | 
14 | image_data = []
15 | valid_count = 0
16 | img_count = 0
17 | box_count = 0
18 | with open(anno_path, 'rb') as fid:
19 |     lines = fid.readlines()
20 | num_lines = len(lines)
21 | 
22 | index = 0
23 | while index < num_lines:
24 |     filename = lines[index].strip()
25 |     img_count += 1
26 |     if img_count % 1000 == 0:
27 |         print(img_count)
28 |     num_obj = int(lines[index + 1])
29 |     filepath = os.path.join(img_path, filename)
30 |     img = cv2.imread(filepath)
31 |     img_height, img_width = img.shape[:2]
32 |     boxes = []
33 |     if num_obj > 0:
34 |         for i in range(num_obj):
35 |             info = lines[index + 2 + i].strip().split(' ')
36 |             x1, y1 = max(int(info[0]), 0), max(int(info[1]), 0)
37 |             w, h = min(int(info[2]), img_width - x1 - 1), min(int(info[3]), img_height - y1 - 1)
38 |             if w >= 5 and h >= 5:
39 |                 box = np.array([x1, y1, x1 + w, y1 + h])
40 |                 boxes.append(box)
41 |     boxes = np.array(boxes)
42 |     box_count += len(boxes)
43 |     if len(boxes) > 0:
44 |         valid_count += 1
45 |         annotation = {}
46 |         annotation['filepath'] = filepath
47 |         annotation['bboxes'] = boxes
48 |         image_data.append(annotation)
49 |     index += (2 + num_obj)
50 | 
51 | print('{} images and {} valid images and {} boxes'.format(img_count, valid_count, box_count))
52 | with open(res_path, 'wb') as fid:
53 |     pickle.dump(image_data, fid, pickle.HIGHEST_PROTOCOL)
54 | 


--------------------------------------------------------------------------------
/keras_csp/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/keras_csp/__init__.py


--------------------------------------------------------------------------------
/keras_csp/bbox_process.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from keras_csp.nms_wrapper import nms
  3 | 
  4 | 
  5 | def parse_det(Y, C, score=0.1, down=4, scale='h'):
  6 |     seman = Y[0][0, :, :, 0]
  7 |     if scale == 'h':
  8 |         height = np.exp(Y[1][0, :, :, 0]) * down
  9 |         width = 0.41 * height
 10 |     elif scale == 'w':
 11 |         width = np.exp(Y[1][0, :, :, 0]) * down
 12 |         height = width / 0.41
 13 |     elif scale == 'hw':
 14 |         height = np.exp(Y[1][0, :, :, 0]) * down
 15 |         width = np.exp(Y[1][0, :, :, 1]) * down
 16 |     y_c, x_c = np.where(seman > score)
 17 |     boxs = []
 18 |     if len(y_c) > 0:
 19 |         for i in range(len(y_c)):
 20 |             h = height[y_c[i], x_c[i]]
 21 |             w = width[y_c[i], x_c[i]]
 22 |             s = seman[y_c[i], x_c[i]]
 23 |             x1, y1 = max(0, (x_c[i] + 0.5) * down - w / 2), max(0, (y_c[i] + 0.5) * down - h / 2)
 24 |             boxs.append([x1, y1, min(x1 + w, C.size_test[1]), min(y1 + h, C.size_test[0]), s])
 25 |         boxs = np.asarray(boxs, dtype=np.float32)
 26 |         keep = nms(boxs, 0.5, usegpu=False, gpu_id=0)
 27 |         boxs = boxs[keep, :]
 28 |     return boxs
 29 | 
 30 | 
 31 | def parse_det_top(Y, C, score=0.1):
 32 |     seman = Y[0][0, :, :, 0]
 33 |     height = Y[1][0, :, :, 0]
 34 |     y_c, x_c = np.where(seman > score)
 35 |     boxs = []
 36 |     if len(y_c) > 0:
 37 |         for i in range(len(y_c)):
 38 |             h = np.exp(height[y_c[i], x_c[i]]) * 4
 39 |             w = 0.41 * h
 40 |             s = seman[y_c[i], x_c[i]]
 41 |             x1, y1 = max(0, x_c[i] * 4 + 2 - w / 2), max(0, y_c[i] * 4 + 2)
 42 |             boxs.append([x1, y1, min(x1 + w, C.size_test[1]), min(y1 + h, C.size_test[0]), s])
 43 |         boxs = np.asarray(boxs, dtype=np.float32)
 44 |         keep = nms(boxs, 0.5, usegpu=False, gpu_id=0)
 45 |         boxs = boxs[keep, :]
 46 |     return boxs
 47 | 
 48 | 
 49 | def parse_det_bottom(Y, C, score=0.1):
 50 |     seman = Y[0][0, :, :, 0]
 51 |     height = Y[1][0, :, :, 0]
 52 |     y_c, x_c = np.where(seman > score)
 53 |     boxs = []
 54 |     if len(y_c) > 0:
 55 |         for i in range(len(y_c)):
 56 |             h = np.exp(height[y_c[i], x_c[i]]) * 4
 57 |             w = 0.41 * h
 58 |             s = seman[y_c[i], x_c[i]]
 59 |             x1, y1 = max(0, x_c[i] * 4 + 2 - w / 2), max(0, y_c[i] * 4 + 2 - h)
 60 |             boxs.append([x1, y1, min(x1 + w, C.size_test[1]), min(y1 + h, C.size_test[0]), s])
 61 |         boxs = np.asarray(boxs, dtype=np.float32)
 62 |         keep = nms(boxs, 0.5, usegpu=False, gpu_id=0)
 63 |         boxs = boxs[keep, :]
 64 |     return boxs
 65 | 
 66 | 
 67 | def parse_det_offset(Y, C, score=0.1, down=4):
 68 |     seman = Y[0][0, :, :, 0]
 69 |     height = Y[1][0, :, :, 0]
 70 |     offset_y = Y[2][0, :, :, 0]
 71 |     offset_x = Y[2][0, :, :, 1]
 72 |     y_c, x_c = np.where(seman > score)
 73 |     boxs = []
 74 |     if len(y_c) > 0:
 75 |         for i in range(len(y_c)):
 76 |             h = np.exp(height[y_c[i], x_c[i]]) * down
 77 |             w = 0.41 * h
 78 |             o_y = offset_y[y_c[i], x_c[i]]
 79 |             o_x = offset_x[y_c[i], x_c[i]]
 80 |             s = seman[y_c[i], x_c[i]]
 81 |             x1, y1 = max(0, (x_c[i] + o_x + 0.5) * down - w / 2), max(0, (y_c[i] + o_y + 0.5) * down - h / 2)
 82 |             boxs.append([x1, y1, min(x1 + w, C.size_test[1]), min(y1 + h, C.size_test[0]), s])
 83 |         boxs = np.asarray(boxs, dtype=np.float32)
 84 |         keep = nms(boxs, 0.5, usegpu=False, gpu_id=0)
 85 |         boxs = boxs[keep, :]
 86 |     return boxs
 87 | 
 88 | 
 89 | def parse_wider_offset(Y, C, score=0.1, down=4, nmsthre=0.5):
 90 |     seman = Y[0][0, :, :, 0]
 91 |     height = Y[1][0, :, :, 0]
 92 |     width = Y[1][0, :, :, 1]
 93 |     offset_y = Y[2][0, :, :, 0]
 94 |     offset_x = Y[2][0, :, :, 1]
 95 |     y_c, x_c = np.where(seman > score)
 96 |     boxs = []
 97 |     if len(y_c) > 0:
 98 |         for i in range(len(y_c)):
 99 |             h = np.exp(height[y_c[i], x_c[i]]) * down
100 |             w = np.exp(width[y_c[i], x_c[i]]) * down
101 |             o_y = offset_y[y_c[i], x_c[i]]
102 |             o_x = offset_x[y_c[i], x_c[i]]
103 |             s = seman[y_c[i], x_c[i]]
104 |             x1, y1 = max(0, (x_c[i] + o_x + 0.5) * down - w / 2), max(0, (y_c[i] + o_y + 0.5) * down - h / 2)
105 |             x1, y1 = min(x1, C.size_test[1]), min(y1, C.size_test[0])
106 |             boxs.append([x1, y1, min(x1 + w, C.size_test[1]), min(y1 + h, C.size_test[0]), s])
107 |         boxs = np.asarray(boxs, dtype=np.float32)
108 |         # keep = nms(boxs, nmsthre, usegpu=False, gpu_id=0)
109 |         # boxs = boxs[keep, :]
110 |         boxs = soft_bbox_vote(boxs, thre=nmsthre)
111 |     return boxs
112 | 
113 | 
114 | def soft_bbox_vote(det, thre=0.35, score=0.05):
115 |     if det.shape[0] <= 1:
116 |         return det
117 |     order = det[:, 4].ravel().argsort()[::-1]
118 |     det = det[order, :]
119 |     dets = []
120 |     while det.shape[0] > 0:
121 |         # IOU
122 |         area = (det[:, 2] - det[:, 0] + 1) * (det[:, 3] - det[:, 1] + 1)
123 |         xx1 = np.maximum(det[0, 0], det[:, 0])
124 |         yy1 = np.maximum(det[0, 1], det[:, 1])
125 |         xx2 = np.minimum(det[0, 2], det[:, 2])
126 |         yy2 = np.minimum(det[0, 3], det[:, 3])
127 |         w = np.maximum(0.0, xx2 - xx1 + 1)
128 |         h = np.maximum(0.0, yy2 - yy1 + 1)
129 |         inter = w * h
130 |         o = inter / (area[0] + area[:] - inter)
131 | 
132 |         # get needed merge det and delete these det
133 |         merge_index = np.where(o >= thre)[0]
134 |         det_accu = det[merge_index, :]
135 |         det_accu_iou = o[merge_index]
136 |         det = np.delete(det, merge_index, 0)
137 | 
138 |         if merge_index.shape[0] <= 1:
139 |             try:
140 |                 dets = np.row_stack((dets, det_accu))
141 |             except:
142 |                 dets = det_accu
143 |             continue
144 |         else:
145 |             soft_det_accu = det_accu.copy()
146 |             soft_det_accu[:, 4] = soft_det_accu[:, 4] * (1 - det_accu_iou)
147 |             soft_index = np.where(soft_det_accu[:, 4] >= score)[0]
148 |             soft_det_accu = soft_det_accu[soft_index, :]
149 | 
150 |             det_accu[:, 0:4] = det_accu[:, 0:4] * np.tile(det_accu[:, -1:], (1, 4))
151 |             max_score = np.max(det_accu[:, 4])
152 |             det_accu_sum = np.zeros((1, 5))
153 |             det_accu_sum[:, 0:4] = np.sum(det_accu[:, 0:4], axis=0) / np.sum(det_accu[:, -1:])
154 |             det_accu_sum[:, 4] = max_score
155 | 
156 |             if soft_det_accu.shape[0] > 0:
157 |                 det_accu_sum = np.row_stack((soft_det_accu, det_accu_sum))
158 | 
159 |             try:
160 |                 dets = np.row_stack((dets, det_accu_sum))
161 |             except:
162 |                 dets = det_accu_sum
163 | 
164 |     order = dets[:, 4].ravel().argsort()[::-1]
165 |     dets = dets[order, :]
166 |     return dets
167 | 
168 | 
169 | def bbox_vote(det, thre):
170 |     if det.shape[0] <= 1:
171 |         return det
172 |     order = det[:, 4].ravel().argsort()[::-1]
173 |     det = det[order, :]
174 |     # det = det[np.where(det[:, 4] > 0.2)[0], :]
175 |     dets = []
176 |     while det.shape[0] > 0:
177 |         # IOU
178 |         area = (det[:, 2] - det[:, 0] + 1) * (det[:, 3] - det[:, 1] + 1)
179 |         xx1 = np.maximum(det[0, 0], det[:, 0])
180 |         yy1 = np.maximum(det[0, 1], det[:, 1])
181 |         xx2 = np.minimum(det[0, 2], det[:, 2])
182 |         yy2 = np.minimum(det[0, 3], det[:, 3])
183 |         w = np.maximum(0.0, xx2 - xx1 + 1)
184 |         h = np.maximum(0.0, yy2 - yy1 + 1)
185 |         inter = w * h
186 |         o = inter / (area[0] + area[:] - inter)
187 | 
188 |         # get needed merge det and delete these  det
189 |         merge_index = np.where(o >= thre)[0]
190 |         det_accu = det[merge_index, :]
191 |         det = np.delete(det, merge_index, 0)
192 | 
193 |         if merge_index.shape[0] <= 1:
194 |             try:
195 |                 dets = np.row_stack((dets, det_accu))
196 |             except:
197 |                 dets = det_accu
198 |             continue
199 |         else:
200 |             det_accu[:, 0:4] = det_accu[:, 0:4] * np.tile(det_accu[:, -1:], (1, 4))
201 |             max_score = np.max(det_accu[:, 4])
202 |             det_accu_sum = np.zeros((1, 5))
203 |             det_accu_sum[:, 0:4] = np.sum(det_accu[:, 0:4], axis=0) / np.sum(det_accu[:, -1:])
204 |             det_accu_sum[:, 4] = max_score
205 |             try:
206 |                 dets = np.row_stack((dets, det_accu_sum))
207 |             except:
208 |                 dets = det_accu_sum
209 | 
210 |     return dets
211 | 


--------------------------------------------------------------------------------
/keras_csp/bbox_transform.py:
--------------------------------------------------------------------------------
 1 | # --------------------------------------------------------
 2 | # Fast R-CNN
 3 | # Copyright (c) 2015 Microsoft
 4 | # Licensed under The MIT License [see LICENSE for details]
 5 | # Written by Ross Girshick
 6 | # --------------------------------------------------------
 7 | 
 8 | import numpy as np
 9 | 
10 | 
11 | def bbox_transform(ex_rois, gt_rois):
12 |     ex_widths = ex_rois[:, 2] - ex_rois[:, 0] + 1.0
13 |     ex_heights = ex_rois[:, 3] - ex_rois[:, 1] + 1.0
14 |     ex_ctr_x = ex_rois[:, 0] + 0.5 * ex_widths
15 |     ex_ctr_y = ex_rois[:, 1] + 0.5 * ex_heights
16 | 
17 |     gt_widths = gt_rois[:, 2] - gt_rois[:, 0] + 1.0
18 |     gt_heights = gt_rois[:, 3] - gt_rois[:, 1] + 1.0
19 |     gt_ctr_x = gt_rois[:, 0] + 0.5 * gt_widths
20 |     gt_ctr_y = gt_rois[:, 1] + 0.5 * gt_heights
21 | 
22 |     targets_dx = (gt_ctr_x - ex_ctr_x) / ex_widths
23 |     targets_dy = (gt_ctr_y - ex_ctr_y) / ex_heights
24 |     targets_dw = np.log(gt_widths / ex_widths)
25 |     targets_dh = np.log(gt_heights / ex_heights)
26 | 
27 |     targets = np.vstack(
28 |         (targets_dx, targets_dy, targets_dw, targets_dh)).transpose()
29 |     return targets
30 | 
31 | 
32 | def bbox_transform_inv(boxes, deltas):
33 |     if boxes.shape[0] == 0:
34 |         return np.zeros((0, deltas.shape[1]), dtype=deltas.dtype)
35 | 
36 |     boxes = boxes.astype(deltas.dtype, copy=False)
37 | 
38 |     widths = boxes[:, 2] - boxes[:, 0] + 1.0
39 |     heights = boxes[:, 3] - boxes[:, 1] + 1.0
40 |     ctr_x = boxes[:, 0] + 0.5 * widths
41 |     ctr_y = boxes[:, 1] + 0.5 * heights
42 | 
43 |     dx = deltas[:, 0::4]
44 |     dy = deltas[:, 1::4]
45 |     dw = deltas[:, 2::4]
46 |     dh = deltas[:, 3::4]
47 | 
48 |     pred_ctr_x = dx * widths[:, np.newaxis] + ctr_x[:, np.newaxis]
49 |     pred_ctr_y = dy * heights[:, np.newaxis] + ctr_y[:, np.newaxis]
50 |     pred_w = np.exp(dw) * widths[:, np.newaxis]
51 |     pred_h = np.exp(dh) * heights[:, np.newaxis]
52 | 
53 |     pred_boxes = np.zeros(deltas.shape, dtype=deltas.dtype)
54 |     # x1
55 |     pred_boxes[:, 0::4] = pred_ctr_x - 0.5 * pred_w
56 |     # y1
57 |     pred_boxes[:, 1::4] = pred_ctr_y - 0.5 * pred_h
58 |     # x2
59 |     pred_boxes[:, 2::4] = pred_ctr_x + 0.5 * pred_w
60 |     # y2
61 |     pred_boxes[:, 3::4] = pred_ctr_y + 0.5 * pred_h
62 | 
63 |     return pred_boxes
64 | 
65 | 
66 | def compute_targets(ex_rois, gt_rois, classifier_regr_std, std):
67 |     """Compute bounding-box regression targets for an image."""
68 | 
69 |     assert ex_rois.shape[0] == gt_rois.shape[0]
70 |     assert ex_rois.shape[1] == 4
71 |     assert gt_rois.shape[1] == 4
72 | 
73 |     targets = bbox_transform(ex_rois, gt_rois)
74 |     # Optionally normalize targets by a precomputed mean and stdev
75 |     if std:
76 |         targets = targets / np.array(classifier_regr_std)
77 |     return targets
78 | 
79 | 
80 | def clip_boxes(boxes, im_shape):
81 |     # boxes[:, 0::4] = np.maximum(np.minimum(boxes[:, 0::4], im_shape[1] - 1), 0)
82 |     # # y1 >= 0
83 |     # boxes[:, 1::4] = np.maximum(np.minimum(boxes[:, 1::4], im_shape[0] - 1), 0)
84 |     # # x2 < im_shape[1]
85 |     # boxes[:, 2::4] = np.maximum(np.minimum(boxes[:, 2::4], im_shape[1] - 1), 0)
86 |     # # y2 < im_shape[0]
87 |     # boxes[:, 3::4] = np.maximum(np.minimum(boxes[:, 3::4], im_shape[0] - 1), 0)
88 |     boxes[:, 0] = np.maximum(np.minimum(boxes[:, 0], im_shape[1] - 1), 0)
89 |     # y1 >= 0
90 |     boxes[:, 1] = np.maximum(np.minimum(boxes[:, 1], im_shape[0] - 1), 0)
91 |     # x2 < im_shape[1]
92 |     boxes[:, 2] = np.maximum(np.minimum(boxes[:, 2], im_shape[1] - 1), 0)
93 |     # y2 < im_shape[0]
94 |     boxes[:, 3] = np.maximum(np.minimum(boxes[:, 3], im_shape[0] - 1), 0)
95 |     return boxes
96 | 


--------------------------------------------------------------------------------
/keras_csp/config.py:
--------------------------------------------------------------------------------
 1 | class Config(object):
 2 |     def __init__(self):
 3 |         self.gpu_ids = '0'
 4 |         self.onegpu = 2
 5 |         self.num_epochs = 150
 6 |         self.add_epoch = 0
 7 |         self.iter_per_epoch = 2000
 8 |         self.init_lr = 1e-4
 9 |         self.alpha = 0.999
10 | 
11 |         # setting for network architechture
12 |         self.network = 'resnet50'  # or 'mobilenet'
13 |         self.point = 'center'  # or 'top', 'bottom
14 |         self.scale = 'h'  # or 'w', 'hw'
15 |         self.num_scale = 1  # 1 for height (or width) prediction, 2 for height+width prediction
16 |         self.offset = False  # append offset prediction or not
17 |         self.down = 4  # downsampling rate of the feature map for detection
18 |         self.radius = 2  # surrounding areas of positives for the scale map
19 | 
20 |         # setting for data augmentation
21 |         self.use_horizontal_flips = True
22 |         self.brightness = (0.5, 2)
23 |         self.size_train = (336, 448)
24 | 
25 |         # image channel-wise mean to subtract, the order is BGR
26 |         self.img_channel_mean = [103.939, 116.779, 123.68]
27 | 


--------------------------------------------------------------------------------
/keras_csp/data_augment.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import cv2
  3 | import numpy as np
  4 | import copy
  5 | 
  6 | def _brightness(image, min=0.5, max=2.0):
  7 |     '''
  8 |     Randomly change the brightness of the input image.
  9 | 
 10 |     Protected against overflow.
 11 |     '''
 12 |     hsv = cv2.cvtColor(image,cv2.COLOR_RGB2HSV)
 13 | 
 14 |     random_br = np.random.uniform(min,max)
 15 | 
 16 |     #To protect against overflow: Calculate a mask for all pixels
 17 |     #where adjustment of the brightness would exceed the maximum
 18 |     #brightness value and set the value to the maximum at those pixels.
 19 |     mask = hsv[:,:,2] * random_br > 255
 20 |     v_channel = np.where(mask, 255, hsv[:,:,2] * random_br)
 21 |     hsv[:,:,2] = v_channel
 22 | 
 23 |     return cv2.cvtColor(hsv,cv2.COLOR_HSV2RGB)
 24 | 
 25 | def resize_image(image, gts,igs, scale=[0.4,1.5]):
 26 |     height, width = image.shape[0:2]
 27 |     ratio = np.random.uniform(scale[0], scale[1])
 28 |     # if len(gts)>0 and np.max(gts[:,3]-gts[:,1])>300:
 29 |     #     ratio = np.random.uniform(scale[0], 1.0)
 30 |     new_height, new_width = int(ratio*height), int(ratio*width)
 31 |     image = cv2.resize(image, (new_width, new_height))
 32 |     if len(gts)>0:
 33 |         gts = np.asarray(gts,dtype=float)
 34 |         gts[:, 0:4:2] *= ratio
 35 |         gts[:, 1:4:2] *= ratio
 36 | 
 37 |     if len(igs)>0:
 38 |         igs = np.asarray(igs, dtype=float)
 39 |         igs[:, 0:4:2] *= ratio
 40 |         igs[:, 1:4:2] *= ratio
 41 | 
 42 |     return image, gts, igs
 43 | 
 44 | def random_crop(image, gts, igs, crop_size, limit=8):
 45 |     img_height, img_width = image.shape[0:2]
 46 |     crop_h, crop_w = crop_size
 47 | 
 48 |     if len(gts)>0:
 49 |         sel_id = np.random.randint(0, len(gts))
 50 |         sel_center_x = int((gts[sel_id, 0] + gts[sel_id, 2]) / 2.0)
 51 |         sel_center_y = int((gts[sel_id, 1] + gts[sel_id, 3]) / 2.0)
 52 |     else:
 53 |         sel_center_x = int(np.random.randint(0, img_width - crop_w+1) + crop_w * 0.5)
 54 |         sel_center_y = int(np.random.randint(0, img_height - crop_h+1) + crop_h * 0.5)
 55 | 
 56 |     crop_x1 = max(sel_center_x - int(crop_w * 0.5), int(0))
 57 |     crop_y1 = max(sel_center_y - int(crop_h * 0.5), int(0))
 58 |     diff_x = max(crop_x1 + crop_w - img_width, int(0))
 59 |     crop_x1 -= diff_x
 60 |     diff_y = max(crop_y1 + crop_h - img_height, int(0))
 61 |     crop_y1 -= diff_y
 62 |     cropped_image = np.copy(image[crop_y1:crop_y1 + crop_h, crop_x1:crop_x1 + crop_w])
 63 |     # crop detections
 64 |     if len(igs)>0:
 65 |         igs[:, 0:4:2] -= crop_x1
 66 |         igs[:, 1:4:2] -= crop_y1
 67 |         igs[:, 0:4:2] = np.clip(igs[:, 0:4:2], 0, crop_w)
 68 |         igs[:, 1:4:2] = np.clip(igs[:, 1:4:2], 0, crop_h)
 69 |         keep_inds = ((igs[:, 2] - igs[:, 0]) >=8) & \
 70 |                     ((igs[:, 3] - igs[:, 1]) >=8)
 71 |         igs = igs[keep_inds]
 72 |     if len(gts)>0:
 73 |         ori_gts = np.copy(gts)
 74 |         gts[:, 0:4:2] -= crop_x1
 75 |         gts[:, 1:4:2] -= crop_y1
 76 |         gts[:, 0:4:2] = np.clip(gts[:, 0:4:2], 0, crop_w)
 77 |         gts[:, 1:4:2] = np.clip(gts[:, 1:4:2], 0, crop_h)
 78 | 
 79 |         before_area = (ori_gts[:, 2] - ori_gts[:, 0]) * (ori_gts[:, 3] - ori_gts[:, 1])
 80 |         after_area = (gts[:, 2] - gts[:, 0]) * (gts[:, 3] - gts[:, 1])
 81 | 
 82 |         keep_inds = ((gts[:, 2] - gts[:, 0]) >=limit) & \
 83 |                     (after_area >= 0.5 * before_area)
 84 |         gts = gts[keep_inds]
 85 | 
 86 |     return cropped_image, gts, igs
 87 | 
 88 | 
 89 | def random_pave(image, gts, igs, pave_size, limit=8):
 90 |     img_height, img_width = image.shape[0:2]
 91 |     pave_h, pave_w = pave_size
 92 |     # paved_image = np.zeros((pave_h, pave_w, 3), dtype=image.dtype)
 93 |     paved_image = np.ones((pave_h, pave_w, 3), dtype=image.dtype) * np.mean(image, dtype=int)
 94 |     pave_x = int(np.random.randint(0, pave_w - img_width + 1))
 95 |     pave_y = int(np.random.randint(0, pave_h - img_height + 1))
 96 |     paved_image[pave_y:pave_y + img_height, pave_x:pave_x + img_width] = image
 97 |     # pave detections
 98 |     if len(igs) > 0:
 99 |         igs[:, 0:4:2] += pave_x
100 |         igs[:, 1:4:2] += pave_y
101 |         keep_inds = ((igs[:, 2] - igs[:, 0]) >= 8) & \
102 |                     ((igs[:, 3] - igs[:, 1]) >= 8)
103 |         igs = igs[keep_inds]
104 | 
105 |     if len(gts) > 0:
106 |         gts[:, 0:4:2] += pave_x
107 |         gts[:, 1:4:2] += pave_y
108 |         keep_inds = ((gts[:, 2] - gts[:, 0]) >= limit)
109 |         gts = gts[keep_inds]
110 | 
111 |     return paved_image, gts, igs
112 | 
113 | def augment(img_data, c):
114 |     assert 'filepath' in img_data
115 |     assert 'bboxes' in img_data
116 |     img_data_aug = copy.deepcopy(img_data)
117 |     img = cv2.imread(img_data_aug['filepath'])
118 |     img_height, img_width = img.shape[:2]
119 | 
120 |     # random brightness
121 |     if c.brightness and np.random.randint(0, 2) == 0:
122 |         img = _brightness(img, min=c.brightness[0], max=c.brightness[1])
123 |     # random horizontal flip
124 |     if c.use_horizontal_flips and np.random.randint(0, 2) == 0:
125 |         img = cv2.flip(img, 1)
126 |         if len(img_data_aug['bboxes']) > 0:
127 |             img_data_aug['bboxes'][:, [0, 2]] = img_width - img_data_aug['bboxes'][:, [2, 0]]
128 |         if len(img_data_aug['ignoreareas']) > 0:
129 |             img_data_aug['ignoreareas'][:, [0, 2]] = img_width - img_data_aug['ignoreareas'][:, [2, 0]]
130 | 
131 |     gts = np.copy(img_data_aug['bboxes'])
132 |     igs = np.copy(img_data_aug['ignoreareas'])
133 | 
134 |     img, gts, igs = resize_image(img, gts, igs, scale=[0.4,1.5])
135 |     if img.shape[0]>=c.size_train[0]:
136 |         img, gts, igs = random_crop(img, gts, igs, c.size_train,limit=16)
137 |     else:
138 |         img, gts, igs = random_pave(img, gts, igs, c.size_train,limit=16)
139 | 
140 |     img_data_aug['bboxes'] = gts
141 |     img_data_aug['ignoreareas'] = igs
142 | 
143 |     img_data_aug['width'] = c.size_train[1]
144 |     img_data_aug['height'] = c.size_train[0]
145 | 
146 |     return img_data_aug, img
147 | 
148 | def augment_wider(img_data, c):
149 |     assert 'filepath' in img_data
150 |     assert 'bboxes' in img_data
151 |     img_data_aug = copy.deepcopy(img_data)
152 |     img = cv2.imread(img_data_aug['filepath'])
153 |     img_height, img_width = img.shape[:2]
154 | 
155 |     # random brightness
156 |     if c.brightness and np.random.randint(0, 2) == 0:
157 |         img = _brightness(img, min=c.brightness[0], max=c.brightness[1])
158 |     # random horizontal flip
159 |     if c.use_horizontal_flips and np.random.randint(0, 2) == 0:
160 |         img = cv2.flip(img, 1)
161 |         if len(img_data_aug['bboxes']) > 0:
162 |             img_data_aug['bboxes'][:, [0, 2]] = img_width - img_data_aug['bboxes'][:, [2, 0]]
163 |     gts = np.copy(img_data_aug['bboxes'])
164 | 
165 |     scales = np.asarray([16, 32, 64, 128, 256])
166 |     crop_p = c.size_train[0]
167 |     if len(gts) > 0:
168 |         sel_id = np.random.randint(0, len(gts))
169 |         s_face = np.sqrt((gts[sel_id, 2] - gts[sel_id, 0]) * (gts[sel_id, 3] - gts[sel_id, 1]))
170 |         index = np.random.randint(0, np.argmin(np.abs(scales - s_face)) + 1)
171 |         s_tar = np.random.uniform(np.power(2, 4 + index)*1.5, np.power(2, 4 + index) * 2)
172 |         ratio = s_tar / s_face
173 |         new_height, new_width = int(ratio * img_height), int(ratio * img_width)
174 |         img = cv2.resize(img, (new_width, new_height))
175 |         gts = np.asarray(gts, dtype=float) * ratio
176 | 
177 |         crop_x1 = np.random.randint(0, int(gts[sel_id, 0])+1)
178 |         crop_x1 = np.minimum(crop_x1, np.maximum(0, new_width - crop_p))
179 |         crop_y1 = np.random.randint(0, int(gts[sel_id, 1])+1)
180 |         crop_y1 = np.minimum(crop_y1, np.maximum(0, new_height - crop_p))
181 |         img = img[crop_y1:crop_y1 + crop_p, crop_x1:crop_x1 + crop_p]
182 |         # crop detections
183 |         if len(gts) > 0:
184 |             ori_gts = np.copy(gts)
185 |             gts[:, 0:4:2] -= crop_x1
186 |             gts[:, 1:4:2] -= crop_y1
187 |             gts[:, 0:4:2] = np.clip(gts[:, 0:4:2], 0, crop_p)
188 |             gts[:, 1:4:2] = np.clip(gts[:, 1:4:2], 0, crop_p)
189 | 
190 |             before_area = (ori_gts[:, 2] - ori_gts[:, 0]) * (ori_gts[:, 3] - ori_gts[:, 1])
191 |             after_area = (gts[:, 2] - gts[:, 0]) * (gts[:, 3] - gts[:, 1])
192 | 
193 |             keep_inds = (after_area >= 0.5 * before_area)
194 |             keep_inds_ig = (after_area < 0.5 * before_area)
195 |             igs = gts[keep_inds_ig]
196 |             gts = gts[keep_inds]
197 | 
198 |             if len(igs) > 0:
199 |                 w, h = igs[:, 2] - igs[:, 0], igs[:, 3] - igs[:, 1]
200 |                 igs = igs[np.logical_and(w >= 12, h >= 12), :]
201 |             if len(gts) > 0:
202 |                 w, h = gts[:, 2] - gts[:, 0], gts[:, 3] - gts[:, 1]
203 |                 gts = gts[np.logical_and(w >= 12, h >= 12), :]
204 |     else:
205 |         img = img[0:crop_p, 0:crop_p]
206 | 
207 |     if np.minimum(img.shape[0], img.shape[1]) < c.size_train[0]:
208 |         img, gts, igs = random_pave(img, gts, igs, c.size_train)
209 | 
210 |     img_data_aug['bboxes'] = gts
211 |     img_data_aug['ignoreareas'] = igs
212 |     return img_data_aug, img
213 | 


--------------------------------------------------------------------------------
/keras_csp/keras_layer_L2Normalization.py:
--------------------------------------------------------------------------------
 1 | '''
 2 | A custom Keras layer to perform L2-normalization.
 3 | 
 4 | Copyright (C) 2017 Pierluigi Ferrari
 5 | 
 6 | This program is free software: you can redistribute it and/or modify
 7 | it under the terms of the GNU General Public License as published by
 8 | the Free Software Foundation, either version 3 of the License, or
 9 | (at your option) any later version.
10 | 
11 | This program is distributed in the hope that it will be useful,
12 | but WITHOUT ANY WARRANTY; without even the implied warranty of
13 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
14 | GNU General Public License for more details.
15 | 
16 | You should have received a copy of the GNU General Public License
17 | along with this program.  If not, see <http://www.gnu.org/licenses/>.
18 | '''
19 | 
20 | import keras.backend as K
21 | from keras.engine.topology import InputSpec
22 | from keras.engine.topology import Layer
23 | import numpy as np
24 | 
25 | 
26 | class L2Normalization(Layer):
27 |     '''
28 |     Performs L2 normalization on the input tensor with a learnable scaling parameter
29 |     as described in the paper "Parsenet: Looking Wider to See Better" (see references)
30 |     and as used in the original SSD model.
31 | 
32 |     Arguments:
33 |         gamma_init (int): The initial scaling parameter. Defaults to 20 following the
34 |             SSD paper.
35 | 
36 |     Input shape:
37 |         4D tensor of shape `(batch, channels, height, width)` if `dim_ordering = 'th'`
38 |         or `(batch, height, width, channels)` if `dim_ordering = 'tf'`.
39 | 
40 |     Returns:
41 |         The scaled tensor. Same shape as the input tensor.
42 | 
43 |     References:
44 |         http://cs.unc.edu/~wliu/papers/parsenet.pdf
45 |     '''
46 | 
47 |     def __init__(self, gamma_init=20, **kwargs):
48 |         if K.image_dim_ordering() == 'tf':
49 |             self.axis = 3
50 |         else:
51 |             self.axis = 1
52 |         self.gamma_init = gamma_init
53 |         super(L2Normalization, self).__init__(**kwargs)
54 | 
55 |     def build(self, input_shape):
56 |         self.input_spec = [InputSpec(shape=input_shape)]
57 |         gamma = self.gamma_init * np.ones((input_shape[self.axis],))
58 |         self.gamma = K.variable(gamma, name='{}_gamma'.format(self.name))
59 |         self.trainable_weights = [self.gamma]
60 |         super(L2Normalization, self).build(input_shape)
61 | 
62 |     def call(self, x, mask=None):
63 |         output = K.l2_normalize(x, self.axis)
64 |         output *= self.gamma
65 |         return output
66 | 


--------------------------------------------------------------------------------
/keras_csp/losses.py:
--------------------------------------------------------------------------------
 1 | from keras import backend as K
 2 | from keras.objectives import categorical_crossentropy
 3 | 
 4 | if K.image_dim_ordering() == 'tf':
 5 |     import tensorflow as tf
 6 | 
 7 | epsilon = 1e-4
 8 | 
 9 | 
10 | def cls_center(y_true, y_pred):
11 |     classification_loss = K.binary_crossentropy(y_pred[:, :, :, 0], y_true[:, :, :, 2])
12 |     # firstly we compute the focal weight
13 |     positives = y_true[:, :, :, 2]
14 |     negatives = y_true[:, :, :, 1] - y_true[:, :, :, 2]
15 |     foreground_weight = positives * (1.0 - y_pred[:, :, :, 0]) ** 2.0
16 |     # foreground_weight = positives
17 |     background_weight = negatives * ((1.0 - y_true[:, :, :, 0]) ** 4.0) * (y_pred[:, :, :, 0] ** 2.0)
18 |     # background_weight = negatives * ((1.0 - y_true[:, :, :, 0])**4.0)*(0.01 ** 2.0)
19 | 
20 |     # foreground_weight = y_true[:, :, :, 0] * (1- y_pred[:, :, :, 0]) ** 2.0
21 |     # background_weight = negatives * y_pred[:, :, :, 0] ** 2.0
22 | 
23 |     focal_weight = foreground_weight + background_weight
24 | 
25 |     assigned_boxes = tf.reduce_sum(y_true[:, :, :, 2])
26 |     class_loss = 0.01 * tf.reduce_sum(focal_weight * classification_loss) / tf.maximum(1.0, assigned_boxes)
27 | 
28 |     # assigned_boxes = tf.reduce_sum(tf.reduce_sum(y_true[:, :, :, 1], axis=-1), axis=-1)
29 |     # class_loss = tf.reduce_sum(tf.reduce_sum(classification_loss, axis=-1), axis=-1) / tf.maximum(1.0, assigned_boxes)
30 | 
31 |     return class_loss
32 | 
33 | 
34 | def regr_h(y_true, y_pred):
35 |     absolute_loss = tf.abs(y_true[:, :, :, 0] - y_pred[:, :, :, 0]) / (y_true[:, :, :, 0] + 1e-10)
36 |     square_loss = 0.5 * ((y_true[:, :, :, 0] - y_pred[:, :, :, 0]) / (y_true[:, :, :, 0] + 1e-10)) ** 2
37 | 
38 |     l1_loss = y_true[:, :, :, 1] * tf.where(tf.less(absolute_loss, 1.0), square_loss, absolute_loss - 0.5)
39 | 
40 |     assigned_boxes = tf.reduce_sum(y_true[:, :, :, 1])
41 |     class_loss = tf.reduce_sum(l1_loss) / tf.maximum(1.0, assigned_boxes)
42 | 
43 |     return class_loss
44 | 
45 | 
46 | def regr_hw(y_true, y_pred):
47 |     absolute_loss = tf.abs(y_true[:, :, :, :2] - y_pred[:, :, :, :]) / (y_true[:, :, :, :2] + 1e-10)
48 |     square_loss = 0.5 * ((y_true[:, :, :, :2] - y_pred[:, :, :, :]) / (y_true[:, :, :, :2] + 1e-10)) ** 2
49 |     loss = y_true[:, :, :, 2] * tf.reduce_sum(tf.where(tf.less(absolute_loss, 1.0), square_loss, absolute_loss - 0.5),
50 |                                               axis=-1)
51 |     assigned_boxes = tf.reduce_sum(y_true[:, :, :, 2])
52 |     class_loss = tf.reduce_sum(loss) / tf.maximum(1.0, assigned_boxes)
53 | 
54 |     return class_loss
55 | 
56 | 
57 | def regr_offset(y_true, y_pred):
58 |     absolute_loss = tf.abs(y_true[:, :, :, :2] - y_pred[:, :, :, :])
59 |     square_loss = 0.5 * (y_true[:, :, :, :2] - y_pred[:, :, :, :]) ** 2
60 |     l1_loss = y_true[:, :, :, 2] * tf.reduce_sum(
61 |         tf.where(tf.less(absolute_loss, 1.0), square_loss, absolute_loss - 0.5), axis=-1)
62 | 
63 |     assigned_boxes = tf.reduce_sum(y_true[:, :, :, 2])
64 |     class_loss = 0.1 * tf.reduce_sum(l1_loss) / tf.maximum(1.0, assigned_boxes)
65 | 
66 |     return class_loss
67 | 


--------------------------------------------------------------------------------
/keras_csp/nms/.gitignore:
--------------------------------------------------------------------------------
1 | *.c
2 | *.cpp
3 | *.so
4 | 


--------------------------------------------------------------------------------
/keras_csp/nms/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/dominikandreas/CSP/e4a5a3770609896d3ab4b164c434ce80746225e6/keras_csp/nms/__init__.py


--------------------------------------------------------------------------------
/keras_csp/nms/cpu_nms.pyx:
--------------------------------------------------------------------------------
 1 | # --------------------------------------------------------
 2 | # Fast R-CNN
 3 | # Copyright (c) 2015 Microsoft
 4 | # Licensed under The MIT License [see LICENSE for details]
 5 | # Written by Ross Girshick
 6 | # --------------------------------------------------------
 7 | 
 8 | import numpy as np
 9 | cimport numpy as np
10 | 
11 | cdef inline np.float32_t max(np.float32_t a, np.float32_t b):
12 |     return a if a >= b else b
13 | 
14 | cdef inline np.float32_t min(np.float32_t a, np.float32_t b):
15 |     return a if a <= b else b
16 | 
17 | def cpu_nms(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh):
18 |     cdef np.ndarray[np.float32_t, ndim=1] x1 = dets[:, 0]
19 |     cdef np.ndarray[np.float32_t, ndim=1] y1 = dets[:, 1]
20 |     cdef np.ndarray[np.float32_t, ndim=1] x2 = dets[:, 2]
21 |     cdef np.ndarray[np.float32_t, ndim=1] y2 = dets[:, 3]
22 |     cdef np.ndarray[np.float32_t, ndim=1] scores = dets[:, 4]
23 | 
24 |     cdef np.ndarray[np.float32_t, ndim=1] areas = (x2 - x1 + 1) * (y2 - y1 + 1)
25 |     cdef np.ndarray[np.int_t, ndim=1] order = scores.argsort()[::-1]
26 | 
27 |     cdef int ndets = dets.shape[0]
28 |     cdef np.ndarray[np.int_t, ndim=1] suppressed = \
29 |             np.zeros((ndets), dtype=np.int)
30 | 
31 |     # nominal indices
32 |     cdef int _i, _j
33 |     # sorted indices
34 |     cdef int i, j
35 |     # temp variables for box i's (the box currently under consideration)
36 |     cdef np.float32_t ix1, iy1, ix2, iy2, iarea
37 |     # variables for computing overlap with box j (lower scoring box)
38 |     cdef np.float32_t xx1, yy1, xx2, yy2
39 |     cdef np.float32_t w, h
40 |     cdef np.float32_t inter, ovr
41 | 
42 |     keep = []
43 |     for _i in range(ndets):
44 |         i = order[_i]
45 |         if suppressed[i] == 1:
46 |             continue
47 |         keep.append(i)
48 |         ix1 = x1[i]
49 |         iy1 = y1[i]
50 |         ix2 = x2[i]
51 |         iy2 = y2[i]
52 |         iarea = areas[i]
53 |         for _j in range(_i + 1, ndets):
54 |             j = order[_j]
55 |             if suppressed[j] == 1:
56 |                 continue
57 |             xx1 = max(ix1, x1[j])
58 |             yy1 = max(iy1, y1[j])
59 |             xx2 = min(ix2, x2[j])
60 |             yy2 = min(iy2, y2[j])
61 |             w = max(0.0, xx2 - xx1 + 1)
62 |             h = max(0.0, yy2 - yy1 + 1)
63 |             inter = w * h
64 |             ovr = inter / (iarea + areas[j] - inter)
65 |             if ovr >= thresh:
66 |                 suppressed[j] = 1
67 | 
68 |     return keep
69 | 


--------------------------------------------------------------------------------
/keras_csp/nms/gpu_nms.hpp:
--------------------------------------------------------------------------------
1 | void _nms(int* keep_out, int* num_out, const float* boxes_host, int boxes_num,
2 |           int boxes_dim, float nms_overlap_thresh, int device_id);
3 | 


--------------------------------------------------------------------------------
/keras_csp/nms/gpu_nms.pyx:
--------------------------------------------------------------------------------
 1 | # --------------------------------------------------------
 2 | # Faster R-CNN
 3 | # Copyright (c) 2015 Microsoft
 4 | # Licensed under The MIT License [see LICENSE for details]
 5 | # Written by Ross Girshick
 6 | # --------------------------------------------------------
 7 | 
 8 | import numpy as np
 9 | cimport numpy as np
10 | 
11 | assert sizeof(int) == sizeof(np.int32_t)
12 | 
13 | cdef extern from "gpu_nms.hpp":
14 |     void _nms(np.int32_t*, int*, np.float32_t*, int, int, float, int)
15 | 
16 | def gpu_nms(np.ndarray[np.float32_t, ndim=2] dets, np.float thresh,
17 |             np.int32_t device_id=0):
18 |     cdef int boxes_num = dets.shape[0]
19 |     cdef int boxes_dim = dets.shape[1]
20 |     cdef int num_out
21 |     cdef np.ndarray[np.int32_t, ndim=1] \
22 |         keep = np.zeros(boxes_num, dtype=np.int32)
23 |     cdef np.ndarray[np.float32_t, ndim=1] \
24 |         scores = dets[:, 4]
25 |     cdef np.ndarray[np.int_t, ndim=1] \
26 |         order = scores.argsort()[::-1]
27 |     cdef np.ndarray[np.float32_t, ndim=2] \
28 |         sorted_dets = dets[order, :]
29 |     _nms(&keep[0], &num_out, &sorted_dets[0, 0], boxes_num, boxes_dim, thresh, device_id)
30 |     keep = keep[:num_out]
31 |     return list(order[keep])
32 | 


--------------------------------------------------------------------------------
/keras_csp/nms/nms_kernel.cu:
--------------------------------------------------------------------------------
  1 | // ------------------------------------------------------------------
  2 | // Faster R-CNN
  3 | // Copyright (c) 2015 Microsoft
  4 | // Licensed under The MIT License [see fast-rcnn/LICENSE for details]
  5 | // Written by Shaoqing Ren
  6 | // ------------------------------------------------------------------
  7 | 
  8 | #include "gpu_nms.hpp"
  9 | #include <vector>
 10 | #include <iostream>
 11 | 
 12 | #define CUDA_CHECK(condition) \
 13 |   /* Code block avoids redefinition of cudaError_t error */ \
 14 |   do { \
 15 |     cudaError_t error = condition; \
 16 |     if (error != cudaSuccess) { \
 17 |       std::cout << cudaGetErrorString(error) << std::endl; \
 18 |     } \
 19 |   } while (0)
 20 | 
 21 | #define DIVUP(m,n) ((m) / (n) + ((m) % (n) > 0))
 22 | int const threadsPerBlock = sizeof(unsigned long long) * 8;
 23 | 
 24 | __device__ inline float devIoU(float const * const a, float const * const b) {
 25 |   float left = max(a[0], b[0]), right = min(a[2], b[2]);
 26 |   float top = max(a[1], b[1]), bottom = min(a[3], b[3]);
 27 |   float width = max(right - left + 1, 0.f), height = max(bottom - top + 1, 0.f);
 28 |   float interS = width * height;
 29 |   float Sa = (a[2] - a[0] + 1) * (a[3] - a[1] + 1);
 30 |   float Sb = (b[2] - b[0] + 1) * (b[3] - b[1] + 1);
 31 |   return interS / (Sa + Sb - interS);
 32 | }
 33 | 
 34 | __global__ void nms_kernel(const int n_boxes, const float nms_overlap_thresh,
 35 |                            const float *dev_boxes, unsigned long long *dev_mask) {
 36 |   const int row_start = blockIdx.y;
 37 |   const int col_start = blockIdx.x;
 38 | 
 39 |   // if (row_start > col_start) return;
 40 | 
 41 |   const int row_size =
 42 |         min(n_boxes - row_start * threadsPerBlock, threadsPerBlock);
 43 |   const int col_size =
 44 |         min(n_boxes - col_start * threadsPerBlock, threadsPerBlock);
 45 | 
 46 |   __shared__ float block_boxes[threadsPerBlock * 5];
 47 |   if (threadIdx.x < col_size) {
 48 |     block_boxes[threadIdx.x * 5 + 0] =
 49 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 0];
 50 |     block_boxes[threadIdx.x * 5 + 1] =
 51 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 1];
 52 |     block_boxes[threadIdx.x * 5 + 2] =
 53 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 2];
 54 |     block_boxes[threadIdx.x * 5 + 3] =
 55 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 3];
 56 |     block_boxes[threadIdx.x * 5 + 4] =
 57 |         dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 4];
 58 |   }
 59 |   __syncthreads();
 60 | 
 61 |   if (threadIdx.x < row_size) {
 62 |     const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x;
 63 |     const float *cur_box = dev_boxes + cur_box_idx * 5;
 64 |     int i = 0;
 65 |     unsigned long long t = 0;
 66 |     int start = 0;
 67 |     if (row_start == col_start) {
 68 |       start = threadIdx.x + 1;
 69 |     }
 70 |     for (i = start; i < col_size; i++) {
 71 |       if (devIoU(cur_box, block_boxes + i * 5) > nms_overlap_thresh) {
 72 |         t |= 1ULL << i;
 73 |       }
 74 |     }
 75 |     const int col_blocks = DIVUP(n_boxes, threadsPerBlock);
 76 |     dev_mask[cur_box_idx * col_blocks + col_start] = t;
 77 |   }
 78 | }
 79 | 
 80 | void _set_device(int device_id) {
 81 |   int current_device;
 82 |   CUDA_CHECK(cudaGetDevice(&current_device));
 83 |   if (current_device == device_id) {
 84 |     return;
 85 |   }
 86 |   // The call to cudaSetDevice must come before any calls to Get, which
 87 |   // may perform initialization using the GPU.
 88 |   CUDA_CHECK(cudaSetDevice(device_id));
 89 | }
 90 | 
 91 | void _nms(int* keep_out, int* num_out, const float* boxes_host, int boxes_num,
 92 |           int boxes_dim, float nms_overlap_thresh, int device_id) {
 93 |   _set_device(device_id);
 94 | 
 95 |   float* boxes_dev = NULL;
 96 |   unsigned long long* mask_dev = NULL;
 97 | 
 98 |   const int col_blocks = DIVUP(boxes_num, threadsPerBlock);
 99 | 
100 |   CUDA_CHECK(cudaMalloc(&boxes_dev,
101 |                         boxes_num * boxes_dim * sizeof(float)));
102 |   CUDA_CHECK(cudaMemcpy(boxes_dev,
103 |                         boxes_host,
104 |                         boxes_num * boxes_dim * sizeof(float),
105 |                         cudaMemcpyHostToDevice));
106 | 
107 |   CUDA_CHECK(cudaMalloc(&mask_dev,
108 |                         boxes_num * col_blocks * sizeof(unsigned long long)));
109 | 
110 |   dim3 blocks(DIVUP(boxes_num, threadsPerBlock),
111 |               DIVUP(boxes_num, threadsPerBlock));
112 |   dim3 threads(threadsPerBlock);
113 |   nms_kernel<<<blocks, threads>>>(boxes_num,
114 |                                   nms_overlap_thresh,
115 |                                   boxes_dev,
116 |                                   mask_dev);
117 | 
118 |   std::vector<unsigned long long> mask_host(boxes_num * col_blocks);
119 |   CUDA_CHECK(cudaMemcpy(&mask_host[0],
120 |                         mask_dev,
121 |                         sizeof(unsigned long long) * boxes_num * col_blocks,
122 |                         cudaMemcpyDeviceToHost));
123 | 
124 |   std::vector<unsigned long long> remv(col_blocks);
125 |   memset(&remv[0], 0, sizeof(unsigned long long) * col_blocks);
126 | 
127 |   int num_to_keep = 0;
128 |   for (int i = 0; i < boxes_num; i++) {
129 |     int nblock = i / threadsPerBlock;
130 |     int inblock = i % threadsPerBlock;
131 | 
132 |     if (!(remv[nblock] & (1ULL << inblock))) {
133 |       keep_out[num_to_keep++] = i;
134 |       unsigned long long *p = &mask_host[0] + i * col_blocks;
135 |       for (int j = nblock; j < col_blocks; j++) {
136 |         remv[j] |= p[j];
137 |       }
138 |     }
139 |   }
140 |   *num_out = num_to_keep;
141 | 
142 |   CUDA_CHECK(cudaFree(boxes_dev));
143 |   CUDA_CHECK(cudaFree(mask_dev));
144 | }
145 | 


--------------------------------------------------------------------------------
/keras_csp/nms/py_cpu_nms.py:
--------------------------------------------------------------------------------
 1 | # --------------------------------------------------------
 2 | # Fast R-CNN
 3 | # Copyright (c) 2015 Microsoft
 4 | # Licensed under The MIT License [see LICENSE for details]
 5 | # Written by Ross Girshick
 6 | # --------------------------------------------------------
 7 | 
 8 | import numpy as np
 9 | 
10 | def py_cpu_nms(dets, thresh):
11 |     """Pure Python NMS baseline."""
12 |     x1 = dets[:, 0]
13 |     y1 = dets[:, 1]
14 |     x2 = dets[:, 2]
15 |     y2 = dets[:, 3]
16 |     scores = dets[:, 4]
17 | 
18 |     areas = (x2 - x1 + 1) * (y2 - y1 + 1)
19 |     order = scores.argsort()[::-1]
20 | 
21 |     keep = []
22 |     while order.size > 0:
23 |         i = order[0]
24 |         keep.append(i)
25 |         xx1 = np.maximum(x1[i], x1[order[1:]])
26 |         yy1 = np.maximum(y1[i], y1[order[1:]])
27 |         xx2 = np.minimum(x2[i], x2[order[1:]])
28 |         yy2 = np.minimum(y2[i], y2[order[1:]])
29 | 
30 |         w = np.maximum(0.0, xx2 - xx1 + 1)
31 |         h = np.maximum(0.0, yy2 - yy1 + 1)
32 |         inter = w * h
33 |         ovr = inter / (areas[i] + areas[order[1:]] - inter)
34 | 
35 |         inds = np.where(ovr <= thresh)[0]
36 |         order = order[inds + 1]
37 | 
38 |     return keep
39 | 


--------------------------------------------------------------------------------
/keras_csp/nms_wrapper.py:
--------------------------------------------------------------------------------
 1 | # --------------------------------------------------------
 2 | # Fast R-CNN
 3 | # Copyright (c) 2015 Microsoft
 4 | # Licensed under The MIT License [see LICENSE for details]
 5 | # Written by Ross Girshick
 6 | # --------------------------------------------------------
 7 | import pyximport; pyximport.install()
 8 | from keras_csp.nms.gpu_nms import gpu_nms
 9 | from keras_csp.nms.cpu_nms import cpu_nms
10 | import numpy as np
11 | 
12 | def soft_nms(dets, sigma=0.5, Nt=0.3, threshold=0.001, method=1):
13 | 
14 |     keep = cpu_soft_nms(np.ascontiguousarray(dets, dtype=np.float32),
15 |                         np.float32(sigma), np.float32(Nt),
16 |                         np.float32(threshold),
17 |                         np.uint8(method))
18 |     return keep
19 | 
20 | def nms(dets, thresh, usegpu,gpu_id):
21 |     """Dispatch to either CPU or GPU NMS implementations."""
22 | 
23 |     if dets.shape[0] == 0:
24 |         return []
25 |     if usegpu:
26 |         return gpu_nms(dets, thresh, device_id=gpu_id)
27 |     else:
28 |         return cpu_nms(dets, thresh)
29 | 


--------------------------------------------------------------------------------
/keras_csp/parallel_model.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Mask R-CNN
  3 | Multi-GPU Support for Keras.
  4 | 
  5 | Copyright (c) 2017 Matterport, Inc.
  6 | Licensed under the MIT License (see LICENSE for details)
  7 | Written by Waleed Abdulla
  8 | 
  9 | Ideas and a small code snippets from these sources:
 10 | https://github.com/fchollet/keras/issues/2436
 11 | https://medium.com/@kuza55/transparent-multi-gpu-training-on-tensorflow-with-keras-8b0016fd9012
 12 | https://github.com/avolkov1/keras_experiments/blob/master/keras_exp/multigpu/
 13 | https://github.com/fchollet/keras/blob/master/keras/utils/training_utils.py
 14 | """
 15 | 
 16 | import tensorflow as tf
 17 | import keras.backend as K
 18 | import keras.layers as KL
 19 | import keras.models as KM
 20 | 
 21 | 
 22 | class ParallelModel(KM.Model):
 23 |     """Subclasses the standard Keras Model and adds multi-GPU support.
 24 |     It works by creating a copy of the model on each GPU. Then it slices
 25 |     the inputs and sends a slice to each copy of the model, and then
 26 |     merges the outputs together and applies the loss on the combined
 27 |     outputs.
 28 |     """
 29 | 
 30 |     def __init__(self, keras_model, gpu_count):
 31 |         """Class constructor.
 32 |         keras_model: The Keras model to parallelize
 33 |         gpu_count: Number of GPUs. Must be > 1
 34 |         """
 35 |         self.inner_model = keras_model
 36 |         self.gpu_count = gpu_count
 37 |         merged_outputs = self.make_parallel()
 38 |         super(ParallelModel, self).__init__(inputs=self.inner_model.inputs,
 39 |                                             outputs=merged_outputs)
 40 | 
 41 |     def __getattribute__(self, attrname):
 42 |         """Redirect loading and saving methods to the inner model. That's where
 43 |         the weights are stored."""
 44 |         if 'load' in attrname or 'save' in attrname:
 45 |             return getattr(self.inner_model, attrname)
 46 |         return super(ParallelModel, self).__getattribute__(attrname)
 47 | 
 48 |     def summary(self, *args, **kwargs):
 49 |         """Override summary() to display summaries of both, the wrapper
 50 |         and inner models."""
 51 |         super(ParallelModel, self).summary(*args, **kwargs)
 52 |         self.inner_model.summary(*args, **kwargs)
 53 | 
 54 |     def make_parallel(self):
 55 |         """Creates a new wrapper model that consists of multiple replicas of
 56 |         the original model placed on different GPUs.
 57 |         """
 58 |         # Slice inputs. Slice inputs on the CPU to avoid sending a copy
 59 |         # of the full inputs to all GPUs. Saves on bandwidth and memory.
 60 |         input_slices = {name: tf.split(x, self.gpu_count)
 61 |                         for name, x in zip(self.inner_model.input_names,
 62 |                                            self.inner_model.inputs)}
 63 | 
 64 |         output_names = self.inner_model.output_names
 65 |         outputs_all = []
 66 |         for i in range(len(self.inner_model.outputs)):
 67 |             outputs_all.append([])
 68 | 
 69 |         # Run the model call() on each GPU to place the ops there
 70 |         for i in range(self.gpu_count):
 71 |             with tf.device('/gpu:%d' % i):
 72 |                 with tf.name_scope('tower_%d' % i):
 73 |                     # Run a slice of inputs through this replica
 74 |                     zipped_inputs = list(zip(self.inner_model.input_names,
 75 |                                         self.inner_model.inputs))
 76 |                     inputs = [
 77 |                         KL.Lambda(lambda s: input_slices[name][i],
 78 |                                   output_shape=lambda s: (None,)+s[1:])(tensor)
 79 |                         for name, tensor in zipped_inputs]
 80 |                     # Create the model replica and get the outputs
 81 |                     outputs = self.inner_model(inputs)
 82 |                     if not isinstance(outputs, list):
 83 |                         outputs = [outputs]
 84 |                     # Save the outputs for merging back together later
 85 |                     for l, o in enumerate(outputs):
 86 |                         outputs_all[l].append(o)
 87 | 
 88 |         # Merge outputs on CPU
 89 |         with tf.device('/cpu:0'):
 90 |             merged = []
 91 |             for outputs, name in zip(outputs_all, output_names):
 92 |                 # If outputs are numbers without dimensions, add a batch dim.
 93 |                 def add_dim(tensor):
 94 |                     """Add a dimension to tensors that don't have any."""
 95 |                     if K.int_shape(tensor) == ():
 96 |                         return KL.Lambda(lambda t: K.reshape(t, [1, 1]))(tensor)
 97 |                     return tensor
 98 |                 outputs = list(map(add_dim, outputs))
 99 | 
100 |                 # Concatenate
101 |                 merged.append(KL.Concatenate(axis=0, name=name)(outputs))
102 |         return merged
103 | 
104 | 
105 | if __name__ == "__main__":
106 |     # Testing code below. It creates a simple model to train on MNIST and
107 |     # tries to run it on 2 GPUs. It saves the graph so it can be viewed
108 |     # in TensorBoard. Run it as:
109 |     #
110 |     # python3 parallel_model.py
111 | 
112 |     import os
113 |     import numpy as np
114 |     import keras.optimizers
115 |     from keras.datasets import mnist
116 |     from keras.preprocessing.image import ImageDataGenerator
117 | 
118 |     GPU_COUNT = 2
119 | 
120 |     # Root directory of the project
121 |     ROOT_DIR = os.getcwd()
122 | 
123 |     # Directory to save logs and trained model
124 |     MODEL_DIR = os.path.join(ROOT_DIR, "logs/parallel")
125 | 
126 |     def build_model(x_train, num_classes):
127 |         # Reset default graph. Keras leaves old ops in the graph,
128 |         # which are ignored for execution but clutter graph
129 |         # visualization in TensorBoard.
130 |         tf.reset_default_graph()
131 | 
132 |         inputs = KL.Input(shape=x_train.shape[1:], name="input_image")
133 |         x = KL.Conv2D(32, (3, 3), activation='relu', padding="same",
134 |                       name="conv1")(inputs)
135 |         x = KL.Conv2D(64, (3, 3), activation='relu', padding="same",
136 |                       name="conv2")(x)
137 |         x = KL.MaxPooling2D(pool_size=(2, 2), name="pool1")(x)
138 |         x = KL.Flatten(name="flat1")(x)
139 |         x = KL.Dense(128, activation='relu', name="dense1")(x)
140 |         x = KL.Dense(num_classes, activation='softmax', name="dense2")(x)
141 | 
142 |         return KM.Model(inputs, x, "digit_classifier_model")
143 | 
144 |     # Load MNIST Data
145 |     (x_train, y_train), (x_test, y_test) = mnist.load_data()
146 |     x_train = np.expand_dims(x_train, -1).astype('float32') / 255
147 |     x_test = np.expand_dims(x_test, -1).astype('float32') / 255
148 | 
149 |     print(('x_train shape:', x_train.shape))
150 |     print(('x_test shape:', x_test.shape))
151 | 
152 |     # Build data generator and model
153 |     datagen = ImageDataGenerator()
154 |     model = build_model(x_train, 10)
155 | 
156 |     # Add multi-GPU support.
157 |     model = ParallelModel(model, GPU_COUNT)
158 | 
159 |     optimizer = keras.optimizers.SGD(lr=0.01, momentum=0.9, clipnorm=5.0)
160 | 
161 |     model.compile(loss='sparse_categorical_crossentropy',
162 |                   optimizer=optimizer, metrics=['accuracy'])
163 | 
164 |     model.summary()
165 | 
166 |     # Train
167 |     model.fit_generator(
168 |         datagen.flow(x_train, y_train, batch_size=64),
169 |         steps_per_epoch=50, epochs=10, verbose=1,
170 |         validation_data=(x_test, y_test),
171 |         callbacks=[keras.callbacks.TensorBoard(log_dir=MODEL_DIR,
172 |                                                write_graph=True)]
173 |     )
174 | 


--------------------------------------------------------------------------------
/keras_csp/utilsfunc.py:
--------------------------------------------------------------------------------
  1 | import cv2
  2 | import numpy as np
  3 | 
  4 | 
  5 | def format_img_size(img, C):
  6 |     """ formats the image size based on config """
  7 |     img_min_side = float(C.im_size)
  8 |     (height, width, _) = img.shape
  9 | 
 10 |     if width <= height:
 11 |         ratio = img_min_side / width
 12 |         new_height = int(ratio * height)
 13 |         new_width = int(img_min_side)
 14 |     else:
 15 |         ratio = img_min_side / height
 16 |         new_width = int(ratio * width)
 17 |         new_height = int(img_min_side)
 18 |     img = cv2.resize(img, (new_width, new_height), interpolation=cv2.INTER_CUBIC)
 19 |     return img, ratio
 20 | 
 21 | 
 22 | def format_img_channels(img, C):
 23 |     """ formats the image channels based on config """
 24 |     # img = img[:, :, (2, 1, 0)]
 25 |     img = img.astype(np.float32)
 26 |     img[:, :, 0] -= C.img_channel_mean[0]
 27 |     img[:, :, 1] -= C.img_channel_mean[1]
 28 |     img[:, :, 2] -= C.img_channel_mean[2]
 29 |     # img /= C.img_scaling_factor
 30 |     # img = np.transpose(img, (2, 0, 1))
 31 |     img = np.expand_dims(img, axis=0)
 32 |     return img
 33 | 
 34 | 
 35 | def format_img_batch(img, C):
 36 |     """ formats the image channels based on config """
 37 |     # img = img[:, :, (2, 1, 0)]
 38 |     img = img.astype(np.float32)
 39 |     img[:, :, :, 0] -= C.img_channel_mean[0]
 40 |     img[:, :, :, 1] -= C.img_channel_mean[1]
 41 |     img[:, :, :, 2] -= C.img_channel_mean[2]
 42 |     # img /= C.img_scaling_factor
 43 |     # img = np.transpose(img, (2, 0, 1))
 44 |     # img = np.expand_dims(img, axis=0)
 45 |     return img
 46 | 
 47 | 
 48 | def format_img(img, C):
 49 |     """ formats an image for model prediction based on config """
 50 |     # img, ratio = format_img_size(img, C)
 51 |     img = format_img_channels(img, C)
 52 |     return img  # return img, ratio
 53 | 
 54 | 
 55 | def format_img_inria(img, C):
 56 |     img_h, img_w = img.shape[:2]
 57 |     # img_h_new, img_w_new = int(round(img_h/16)*16), int(round(img_w/16)*16)
 58 |     # img = cv2.resize(img, (img_w_new, img_h_new))
 59 |     img_h_new, img_w_new = int(np.ceil(img_h / 16) * 16), int(np.ceil(img_w / 16) * 16)
 60 |     paved_image = np.zeros((img_h_new, img_w_new, 3), dtype=img.dtype)
 61 |     paved_image[0:img_h, 0:img_w] = img
 62 |     img = format_img_channels(paved_image, C)
 63 |     return img
 64 | 
 65 | 
 66 | def format_img_ratio(img, C, ratio):
 67 |     img = img.astype(np.float32)
 68 |     img[:, :, 0] -= C.img_channel_mean[0]
 69 |     img[:, :, 1] -= C.img_channel_mean[1]
 70 |     img[:, :, 2] -= C.img_channel_mean[2]
 71 |     img = cv2.resize(img, None, None, fx=ratio, fy=ratio)
 72 |     # img = cv2.resize(img, None, None, fx=ratio, fy=ratio, interpolation=cv2.INTER_CUBIC)
 73 |     img = np.expand_dims(img, axis=0)
 74 |     return img  # return img, ratio
 75 | 
 76 | 
 77 | def preprocess_input_test(x):
 78 |     x = x.astype(np.float32)
 79 |     x /= 255.
 80 |     x -= 0.5
 81 |     x *= 2.
 82 |     x = np.expand_dims(x, axis=0)
 83 |     return x
 84 | 
 85 | 
 86 | # Method to transform the coordinates of the bounding box to its original size
 87 | def get_real_coordinates(ratio, x1, y1, x2, y2):
 88 |     real_x1 = int(round(x1 // ratio))
 89 |     real_y1 = int(round(y1 // ratio))
 90 |     real_x2 = int(round(x2 // ratio))
 91 |     real_y2 = int(round(y2 // ratio))
 92 | 
 93 |     return (real_x1, real_y1, real_x2, real_y2)
 94 | 
 95 | 
 96 | def intersection(ai, bi, area):
 97 |     x = max(ai[0], bi[0])
 98 |     y = max(ai[1], bi[1])
 99 |     w = min(ai[2], bi[2]) - x
100 |     h = min(ai[3], bi[3]) - y
101 |     if w < 0 or h < 0:
102 |         return 0
103 |     return w * h / area
104 | 
105 | 
106 | def box_grid_overlap(bboxes, get_img_output_length):
107 |     width, height = 960, 540
108 |     (resized_width, resized_height) = 960, 540
109 |     # for VGG16, calculate the output map size
110 |     (output_width, output_height) = get_img_output_length(resized_width, resized_height)
111 | 
112 |     downscale = float(16)
113 | 
114 |     num_bboxes = len(bboxes)
115 |     # initialise output objectiveness
116 |     y_grid_overlap = np.zeros((output_height, output_width))
117 | 
118 |     if num_bboxes > 0:
119 |         # get the GT box coordinates, and resize to account for image resizing
120 |         gta = np.zeros((num_bboxes, 4))
121 |         gta[:, 0], gta[:, 1], gta[:, 2], gta[:, 3] = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2] + bboxes[:, 0], bboxes[:,
122 |                                                                                                               3] + bboxes[
123 |                                                                                                                    :, 1]
124 |         # for bbox_num in range(num_bboxes):
125 |         #     # get the GT box coordinates, and resize to account for image resizing
126 |         #     gta[bbox_num, 0] = bboxes[bbox_num][0]
127 |         #     gta[bbox_num, 1] = bboxes[bbox_num][1]
128 |         #     gta[bbox_num, 2] = bboxes[bbox_num][2]
129 |         #     gta[bbox_num, 3] = bboxes[bbox_num][3]
130 |         for ix in range(output_width):
131 |             x1_anc = downscale * ix
132 |             x2_anc = downscale * (ix + 1)
133 |             for jy in range(output_height):
134 |                 y1_anc = downscale * jy
135 |                 y2_anc = downscale * (jy + 1)
136 |                 best_op = 0
137 |                 for b in range(num_bboxes):
138 |                     grid = [x1_anc, y1_anc, x2_anc, y2_anc]
139 |                     op = intersection(grid, gta[b, :], downscale ** 2)
140 |                     best_op = op if op > best_op else best_op
141 |                 y_grid_overlap[jy, ix] = best_op
142 | 
143 |     y_grid_overlap = np.expand_dims(y_grid_overlap.reshape((1, -1)), axis=0)
144 |     return y_grid_overlap
145 | 
146 | 
147 | def integrate_motion_score(bboxes, probs, pred, stride=16):
148 |     if len(bboxes) == 0:
149 |         return []
150 |     probs = probs.reshape((-1, 1))
151 |     pred_anchor_score = np.zeros((probs.shape[0], 1))
152 |     for i in range(len(bboxes)):
153 |         x1, y1, x2, y2 = int(bboxes[i][0] / stride), int(bboxes[i][1] / stride), int(bboxes[i][2] / stride), int(
154 |             bboxes[i][3] / stride)
155 |         pred_anchor_score[i, 0] = np.sum(pred[y1:y2, x1:x2]) / ((x2 - x1) * (y2 - y1))
156 |     # alpha, belta = 2/0.7, 0.1
157 |     # pred_anchor_score = np.where(pred_anchor_score>0.7, pred_anchor_score*alpha, pred_anchor_score)
158 |     # pred_anchor_score = np.maximum(pred_anchor_score*alpha, np.ones_like(pred_anchor_score)*belta)
159 |     # all_probs = pred_anchor_score
160 |     all_probs = probs * pred_anchor_score
161 |     return all_probs
162 | 
163 | 
164 | def box_encoder_pp(anchors, boxes, Y1):
165 |     A = np.copy(anchors[:, :, :, :4])
166 |     A = A.reshape((-1, 4))
167 | 
168 |     # 1 calculate the iou scores
169 |     max_overlaps = np.zeros((anchors.shape[0] * anchors.shape[1] * anchors.shape[2],), dtype=np.float32)
170 |     if len(boxes) > 0:
171 |         boxes[:, 2] += boxes[:, 0]
172 |         boxes[:, 3] += boxes[:, 1]
173 |         overlaps = bbox_overlaps(np.ascontiguousarray(A, dtype=np.float64),
174 |                                  np.ascontiguousarray(boxes, dtype=np.float64))
175 |         max_overlaps = overlaps.max(axis=1)
176 |     # normalize the iou scores
177 |     if np.max(max_overlaps) > 0:
178 |         max_overlaps = (max_overlaps - np.min(max_overlaps)) / np.max(max_overlaps)
179 |     # 2 calculate the rpn scores
180 |     rpn_score = Y1.reshape((-1)).astype(np.float32)
181 |     inds = np.where(max_overlaps == 0)
182 |     rpn_score[inds] = np.min(rpn_score)
183 |     scores = (rpn_score + max_overlaps) / 2
184 |     scores = np.expand_dims(scores.reshape((1, -1)).astype(np.float32), axis=0)
185 |     return scores
186 | 
187 | 
188 | def box_encoder_iou(anchors, boxes):
189 |     A = np.copy(anchors[:, :, :, :4])
190 |     A = A.reshape((-1, 4))
191 | 
192 |     max_overlaps = np.zeros((anchors.shape[0] * anchors.shape[1] * anchors.shape[2],), dtype=np.float32)
193 |     if len(boxes) > 0:
194 |         boxes[:, 2] += boxes[:, 0]
195 |         boxes[:, 3] += boxes[:, 1]
196 |         overlaps = bbox_overlaps(np.ascontiguousarray(A, dtype=np.float64),
197 |                                  np.ascontiguousarray(boxes, dtype=np.float64))
198 |         max_overlaps = overlaps.max(axis=1)
199 |     scores = np.expand_dims(max_overlaps.reshape((1, -1)).astype(np.float32), axis=0)
200 |     return scores
201 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | tensorflow-gpu==1.14.0
2 | easydict==1.6
3 | joblib==0.10.3
4 | numpy==1.16.2
5 | opencv-python==4.1.0.25
6 | Pillow==6.1.0
7 | keras==2.0.6
8 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | from Cython.Distutils import build_ext
  3 | from distutils.core import setup, Extension
  4 | from Cython.Build import cythonize
  5 | import numpy as np
  6 | 
  7 | try:
  8 |     numpy_include = np.get_include()
  9 | except AttributeError:
 10 |     numpy_include = np.get_numpy_include()
 11 | 
 12 | 
 13 | def customize_compiler_for_nvcc(self):
 14 |     """inject deep into distutils to customize how the dispatch
 15 |     to gcc/nvcc works.
 16 |     If you subclass UnixCCompiler, it's not trivial to get your subclass
 17 |     injected in, and still have the right customizations (i.e.
 18 |     distutils.sysconfig.customize_compiler) run on it. So instead of going
 19 |     the OO route, I have this. Note, it's kindof like a wierd functional
 20 |     subclassing going on."""
 21 | 
 22 |     # tell the compiler it can processes .cu
 23 |     self.src_extensions.append('.cu')
 24 | 
 25 |     # save references to the default compiler_so and _comple methods
 26 |     default_compiler_so = self.compiler_so
 27 |     super = self._compile
 28 | 
 29 |     # now redefine the _compile method. This gets executed for each
 30 |     # object but distutils doesn't have the ability to change compilers
 31 |     # based on source extension: we add it.
 32 |     def _compile(obj, src, ext, cc_args, extra_postargs, pp_opts):
 33 |         if os.path.splitext(src)[1] == '.cu':
 34 |             # use the cuda for .cu files
 35 |             self.set_executable('compiler_so', CUDA['nvcc'])
 36 |             # use only a subset of the extra_postargs, which are 1-1 translated
 37 |             # from the extra_compile_args in the Extension class
 38 |             postargs = extra_postargs['nvcc']
 39 |         else:
 40 |             postargs = extra_postargs['gcc']
 41 | 
 42 |         super(obj, src, ext, cc_args, postargs, pp_opts)
 43 |         # reset the default compiler_so, which we might have changed for cuda
 44 |         self.compiler_so = default_compiler_so
 45 | 
 46 |     # inject our redefined _compile method into the class
 47 |     self._compile = _compile
 48 | 
 49 | 
 50 | # run the customize_compiler
 51 | class custom_build_ext(build_ext):
 52 |     def build_extensions(self):
 53 |         customize_compiler_for_nvcc(self.compiler)
 54 |         build_ext.build_extensions(self)
 55 | 
 56 | 
 57 | def find_in_path(name, path):
 58 |     "Find a file in a search path"
 59 |     # Adapted fom
 60 |     # http://code.activestate.com/recipes/52224-find-a-file-given-a-search-path/
 61 |     for dir in path.split(os.pathsep):
 62 |         binpath = os.path.join(dir, name)
 63 |         if os.path.exists(binpath):
 64 |             return os.path.abspath(binpath)
 65 |     return None
 66 | 
 67 | 
 68 | def locate_cuda():
 69 |     """Locate the CUDA environment on the system
 70 |     Returns a dict with keys 'home', 'nvcc', 'include', and 'lib64'
 71 |     and values giving the absolute path to each directory.
 72 |     Starts by looking for the CUDAHOME env variable. If not found, everything
 73 |     is based on finding 'nvcc' in the PATH.
 74 |     """
 75 | 
 76 |     # first check if the CUDAHOME env variable is in use
 77 |     if 'CUDAHOME' in os.environ:
 78 |         home = os.environ['CUDAHOME']
 79 |         nvcc = os.path.join(home, 'bin', 'nvcc')
 80 |     else:
 81 |         # otherwise, search the PATH for NVCC
 82 |         default_path = os.path.join(os.sep, 'usr', 'local', 'cuda', 'bin')
 83 |         nvcc = find_in_path('nvcc', os.environ['PATH'] + os.pathsep + default_path)
 84 |         if nvcc is None:
 85 |             raise EnvironmentError('The nvcc binary could not be '
 86 |                                    'located in your $PATH. Either add it to your path, or set $CUDAHOME')
 87 |         home = os.path.dirname(os.path.dirname(nvcc))
 88 | 
 89 |     cudaconfig = {'home': home, 'nvcc': nvcc,
 90 |                   'include': os.path.join(home, 'include'),
 91 |                   'lib64': os.path.join(home, 'lib64')}
 92 |     for k, v in cudaconfig.items():
 93 |         if not os.path.exists(v):
 94 |             raise EnvironmentError('The CUDA %s path could not be located in %s' % (k, v))
 95 | 
 96 |     return cudaconfig
 97 | 
 98 | 
 99 | CUDA = locate_cuda()
100 | 
101 | ext_modules = [
102 |     Extension(
103 |         "keras_csp.nms.cpu_nms",
104 |         ["keras_csp/nms/cpu_nms.pyx"],
105 |         extra_compile_args={'gcc': ["-Wno-cpp", "-Wno-unused-function"]},
106 |         include_dirs=[numpy_include]
107 |     ),
108 |     Extension('keras_csp.nms.gpu_nms',
109 |               ['keras_csp/nms/nms_kernel.cu', 'keras_csp/nms/gpu_nms.pyx'],
110 |               library_dirs=[CUDA['lib64']],
111 |               libraries=['cudart'],
112 |               language='c++',
113 |               runtime_library_dirs=[CUDA['lib64']],
114 |               # this syntax is specific to this build system
115 |               # we're only going to use certain compiler args with nvcc and not with
116 |               # gcc the implementation of this trick is in customize_compiler() below
117 |               extra_compile_args={'gcc': ["-Wno-unused-function"],
118 |                                   'nvcc': ['-arch=sm_35',
119 |                                            '--ptxas-options=-v',
120 |                                            '-c',
121 |                                            '--compiler-options',
122 |                                            "'-fPIC'"]},
123 |               include_dirs=[numpy_include, CUDA['include']]
124 |               )
125 | ]
126 | 
127 | setup(
128 |     ext_modules=ext_modules,
129 |     cmdclass={'build_ext': custom_build_ext},
130 | )
131 | 


--------------------------------------------------------------------------------
/test_caltech.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import time
 3 | import pickle
 4 | from keras.layers import Input
 5 | from keras.models import Model
 6 | from keras_csp import config, bbox_process
 7 | from keras_csp.utilsfunc import *
 8 | from keras_csp import resnet50 as nn
 9 | 
10 | os.environ["CUDA_VISIBLE_DEVICES"] = '1'
11 | C = config.Config()
12 | C.offset = True
13 | cache_path = 'data/cache/caltech/test'
14 | with open(cache_path, 'rb') as fid:
15 |     val_data = pickle.load(fid, encoding='latin1')
16 | num_imgs = len(val_data)
17 | print('num of val samples: {}'.format(num_imgs))
18 | 
19 | C.size_test = (480, 640)
20 | input_shape_img = (C.size_test[0], C.size_test[1], 3)
21 | 
22 | img_input = Input(shape=input_shape_img)
23 | 
24 | # define the network prediction
25 | preds = nn.nn_p3p4p5(img_input, offset=C.offset, num_scale=C.num_scale, trainable=True)
26 | model = Model(img_input, preds)
27 | 
28 | if C.offset:
29 |     w_path = 'output/valmodels/caltech/%s/off' % (C.scale)
30 |     out_path = 'output/valresults/caltech/%s/off' % (C.scale)
31 | else:
32 |     w_path = 'output/valmodels/caltech/%s/nooff' % (C.scale)
33 |     out_path = 'output/valresults/caltech/%s/nooff' % (C.scale)
34 | 
35 | if not os.path.exists(out_path):
36 |     os.makedirs(out_path)
37 | files = sorted(os.listdir(w_path))
38 | for w_ind in range(51, 121):
39 |     for f in files:
40 |         if f.split('_')[0] == 'net' and int(f.split('_')[1][1:]) == w_ind:
41 |             cur_file = f
42 |             break
43 |     weight1 = os.path.join(w_path, cur_file)
44 |     print('load weights from {}'.format(weight1))
45 |     model.load_weights(weight1, by_name=True)
46 |     res_path = os.path.join(out_path, '%03d' % int(str(w_ind)))
47 | 
48 |     print(res_path)
49 |     if not os.path.exists(res_path):
50 |         os.mkdir(res_path)
51 |     for st in range(6, 11):
52 |         set_path = os.path.join(res_path, 'set' + '%02d' % st)
53 |         if not os.path.exists(set_path):
54 |             os.mkdir(set_path)
55 | 
56 |     start_time = time.time()
57 |     for f in range(num_imgs):
58 |         filepath = val_data[f]['filepath']
59 |         filepath_next = val_data[f + 1]['filepath'] if f < num_imgs - 1 else val_data[f]['filepath']
60 |         set = filepath.split('/')[-1].split('_')[0]
61 |         video = filepath.split('/')[-1].split('_')[1]
62 |         frame_number = int(filepath.split('/')[-1].split('_')[2][1:6]) + 1
63 |         frame_number_next = int(filepath_next.split('/')[-1].split('_')[2][1:6]) + 1
64 |         set_path = os.path.join(res_path, set)
65 |         video_path = os.path.join(set_path, video + '.txt')
66 |         if os.path.exists(video_path):
67 |             continue
68 |         if frame_number == 30:
69 |             res_all = []
70 |         img = cv2.imread(filepath)
71 |         x_rcnn = format_img(img, C)
72 |         Y = model.predict(x_rcnn)
73 | 
74 |         if C.offset:
75 |             boxes = bbox_process.parse_det_offset(Y, C, score=0.01, down=4)
76 |         else:
77 |             boxes = bbox_process.parse_det(Y, C, score=0.01, down=4, scale=C.scale)
78 | 
79 |         if len(boxes) > 0:
80 |             f_res = np.repeat(frame_number, len(boxes), axis=0).reshape((-1, 1))
81 |             boxes[:, [2, 3]] -= boxes[:, [0, 1]]
82 |             res_all += np.concatenate((f_res, boxes), axis=-1).tolist()
83 |         if frame_number_next == 30 or f == num_imgs - 1:
84 |             np.savetxt(video_path, np.array(res_all), fmt='%6f')
85 |     print(time.time() - start_time)
86 | 


--------------------------------------------------------------------------------
/test_city.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import time
 3 | import pickle
 4 | from keras.layers import Input
 5 | from keras.models import Model
 6 | from keras_csp import config, bbox_process
 7 | from keras_csp.utilsfunc import *
 8 | 
 9 | os.environ["CUDA_VISIBLE_DEVICES"] = '1'
10 | C = config.Config()
11 | C.offset = True
12 | cache_path = 'data/cache/cityperson/val_500'
13 | with open(cache_path, 'rb') as fid:
14 |     val_data = pickle.load(fid, encoding='latin1')
15 | num_imgs = len(val_data)
16 | print('num of val samples: {}'.format(num_imgs))
17 | 
18 | C.size_test = (1024, 2048)
19 | input_shape_img = (C.size_test[0], C.size_test[1], 3)
20 | img_input = Input(shape=input_shape_img)
21 | 
22 | # define the base network (resnet here, can be MobileNet, etc)
23 | if C.network == 'resnet50':
24 |     from keras_csp import resnet50 as nn
25 | elif C.network == 'mobilenet':
26 |     from keras_csp import mobilenet as nn
27 | else:
28 |     raise NotImplementedError('Not support network: {}'.format(C.network))
29 | 
30 | # define the network prediction
31 | preds = nn.nn_p3p4p5(img_input, offset=C.offset, num_scale=C.num_scale, trainable=True)
32 | model = Model(img_input, preds)
33 | 
34 | if C.offset:
35 |     w_path = 'output/valmodels/city/%s/off' % (C.scale)
36 |     out_path = 'output/valresults/city/%s/off' % (C.scale)
37 | else:
38 |     w_path = 'output/valmodels/city/%s/nooff' % (C.scale)
39 |     out_path = 'output/valresults/city/%s/nooff' % (C.scale)
40 | if not os.path.exists(out_path):
41 |     os.makedirs(out_path)
42 | files = sorted(os.listdir(w_path))
43 | # get the results from epoch 51 to epoch 150
44 | for w_ind in range(150, 151):
45 |     for f in files:
46 |         if f.split('_')[0] == 'net' and int(f.split('_')[1][1:]) == w_ind:
47 |             cur_file = f
48 |             break
49 |     weight1 = os.path.join(w_path, cur_file)
50 |     print('load weights from {}'.format(weight1))
51 |     model.load_weights(weight1, by_name=True)
52 |     res_path = os.path.join(out_path, '%03d' % int(str(w_ind)))
53 |     if not os.path.exists(res_path):
54 |         os.makedirs(res_path)
55 |     print(res_path)
56 |     res_file = os.path.join(res_path, 'val_det.txt')
57 |     res_all = []
58 |     start_time = time.time()
59 |     for f in range(num_imgs):
60 |         filepath = val_data[f]['filepath']
61 |         img = cv2.imread(filepath)
62 |         if img is None:
63 |             raise RuntimeError("image at %s not found" % filepath)
64 |         x_rcnn = format_img(img, C)
65 |         Y = model.predict(x_rcnn)
66 | 
67 |         if C.offset:
68 |             boxes = bbox_process.parse_det_offset(Y, C, score=0.1, down=4)
69 |         else:
70 |             boxes = bbox_process.parse_det(Y, C, score=0.1, down=4, scale=C.scale)
71 |         if len(boxes) > 0:
72 |             f_res = np.repeat(f + 1, len(boxes), axis=0).reshape((-1, 1))
73 |             boxes[:, [2, 3]] -= boxes[:, [0, 1]]
74 |             res_all += np.concatenate((f_res, boxes), axis=-1).tolist()
75 |     np.savetxt(res_file, np.array(res_all), fmt='%6f')
76 |     print(time.time() - start_time)
77 | 


--------------------------------------------------------------------------------
/test_wider_ms.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import time
  3 | import pickle
  4 | from keras.layers import Input
  5 | from keras.models import Model
  6 | from keras_csp import config, bbox_process
  7 | from keras_csp.utilsfunc import *
  8 | 
  9 | os.environ["CUDA_VISIBLE_DEVICES"] = '0'
 10 | C = config.Config()
 11 | C.offset = True
 12 | C.scale = 'hw'
 13 | C.num_scale = 2
 14 | cache_path = 'data/cache/widerface/val'
 15 | with open(cache_path, 'rb') as fid:
 16 |     val_data = pickle.load(fid, encoding='latin1')
 17 | num_imgs = len(val_data)
 18 | print('num of val samples: {}'.format(num_imgs))
 19 | 
 20 | C.size_test = [0, 0]
 21 | input_shape_img = (None, None, 3)
 22 | img_input = Input(shape=input_shape_img)
 23 | 
 24 | # define the base network (resnet here, can be MobileNet, etc)
 25 | from keras_csp import resnet50 as nn
 26 | 
 27 | # define the network prediction
 28 | preds = nn.nn_p3p4p5(img_input, offset=C.offset, num_scale=C.num_scale, trainable=True)
 29 | model = Model(img_input, preds)
 30 | 
 31 | if C.offset:
 32 |     w_path = 'output/valmodels/wider/%s/off' % (C.scale)
 33 |     out_path = 'output/valresults/wider/%s/off' % (C.scale)
 34 | else:
 35 |     w_path = 'output/valmodels/wider/%s/nooff' % (C.scale)
 36 |     out_path = 'output/valresults/wider/%s/nooff' % (C.scale)
 37 | if not os.path.exists(out_path):
 38 |     os.makedirs(out_path)
 39 | files = sorted(os.listdir(w_path))
 40 | # get the results from epoch 51 to epoch 150
 41 | for w_ind in range(382, 383):
 42 |     for f in files:
 43 |         if f.split('_')[0] == 'net' and int(f.split('_')[1][1:]) == w_ind:
 44 |             cur_file = f
 45 |             break
 46 |     weight1 = os.path.join(w_path, cur_file)
 47 |     print('load weights from {}'.format(weight1))
 48 |     model.load_weights(weight1, by_name=True)
 49 |     res_path = os.path.join(out_path, '%03d' % int(str(w_ind)))
 50 |     if not os.path.exists(res_path):
 51 |         os.makedirs(res_path)
 52 |     print(res_path)
 53 | 
 54 |     start_time = time.time()
 55 |     for f in range(num_imgs):
 56 |         filepath = val_data[f]['filepath']
 57 |         event = filepath.split('/')[-2]
 58 |         event_path = os.path.join(res_path, event)
 59 |         if not os.path.exists(event_path):
 60 |             os.mkdir(event_path)
 61 |         filename = filepath.split('/')[-1].split('.')[0]
 62 |         txtpath = os.path.join(event_path, filename + '.txt')
 63 |         if os.path.exists(txtpath):
 64 |             continue
 65 | 
 66 |         img = cv2.imread(filepath)
 67 | 
 68 | 
 69 |         def detect_face(img, scale=1, flip=False):
 70 |             img_h, img_w = img.shape[:2]
 71 |             img_h_new, img_w_new = int(np.ceil(scale * img_h / 16) * 16), int(np.ceil(scale * img_w / 16) * 16)
 72 |             scale_h, scale_w = img_h_new / img_h, img_w_new / img_w
 73 | 
 74 |             img_s = cv2.resize(img, None, None, fx=scale_w, fy=scale_h, interpolation=cv2.INTER_LINEAR)
 75 |             # img_h, img_w = img_s.shape[:2]
 76 |             # print frame_number
 77 |             C.size_test[0] = img_h_new
 78 |             C.size_test[1] = img_w_new
 79 | 
 80 |             if flip:
 81 |                 img_sf = cv2.flip(img_s, 1)
 82 |                 # x_rcnn = format_img_pad(img_sf, C)
 83 |                 x_rcnn = format_img(img_sf, C)
 84 |             else:
 85 |                 # x_rcnn = format_img_pad(img_s, C)
 86 |                 x_rcnn = format_img(img_s, C)
 87 |             Y = model.predict(x_rcnn)
 88 |             boxes = bbox_process.parse_wider_offset(Y, C, score=0.05, nmsthre=0.6)
 89 |             if len(boxes) > 0:
 90 |                 keep_index = np.where(np.minimum(boxes[:, 2] - boxes[:, 0], boxes[:, 3] - boxes[:, 1]) >= 12)[0]
 91 |                 boxes = boxes[keep_index, :]
 92 |             if len(boxes) > 0:
 93 |                 if flip:
 94 |                     boxes[:, [0, 2]] = img_s.shape[1] - boxes[:, [2, 0]]
 95 |                 boxes[:, 0:4:2] = boxes[:, 0:4:2] / scale_w
 96 |                 boxes[:, 1:4:2] = boxes[:, 1:4:2] / scale_h
 97 |             else:
 98 |                 boxes = np.empty(shape=[0, 5], dtype=np.float32)
 99 |             return boxes
100 | 
101 | 
102 |         def im_det_ms_pyramid(image, max_im_shrink):
103 |             # shrink detecting and shrink only detect big face
104 |             det_s = np.row_stack((detect_face(image, 0.5), detect_face(image, 0.5, flip=True)))
105 |             index = np.where(np.maximum(det_s[:, 2] - det_s[:, 0] + 1, det_s[:, 3] - det_s[:, 1] + 1) > 64)[0]
106 |             det_s = det_s[index, :]
107 | 
108 |             det_temp = np.row_stack((detect_face(image, 0.75), detect_face(image, 0.75, flip=True)))
109 |             index = np.where(np.maximum(det_temp[:, 2] - det_temp[:, 0] + 1, det_temp[:, 3] - det_temp[:, 1] + 1) > 32)[
110 |                 0]
111 |             det_temp = det_temp[index, :]
112 |             det_s = np.row_stack((det_s, det_temp))
113 | 
114 |             det_temp = np.row_stack((detect_face(image, 0.25), detect_face(image, 0.25, flip=True)))
115 |             index = np.where(np.maximum(det_temp[:, 2] - det_temp[:, 0] + 1, det_temp[:, 3] - det_temp[:, 1] + 1) > 96)[
116 |                 0]
117 |             det_temp = det_temp[index, :]
118 |             det_s = np.row_stack((det_s, det_temp))
119 | 
120 |             st = [1.25, 1.5, 1.75, 2.0, 2.25]
121 |             for i in range(len(st)):
122 |                 if (st[i] <= max_im_shrink):
123 |                     det_temp = np.row_stack((detect_face(image, st[i]), detect_face(image, st[i], flip=True)))
124 |                     # Enlarged images are only used to detect small faces.
125 |                     if st[i] == 1.25:
126 |                         index = np.where(
127 |                             np.minimum(det_temp[:, 2] - det_temp[:, 0] + 1, det_temp[:, 3] - det_temp[:, 1] + 1) < 128)[
128 |                             0]
129 |                         det_temp = det_temp[index, :]
130 |                     elif st[i] == 1.5:
131 |                         index = np.where(
132 |                             np.minimum(det_temp[:, 2] - det_temp[:, 0] + 1, det_temp[:, 3] - det_temp[:, 1] + 1) < 96)[
133 |                             0]
134 |                         det_temp = det_temp[index, :]
135 |                     elif st[i] == 1.75:
136 |                         index = np.where(
137 |                             np.minimum(det_temp[:, 2] - det_temp[:, 0] + 1, det_temp[:, 3] - det_temp[:, 1] + 1) < 64)[
138 |                             0]
139 |                         det_temp = det_temp[index, :]
140 |                     elif st[i] == 2.0:
141 |                         index = np.where(
142 |                             np.minimum(det_temp[:, 2] - det_temp[:, 0] + 1, det_temp[:, 3] - det_temp[:, 1] + 1) < 48)[
143 |                             0]
144 |                         det_temp = det_temp[index, :]
145 |                     elif st[i] == 2.25:
146 |                         index = np.where(
147 |                             np.minimum(det_temp[:, 2] - det_temp[:, 0] + 1, det_temp[:, 3] - det_temp[:, 1] + 1) < 32)[
148 |                             0]
149 |                         det_temp = det_temp[index, :]
150 |                     det_s = np.row_stack((det_s, det_temp))
151 |             return det_s
152 | 
153 | 
154 |         max_im_shrink = (0x7fffffff / 577.0 / (img.shape[0] * img.shape[1])) ** 0.5  # the max size of input image
155 |         shrink = max_im_shrink if max_im_shrink < 1 else 1
156 |         det0 = detect_face(img)
157 |         det1 = detect_face(img, flip=True)
158 |         det2 = im_det_ms_pyramid(img, max_im_shrink)
159 |         # merge all test results via bounding box voting
160 |         det = np.row_stack((det0, det1, det2))
161 |         keep_index = np.where(np.minimum(det[:, 2] - det[:, 0], det[:, 3] - det[:, 1]) >= 3)[0]
162 |         det = det[keep_index, :]
163 |         dets = bbox_process.soft_bbox_vote(det, thre=0.4)
164 |         keep_index = np.where((dets[:, 2] - dets[:, 0] + 1) * (dets[:, 3] - dets[:, 1] + 1) >= 6 ** 2)[0]
165 |         dets = dets[keep_index, :]
166 | 
167 |         with open(txtpath, 'w') as f:
168 |             f.write('{:s}\n'.format(filename))
169 |             f.write('{:d}\n'.format(len(dets)))
170 |             for line in dets:
171 |                 f.write('{:.0f} {:.0f} {:.0f} {:.0f} {:.3f}\n'.
172 |                         format(line[0], line[1], line[2] - line[0] + 1, line[3] - line[1] + 1, line[4]))
173 |     print(time.time() - start_time)
174 | 


--------------------------------------------------------------------------------
/train_caltech.py:
--------------------------------------------------------------------------------
  1 | import random
  2 | import sys, os
  3 | import time
  4 | import numpy as np
  5 | import pickle
  6 | from keras.utils import generic_utils
  7 | from keras.optimizers import Adam
  8 | from keras.layers import Input
  9 | from keras.models import Model
 10 | from keras_csp import config, data_generators
 11 | from keras_csp import losses as losses
 12 | 
 13 | # get the config parameters
 14 | C = config.Config()
 15 | C.gpu_ids = '0'
 16 | C.onegpu = 16
 17 | C.size_train = (336, 448)
 18 | C.init_lr = 1e-4
 19 | C.num_epochs = 120
 20 | C.offset = True
 21 | 
 22 | num_gpu = len(C.gpu_ids.split(','))
 23 | batchsize = C.onegpu * num_gpu
 24 | os.environ["CUDA_VISIBLE_DEVICES"] = C.gpu_ids
 25 | 
 26 | # get the training data
 27 | cache_ped = 'data/cache/caltech/train_gt'
 28 | cache_emp = 'data/cache/caltech/train_nogt'
 29 | with open(cache_ped, 'rb') as fid:
 30 |     ped_data = pickle.load(fid, encoding='latin1')
 31 | with open(cache_emp, 'rb') as fid:
 32 |     emp_data = pickle.load(fid, encoding='latin1')
 33 | num_imgs_ped = len(ped_data)
 34 | num_imgs_emp = len(emp_data)
 35 | print(('num of ped and emp samples: {} {}'.format(num_imgs_ped, num_imgs_emp)))
 36 | data_gen_train = data_generators.get_data_hybrid(ped_data, emp_data, C, batchsize=batchsize, hyratio=0.5)
 37 | 
 38 | # define the base network (resnet here, can be MobileNet, etc)
 39 | if C.network == 'resnet50':
 40 |     from keras_csp import resnet50 as nn
 41 | 
 42 |     weight_path = 'data/models/resnet50_weights_tf_dim_ordering_tf_kernels.h5'
 43 | elif C.network == 'mobilenet':
 44 |     from keras_csp import mobilenet as nn
 45 | 
 46 |     weight_path = 'data/models/mobilenet_1_0_224_tf_no_top.h5'
 47 | else:
 48 |     raise NotImplementedError('Not support network: {}'.format(C.network))
 49 | 
 50 | input_shape_img = (C.size_train[0], C.size_train[1], 3)
 51 | img_input = Input(shape=input_shape_img)
 52 | # define the network prediction
 53 | preds = nn.nn_p3p4p5(img_input, offset=C.offset, num_scale=C.num_scale, trainable=True)
 54 | preds_tea = nn.nn_p3p4p5(img_input, offset=C.offset, num_scale=C.num_scale, trainable=True)
 55 | 
 56 | model = Model(img_input, preds)
 57 | if num_gpu > 1:
 58 |     from keras_csp.parallel_model import ParallelModel
 59 | 
 60 |     model = ParallelModel(model, int(num_gpu))
 61 |     model_stu = Model(img_input, preds)
 62 | model_tea = Model(img_input, preds_tea)
 63 | 
 64 | model.load_weights(weight_path, by_name=True)
 65 | model_tea.load_weights(weight_path, by_name=True)
 66 | print('load weights from {}'.format(weight_path))
 67 | 
 68 | if C.offset:
 69 |     out_path = 'output/valmodels/caltech/%s/off2' % (C.scale)
 70 | else:
 71 |     out_path = 'output/valmodels/caltech/%s/nooff' % (C.scale)
 72 | 
 73 | if not os.path.exists(out_path):
 74 |     os.makedirs(out_path)
 75 | res_file = os.path.join(out_path, 'records.txt')
 76 | 
 77 | optimizer = Adam(lr=C.init_lr)
 78 | if C.offset:
 79 |     model.compile(optimizer=optimizer, loss=[losses.cls_center, losses.regr_h, losses.regr_offset])
 80 | else:
 81 |     if C.scale == 'hw':
 82 |         model.compile(optimizer=optimizer, loss=[losses.cls_center, losses.regr_hw])
 83 |     else:
 84 |         model.compile(optimizer=optimizer, loss=[losses.cls_center, losses.regr_h])
 85 | 
 86 | epoch_length = int(C.iter_per_epoch / batchsize)
 87 | iter_num = 0
 88 | add_epoch = 0
 89 | losses = np.zeros((epoch_length, 3))
 90 | 
 91 | best_loss = np.Inf
 92 | print(('Starting training with lr {} and alpha {}'.format(C.init_lr, C.alpha)))
 93 | start_time = time.time()
 94 | total_loss_r, cls_loss_r1, regr_loss_r1, offset_loss_r1 = [], [], [], []
 95 | for epoch_num in range(C.num_epochs):
 96 |     progbar = generic_utils.Progbar(epoch_length)
 97 |     print(('Epoch {}/{}'.format(epoch_num + 1 + add_epoch, C.num_epochs + C.add_epoch)))
 98 |     while True:
 99 |         try:
100 |             X, Y = next(data_gen_train)
101 |             loss_s1 = model.train_on_batch(X, Y)
102 | 
103 |             for l in model_tea.layers:
104 |                 weights_tea = l.get_weights()
105 |                 if len(weights_tea) > 0:
106 |                     if num_gpu > 1:
107 |                         weights_stu = model_stu.get_layer(name=l.name).get_weights()
108 |                     else:
109 |                         weights_stu = model.get_layer(name=l.name).get_weights()
110 |                     weights_tea = [C.alpha * w_tea + (1 - C.alpha) * w_stu for (w_tea, w_stu) in
111 |                                    zip(weights_tea, weights_stu)]
112 |                     l.set_weights(weights_tea)
113 |             # print loss_s1
114 |             losses[iter_num, 0] = loss_s1[1]
115 |             losses[iter_num, 1] = loss_s1[2]
116 |             if C.offset:
117 |                 losses[iter_num, 2] = loss_s1[3]
118 |             else:
119 |                 losses[iter_num, 2] = 0
120 | 
121 |             iter_num += 1
122 |             if iter_num % 20 == 0:
123 |                 progbar.update(iter_num,
124 |                                [('cls', np.mean(losses[:iter_num, 0])), ('regr_h', np.mean(losses[:iter_num, 1])),
125 |                                 ('offset', np.mean(losses[:iter_num, 2]))])
126 |             if iter_num == epoch_length:
127 |                 cls_loss1 = np.mean(losses[:, 0])
128 |                 regr_loss1 = np.mean(losses[:, 1])
129 |                 offset_loss1 = np.mean(losses[:, 2])
130 |                 total_loss = cls_loss1 + regr_loss1 + offset_loss1
131 | 
132 |                 total_loss_r.append(total_loss)
133 |                 cls_loss_r1.append(cls_loss1)
134 |                 regr_loss_r1.append(regr_loss1)
135 |                 offset_loss_r1.append(offset_loss1)
136 |                 print(('Total loss: {}'.format(total_loss)))
137 |                 print(('Elapsed time: {}'.format(time.time() - start_time)))
138 | 
139 |                 iter_num = 0
140 |                 start_time = time.time()
141 | 
142 |                 if total_loss < best_loss:
143 |                     print(('Total loss decreased from {} to {}, saving weights'.format(best_loss, total_loss)))
144 |                     best_loss = total_loss
145 |                 model_tea.save_weights(
146 |                     os.path.join(out_path, 'net_e{}_l{}.hdf5'.format(epoch_num + 1 + add_epoch, total_loss)))
147 |                 break
148 |         except Exception as e:
149 |             print(('Exception: {}'.format(e)))
150 |             continue
151 |     records = np.concatenate((np.asarray(total_loss_r).reshape((-1, 1)),
152 |                               np.asarray(cls_loss_r1).reshape((-1, 1)),
153 |                               np.asarray(regr_loss_r1).reshape((-1, 1)),
154 |                               np.asarray(offset_loss_r1).reshape((-1, 1)),),
155 |                              axis=-1)
156 |     np.savetxt(res_file, np.array(records), fmt='%.6f')
157 | print('Training complete, exiting.')
158 | 


--------------------------------------------------------------------------------
/train_city.py:
--------------------------------------------------------------------------------
  1 | import glob
  2 | import random
  3 | import sys, os
  4 | import time
  5 | import numpy as np
  6 | import pickle
  7 | from keras.utils import generic_utils
  8 | from keras.optimizers import Adam
  9 | from keras.layers import Input
 10 | from keras.models import Model
 11 | from keras_csp import config, data_generators
 12 | from keras_csp import losses as losses
 13 | 
 14 | # get the config parameters
 15 | C = config.Config()
 16 | C.gpu_ids = '0,1,2,3'
 17 | C.onegpu = 2
 18 | C.size_train = (640, 1280)
 19 | C.init_lr = 2e-4
 20 | C.num_epochs = 150
 21 | C.offset = True
 22 | 
 23 | num_gpu = len(C.gpu_ids.split(','))
 24 | batchsize = C.onegpu * num_gpu
 25 | os.environ["CUDA_VISIBLE_DEVICES"] = C.gpu_ids
 26 | 
 27 | # get the training data
 28 | cache_path = 'data/cache/cityperson/train_h50'
 29 | with open(cache_path, 'rb') as fid:
 30 |     train_data = pickle.load(fid, encoding='latin1')
 31 | num_imgs_train = len(train_data)
 32 | random.shuffle(train_data)
 33 | print('num of training samples: {}'.format(num_imgs_train))
 34 | data_gen_train = data_generators.get_data(train_data, C, batchsize=batchsize)
 35 | 
 36 | # define the base network (resnet here, can be MobileNet, etc)
 37 | if C.network == 'resnet50':
 38 |     from keras_csp import resnet50 as nn
 39 | 
 40 |     weight_path = 'data/models/resnet50_weights_tf_dim_ordering_tf_kernels.h5'
 41 | 
 42 | if C.offset:
 43 |     out_path = 'output/valmodels/city/%s/off' % (C.scale)
 44 | else:
 45 |     out_path = 'output/valmodels/city/%s/nooff' % (C.scale)
 46 | if not os.path.exists(out_path):
 47 |     os.makedirs(out_path)
 48 |     epoch = 0
 49 | else:
 50 |     checkpoint_paths = glob.glob(out_path + "/net*.hdf5")
 51 |     checkpoint_names = [f.split("/")[-1] for f in checkpoint_paths]
 52 |     epochs = [*map(int, [f.split("net_e")[1].split("_")[0] for f in checkpoint_names if "net_e" in f])]
 53 |     max_epoch_idx = np.argmax(epochs)
 54 |     epoch = epochs[max_epoch_idx]
 55 |     weight_path = checkpoint_paths[max_epoch_idx]
 56 | 
 57 | input_shape_img = (C.size_train[0], C.size_train[1], 3)
 58 | img_input = Input(shape=input_shape_img)
 59 | # define the network prediction
 60 | preds = nn.nn_p3p4p5(img_input, offset=C.offset, num_scale=C.num_scale, trainable=True)
 61 | preds_tea = nn.nn_p3p4p5(img_input, offset=C.offset, num_scale=C.num_scale, trainable=True)
 62 | model = Model(img_input, preds)
 63 | 
 64 | if num_gpu > 1:
 65 |     from keras_csp.parallel_model import ParallelModel
 66 | 
 67 |     model = ParallelModel(model, int(num_gpu))
 68 |     model_stu = Model(img_input, preds)
 69 | model_tea = Model(img_input, preds_tea)
 70 | 
 71 | model.load_weights(weight_path, by_name=True)
 72 | model_tea.load_weights(weight_path, by_name=True)
 73 | print('load weights from {}'.format(weight_path))
 74 | 
 75 | res_file = os.path.join(out_path, 'records.txt')
 76 | 
 77 | optimizer = Adam(lr=C.init_lr)
 78 | if C.offset:
 79 |     model.compile(optimizer=optimizer, loss=[losses.cls_center, losses.regr_h, losses.regr_offset])
 80 | else:
 81 |     if C.scale == 'hw':
 82 |         model.compile(optimizer=optimizer, loss=[losses.cls_center, losses.regr_hw])
 83 |     else:
 84 |         model.compile(optimizer=optimizer, loss=[losses.cls_center, losses.regr_h])
 85 | 
 86 | epoch_length = int(C.iter_per_epoch / batchsize)
 87 | iter_num = 0
 88 | add_epoch = 0
 89 | losses = np.zeros((epoch_length, 3))
 90 | 
 91 | best_loss = np.Inf
 92 | print(('Starting training with lr {} and alpha {}'.format(C.init_lr, C.alpha)))
 93 | start_time = time.time()
 94 | total_loss_r, cls_loss_r1, regr_loss_r1, offset_loss_r1 = [], [], [], []
 95 | for epoch_num in range(epoch, C.num_epochs):
 96 |     progbar = generic_utils.Progbar(epoch_length)
 97 |     print(('Epoch {}/{}'.format(epoch_num + 1 + add_epoch, C.num_epochs + C.add_epoch)))
 98 |     while True:
 99 |         X, Y = next(data_gen_train)
100 |         loss_s1 = model.train_on_batch(X, Y)
101 | 
102 |         for l in model_tea.layers:
103 |             weights_tea = l.get_weights()
104 |             if len(weights_tea) > 0:
105 |                 if num_gpu > 1:
106 |                     weights_stu = model_stu.get_layer(name=l.name).get_weights()
107 |                 else:
108 |                     weights_stu = model.get_layer(name=l.name).get_weights()
109 |                 weights_tea = [C.alpha * w_tea + (1 - C.alpha) * w_stu for (w_tea, w_stu) in
110 |                                zip(weights_tea, weights_stu)]
111 |                 l.set_weights(weights_tea)
112 |         # print loss_s1
113 |         losses[iter_num, 0] = loss_s1[1]
114 |         losses[iter_num, 1] = loss_s1[2]
115 |         if C.offset:
116 |             losses[iter_num, 2] = loss_s1[3]
117 |         else:
118 |             losses[iter_num, 2] = 0
119 | 
120 |         iter_num += 1
121 |         if iter_num % 20 == 0:
122 |             progbar.update(iter_num,
123 |                            [('cls', np.mean(losses[:iter_num, 0])), ('regr_h', np.mean(losses[:iter_num, 1])),
124 |                             ('offset', np.mean(losses[:iter_num, 2]))])
125 |         if iter_num == epoch_length:
126 |             cls_loss1 = np.mean(losses[:, 0])
127 |             regr_loss1 = np.mean(losses[:, 1])
128 |             offset_loss1 = np.mean(losses[:, 2])
129 |             total_loss = cls_loss1 + regr_loss1 + offset_loss1
130 | 
131 |             total_loss_r.append(total_loss)
132 |             cls_loss_r1.append(cls_loss1)
133 |             regr_loss_r1.append(regr_loss1)
134 |             offset_loss_r1.append(offset_loss1)
135 |             print(('Total loss: {}'.format(total_loss)))
136 |             print(('Elapsed time: {}'.format(time.time() - start_time)))
137 | 
138 |             iter_num = 0
139 |             start_time = time.time()
140 | 
141 |             if total_loss < best_loss:
142 |                 print(('Total loss decreased from {} to {}, saving weights'.format(best_loss, total_loss)))
143 |                 best_loss = total_loss
144 |             model_tea.save_weights(
145 |                 os.path.join(out_path, 'net_e{}_l{}.hdf5'.format(epoch_num + 1 + add_epoch, total_loss)))
146 |             break
147 | 
148 |     records = np.concatenate((np.asarray(total_loss_r).reshape((-1, 1)),
149 |                               np.asarray(cls_loss_r1).reshape((-1, 1)),
150 |                               np.asarray(regr_loss_r1).reshape((-1, 1)),
151 |                               np.asarray(offset_loss_r1).reshape((-1, 1)),),
152 |                              axis=-1)
153 |     np.savetxt(res_file, np.array(records), fmt='%.6f')
154 | print('Training complete, exiting.')
155 | 


--------------------------------------------------------------------------------
/train_wider.py:
--------------------------------------------------------------------------------
  1 | import random
  2 | import sys, os
  3 | import time
  4 | import numpy as np
  5 | import pickle
  6 | from keras.utils import generic_utils
  7 | from keras.optimizers import Adam
  8 | from keras.layers import Input
  9 | from keras.models import Model
 10 | from keras_csp import config, data_generators
 11 | from keras_csp import losses as losses
 12 | 
 13 | # get the config parameters
 14 | C = config.Config()
 15 | C.gpu_ids = '0,1,2,3,4,5,6,7'
 16 | C.onegpu = 4
 17 | C.size_train = (704, 704)
 18 | C.init_lr = 2e-4
 19 | C.offset = True
 20 | C.scale = 'hw'
 21 | C.num_scale = 2
 22 | C.num_epochs = 400
 23 | C.iter_per_epoch = 4000
 24 | num_gpu = len(C.gpu_ids.split(','))
 25 | batchsize = C.onegpu * num_gpu
 26 | os.environ["CUDA_VISIBLE_DEVICES"] = C.gpu_ids
 27 | 
 28 | # get the training data
 29 | cache_path = 'data/cache/widerface/train'
 30 | with open(cache_path, 'rb') as fid:
 31 |     train_data = pickle.load(fid, encoding='latin1')
 32 | num_imgs_train = len(train_data)
 33 | print('num of training samples: {}'.format(num_imgs_train))
 34 | data_gen_train = data_generators.get_data_wider(train_data, C, batchsize=batchsize)
 35 | 
 36 | # define the base network (resnet here, can be MobileNet, etc)
 37 | if C.network == 'resnet50':
 38 |     from keras_csp import resnet50 as nn
 39 | 
 40 |     weight_path = 'data/models/resnet50_weights_tf_dim_ordering_tf_kernels.h5'
 41 | 
 42 | input_shape_img = (C.size_train[0], C.size_train[1], 3)
 43 | img_input = Input(shape=input_shape_img)
 44 | # define the network prediction
 45 | preds = nn.nn_p3p4p5(img_input, offset=C.offset, num_scale=C.num_scale, trainable=True)
 46 | preds_tea = nn.nn_p3p4p5(img_input, offset=C.offset, num_scale=C.num_scale, trainable=True)
 47 | 
 48 | model = Model(img_input, preds)
 49 | if num_gpu > 1:
 50 |     from keras_csp.parallel_model import ParallelModel
 51 | 
 52 |     model = ParallelModel(model, int(num_gpu))
 53 |     model_stu = Model(img_input, preds)
 54 | model_tea = Model(img_input, preds_tea)
 55 | 
 56 | model.load_weights(weight_path, by_name=True)
 57 | model_tea.load_weights(weight_path, by_name=True)
 58 | print('load weights from {}'.format(weight_path))
 59 | 
 60 | if C.offset:
 61 |     out_path = 'output/valmodels/wider/%s/off' % (C.scale)
 62 | else:
 63 |     out_path = 'output/valmodels/wider/%s/nooff' % (C.scale)
 64 | if not os.path.exists(out_path):
 65 |     os.makedirs(out_path)
 66 | res_file = os.path.join(out_path, 'records.txt')
 67 | 
 68 | optimizer = Adam(lr=C.init_lr)
 69 | if C.offset:
 70 |     model.compile(optimizer=optimizer, loss=[losses.cls_center, losses.regr_hw, losses.regr_offset])
 71 | else:
 72 |     model.compile(optimizer=optimizer, loss=[losses.cls_center, losses.regr_hw])
 73 | 
 74 | epoch_length = int(C.iter_per_epoch / batchsize)
 75 | iter_num = 0
 76 | add_epoch = 0
 77 | losses = np.zeros((epoch_length, 3))
 78 | 
 79 | best_loss = np.Inf
 80 | print(('Starting training with lr {} and alpha {}'.format(C.init_lr, C.alpha)))
 81 | start_time = time.time()
 82 | total_loss_r, cls_loss_r1, regr_loss_r1, offset_loss_r1 = [], [], [], []
 83 | for epoch_num in range(C.num_epochs):
 84 |     progbar = generic_utils.Progbar(epoch_length)
 85 |     print(('Epoch {}/{}'.format(epoch_num + 1 + add_epoch, C.num_epochs + C.add_epoch)))
 86 |     while True:
 87 |         try:
 88 |             X, Y = next(data_gen_train)
 89 |             loss_s1 = model.train_on_batch(X, Y)
 90 | 
 91 |             for l in model_tea.layers:
 92 |                 weights_tea = l.get_weights()
 93 |                 if len(weights_tea) > 0:
 94 |                     if num_gpu > 1:
 95 |                         weights_stu = model_stu.get_layer(name=l.name).get_weights()
 96 |                     else:
 97 |                         weights_stu = model.get_layer(name=l.name).get_weights()
 98 |                     weights_tea = [C.alpha * w_tea + (1 - C.alpha) * w_stu for (w_tea, w_stu) in
 99 |                                    zip(weights_tea, weights_stu)]
100 |                     l.set_weights(weights_tea)
101 |             # print loss_s1
102 |             losses[iter_num, 0] = loss_s1[1]
103 |             losses[iter_num, 1] = loss_s1[2]
104 |             if C.offset:
105 |                 losses[iter_num, 2] = loss_s1[3]
106 |             else:
107 |                 losses[iter_num, 2] = 0
108 | 
109 |             iter_num += 1
110 |             if iter_num % 20 == 0:
111 |                 progbar.update(iter_num,
112 |                                [('cls', np.mean(losses[:iter_num, 0])), ('regr_h', np.mean(losses[:iter_num, 1])),
113 |                                 ('offset', np.mean(losses[:iter_num, 2]))])
114 |             if iter_num == epoch_length:
115 |                 cls_loss1 = np.mean(losses[:, 0])
116 |                 regr_loss1 = np.mean(losses[:, 1])
117 |                 offset_loss1 = np.mean(losses[:, 2])
118 |                 total_loss = cls_loss1 + regr_loss1 + offset_loss1
119 | 
120 |                 total_loss_r.append(total_loss)
121 |                 cls_loss_r1.append(cls_loss1)
122 |                 regr_loss_r1.append(regr_loss1)
123 |                 offset_loss_r1.append(offset_loss1)
124 |                 print(('Total loss: {}'.format(total_loss)))
125 |                 print(('Elapsed time: {}'.format(time.time() - start_time)))
126 | 
127 |                 iter_num = 0
128 |                 start_time = time.time()
129 | 
130 |                 if total_loss < best_loss:
131 |                     print(('Total loss decreased from {} to {}, saving weights'.format(best_loss, total_loss)))
132 |                     best_loss = total_loss
133 |                 model_tea.save_weights(
134 |                     os.path.join(out_path, 'net_e{}_l{}.hdf5'.format(epoch_num + 1 + add_epoch, total_loss)))
135 |                 break
136 |         except Exception as e:
137 |             print(('Exception: {}'.format(e)))
138 |             continue
139 |     records = np.concatenate((np.asarray(total_loss_r).reshape((-1, 1)),
140 |                               np.asarray(cls_loss_r1).reshape((-1, 1)),
141 |                               np.asarray(regr_loss_r1).reshape((-1, 1)),
142 |                               np.asarray(offset_loss_r1).reshape((-1, 1)),),
143 |                              axis=-1)
144 |     np.savetxt(res_file, np.array(records), fmt='%.6f')
145 | print('Training complete, exiting.')
146 | 


--------------------------------------------------------------------------------