├── LICENSE ├── README.md ├── v1 └── utils.py └── v2 └── utils └── __init__.py /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2016 Visual Computing Institute 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DROW: Deep Multiclass Detection in 2D Range ("Laser") Data 2 | 3 | All code related to our work on detection in laser (aka lidar aka 2D range) data, covering both of the following papers: 4 | 5 | - [DROW: Real-Time Deep Learning based Wheelchair Detection in 2D Range Data](http://arxiv.org/abs/1603.02636), henceforth called "v1". 6 | - [Deep Person Detection in 2D Range Data](https://arxiv.org/abs/1804.02463), henceforth called "v2". 7 | 8 | If you use anything provided here, please cite both papers in your work, see [citations below](#citations-and-thanks) below for the citation format. 9 | 10 | # DROW v2 Detector Training and Evaluation 11 | 12 | Code for training and evaluating DROW (v2) resides in various notebooks in the `v2` subfolder. 13 | All notebooks are highly similar, and each notebook is used for obtaining one different curve in the paper. 14 | Our final best model was obtained in `v2/Clean Final* [T=5,net=drow3xLF2p,odom=rot,trainval].ipynb`. 15 | 16 | ## What's new in v2? 17 | 18 | Our second paper ("Deep Person Detection in 2D Range Data") adds the following: 19 | 20 | - Annotations of persons in the dataset. 21 | - Inclusion of odometry in the dataset. 22 | - New network architecture. 23 | - Publishing of pre-trained weights. 24 | - Temporal integration of intormation in the model while respecting odometry. 25 | - Comparison to well-tuned competing state-of-the-art person detectors. (In the paper only, not this repo.) 26 | 27 | ## Pre-trained weights 28 | 29 | For DROW v2, we are able to provide the weights of the various models used in the paper here on GitHub in the [releases section](https://github.com/VisualComputingInstitute/DROW/releases). 30 | The names correspond to the notebooks in the `v2` subfolder which were used to obtain these models. 31 | 32 | When trying to load these weights, you might encounter the following error: 33 | ``` 34 | cuda runtime error (10) : invalid device ordinal 35 | ``` 36 | which can easily be solved by adding `map_location={'cuda:1': 'cuda:0'}` to the `load()` call, [additional details here](https://discuss.pytorch.org/t/saving-and-loading-torch-models-on-2-machines-with-different-number-of-gpu-devices/6666). 37 | 38 | 39 | # DROW v1 Training and Evaluation 40 | 41 | All code for training and evaluating DROW (v1) resides in the `v1/train-eval.ipynb` notebook, which you can open here on github or run for yourself. 42 | Most, but not all, of this notebook was used during actual training of the final model for the paper. 43 | While it was not intended to facilitate training your own model, it could be used for that after some careful reading. 44 | 45 | 46 | ## DROW v1 Detector ROS Node 47 | 48 | A ROS detector node that can be used with a trained model and outputs standard `PoseArray` messages can be found in the [STRANDS repositories](https://github.com/strands-project/strands_perception_people/tree/indigo-devel/wheelchair_detector). 49 | 50 | 51 | # DROW Laser Dataset 52 | 53 | You can obtain our full dataset here on GitHub in the [releases section](https://github.com/VisualComputingInstitute/DROW/releases), 54 | specifically the file [DROWv2-data.zip](https://github.com/VisualComputingInstitute/DROW/releases/download/v2/DROWv2-data.zip). 55 | 56 | PLEASE read the v1 paper carefully before asking about the dataset, as we describe it at length in Section III.A. 57 | Further details about the data storage format are given below in this README. 58 | 59 | ## Citations and Thanks 60 | 61 | If you use this dataset or code in your work, please cite **both** the following papers: 62 | 63 | > Beyer*, L., Hermans*, A., Leibe, B. (2017). DROW: Real-Time Deep Learning-Based Wheelchair Detection in 2-D Range Data. IEEE Robotics and Automation Letters, 2(2), 585-592. 64 | 65 | BibTex: 66 | 67 | ``` 68 | @article{BeyerHermans2016RAL, 69 | title = {{DROW: Real-Time Deep Learning based Wheelchair Detection in 2D Range Data}}, 70 | author = {Beyer*, Lucas and Hermans*, Alexander and Leibe, Bastian}, 71 | journal = {{IEEE Robotics and Automation Letters (RA-L)}}, 72 | year = {2016} 73 | } 74 | ``` 75 | 76 | > Beyer, L., Hermans, A., Linder T., Arras O.K., Leibe, B. (2018). Deep Person Detection in 2D Range Data. IEEE Robotics and Automation Letters, 3(3), 2726-2733. 77 | 78 | BibTex: 79 | 80 | ``` 81 | @article{Beyer2018RAL, 82 | title = {{Deep Person Detection in 2D Range Data}}, 83 | author = {Beyer, Lucas and Hermans, Alexander and Linder Timm and Arras Kai O. and Leibe, Bastian}, 84 | journal = {{IEEE Robotics and Automation Letters (RA-L)}}, 85 | year = {2018} 86 | } 87 | ``` 88 | 89 | Walker and wheelchair annotations by Lucas Beyer (@lucasb-eyer) and Alexander Hermans (@Pandoro), 90 | and huge thanks to Supinya Beyer (@SupinyaMay) who created the person annotations for v2. 91 | 92 | ## License 93 | 94 | The whole dataset is published under the MIT license, [roughly meaning](https://tldrlegal.com/license/mit-license) you can use it for whatever you want as long as you credit us. 95 | However, we encourage you to contribute any extensions back, so that this repository stays the central place. 96 | 97 | One exception to this licensing terms is the `reha` subset of the dataset, which we have converted from TU Ilmenau's data. 98 | The [original dataset](https://www.tu-ilmenau.de/de/neurob/data-sets-code/people-detection-in-2d-laser-range-data/) was released under [CC-BY-NC-SA 3.0 Unported License](http://creativecommons.org/licenses/by-nc-sa/3.0/), and our conversion of it included herein keeps that license. 99 | 100 | ## Data Recording Setup 101 | 102 | The exact recording setup is described in Section III.A of our v1 paper. 103 | In short, it was recorded using a SICK S300 spanning 225° in 450 poins at 37cm height. 104 | Recording happened in an elderly care facility, the test-set is completely disjoint from the train and validation sets, as it was recorded in a different aisle of the facility. 105 | 106 | ## Data Annotation Setup 107 | 108 | Again, the exact setup is described in the v1-paper. 109 | We used [this annotator](https://github.com/lucasb-eyer/laser-detection-annotator) to create the annotations. 110 | Instead of all the laser scans, we annotate small batches throughout every sequence as follows: 111 | A batch consists of 100 frames, out of which we annotate every 5th frame, resulting in 20 annotated frames per batch. 112 | Within a sequence, we only annotate every 4th batch, leading to a total of 5 % of the laser scans being annotated. 113 | 114 | ## Dataset Use and Format 115 | 116 | We highly recommend you use the `load_scan`, `load_dets`, and `load_odom` functions in `utils.py` for loading raw laser scans, detection annotations, and odometry data, respectively. 117 | Please see the code's doc-comments or the DROW reference code for details on how to use them. 118 | Please note that each scan (or frame), as well as detections and odometry, comes with a **sequence number that is only unique within a file, but not across files**. 119 | 120 | ### Detailed format description 121 | 122 | If you want to load the files yourself regardless, this is their format: 123 | 124 | One recording consists of a `.csv` file which contains all raw laser-scans, and one file per type of annotation, currently `.wc` for wheelchairs, `.wa` for walking-aids, and `.wp` for persons. 125 | 126 | The `.csv` files contain one line per scan, the first value is the sequence number of that scan, followed by 450 floating-point values representing the distance at which the laser-points hit something. 127 | There is at least one "magic value" for that distance at `29.96` which means N/A. 128 | Note that the laser values go from left-to-right, i.e. the first value corresponds to the leftmost laser point, from the robot's point of view. 129 | 130 | The files `.wa`/`.wc` again contain one line per frame and start with a sequence number which should be used to match the detections to the scan **in the corresponding `.csv` file only**. 131 | Then follows a json-encoded list of `(r,φ)` pairs, which are the detections in polar coordinates. 132 | For each detection, `r` represents the distance from the laser scanner and `φ ∈ [-π,π]` the angle in radians, zero being right in the front centered of the scanner ("up"), positive values going to the left and negative ones to the right. 133 | There's an important difference between an empty frame and an un-annotated one: 134 | An empty frame is present in the data as `123456,[]` and means that no detection of that type (person/wheelchair/walker) is present in the frame, whereas an un-annotated frame is simply not present in the file: the sequence number is skipped. 135 | 136 | Finally, the `.odom2` files again contain one line per frame and start with a sequence number which should be used to match the odometry data to the scan **in the corresponding `.csv` file only**. 137 | Then follows a comma-separated sequence of floating points, which correspond to `time` in seconds, `Tx` and `Ty` translation in meters, and `φ ∈ [-π,π]` orientation in radians of the robot's scanner. 138 | These values are all relative to some arbitrary initial value which is not provided, so one should only work with differences. 139 | -------------------------------------------------------------------------------- /v1/utils.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import json 3 | import matplotlib as mpl 4 | import matplotlib.pyplot as plt 5 | import cv2 6 | import scipy.ndimage 7 | import scipy.interpolate 8 | 9 | 10 | laserFoV = np.radians(225) 11 | 12 | 13 | def laser_angles(N, fov=None): 14 | fov = fov or laserFoV 15 | return np.linspace(-fov*0.5, fov*0.5, N) 16 | 17 | 18 | def xy_to_rphi(x, y): 19 | # NOTE: Axes rotated by 90 CCW by intent, so tat 0 is top. 20 | return np.hypot(x, y), np.arctan2(-x, y) 21 | 22 | 23 | def rphi_to_xy(r, phi): 24 | return r * -np.sin(phi), r * np.cos(phi) 25 | 26 | 27 | def scan_to_xy(scan, thresh=None, fov=None): 28 | s = np.array(scan, copy=True) 29 | if thresh is not None: 30 | s[s > thresh] = np.nan 31 | return rphi_to_xy(s, laser_angles(len(scan), fov)) 32 | 33 | 34 | def load_scan(fname): 35 | data = np.genfromtxt(fname, delimiter=",") 36 | seqs, scans = data[:,0].astype(np.uint32), data[:,1:-1] 37 | return seqs, scans 38 | 39 | 40 | def load_dets(name): 41 | def _doload(fname): 42 | seqs, dets = [], [] 43 | with open(fname) as f: 44 | for line in f: 45 | seq, tail = line.split(',', 1) 46 | seqs.append(int(seq)) 47 | dets.append(json.loads(tail)) 48 | return seqs, dets 49 | 50 | s1, wcs = _doload(name + ".wc") 51 | s2, was = _doload(name + ".wa") 52 | 53 | assert all(a == b for a, b in zip(s1, s2)), "Uhhhh?" 54 | return s1, wcs, was 55 | 56 | 57 | def precrec_unvoted(preds, gts, radius, pred_rphi=False, gt_rphi=False): 58 | """ 59 | The "unvoted" precision/recall, meaning that multiple predictions for the same ground-truth are NOT penalized. 60 | 61 | - `preds` an iterable (scans) of iterables (per scan) containing predicted x/y or r/phi pairs. 62 | - `gts` an iterable (scans) of iterables (per scan) containing ground-truth x/y or r/phi pairs. 63 | - `radius` the cutoff-radius for "correct", in meters. 64 | - `pred_rphi` whether `preds` is r/phi (True) or x/y (False). 65 | - `gt_rphi` whether `gts` is r/phi (True) or x/y (False). 66 | 67 | Returns a pair of numbers: (precision, recall) 68 | """ 69 | # Tested against other code. 70 | 71 | npred, npred_hit, ngt, ngt_hit = 0.0, 0.0, 0.0, 0.0 72 | for ps, gts in zip(preds, gts): 73 | # Distance between each ground-truth and predictions 74 | assoc = np.zeros((len(gts), len(ps))) 75 | 76 | for ip, p in enumerate(ps): 77 | for igt, gt in enumerate(gts): 78 | px, py = rphi_to_xy(*p) if pred_rphi else p 79 | gx, gy = rphi_to_xy(*gt) if gt_rphi else gt 80 | assoc[igt, ip] = np.hypot(px-gx, py-gy) 81 | 82 | # Now cutting it off at `radius`, we can get all we need. 83 | assoc = assoc < radius 84 | npred += len(ps) 85 | npred_hit += np.count_nonzero(np.sum(assoc, axis=0)) 86 | ngt += len(gts) 87 | ngt_hit += np.count_nonzero(np.sum(assoc, axis=1)) 88 | 89 | return ( 90 | npred_hit/npred if npred > 0 else np.nan, 91 | ngt_hit/ngt if ngt > 0 else np.nan 92 | ) 93 | 94 | 95 | def precrec(preds, gts, radius, pred_rphi=False, gt_rphi=False): 96 | """ 97 | Ideally, we'd use Hungarian algorithm instead of greedy one on all "hits" within the radius, but meh. 98 | 99 | - `preds` an iterable (scans) of iterables (per scan) containing predicted x/y or r/phi pairs. 100 | - `gts` an iterable (scans) of iterables (per scan) containing ground-truth x/y or r/phi pairs. 101 | - `radius` the cutoff-radius for "correct", in meters. 102 | - `pred_rphi` whether `preds` is r/phi (True) or x/y (False). 103 | - `gt_rphi` whether `gts` is r/phi (True) or x/y (False). 104 | 105 | Returns a pair of numbers: (precision, recall) 106 | """ 107 | tp, fp, fn = 0.0, 0.0, 0.0 108 | for ps, gts in zip(preds, gts): 109 | # Assign each ground-truth the prediction which is closest to it AND inside the radius. 110 | assoc = np.zeros((len(gts), len(ps))) 111 | for igt, gt in enumerate(gts): 112 | min_d = radius 113 | best = -1 114 | for ip, p in enumerate(ps): 115 | # Skip prediction if already associated. 116 | if np.any(assoc[:,ip]): 117 | continue 118 | 119 | px, py = rphi_to_xy(*p) if pred_rphi else p 120 | gx, gy = rphi_to_xy(*gt) if gt_rphi else gt 121 | d = np.hypot(px-gx, py-gy) 122 | if d < min_d: 123 | min_d = d 124 | best = ip 125 | 126 | if best != -1: 127 | assoc[igt,best] = 1 128 | 129 | nassoc = np.sum(assoc) 130 | tp += nassoc # All associated predictions are true pos. 131 | fp += len(ps) - nassoc # All not-associated predictions are false pos. 132 | fn += len(gts) - nassoc # All not-associated ground-truths are false negs. 133 | 134 | return tp/(fp+tp) if fp+tp > 0 else np.nan, tp/(fn+tp) if fn+tp > 0 else np.nan 135 | 136 | 137 | # Tested with gts,gts -> 1,1 and the following -> (0.5, 0.6666) 138 | # precrec( 139 | # preds=[[(-1,0),(0,0),(1,0),(0,1)]], 140 | # gts=[[(-0.5,0),(0.5,0),(-2,-2)]], 141 | # radius=0.6 142 | # ) 143 | 144 | 145 | def prettify_pr_curve(ax): 146 | ax.plot([0,1], [0,1], ls="--", c=".6") 147 | ax.set_xlim(-0.02,1.02) 148 | ax.set_ylim(-0.02,1.02) 149 | ax.set_xlabel("Recall [%]") 150 | ax.set_ylabel("Precision [%]") 151 | ax.axes.xaxis.set_major_formatter(mpl.ticker.FuncFormatter(lambda x, pos: '{:.0f}'.format(x*100))) 152 | ax.axes.yaxis.set_major_formatter(mpl.ticker.FuncFormatter(lambda x, pos: '{:.0f}'.format(x*100))) 153 | return ax 154 | 155 | 156 | def votes_to_detections(locations, probas=None, in_rphi=True, out_rphi=True, bin_size=0.1, blur_win=21, blur_sigma=2.0, x_min=-15.0, x_max=15.0, y_min=-5.0, y_max=15.0, retgrid=False): 157 | ''' 158 | Convert a list of votes to a list of detections based on Non-Max supression. 159 | 160 | - `locations` an iterable containing predicted x/y or r/phi pairs. 161 | - `probas` an iterable containing predicted probabilities. Considered all ones if `None`. 162 | - `in_rphi` whether `locations` is r/phi (True) or x/y (False). 163 | - `out_rphi` whether the output should be r/phi (True) or x/y (False). 164 | - `bin_size` the bin size (in meters) used for the grid where votes are cast. 165 | - `blur_win` the window size (in bins) used to blur the voting grid. 166 | - `blur_sigma` the sigma used to compute the Gaussian in the blur window. 167 | - `x_min` the left limit for the voting grid, in meters. 168 | - `x_max` the right limit for the voting grid, in meters. 169 | - `y_min` the bottom limit for the voting grid in meters. 170 | - `y_max` the top limit for the voting grid in meters. 171 | 172 | Returns a list of tuples (x,y,class) or (r,phi,class) where `class` is 173 | the index into `probas` which was highest for each detection, thus starts at 0. 174 | 175 | NOTE/TODO: We really should replace `bin_size` by `nbins` so as to avoid "remainders". 176 | Right now, we simply ignore the remainder on the "max" side. 177 | ''' 178 | locations = np.array(locations) 179 | if len(locations) == 0: 180 | return [] 181 | 182 | if probas is None: 183 | probas = np.ones((len(locations),1)) 184 | else: 185 | probas = np.array(probas) 186 | assert len(probas) == len(locations) and probas.ndim == 2, "Invalid format of `probas`" 187 | 188 | x_range = int((x_max-x_min)/bin_size) 189 | y_range = int((y_max-y_min)/bin_size) 190 | grid = np.zeros((x_range, y_range, 1+probas.shape[1]), np.float32) 191 | 192 | # Update x/y max to correspond to the end of the last bin. 193 | # TODO: fix this as stated in the docstring. 194 | x_max = x_min + x_range*bin_size 195 | y_max = y_min + y_range*bin_size 196 | 197 | # Do the voting into the grid. 198 | for loc, p in zip(locations, probas): 199 | x,y = rphi_to_xy(*loc) if in_rphi else loc 200 | 201 | # Skip votes outside the grid. 202 | if not (x_min < x < x_max and y_min < y < y_max): 203 | continue 204 | 205 | x = int((x-x_min)/bin_size) 206 | y = int((y-y_min)/bin_size) 207 | grid[x,y,0] += np.sum(p) 208 | grid[x,y,1:] += p 209 | 210 | # Yes, this blurs each channel individually, just what we need! 211 | grid = cv2.GaussianBlur(grid, (blur_win,blur_win), blur_sigma) 212 | 213 | # Find the maxima (NMS) only in the "common" voting grid. 214 | grid_all = grid[:,:,0] 215 | max_grid = scipy.ndimage.maximum_filter(grid_all, size=3) 216 | maxima = (grid_all == max_grid) & (grid_all != 0) 217 | m_x, m_y = np.where(maxima) 218 | 219 | # Probabilities of all classes where maxima were found. 220 | m_p = grid[m_x, m_y, 1:] 221 | 222 | # Back from grid-bins to real-world locations. 223 | m_x = m_x*bin_size + x_min + bin_size/2 224 | m_y = m_y*bin_size + y_min + bin_size/2 225 | maxima = [(xy_to_rphi(x,y) if out_rphi else (x,y)) + (np.argmax(p),) for x,y,p in zip(m_x, m_y, m_p)] 226 | return (maxima, grid) if retgrid else maxima 227 | 228 | 229 | def generate_cut_outs(scan, standard_depth=4.0, window_size=48, threshold_distance=1.0, npts=None, center=True, border=29.99, resample_type='cv', **kw): 230 | ''' 231 | Generate window cut outs that all have a fixed size independent of depth. 232 | This means areas close to the scanner will be subsampled and areas far away 233 | will be upsampled. 234 | All cut outs will have values between `-threshold_distance` and `+threshold_distance` 235 | as they are normalized by the center point. 236 | 237 | - `scan` an iterable of radii within a laser scan. 238 | - `standard_depth` the reference distance (in meters) at which a window with `window_size` gets cut out. 239 | - `window_size` the window of laser rays that will be extracted everywhere. 240 | - `npts` is the number of final samples to have per window. `None` means same as `window_size`. 241 | - `threshold_distance` the distance in meters from the center point that will be used to clamp the laser radii. 242 | Since we're talking about laser-radii, this means the cutout is a donut-shaped hull, as opposed to a rectangular hull. 243 | This can be `np.inf` to skip the clamping altogether. 244 | - `center` whether to center the cutout around the current laser point's depth (True), or keep depth values raw (False). 245 | - `border` the radius value to fill the half of the outermost windows with. 246 | - `resample_type` specifies the resampling API to be used. Possible values are: 247 | - `cv` for OpenCV's `cv2.resize` function using LINEAR/AREA interpolation. 248 | - `zoom` for SciPy's `zoom` function, to which options such as `order=3` can be passed as extra kwargs. 249 | - `int1d` for SciPy's `interp1d` function, to which options such as `kind=3` can be passed as extra kwargs. 250 | ''' 251 | s_np = np.fromiter(iter(scan), dtype=np.float32) 252 | N = len(s_np) 253 | 254 | npts = npts or window_size 255 | cut_outs = np.zeros((N, npts), dtype=np.float32) 256 | 257 | current_size = (window_size * standard_depth / s_np).astype(np.int32) 258 | start = -current_size//2 + np.arange(N) 259 | end = start + current_size 260 | s_np_extended = np.append(s_np, border) 261 | 262 | # While we don't really need to special-case, it should save precious computation. 263 | if threshold_distance != np.inf: 264 | near = s_np-threshold_distance 265 | far = s_np+threshold_distance 266 | 267 | for i in range(N): 268 | # Get the window. 269 | sample_points = np.arange(start[i], end[i]) 270 | sample_points[sample_points < 0] = -1 271 | sample_points[sample_points >= N] = -1 272 | window = s_np_extended[sample_points] 273 | 274 | # Threshold the near and far values, then 275 | if threshold_distance != np.inf: 276 | window = np.clip(window, near[i], far[i]) 277 | 278 | # shift everything to be centered around the middle point. 279 | if center: 280 | window -= s_np[i] 281 | 282 | # Values will now span [-d,d] if `center` and `clamp` are both True. 283 | 284 | # resample it to the correct size. 285 | if resample_type == 'cv': 286 | # Use 'INTER_LINEAR' for when down-sampling the image LINEAR is ridiculous. 287 | # It's just 1ms slower for a whole scan in the worst case. 288 | interp = cv2.INTER_AREA if npts < len(window) else cv2.INTER_LINEAR 289 | cut_outs[i,:] = cv2.resize(window[None], (npts,1), interpolation=interp)[0] 290 | elif resample_type == 'zoom': 291 | scipy.ndimage.interpolation.zoom(window, npts/len(window), output=cut_outs[i,:], **kw) 292 | elif resample_type == 'int1d': 293 | cut_outs[i,:] = scipy.interpolate.interp1d(np.linspace(0,1, num=len(window), endpoint=True), window, assume_sorted=True, copy=False, **kw)(np.linspace(0,1,num=npts, endpoint=True)) 294 | 295 | return cut_outs 296 | 297 | 298 | def generate_cut_outs_raw(scan, window_size=48, threshold_distance=np.inf, center=False, border=29.99): 299 | ''' 300 | Generate window cut outs that all have a fixed number of rays independent of depth. 301 | This means objects close to the scanner will cover more rays and those far away fewer. 302 | All cut outs will contain the raw values from the input scan. 303 | 304 | - `scan` an iterable of radii within a laser scan. 305 | - `window_size` the window of laser rays that will be extracted everywhere. 306 | - `threshold_distance` the distance in meters from the center point that will be used to clamp the laser radii. 307 | Since we're talking about laser-radii, this means the cutout is a donut-shaped hull, as opposed to a rectangular hull. 308 | This can be `np.inf` to skip the clamping altogether. 309 | - `center` whether to center the cutout around the current laser point's depth (True), or keep depth values raw (False). 310 | - `border` the radius value to fill the half of the outermost windows with. 311 | ''' 312 | s_np = np.fromiter(iter(scan), dtype=np.float32) 313 | N = len(s_np) 314 | 315 | cut_outs = np.zeros((N, window_size), dtype=np.float32) 316 | 317 | start = -window_size//2 + np.arange(N) 318 | end = start + window_size 319 | s_np_extended = np.append(s_np, border) 320 | 321 | # While we don't really need to special-case, it should save precious computation. 322 | if threshold_distance != np.inf: 323 | near = s_np-threshold_distance 324 | far = s_np+threshold_distance 325 | 326 | for i in range(N): 327 | # Get the window. 328 | sample_points = np.arange(start[i], end[i]) 329 | sample_points[sample_points < 0] = -1 330 | sample_points[sample_points >= N] = -1 331 | window = s_np_extended[sample_points] 332 | 333 | # Threshold the near and far values, then 334 | if threshold_distance != np.inf: 335 | window = np.clip(window, near[i], far[i]) 336 | 337 | # shift everything to be centered around the middle point. 338 | if center: 339 | window -= s_np[i] 340 | 341 | cut_outs[i,:] = window 342 | 343 | return cut_outs 344 | 345 | 346 | def hyperopt(pred_conf): 347 | ho_wBG = 0.38395839618267696 348 | ho_wWC = 0.599481486880304 349 | ho_wWA = 0.4885948464627302 350 | 351 | # Unused 352 | ho_sigma = 2.93 353 | ho_binsz = 0.10 354 | 355 | # Compute "optimal" "tight" window-size dependent on blur-size. 356 | ho_blur_win = ho_sigma*5 357 | ho_blur_win = int(2*(ho_blur_win//2)+1) # Make odd 358 | 359 | # Weight network outputs 360 | newconf = pred_conf * [ho_wBG, ho_wWC, ho_wWA] 361 | # And re-normalize to get "real" probabilities 362 | newconf /= np.sum(newconf, axis=-1, keepdims=True) 363 | 364 | return newconf, {'bin_size': ho_binsz, 'blur_win': ho_blur_win, 'blur_sigma': ho_sigma} 365 | -------------------------------------------------------------------------------- /v2/utils/__init__.py: -------------------------------------------------------------------------------- 1 | from collections import defaultdict 2 | import numpy as np 3 | import json 4 | import matplotlib as mpl 5 | import matplotlib.pyplot as plt 6 | import cv2 7 | import scipy.ndimage 8 | import scipy.interpolate 9 | from scipy.optimize import linear_sum_assignment 10 | from scipy.spatial.distance import cdist 11 | from sklearn.metrics import auc 12 | 13 | import lbtoolbox.plotting as lbplt 14 | 15 | 16 | laserIncrement = np.radians(0.5) 17 | laserFoV = (450-1)*laserIncrement # 450 points. 18 | 19 | 20 | def laser_angles(N, fov=None): 21 | fov = fov or laserFoV 22 | return np.linspace(-fov*0.5, fov*0.5, N) 23 | 24 | 25 | def xy_to_rphi(x, y): 26 | # NOTE: Axes rotated by 90 CCW by intent, so that 0 is top. 27 | return np.hypot(x, y), np.arctan2(-x, y) 28 | 29 | 30 | def rphi_to_xy(r, phi): 31 | return r * -np.sin(phi), r * np.cos(phi) 32 | 33 | 34 | def scan_to_xy(scan, thresh=None, fov=None): 35 | s = np.array(scan, copy=True) 36 | if thresh is not None: 37 | s[s > thresh] = np.nan 38 | return rphi_to_xy(s, laser_angles(len(scan), fov)) 39 | 40 | 41 | def load_scan(fname): 42 | data = np.genfromtxt(fname, delimiter=",") 43 | seqs, times, scans = data[:,0].astype(np.uint32), data[:,1].astype(np.float32), data[:,2:].astype(np.float32) 44 | return seqs, times, scans 45 | 46 | 47 | def load_odom(fname): 48 | return np.genfromtxt(fname, delimiter=",", dtype=[('seq', 'u4'), ('t', 'f4'), ('xya', 'f4', 3)]) 49 | 50 | 51 | def load_dets(name, DATADIR, LABELDIR): 52 | def _doload(fname): 53 | seqs, dets = [], [] 54 | with open(fname) as f: 55 | for line in f: 56 | seq, tail = line.split(',', 1) 57 | seqs.append(int(seq)) 58 | dets.append(json.loads(tail)) 59 | return seqs, dets 60 | 61 | s1, wcs = _doload(name.replace(DATADIR, LABELDIR) + '.wc') 62 | s2, was = _doload(name.replace(DATADIR, LABELDIR) + '.wa') 63 | s3, wps = _doload(name.replace(DATADIR, LABELDIR) + '.wp') 64 | 65 | assert all(a == b == c for a, b, c in zip(s1, s2, s3)), "Uhhhh?" 66 | return np.array(s1), wcs, was, wps 67 | 68 | 69 | def linearize(all_seqs, all_scans, all_detseqs, all_wcs, all_was, all_wps): 70 | lin_seqs, lin_scans, lin_wcs, lin_was, lin_wps = [], [], [], [], [] 71 | # Loop through the "sessions" (correspond to files) 72 | for seqs, scans, detseqs, wcs, was, wps in zip(all_seqs, all_scans, all_detseqs, all_wcs, all_was, all_wps): 73 | # Note that sequence IDs may overlap between sessions! 74 | s2s = dict(zip(seqs, scans)) 75 | # Go over the individual measurements/annotations of a session. 76 | for ds, wc, wa, wp in zip(detseqs, wcs, was, wps): 77 | lin_seqs.append(ds) 78 | lin_scans.append(s2s[ds]) 79 | lin_wcs.append(wc) 80 | lin_was.append(wa) 81 | lin_wps.append(wp) 82 | return lin_seqs, lin_scans, lin_wcs, lin_was, lin_wps 83 | 84 | 85 | def closest_detection(scan, dets, radii): 86 | """ 87 | Given a single `scan` (450 floats), a list of r,phi detections `dets` (Nx2), 88 | and a list of N `radii` for those detections, return a mapping from each 89 | point in `scan` to the closest detection for which the point falls inside its radius. 90 | The returned detection-index is a 1-based index, with 0 meaning no detection 91 | is close enough to that point. 92 | """ 93 | if len(dets) == 0: 94 | return np.zeros_like(scan, dtype=int) 95 | 96 | assert len(dets) == len(radii), "Need to give a radius for each detection!" 97 | 98 | scan_xy = np.array(scan_to_xy(scan)).T 99 | 100 | # Distance (in x,y space) of each laser-point with each detection. 101 | dists = cdist(scan_xy, np.array([rphi_to_xy(x,y) for x,y in dets])) 102 | 103 | # Subtract the radius from the distances, such that they are <0 if inside, >0 if outside. 104 | dists -= radii 105 | 106 | # Prepend zeros so that argmin is 0 for everything "outside". 107 | dists = np.hstack([np.zeros((len(scan),1)), dists]) 108 | 109 | # And find out who's closest, including the threshold! 110 | return np.argmin(dists, axis=1) 111 | 112 | 113 | def global2win(r, phi, dr, dphi): 114 | # Convert to relative, angle-aligned x/y coordinate-system. 115 | dx = np.sin(dphi-phi) * dr 116 | dy = np.cos(dphi-phi) * dr - r 117 | return dx, dy 118 | 119 | 120 | def win2global(r, phi, dx, dy): 121 | y = r + dy 122 | dphi = np.arctan2(dx, y) # dx first is correct due to problem geometry dx -> y axis and vice versa. 123 | return y/np.cos(dphi), phi + dphi 124 | 125 | 126 | def precrec_unvoted(preds, gts, radius, pred_rphi=False, gt_rphi=False): 127 | """ 128 | The "unvoted" precision/recall, meaning that multiple predictions for the same ground-truth are NOT penalized. 129 | 130 | - `preds` an iterable (scans) of iterables (per scan) containing predicted x/y or r/phi pairs. 131 | - `gts` an iterable (scans) of iterables (per scan) containing ground-truth x/y or r/phi pairs. 132 | - `radius` the cutoff-radius for "correct", in meters. 133 | - `pred_rphi` whether `preds` is r/phi (True) or x/y (False). 134 | - `gt_rphi` whether `gts` is r/phi (True) or x/y (False). 135 | 136 | Returns a pair of numbers: (precision, recall) 137 | """ 138 | # Tested against other code. 139 | 140 | npred, npred_hit, ngt, ngt_hit = 0.0, 0.0, 0.0, 0.0 141 | for ps, gts in zip(preds, gts): 142 | # Distance between each ground-truth and predictions 143 | assoc = np.zeros((len(gts), len(ps))) 144 | 145 | for ip, p in enumerate(ps): 146 | for igt, gt in enumerate(gts): 147 | px, py = rphi_to_xy(*p) if pred_rphi else p 148 | gx, gy = rphi_to_xy(*gt) if gt_rphi else gt 149 | assoc[igt, ip] = np.hypot(px-gx, py-gy) 150 | 151 | # Now cutting it off at `radius`, we can get all we need. 152 | assoc = assoc < radius 153 | npred += len(ps) 154 | npred_hit += np.count_nonzero(np.sum(assoc, axis=0)) 155 | ngt += len(gts) 156 | ngt_hit += np.count_nonzero(np.sum(assoc, axis=1)) 157 | 158 | return ( 159 | npred_hit/npred if npred > 0 else np.nan, 160 | ngt_hit/ngt if ngt > 0 else np.nan 161 | ) 162 | 163 | 164 | def precrec(preds, gts, radius, pred_rphi=False, gt_rphi=False): 165 | """ 166 | Ideally, we'd use Hungarian algorithm instead of greedy one on all "hits" within the radius, but meh. 167 | 168 | - `preds` an iterable (scans) of iterables (per scan) containing predicted x/y or r/phi pairs. 169 | - `gts` an iterable (scans) of iterables (per scan) containing ground-truth x/y or r/phi pairs. 170 | - `radius` the cutoff-radius for "correct", in meters. 171 | - `pred_rphi` whether `preds` is r/phi (True) or x/y (False). 172 | - `gt_rphi` whether `gts` is r/phi (True) or x/y (False). 173 | 174 | Returns a pair of numbers: (precision, recall) 175 | """ 176 | tp, fp, fn = 0.0, 0.0, 0.0 177 | for ps, gts in zip(preds, gts): 178 | # Assign each ground-truth the prediction which is closest to it AND inside the radius. 179 | assoc = np.zeros((len(gts), len(ps))) 180 | for igt, gt in enumerate(gts): 181 | min_d = radius 182 | best = -1 183 | for ip, p in enumerate(ps): 184 | # Skip prediction if already associated. 185 | if np.any(assoc[:,ip]): 186 | continue 187 | 188 | px, py = rphi_to_xy(*p) if pred_rphi else p 189 | gx, gy = rphi_to_xy(*gt) if gt_rphi else gt 190 | d = np.hypot(px-gx, py-gy) 191 | if d < min_d: 192 | min_d = d 193 | best = ip 194 | 195 | if best != -1: 196 | assoc[igt,best] = 1 197 | 198 | nassoc = np.sum(assoc) 199 | tp += nassoc # All associated predictions are true pos. 200 | fp += len(ps) - nassoc # All not-associated predictions are false pos. 201 | fn += len(gts) - nassoc # All not-associated ground-truths are false negs. 202 | 203 | return tp/(fp+tp) if fp+tp > 0 else np.nan, tp/(fn+tp) if fn+tp > 0 else np.nan 204 | 205 | 206 | # Tested with gts,gts -> 1,1 and the following -> (0.5, 0.6666) 207 | # precrec( 208 | # preds=[[(-1,0),(0,0),(1,0),(0,1)]], 209 | # gts=[[(-0.5,0),(0.5,0),(-2,-2)]], 210 | # radius=0.6 211 | # ) 212 | 213 | 214 | def prettify_pr_curve(ax): 215 | ax.plot([0,1], [0,1], ls="--", c=".6") 216 | ax.set_xlim(-0.02,1.02) 217 | ax.set_ylim(-0.02,1.02) 218 | ax.set_xlabel("Recall [%]") 219 | ax.set_ylabel("Precision [%]") 220 | ax.axes.xaxis.set_major_formatter(mpl.ticker.FuncFormatter(lambda x, pos: '{:.0f}'.format(x*100))) 221 | ax.axes.yaxis.set_major_formatter(mpl.ticker.FuncFormatter(lambda x, pos: '{:.0f}'.format(x*100))) 222 | return ax 223 | 224 | 225 | def votes_to_detections(locations, probas=None, in_rphi=True, out_rphi=True, 226 | bin_size=0.1, blur_win=21, blur_sigma=2.0, 227 | x_min=-15.0, x_max=15.0, y_min=-5.0, y_max=15.0, retgrid=False): 228 | ''' 229 | Convert a list of votes to a list of detections based on Non-Max supression. 230 | 231 | - `locations` an iterable containing predicted x/y or r/phi pairs. 232 | - `probas` an iterable containing predicted probabilities. Considered all ones if `None`. 233 | - `in_rphi` whether `locations` is r/phi (True) or x/y (False). 234 | - `out_rphi` whether the output should be r/phi (True) or x/y (False). 235 | - `bin_size` the bin size (in meters) used for the grid where votes are cast. 236 | - `blur_win` the window size (in bins) used to blur the voting grid. 237 | - `blur_sigma` the sigma used to compute the Gaussian in the blur window. 238 | - `x_min` the left limit for the voting grid, in meters. 239 | - `x_max` the right limit for the voting grid, in meters. 240 | - `y_min` the bottom limit for the voting grid in meters. 241 | - `y_max` the top limit for the voting grid in meters. 242 | 243 | Returns a list of tuples (x,y,class) or (r,phi,class) where `class` is 244 | the index into `probas` which was highest for each detection, thus starts at 0. 245 | 246 | NOTE/TODO: We really should replace `bin_size` by `nbins` so as to avoid "remainders". 247 | Right now, we simply ignore the remainder on the "max" side. 248 | ''' 249 | locations = np.array(locations) 250 | if len(locations) == 0: 251 | return ([], 0) if retgrid else [] 252 | 253 | if probas is None: 254 | probas = np.ones((len(locations),1)) 255 | else: 256 | probas = np.array(probas) 257 | assert len(probas) == len(locations) and probas.ndim == 2, "Invalid format of `probas`" 258 | 259 | x_range = int((x_max-x_min)/bin_size) 260 | y_range = int((y_max-y_min)/bin_size) 261 | grid = np.zeros((x_range, y_range, 1+probas.shape[1]), np.float32) 262 | 263 | # Update x/y max to correspond to the end of the last bin. 264 | # TODO: fix this as stated in the docstring. 265 | x_max = x_min + x_range*bin_size 266 | y_max = y_min + y_range*bin_size 267 | 268 | # Do the voting into the grid. 269 | for loc, p in zip(locations, probas): 270 | x,y = rphi_to_xy(*loc) if in_rphi else loc 271 | 272 | # Skip votes outside the grid. 273 | if not (x_min < x < x_max and y_min < y < y_max): 274 | continue 275 | 276 | x = int((x-x_min)/bin_size) 277 | y = int((y-y_min)/bin_size) 278 | grid[x,y,0] += np.sum(p) 279 | grid[x,y,1:] += p 280 | 281 | # Yes, this blurs each channel individually, just what we need! 282 | grid = cv2.GaussianBlur(grid, (blur_win,blur_win), blur_sigma) 283 | 284 | # Find the maxima (NMS) only in the "common" voting grid. 285 | grid_all = grid[:,:,0] 286 | max_grid = scipy.ndimage.maximum_filter(grid_all, size=3) 287 | maxima = (grid_all == max_grid) & (grid_all != 0) 288 | m_x, m_y = np.where(maxima) 289 | 290 | # Probabilities of all classes where maxima were found. 291 | m_p = grid[m_x, m_y, 1:] 292 | 293 | # Back from grid-bins to real-world locations. 294 | m_x = m_x*bin_size + x_min + bin_size/2 295 | m_y = m_y*bin_size + y_min + bin_size/2 296 | maxima = [(xy_to_rphi(x,y) if out_rphi else (x,y)) + (np.argmax(p),) for x,y,p in zip(m_x, m_y, m_p)] 297 | return (maxima, grid) if retgrid else maxima 298 | 299 | 300 | def generate_cut_outs(scan, standard_depth=4.0, window_size=48, threshold_distance=1.0, npts=None, 301 | center='point', border=29.99, resample_type='cv', **kw): 302 | ''' 303 | Generate window cut outs that all have a fixed size independent of depth. 304 | This means areas close to the scanner will be subsampled and areas far away 305 | will be upsampled. 306 | All cut outs will have values between `-threshold_distance` and `+threshold_distance` 307 | as they are normalized by the center point. 308 | 309 | - `scan` an iterable of radii within a laser scan. 310 | - `standard_depth` the reference distance (in meters) at which a window with `window_size` gets cut out. 311 | - `window_size` the window of laser rays that will be extracted everywhere. 312 | - `npts` is the number of final samples to have per window. `None` means same as `window_size`. 313 | - `threshold_distance` the distance in meters from the center point that will be used to clamp the laser radii. 314 | Since we're talking about laser-radii, this means the cutout is a donut-shaped hull, as opposed to a rectangular hull. 315 | This can be `np.inf` to skip the clamping altogether. 316 | - `center` Defines how the cutout depth-values will be centered/rescaled: 317 | - 'none'/'raw'/None: keep the raw depth values in meters. 318 | - 'point': move the depth values such that the center-point of the cutout is at zero. 319 | - 'near': move the depth values such that the 'near' cutoff is at zero (it's like 'point' + `threshold_distance`). 320 | - 'far': actually turn depth upside down, such that the 'far' threshold is at zero, 321 | the current laser point at `threshold_distance` and the 'near' at `2*threshold_distance`. 322 | This, combined with `border=29.99` has the advantage of putting border to zero. 323 | - `border` the radius value to fill the half of the outermost windows with. 324 | - `resample_type` specifies the resampling API to be used. Possible values are: 325 | - `cv` for OpenCV's `cv2.resize` function using LINEAR/AREA interpolation. 326 | - `zoom` for SciPy's `zoom` function, to which options such as `order=3` can be passed as extra kwargs. 327 | - `int1d` for SciPy's `interp1d` function, to which options such as `kind=3` can be passed as extra kwargs. 328 | ''' 329 | s_np = np.fromiter(iter(scan), dtype=np.float32) 330 | N = len(s_np) 331 | 332 | npts = npts or window_size 333 | cut_outs = np.zeros((N, npts), dtype=np.float32) 334 | 335 | current_size = (window_size * standard_depth / s_np).astype(np.int32) 336 | start = -current_size//2 + np.arange(N) 337 | end = start + current_size 338 | s_np_extended = np.append(s_np, border) 339 | 340 | # While we don't really need to special-case, it should save precious computation. 341 | if threshold_distance != np.inf: 342 | near = s_np-threshold_distance 343 | far = s_np+threshold_distance 344 | 345 | for i in range(N): 346 | # Get the window. 347 | sample_points = np.arange(start[i], end[i]) 348 | sample_points[sample_points < 0] = -1 349 | sample_points[sample_points >= N] = -1 350 | window = s_np_extended[sample_points] 351 | 352 | # Threshold the near and far values, then 353 | if threshold_distance != np.inf: 354 | window = np.clip(window, near[i], far[i]) 355 | 356 | # shift everything to be centered around the middle point. 357 | if center == 'point': 358 | window -= s_np[i] 359 | elif center == 'near' and threshold_distance != np.inf: 360 | window -= near[i] 361 | elif center == 'far' and threshold_distance != np.inf: 362 | window = far[i] - window 363 | elif center not in (None, 'none', 'raw'): 364 | raise ValueError("unknown `center` parameter " + str(center)) 365 | 366 | # Values will now span [-d,d] if `center` and `clamp` are both True. 367 | 368 | # resample it to the correct size. 369 | if resample_type == 'cv': 370 | # Use 'INTER_LINEAR' for when down-sampling the image LINEAR is ridiculous. 371 | # It's just 1ms slower for a whole scan in the worst case. 372 | interp = cv2.INTER_AREA if npts < len(window) else cv2.INTER_LINEAR 373 | cut_outs[i,:] = cv2.resize(window[None], (npts,1), interpolation=interp)[0] 374 | elif resample_type == 'zoom': 375 | scipy.ndimage.interpolation.zoom(window, npts/len(window), output=cut_outs[i,:], **kw) 376 | elif resample_type == 'int1d': 377 | cut_outs[i,:] = scipy.interpolate.interp1d(np.linspace(0,1, num=len(window), endpoint=True), window, assume_sorted=True, copy=False, **kw)(np.linspace(0,1,num=npts, endpoint=True)) 378 | 379 | return cut_outs 380 | 381 | 382 | def generate_cut_outs_raw(scan, window_size=48, threshold_distance=np.inf, center=False, border=29.99): 383 | ''' 384 | Generate window cut outs that all have a fixed number of rays independent of depth. 385 | This means objects close to the scanner will cover more rays and those far away fewer. 386 | All cut outs will contain the raw values from the input scan. 387 | 388 | - `scan` an iterable of radii within a laser scan. 389 | - `window_size` the window of laser rays that will be extracted everywhere. 390 | - `threshold_distance` the distance in meters from the center point that will be used to clamp the laser radii. 391 | Since we're talking about laser-radii, this means the cutout is a donut-shaped hull, as opposed to a rectangular hull. 392 | This can be `np.inf` to skip the clamping altogether. 393 | - `center` whether to center the cutout around the current laser point's depth (True), or keep depth values raw (False). 394 | - `border` the radius value to fill the half of the outermost windows with. 395 | ''' 396 | s_np = np.fromiter(iter(scan), dtype=np.float32) 397 | N = len(s_np) 398 | 399 | cut_outs = np.zeros((N, window_size), dtype=np.float32) 400 | 401 | start = -window_size//2 + np.arange(N) 402 | end = start + window_size 403 | s_np_extended = np.append(s_np, border) 404 | 405 | # While we don't really need to special-case, it should save precious computation. 406 | if threshold_distance != np.inf: 407 | near = s_np-threshold_distance 408 | far = s_np+threshold_distance 409 | 410 | for i in range(N): 411 | # Get the window. 412 | sample_points = np.arange(start[i], end[i]) 413 | sample_points[sample_points < 0] = -1 414 | sample_points[sample_points >= N] = -1 415 | window = s_np_extended[sample_points] 416 | 417 | # Threshold the near and far values, then 418 | if threshold_distance != np.inf: 419 | window = np.clip(window, near[i], far[i]) 420 | 421 | # shift everything to be centered around the middle point. 422 | if center: 423 | window -= s_np[i] 424 | 425 | cut_outs[i,:] = window 426 | 427 | return cut_outs 428 | 429 | 430 | def hyperopt(pred_conf): 431 | ho_wBG = 0.38395839618267696 432 | ho_wWC = 0.599481486880304 433 | ho_wWA = 0.4885948464627302 434 | 435 | # Unused 436 | ho_sigma = 2.93 437 | ho_binsz = 0.10 438 | 439 | # Compute "optimal" "tight" window-size dependent on blur-size. 440 | ho_blur_win = ho_sigma*5 441 | ho_blur_win = int(2*(ho_blur_win//2)+1) # Make odd 442 | 443 | # Weight network outputs 444 | newconf = pred_conf * [ho_wBG, ho_wWC, ho_wWA] 445 | # And re-normalize to get "real" probabilities 446 | newconf /= np.sum(newconf, axis=-1, keepdims=True) 447 | 448 | return newconf, {'bin_size': ho_binsz, 'blur_win': ho_blur_win, 'blur_sigma': ho_sigma} 449 | 450 | 451 | ######## 452 | # New eval 453 | ######## 454 | 455 | 456 | def vote_avg(vx, vy, p): 457 | return np.mean(vx), np.mean(vy), np.mean(p, axis=0) 458 | 459 | 460 | def agnostic_weighted_vote_avg(vx, vy, p): 461 | weights = np.sum(p[:,1:], axis=1) 462 | norm = 1.0/np.sum(weights) 463 | return norm*np.sum(weights*vx), norm*np.sum(weights*vy), norm*np.sum(weights[:,None]*p, axis=0) 464 | 465 | 466 | def max_weighted_vote_avg(vx, vy, p): 467 | weights = np.max(p[:,1:], axis=1) 468 | norm = 1.0/np.sum(weights) 469 | return norm*np.sum(weights*vx), norm*np.sum(weights*vy), norm*np.sum(weights[:,None]*p, axis=0) 470 | 471 | 472 | def votes_to_detections2(xs, ys, probas, weighted_avg=False, min_thresh=1e-5, 473 | bin_size=0.1, blur_win=21, blur_sigma=2.0, 474 | x_min=-15.0, x_max=15.0, y_min=-5.0, y_max=15.0, 475 | vote_collect_radius=0.3, retgrid=False, 476 | class_weights=None): 477 | ''' 478 | Convert a list of votes to a list of detections based on Non-Max suppression. 479 | 480 | ` `vote_combiner` the combination function for the votes per detection. 481 | - `bin_size` the bin size (in meters) used for the grid where votes are cast. 482 | - `blur_win` the window size (in bins) used to blur the voting grid. 483 | - `blur_sigma` the sigma used to compute the Gaussian in the blur window. 484 | - `x_min` the left limit for the voting grid, in meters. 485 | - `x_max` the right limit for the voting grid, in meters. 486 | - `y_min` the bottom limit for the voting grid in meters. 487 | - `y_max` the top limit for the voting grid in meters. 488 | - `vote_collect_radius` the radius use during the collection of votes assigned 489 | to each detection. 490 | 491 | Returns a list of tuples (x,y,probs) where `probs` has the same layout as 492 | `probas`. 493 | ''' 494 | if class_weights is not None: 495 | probas = np.array(probas) # Make a copy. 496 | probas[:,:,1:] *= class_weights 497 | vote_combiner = agnostic_weighted_vote_avg if weighted_avg is True else vote_avg 498 | x_range = int((x_max-x_min)/bin_size) 499 | y_range = int((y_max-y_min)/bin_size) 500 | grid = np.zeros((x_range, y_range, probas.shape[2]), np.float32) 501 | 502 | vote_collect_radius_sq = vote_collect_radius * vote_collect_radius 503 | 504 | # Update x/y max to correspond to the end of the last bin. 505 | x_max = x_min + x_range*bin_size 506 | y_max = y_min + y_range*bin_size 507 | 508 | # Where we collect the outputs. 509 | all_dets = [] 510 | all_grids = [] 511 | 512 | # Iterate over the scans. TODO: We can do most of this outside the looping too, actually. 513 | for iscan, (x, y, probs) in enumerate(zip(xs, ys, probas)): 514 | # Clear the grid, for each scan its own. 515 | grid.fill(0) 516 | all_dets.append([]) 517 | 518 | # Filter out all the super-weak votes, as they wouldn't contribute much anyways 519 | # but waste time. 520 | voters_idxs = np.where(np.sum(probs[:,1:], axis=-1) > min_thresh)[0] 521 | # voters_idxs = np.where(probs[:,0] < 1-min_thresh)[0] 522 | # voters_idxs = np.where(np.any(probs[:,1:] > min_thresh, axis=-1))[0] 523 | 524 | # No voters, early bail 525 | if not len(voters_idxs): 526 | if retgrid: 527 | all_grids.append(np.array(grid)) # Be sure to make a copy. 528 | continue 529 | 530 | x = x[voters_idxs] 531 | y = y[voters_idxs] 532 | probs = probs[voters_idxs] 533 | 534 | # Convert x/y to grid-cells. 535 | x_idx = np.int64((x-x_min)/bin_size) 536 | y_idx = np.int64((y-y_min)/bin_size) 537 | 538 | # Discard data outside of the window. 539 | mask = (0 <= x_idx) & (x_idx < x_range) & (0 <= y_idx) & (y_idx < y_range) 540 | x_idx = x_idx[mask] 541 | x = x[mask] 542 | y_idx = y_idx[mask] 543 | y = y[mask] 544 | probs = probs[mask] 545 | 546 | # Vote into the grid, including the agnostic vote as sum of class-votes! 547 | #TODO Do we need the class grids? 548 | np.add.at(grid, [x_idx, y_idx], np.concatenate([np.sum(probs[:,1:], axis=-1, keepdims=True), probs[:,1:]], axis=-1)) 549 | 550 | # Find the maxima (NMS) only in the "common" voting grid. 551 | grid_all = grid[:,:,0] 552 | if blur_win is not None and blur_sigma is not None: 553 | grid_all = cv2.GaussianBlur(grid_all, (blur_win,blur_win), blur_sigma) 554 | max_grid = scipy.ndimage.maximum_filter(grid_all, size=3) 555 | maxima = (grid_all == max_grid) & (grid_all > 0) 556 | m_x, m_y = np.where(maxima) 557 | 558 | if len(m_x) == 0: 559 | if retgrid: 560 | all_grids.append(np.array(grid)) # Be sure to make a copy. 561 | continue 562 | 563 | # Back from grid-bins to real-world locations. 564 | m_x = m_x*bin_size + x_min + bin_size/2 565 | m_y = m_y*bin_size + y_min + bin_size/2 566 | 567 | # For each vote, get which maximum/detection it contributed to. 568 | # Shape of `center_dist` (ndets, voters) and outer is (voters) 569 | center_dist = np.square(x - m_x[:,None]) + np.square(y - m_y[:,None]) 570 | det_voters = np.argmin(center_dist, axis=0) 571 | 572 | # Generate the final detections by average over their voters. 573 | for ipeak in range(len(m_x)): 574 | my_voter_idxs = np.where(det_voters == ipeak)[0] 575 | my_voter_idxs = my_voter_idxs[center_dist[ipeak, my_voter_idxs] < vote_collect_radius_sq] 576 | all_dets[-1].append(vote_combiner(x[my_voter_idxs], y[my_voter_idxs], probs[my_voter_idxs,:])) 577 | 578 | if retgrid: 579 | all_grids.append(np.array(grid)) # Be sure to make a copy. 580 | 581 | if retgrid: 582 | return all_dets, all_grids 583 | return all_dets 584 | 585 | 586 | # Convert it to flat `x`, `y`, `probs` arrays and an extra `frame` array, 587 | # which is the index they had in the first place. 588 | def deep2flat(dets): 589 | all_x, all_y, all_p, all_frames = [], [], [], [] 590 | for i, ds in enumerate(dets): 591 | for (x, y, p) in ds: 592 | all_x.append(x) 593 | all_y.append(y) 594 | all_p.append(p) 595 | all_frames.append(i) 596 | return np.array(all_x), np.array(all_y), np.array(all_p), np.array(all_frames) 597 | 598 | 599 | # Same but slightly different for the ground-truth. 600 | def deep2flat_gt(gts, radius): 601 | all_x, all_y, all_r, all_frames = [], [], [], [] 602 | for i, gt in enumerate(gts): 603 | for (r, phi) in gt: 604 | x, y = rphi_to_xy(r, phi) 605 | all_x.append(x) 606 | all_y.append(y) 607 | all_r.append(radius) 608 | all_frames.append(i) 609 | return np.array(all_x), np.array(all_y), np.array(all_r), np.array(all_frames) 610 | 611 | 612 | def prec_rec_2d(det_scores, det_coords, det_frames, gt_coords, gt_frames, gt_radii): 613 | """ Computes full precision-recall curves at all possible thresholds. 614 | 615 | Arguments: 616 | - `det_scores` (D,) array containing the scores of the D detections. 617 | - `det_coords` (D,2) array containing the (x,y) coordinates of the D detections. 618 | - `det_frames` (D,) array containing the frame number of each of the D detections. 619 | - `gt_coords` (L,2) array containing the (x,y) coordinates of the L labels (ground-truth detections). 620 | - `gt_frames` (L,) array containing the frame number of each of the L labels. 621 | - `gt_radii` (L,) array containing the radius at which each of the L labels should consider detection associations. 622 | This will typically just be an np.full_like(gt_frames, 0.5) or similar, 623 | but could vary when mixing classes, for example. 624 | 625 | Returns: (recs, precs, threshs) 626 | - `threshs`: (D,) array of sorted thresholds (scores), from higher to lower. 627 | - `recs`: (D,) array of recall scores corresponding to the thresholds. 628 | - `precs`: (D,) array of precision scores corresponding to the thresholds. 629 | """ 630 | # This means that all reported detection frames which are not in ground-truth frames 631 | # will be counted as false-positives. 632 | # TODO: do some sanity-checks in the "linearization" functions before calling `prec_rec_2d`. 633 | frames = np.unique(np.r_[det_frames, gt_frames]) 634 | 635 | det_accepted_idxs = defaultdict(list) 636 | tps = np.zeros(len(frames), dtype=np.uint32) 637 | fps = np.zeros(len(frames), dtype=np.uint32) 638 | fns = np.array([np.sum(gt_frames == f) for f in frames], dtype=np.uint32) 639 | 640 | precs = np.full_like(det_scores, np.nan) 641 | recs = np.full_like(det_scores, np.nan) 642 | threshs = np.full_like(det_scores, np.nan) 643 | 644 | indices = np.argsort(det_scores, kind='mergesort') # mergesort for determinism. 645 | for i, idx in enumerate(reversed(indices)): 646 | frame = det_frames[idx] 647 | iframe = np.where(frames == frame)[0][0] # Can only be a single one. 648 | 649 | # Accept this detection 650 | dets_idxs = det_accepted_idxs[frame] 651 | dets_idxs.append(idx) 652 | threshs[i] = det_scores[idx] 653 | 654 | dets = det_coords[dets_idxs] 655 | 656 | gts_mask = gt_frames == frame 657 | gts = gt_coords[gts_mask] 658 | radii = gt_radii[gts_mask] 659 | 660 | if len(gts) == 0: # No GT, but there is a detection. 661 | fps[iframe] += 1 662 | else: # There is GT and detection in this frame. 663 | not_in_radius = radii[:,None] < cdist(gts, dets) # -> ngts x ndets, True (=1) if too far, False (=0) if may match. 664 | igt, idet = linear_sum_assignment(not_in_radius) 665 | 666 | tps[iframe] = np.sum(np.logical_not(not_in_radius[igt, idet])) # Could match within radius 667 | fps[iframe] = len(dets) - tps[iframe] # NB: dets is only the so-far accepted. 668 | fns[iframe] = len(gts) - tps[iframe] 669 | 670 | tp, fp, fn = np.sum(tps), np.sum(fps), np.sum(fns) 671 | precs[i] = tp/(fp+tp) if fp+tp > 0 else np.nan 672 | recs[i] = tp/(fn+tp) if fn+tp > 0 else np.nan 673 | 674 | return recs, precs, threshs 675 | 676 | 677 | def _prepare_prec_rec_softmax(scans, pred_offs): 678 | angles = laser_angles(scans.shape[-1])[None,:] 679 | return rphi_to_xy(*win2global(scans, angles, pred_offs[:,:,0], pred_offs[:,:,1])) 680 | 681 | def _prepare_prec_rec_sigmoids(scans, pred_offs, pred_conf): 682 | angles = laser_angles(scans.shape[-1])[None,:] 683 | x1, y1 = rphi_to_xy(*win2global(scans, angles, pred_offs[:,:,0], pred_offs[:,:,1])) 684 | x2, y2 = rphi_to_xy(*win2global(scans, angles, pred_offs[:,:,2], pred_offs[:,:,3])) 685 | x3, y3 = rphi_to_xy(*win2global(scans, angles, pred_offs[:,:,4], pred_offs[:,:,5])) 686 | x = np.c_[x1, x2, x3] 687 | y = np.c_[y1, y2, y3] 688 | zero = np.zeros_like(pred_conf[:,:,1]) 689 | pred_conf = np.concatenate([ 690 | np.stack([1-pred_conf[:,:,1], pred_conf[:,:,1], zero, zero], axis=2), 691 | np.stack([1-pred_conf[:,:,2], zero, pred_conf[:,:,2], zero], axis=2), 692 | np.stack([1-pred_conf[:,:,3], zero, zero, pred_conf[:,:,3]], axis=2), 693 | ], axis=1) 694 | return x, y, pred_conf 695 | 696 | def _process_detections(det_x, det_y, det_p, det_f, wcs, was, wps, eval_r): 697 | allgts = [wc+wa+wp for wc, wa, wp in zip(wcs, was, wps)] 698 | gts_x, gts_y, gts_r, gts_f = deep2flat_gt(allgts, radius=eval_r) 699 | wd_r, wd_p, wd_t = prec_rec_2d(np.sum(det_p[:,1:], axis=1), np.c_[det_x, det_y], det_f, np.c_[gts_x, gts_y], gts_f, gts_r) 700 | gts_x, gts_y, gts_r, gts_f = deep2flat_gt(wcs, radius=eval_r) 701 | # TODO possibly speed up the below significantly since a lot of them have 0 probability by design in some cases and can be dropped. 702 | wc_r, wc_p, wc_t = prec_rec_2d(det_p[:,1], np.c_[det_x, det_y], det_f, np.c_[gts_x, gts_y], gts_f, gts_r) 703 | gts_x, gts_y, gts_r, gts_f = deep2flat_gt(was, radius=eval_r) 704 | wa_r, wa_p, wa_t = prec_rec_2d(det_p[:,2], np.c_[det_x, det_y], det_f, np.c_[gts_x, gts_y], gts_f, gts_r) 705 | gts_x, gts_y, gts_r, gts_f = deep2flat_gt(wps, radius=eval_r) 706 | wp_r, wp_p, wp_t = prec_rec_2d(det_p[:,3], np.c_[det_x, det_y], det_f, np.c_[gts_x, gts_y], gts_f, gts_r) 707 | 708 | return [wd_r, wd_p, wd_t], [wc_r, wc_p, wc_t], [wa_r, wa_p, wa_t], [wp_r, wp_p, wp_t] 709 | 710 | def _process_detections_2class(det_x, det_y, det_p, det_f, wcs, was, eval_r): 711 | allgts = [wc+wa for wc, wa in zip(wcs, was)] 712 | gts_x, gts_y, gts_r, gts_f = deep2flat_gt(allgts, radius=eval_r) 713 | wd_r, wd_p, wd_t = prec_rec_2d(np.sum(det_p[:,1:], axis=1), np.c_[det_x, det_y], det_f, np.c_[gts_x, gts_y], gts_f, gts_r) 714 | gts_x, gts_y, gts_r, gts_f = deep2flat_gt(wcs, radius=eval_r) 715 | wc_r, wc_p, wc_t = prec_rec_2d(det_p[:,1], np.c_[det_x, det_y], det_f, np.c_[gts_x, gts_y], gts_f, gts_r) 716 | gts_x, gts_y, gts_r, gts_f = deep2flat_gt(was, radius=eval_r) 717 | wa_r, wa_p, wa_t = prec_rec_2d(det_p[:,2], np.c_[det_x, det_y], det_f, np.c_[gts_x, gts_y], gts_f, gts_r) 718 | 719 | return [wd_r, wd_p, wd_t], [wc_r, wc_p, wc_t], [wa_r, wa_p, wa_t] 720 | 721 | 722 | def comp_prec_rec_softmax(scans, wcs, was, wps, pred_conf, pred_offs, eval_r=0.5, **v2d_kw): 723 | x, y = _prepare_prec_rec_softmax(scans, pred_offs) 724 | det_x, det_y, det_p, det_f = deep2flat(votes_to_detections2(x, y, pred_conf, **v2d_kw)) 725 | 726 | return _process_detections(det_x, det_y, det_p, det_f, wcs, was, wps, eval_r) 727 | 728 | 729 | 730 | def comp_prec_rec_softmax2(scans, wcs, was, wps, pred_conf, pred_offs, eval_r=0.5, **v2d_kw): 731 | x, y = _prepare_prec_rec_softmax(scans, pred_offs) 732 | det_x, det_y, det_p, det_f = deep2flat(votes_to_detections3(x, y, pred_conf, **v2d_kw)) 733 | 734 | return _process_detections(det_x, det_y, det_p, det_f, wcs, was, wps, eval_r) 735 | 736 | 737 | def comp_prec_rec_sigmoids(scans, wcs, was, wps, pred_conf, pred_offs, eval_r=0.5, **v2d_kw): 738 | x, y, pred_conf = _prepare_prec_rec_sigmoids(scans, pred_offs, pred_conf) 739 | det_x, det_y, det_p, det_f = deep2flat(votes_to_detections2(x, y, pred_conf, **v2d_kw)) 740 | 741 | return _process_detections(det_x, det_y, det_p, det_f, wcs, was, wps, eval_r) 742 | 743 | 744 | def comp_prec_rec_sigmoids2(scans, wcs, was, wps, pred_conf, pred_offs, eval_r=0.5, **v2d_kw): 745 | x, y, pred_conf = _prepare_prec_rec_sigmoids(scans, pred_offs, pred_conf) 746 | det_x, det_y, det_p, det_f = deep2flat(votes_to_detections3(x, y, pred_conf, **v2d_kw)) 747 | 748 | return _process_detections(det_x, det_y, det_p, det_f, wcs, was, wps, eval_r) 749 | 750 | 751 | def peakf1(recs, precs): 752 | return np.max(2*precs*recs/np.clip(precs+recs, 1e-16, 2+1e-16)) 753 | 754 | 755 | def eer(recs, precs): 756 | # Find the first nonzero or else (0,0) will be the EER :) 757 | def first_nonzero_idx(arr): 758 | return np.where(arr != 0)[0][0] 759 | 760 | p1 = first_nonzero_idx(precs) 761 | r1 = first_nonzero_idx(recs) 762 | idx = np.argmin(np.abs(precs[p1:] - recs[r1:])) 763 | return (precs[p1+idx] + recs[r1+idx])/2 # They are often the exact same, but if not, use average. 764 | 765 | 766 | def plot_prec_rec(wds, wcs, was, wps, figsize=(15,10), title=None): 767 | fig, ax = plt.subplots(figsize=figsize) 768 | 769 | ax.plot(*wds[:2], label='agn (AUC: {:.1%}, F1: {:.1%}, EER: {:.1%})'.format(auc(*wds[:2], reorder=True), peakf1(*wds[:2]), eer(*wds[:2])), c='#E24A33') 770 | ax.plot(*wcs[:2], label='wcs (AUC: {:.1%}, F1: {:.1%}, EER: {:.1%})'.format(auc(*wcs[:2], reorder=True), peakf1(*wcs[:2]), eer(*wcs[:2])), c='#348ABD') 771 | ax.plot(*was[:2], label='was (AUC: {:.1%}, F1: {:.1%}, EER: {:.1%})'.format(auc(*was[:2], reorder=True), peakf1(*was[:2]), eer(*was[:2])), c='#988ED5') 772 | ax.plot(*wps[:2], label='wps (AUC: {:.1%}, F1: {:.1%}, EER: {:.1%})'.format(auc(*wps[:2], reorder=True), peakf1(*wps[:2]), eer(*wps[:2])), c='#8EBA42') 773 | 774 | if title is not None: 775 | fig.suptitle(title, fontsize=16, y=0.91) 776 | 777 | prettify_pr_curve(ax) 778 | lbplt.fatlegend(ax, loc='upper right') 779 | return fig, ax 780 | 781 | # results = comp_prec_rec(va, pred_yva_conf, pred_yva_offs, blur_win=5, blur_sigma=1) 782 | # fig, ax = plot_prec_rec(*results, title='WNet3x 50ep Adam+decay') 783 | 784 | 785 | def votes_to_detections3(xs, ys, probas, min_thresh=1e-5, 786 | bin_size=0.1, blur_win=21, blur_sigma=2.0, 787 | x_min=-15.0, x_max=15.0, y_min=-5.0, y_max=15.0, 788 | nms_radius=0.2, vote_collect_radius=0.3, 789 | weighted_avg=False, retgrid=False, 790 | class_weights=None): 791 | ''' 792 | Convert a list of votes to a list of detections based on Non-Max suppression. 793 | This version uses a separate voting grid for each class, thus needing an 794 | additional nms step at the end. 795 | 796 | - `bin_size` the bin size (in meters) used for the grid where votes are cast. 797 | - `blur_win` the window size (in bins) used to blur the voting grid. 798 | - `blur_sigma` the sigma used to compute the Gaussian in the blur window. 799 | - `x_min` the left limit for the voting grid, in meters. 800 | - `x_max` the right limit for the voting grid, in meters. 801 | - `y_min` the bottom limit for the voting grid in meters. 802 | - `y_max` the top limit for the voting grid in meters. 803 | - `nms_radius` the radius used to suppress less confident maxima. 804 | - `vote_collect_radius` the radius use during the collection of votes assigned 805 | to each detection. 806 | 807 | Returns a list of tuples (x,y,probs) where `probs` has the same layout as 808 | `probas`. 809 | ''' 810 | if class_weights is not None: 811 | probas = np.array(probas) # Make a copy. 812 | probas[:,:,1:] *= class_weights 813 | x_range = int((x_max-x_min)/bin_size) 814 | y_range = int((y_max-y_min)/bin_size) 815 | grid = np.zeros((x_range, y_range, probas.shape[2]-1), np.float32) 816 | 817 | # Fix the blur_win and blur_sigma if they are scalars 818 | 819 | if isinstance(blur_win, (int, float)) or blur_win is None: 820 | blur_win = [blur_win] * (probas.shape[2]-1) 821 | blur_win = np.asarray(blur_win) 822 | if len(blur_win) != (probas.shape[2] - 1): 823 | raise ValueError('Blur window size has to be a scalar or an array with the ' 824 | 'length corresponding to the class count') 825 | if isinstance(blur_sigma, (int, float)) or blur_sigma is None: 826 | blur_sigma = [blur_sigma] * (probas.shape[2]-1) 827 | blur_sigma = np.asarray(blur_sigma) 828 | if len(blur_sigma) != (probas.shape[2] - 1): 829 | raise ValueError('Blur sigma has to be a scalar or an array with the ' 830 | 'length corresponding to the class count') 831 | 832 | vote_collect_radius_sq = vote_collect_radius * vote_collect_radius 833 | 834 | # Update x/y max to correspond to the end of the last bin. 835 | x_max = x_min + x_range*bin_size 836 | y_max = y_min + y_range*bin_size 837 | 838 | # Where we collect the outputs. 839 | all_dets = [] 840 | all_grids = [] 841 | 842 | # Iterate over the scans. 843 | for iscan, (x, y, probs) in enumerate(zip(xs, ys, probas)): 844 | # Clear the grid, for each scan its own. 845 | grid.fill(0) 846 | dets_current = [] 847 | 848 | # Filter out all the super-weak votes, as they wouldn't contribute much anyways 849 | # but waste time. 850 | voters_idxs = np.where(np.sum(probs[:,1:], axis=-1) > min_thresh)[0] 851 | 852 | # No voters, early bail 853 | if not len(voters_idxs): 854 | if retgrid: 855 | all_grids.append(np.array(grid)) # Be sure to make a copy. 856 | continue 857 | 858 | x = x[voters_idxs] 859 | y = y[voters_idxs] 860 | probs = probs[voters_idxs] 861 | 862 | # Convert x/y to grid-cells. 863 | x_idx = np.int64((x-x_min)/bin_size) 864 | y_idx = np.int64((y-y_min)/bin_size) 865 | 866 | mask = (0 <= x_idx) & (x_idx < x_range) & (0 <= y_idx) & (y_idx < y_range) 867 | x_idx = x_idx[mask] 868 | x = x[mask] 869 | y_idx = y_idx[mask] 870 | y = y[mask] 871 | probs = probs[mask] 872 | 873 | # Vote into the grid, including the agnostic vote as sum of class-votes! 874 | np.add.at(grid, [x_idx, y_idx], probs[:,1:]) 875 | 876 | # Loop over all classes: 877 | for c in range(probas.shape[2]-1): 878 | grid_c = grid[..., c] 879 | if blur_win[c] is not None and blur_sigma is not None: 880 | grid_c = cv2.GaussianBlur(grid_c, (blur_win[c],blur_win[c]), blur_sigma[c]) 881 | max_grid = scipy.ndimage.maximum_filter(grid_c, size=3) 882 | maxima = (grid_c == max_grid) & (grid_c > 0) 883 | m_x, m_y = np.where(maxima) 884 | 885 | if len(m_x) == 0: 886 | continue 887 | 888 | # Back from grid-bins to real-world locations. 889 | m_x = m_x*bin_size + x_min + bin_size/2 890 | m_y = m_y*bin_size + y_min + bin_size/2 891 | 892 | # For each vote, get which maximum/detection it contributed to. 893 | # Shape of `center_dist` (ndets, voters) and outer is (voters) 894 | center_dist = np.square(x - m_x[:,None]) + np.square(y - m_y[:,None]) 895 | det_voters = np.argmin(center_dist, axis=0) 896 | 897 | # Generate the final detections by average over their voters. 898 | for ipeak in range(len(m_x)): 899 | # Compute the vote indices, take the closest, but only within a radius. 900 | my_voter_idxs = np.where(det_voters == ipeak)[0] 901 | my_voter_idxs = my_voter_idxs[center_dist[ipeak, my_voter_idxs] < vote_collect_radius_sq] 902 | 903 | # Compute the final output for x, y, and probs. 904 | p = probs[my_voter_idxs, c + 1] 905 | if weighted_avg: 906 | norm = 1 / np.sum(p) 907 | new_x = np.sum(x[my_voter_idxs] * p) * norm 908 | new_y = np.sum(y[my_voter_idxs] * p) * norm 909 | p = np.sum(p * p) * norm 910 | else: 911 | new_x = np.mean(x[my_voter_idxs]) 912 | new_y = np.mean(y[my_voter_idxs]) 913 | p = np.mean(p) 914 | p_padded = np.zeros_like(probs[0]) 915 | p_padded[c + 1] = p 916 | dets_current.append((new_x, new_y, p_padded)) 917 | 918 | # Perform nms on the resulting detections 919 | keep = np.full([len(dets_current)], True, dtype=np.bool) 920 | if nms_radius > 0 and len(dets_current) > 0: 921 | # Store them in a slightly easier format 922 | all_det_xyp = np.stack([[d[0], d[1], np.max(d[2])] for d in dets_current]) 923 | 924 | # Compute the distances between all of them 925 | dist = cdist(all_det_xyp[:,:2], all_det_xyp[:,:2]) 926 | 927 | # Set those that don't influence each other to -1 928 | dist[dist > nms_radius] = -1 929 | dist -= np.eye(len(dist)) 930 | 931 | # Sort them from strongest to weakest detections. 932 | det_indices = np.argsort(-all_det_xyp[:,2]) 933 | for d in det_indices: 934 | if not keep[d]: 935 | continue 936 | # suppress other detections with lower probability 937 | neighbor_mask = dist[d] > -1 938 | suppresable_mask = all_det_xyp[:, 2] <= all_det_xyp[d, 2] 939 | discard = np.logical_and(np.logical_and(neighbor_mask, suppresable_mask), keep) 940 | for i in np.where(discard)[0]: 941 | keep[i] = False 942 | dist[i, :] = 0 943 | dist[:, i] = 0 944 | 945 | # Store those which passed through the nms. 946 | all_dets.append([d for d, k in zip(dets_current, keep) if k]) 947 | 948 | if retgrid: 949 | all_grids.append(np.array(grid)) # Be sure to make a copy. 950 | 951 | if retgrid: 952 | return all_dets, all_grids 953 | return all_dets 954 | 955 | 956 | def subsample_pr(precision, recall, dist_threshold): 957 | p_sample = [precision[0]] 958 | r_sample = [recall[0]] 959 | for p, r in zip(precision[1:], recall[1:]): 960 | if (np.square(p_sample[-1] - p) + np.square(r_sample[-1] - r)) > dist_threshold * dist_threshold: 961 | p_sample.append(p) 962 | r_sample.append(r) 963 | return np.asarray(p_sample), np.asarray(r_sample) 964 | 965 | 966 | def dump_paper_pr_curves(dump_file, precision, recall, 967 | store_prec=3, 968 | fast=0.01, fast_postfix='_fast.csv', 969 | slow=0.001, slow_postfix='.csv', 970 | meta_postfix='_meta.csv'): 971 | header = 'prec,rec' 972 | pr_fast = np.asarray(subsample_pr(precision, recall, dist_threshold=fast)).T * 100 973 | pr_slow = np.asarray(subsample_pr(precision, recall, dist_threshold=slow)).T * 100 974 | a = auc(recall, precision, reorder=True) * 100 975 | f1 = peakf1(recall, precision) * 100 976 | e = eer(recall, precision) * 100 977 | meta = np.asarray([a, f1, e])[None] 978 | 979 | np.savetxt(dump_file + fast_postfix, pr_fast, fmt='%.{}f'.format(store_prec), delimiter=',', header=header, comments='') 980 | np.savetxt(dump_file + slow_postfix, pr_slow, fmt='%.{}f'.format(store_prec), delimiter=',', header=header, comments='') 981 | np.savetxt(dump_file + meta_postfix, meta, fmt='%.{}f'.format(store_prec), delimiter=',', header='auc,f1,eer', comments='') 982 | 983 | 984 | import signal 985 | import multiprocessing 986 | 987 | class BackgroundFunction: 988 | def __init__(self, function, prefetch_count, reseed=True, **kwargs): 989 | """Parallelize a function to prefetch results using mutliple processes. 990 | Args: 991 | function: Function to be executed in parallel. 992 | prefetch_count: Number of samples to prefetch. 993 | kwargs: Keyword args passed to the executed function. 994 | 995 | NOTE: This is taken from Alexander Hermans at 996 | https://github.com/Pandoro/tools/blob/master/utils.py 997 | """ 998 | self.function = function 999 | self.prefetch_count = prefetch_count 1000 | self.kwargs = kwargs 1001 | self.output_queue = multiprocessing.Queue(maxsize=prefetch_count) 1002 | self.procs = [] 1003 | for i in range(self.prefetch_count): 1004 | p = multiprocessing.Process( 1005 | target=BackgroundFunction._compute_next, 1006 | args=(self.function, self.kwargs, self.output_queue, reseed)) 1007 | p.daemon = True # To ensure it is killed if the parent dies. 1008 | p.start() 1009 | self.procs.append(p) 1010 | 1011 | def fill_status(self, normalize=False): 1012 | """Returns the fill status of the underlying queue. 1013 | Args: 1014 | normalize: If set to True, normalize the fill status by the max 1015 | queue size. Defaults to False. 1016 | Returns: 1017 | The possibly normalized fill status of the underlying queue. 1018 | """ 1019 | return (self.output_queue.qsize() / 1020 | (self.prefetch_count if normalize else 1)) 1021 | 1022 | def __call__(self): 1023 | """Obtain one of the prefetched results or wait for one. 1024 | Returns: 1025 | The output of the provided function and the given keyword args. 1026 | """ 1027 | output = self.output_queue.get(block=True) 1028 | return output 1029 | 1030 | def __del__(self): 1031 | """Signal the processes to stop and join them.""" 1032 | for p in self.procs: 1033 | p.terminate() 1034 | p.join() 1035 | 1036 | def _compute_next(function, kwargs, output_queue, reseed): 1037 | """Helper function to do the actual computation in a non_blockig way. 1038 | Since this will always run in a new process, we ignore the interrupt 1039 | signal for the processes. This should be handled by the parent process 1040 | which kills the children when the object is deleted. 1041 | Some more discussion can be found here: 1042 | https://stackoverflow.com/questions/1408356/keyboard-interrupts-with-pythons-multiprocessing-pool 1043 | """ 1044 | signal.signal(signal.SIGINT, signal.SIG_IGN) 1045 | 1046 | if reseed: 1047 | np.random.seed() 1048 | 1049 | while True: 1050 | output_queue.put(function(**kwargs)) 1051 | 1052 | 1053 | #### 1054 | # FOR FINAL EXPS 1055 | ### 1056 | 1057 | def generate_votes(scan, wcs, was, wps, rwc=0.6, rwa=0.4, rwp=0.35, lblwc=1, lblwa=2, lblwp=3): 1058 | N = len(scan) 1059 | y_conf = np.zeros( N, dtype=np.int64) 1060 | y_offs = np.zeros((N, 2), dtype=np.float32) 1061 | 1062 | alldets = list(wcs) + list(was) + list(wps) 1063 | radii = [rwc]*len(wcs) + [rwa]*len(was) + [rwp]*len(wps) 1064 | dets = closest_detection(scan, alldets, radii) 1065 | labels = [0] + [lblwc]*len(wcs) + [lblwa]*len(was) + [lblwp]*len(wps) 1066 | 1067 | for i, (r, phi) in enumerate(zip(scan, laser_angles(N))): 1068 | if 0 < dets[i]: 1069 | y_conf[i] = labels[dets[i]] 1070 | y_offs[i,:] = global2win(r, phi, *alldets[dets[i]-1]) 1071 | 1072 | return y_conf, y_offs 1073 | 1074 | 1075 | from functools import partial 1076 | 1077 | 1078 | class Dataset: 1079 | def __init__(self, filenames, DATADIR, LABELDIR, **votegenkw): 1080 | self.scansns, self.scants, self.scans = zip(*[load_scan(f + '.csv') for f in filenames]) 1081 | self.detsns, self.wcdets, self.wadets, self.wpdets = zip(*map( 1082 | lambda f: load_dets(f, DATADIR, LABELDIR), filenames)) 1083 | self.odoms = [load_odom(f + '.odom2') for f in filenames] 1084 | 1085 | # Pre-compute mappings from detection index to scan index. 1086 | self.idet2iscan = [{i: np.where(ss == d)[0][0] for i, d in enumerate(ds)} 1087 | for ss, ds in zip(self.scansns, self.detsns)] 1088 | 1089 | # This is in order to pick uniformly across annotated scans further down. 1090 | # It's significantly faster this way than computing sequence-probabilities and sampling that way. 1091 | self._seq_picker = np.concatenate([[i]*len(sns) for i, sns in enumerate(self.detsns)]) 1092 | 1093 | # Targets. Kinda ugly, but at least correct, unlike what I had before: pretty but wrong! 1094 | self.y_conf, self.y_offs = [], [] 1095 | for iseq, detsns in enumerate(self.detsns): 1096 | y_confs, y_offss = [], [] 1097 | for idet, detsn in enumerate(detsns): 1098 | y_conf, y_offs = generate_votes( 1099 | self.scans[iseq][self.idet2iscan[iseq][idet]], 1100 | self.wcdets[iseq][idet], self.wadets[iseq][idet], self.wpdets[iseq][idet], 1101 | **votegenkw) 1102 | y_confs.append(y_conf) 1103 | y_offss.append(y_offs) 1104 | self.y_conf.append(np.array(y_confs)) 1105 | self.y_offs.append(np.array(y_offss)) 1106 | 1107 | def random_index(self, min_before=0): 1108 | iseq = np.random.choice(len(self._probs), p=self._probs) 1109 | iscan = min_before + np.random.choice(len(self.scans[iseq]) - min_before) 1110 | return iseq, iscan 1111 | 1112 | def random_labelled_index(self, min_before=0): 1113 | iseq = np.random.choice(self._seq_picker) 1114 | detsns = self.detsns[iseq] 1115 | 1116 | # Figure out for how many labelled scans we don't have enough scans before. 1117 | # Do so using the sequence-number, and we know they are sorted. 1118 | scan0 = self.scansns[iseq][0] 1119 | for skip, sn in enumerate(detsns): 1120 | if scan0 <= sn: 1121 | break 1122 | idet = np.random.randint(skip, len(detsns)) 1123 | return iseq, idet, self.idet2iscan[iseq][idet] 1124 | 1125 | 1126 | def cutout(scans, odoms, ipoint, out=None, odom='rot-rel', win_sz=1.66, thresh_dist=1, 1127 | center='point', center_time='now', value='donut', nsamp=48, UNK=29.99, laserIncrement=laserIncrement): 1128 | """ TODO: Probably we can still try to clean this up more. 1129 | This function here only creates a single cut-out; for training, 1130 | we'd want to get a batch of cutouts from each seq (can vectorize) and for testing 1131 | we'd want all cutouts for one scan, which we can vectorize too. 1132 | But ain't got time for this shit! 1133 | 1134 | Args: 1135 | - scans: (T,N) the T scans (of scansize N) to cut out from, `T=-1` being the "current time". 1136 | - out: None or a (T,nsamp) buffer where to store the cutouts. 1137 | """ 1138 | T, N = scans.shape 1139 | 1140 | # Compute the size (width) of the window at the last time index: 1141 | z = scans[-1,ipoint] 1142 | half_alpha = float(np.arctan(0.5*win_sz/z)) 1143 | 1144 | # Pre-allocate some buffers 1145 | out = np.zeros((T,nsamp), np.float32) if out is None else out 1146 | SCANBUF = np.full(N+1, UNK, np.float32) # Pad by UNK for the border-padding by UNK. 1147 | for t in range(T): 1148 | # If necessary, compute the odometry of the current time relative to the "key" one. 1149 | # TODO: in principle we could also interpolate using the time, since they don't 1150 | # *exactly* line up with the scan's times. 1151 | if odom is not False: 1152 | odom_x, odom_y, odom_a = map(float, odoms[t]['xya'] - odoms[-1]['xya']) 1153 | else: 1154 | odom_x, odom_y, odom_a = 0.0, 0.0, 0.0 1155 | 1156 | # Compute the start and end indices of points in the scan to be considered. 1157 | start = int(round(ipoint - half_alpha/laserIncrement - odom_a/laserIncrement)) 1158 | end = int(round(ipoint + half_alpha/laserIncrement - odom_a/laserIncrement)) 1159 | 1160 | # Now compute the list of indices at which to take the points, 1161 | # using -1/end to access out-of-bounds which has been set to UNK. 1162 | support_points = np.arange(start, end+1) 1163 | support_points.clip(-1, len(SCANBUF)-1, out=support_points) 1164 | 1165 | # Write the scan into the buffer which has UNK at the end and then sample from it. 1166 | SCANBUF[:-1] = scans[t] 1167 | cutout = SCANBUF[support_points] 1168 | 1169 | # Now in case we want to apply "translation" odometry, the best effort we can do, 1170 | # is to project the relative odometry onto the "front" vector of the cutout, and 1171 | # apply that onto the radius (`z`) of the points. 1172 | # TODO: Maybe we can actually do better in the 'undistorted' case below. 1173 | if odom in ('full', 'full-rel'): 1174 | #cutout += np.dot([np.cos(odom_a), np.sin(odom_a)], [odom_x, odom_y]) 1175 | cutout += np.cos(odom_a)*odom_x + np.sin(odom_a)*odom_y 1176 | 1177 | # Now we do the resampling of the cutout to a fixed number of points. We can do it two ways: 1178 | if 'undistorted' in value: 1179 | # In the 'undistorted' case, we actually use x/y cartesian space, i.e. the cut-out 1180 | # is not arc-shaped but really rectangle-shaped. 1181 | # For doing this, we need real interpolation-functionality since even the 'x' axis 1182 | # will be converted non-linearly from angles to points on a line. 1183 | dcorr_a = np.linspace(-half_alpha, half_alpha, len(cutout))# - odom_a 1184 | y = np.cos(dcorr_a) * cutout 1185 | x = np.sin(dcorr_a) * cutout 1186 | kw = {'fill_value': 'extrapolate'} if '(extra)' in value else {'bounds_error': False, 'fill_value': UNK} 1187 | interp = scipy.interpolate.interp1d(x, y, assume_sorted=False, copy=False, kind='linear', **kw) 1188 | out[t] = interp(np.linspace(-0.5*win_sz, 0.5*win_sz, nsamp)) 1189 | else: 1190 | # In the other case, we have a somewhat distorted world-view as the x-indices 1191 | # correspond to angles and the values to z-distances (radii) as in original DROW. 1192 | # The advantage here is we can use the much faster OpenCV resizing functions. 1193 | interp = cv2.INTER_AREA if nsamp < len(cutout) else cv2.INTER_LINEAR 1194 | cv2.resize(cutout[None], (nsamp,1), interpolation=interp, dst=out[None,t]) 1195 | 1196 | # Now we choose where to center the depth at before we will re-center/clip. 1197 | if center_time == 'each': 1198 | z = scans[t][ipoint] 1199 | 1200 | # Clip things too close and too far to create the "focus tunnel" since they are likely irrelevant. 1201 | out[t].clip(z - thresh_dist, z + thresh_dist, out=out[t]) 1202 | #fastclip_(cutouts[i], z - thresh_dist, z + thresh_dist) 1203 | 1204 | # And finally, possibly re-align according to a few different choices. 1205 | if center == 'point': 1206 | out[t] -= z 1207 | elif center == 'near': 1208 | out[t] -= z - thresh_dist 1209 | elif center == 'far': 1210 | out[t] = (z + thresh_dist) - out[t] 1211 | 1212 | return out 1213 | 1214 | 1215 | def get_batch(data, bs, ntime, nsamp, dtime=1, repeat_before=True, **cutout_kw): 1216 | Xb = np.empty((bs, ntime, nsamp), np.float32) 1217 | yb_conf = np.empty(bs, np.int64) 1218 | yb_offs = np.empty((bs, 2), np.float32) 1219 | 1220 | for b in range(bs): 1221 | if repeat_before: 1222 | # Prepend the exact same scan/odom for the first few where there's no history. 1223 | iseq, idet, iscan = data.random_labelled_index() 1224 | times = np.arange(iscan - ntime*dtime + 1, iscan+1, dtime) 1225 | times[times < 0] = times[0 <= times][0] 1226 | 1227 | scans = np.array([data.scans[iseq][j] for j in times]) 1228 | odoms = np.array([data.odoms[iseq][j] for j in times]) 1229 | else: 1230 | iseq, idet, iscan = data.random_labelled_index(min_before=(ntime-1)*dtime) 1231 | scans = data.scans[iseq][iscan-(ntime-1)*dtime:iscan+1:dtime] 1232 | odoms = data.odoms[iseq][iscan-(ntime-1)*dtime:iscan+1:dtime] 1233 | 1234 | ipt = np.random.randint(len(scans[0])) 1235 | cutout(scans, odoms, ipt, out=Xb[b], nsamp=nsamp, **cutout_kw) 1236 | 1237 | yb_conf[b] = data.y_conf[iseq][idet][ipt] 1238 | yb_offs[b] = data.y_offs[iseq][idet][ipt] 1239 | 1240 | return Xb, yb_conf, yb_offs 1241 | --------------------------------------------------------------------------------