├── README.md ├── images ├── img1.png ├── img2.png ├── img3.png ├── img4.png ├── img5.png ├── img6.png ├── img7.png ├── img8.png └── readme └── math_2Dto3D.py /README.md: -------------------------------------------------------------------------------- 1 | # 3D-detection-with-monocular-RGB-image 2 | Reference Paper: 3 | Paper1: 3D Bounding Box Estimation Using Deep Learning and Geometry 4 | URL: https://arxiv.org/abs/1612.00496 5 | Paper2: Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction 6 | URL: https://arxiv.org/abs/1904.01690 7 | Paper3: 3D Bounding Boxes for Road Vehicles: A One-Stage, Localization Prioritized Approach using Single Monocular Images URL: https://link.springer.com/chapter/10.1007%2F978-3-030-11021-5_39 8 | 9 | I did the 3D detection research during my internship in MEGVII and most codes including training, testing, lib codes are not allowed posted online, because the codes contains the basemodel and framework information of MEGVII. 10 | 11 | I want to share my viewpoint and thoughts about 3D detection with monoculr RGB images. The hardest part and the most tricky part is how to use monocular RGB images to predict location, so I decide to post the code of this part after being addimitted by mentor. This part only uses Numpy and math module rather than deep learning framework. Besides, I will compare different methods of predict orientation and location inference. 12 | 13 | ## Data set and structure 14 | Kitti 2d object: http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=2d 15 | Input: monoculr RGB image; 2D boxes, dimension, orientation, and location of objects; Camera's inner and outer parameter. 16 | Data form: JSON. 17 | ## Overall Thought 18 | Class and 2D box prediction + Orientation prediction + dimentation prediction --> Location inference --> visualization 19 | ### Class and 2D box prediction 20 | Applied Faster RCNN and resnet as backbone with 2FNN as header to predict the 2D boxes (left-top point and right-bottom point). 21 | ### Orientation prediction 22 | Because of the 2pi range is hard for model to learn, thus divides the 2pi range into several bins and predict the bin class and offset regression has better performance. 23 | ![image](https://github.com/ZhixinLai/3D-detection-with-monocular-RGB-image/blob/master/images/img1.png) 24 | #### Different thought: 25 | * predic the alpha directly and used the space constraint to inference Sida_y vs predict the Sida_y directly 26 | Paper1 explains the reason why should use method one to do angle prediction, but actually the method two has better performance after doing experients. 27 | * predict angle direcly vs predict sin & cos 28 | sin & cos is better. 29 | * numbers of bin 30 | Dividing 2Pi into 4 bins has best performance. 31 | ### Dimension prediction 32 | Because the objects in Kitti set ranges a lot, it will has bad performance if we predict the dimension for all object directly. 33 | First, calculate the average dimension for each class. Second, regress the offset of each object. Third, according to the object class predicted in the first step, add the average dimension and offset. 34 | ### Location inference(Tricky part) 35 | This part is hardest to comprehend and needs background of projection principle and different coordinate systems. 36 | #### Different thought: 37 | * Method one: According to paper1, we can use the 2D box and 3D box relationship to inference 3D location. As we can see from the figure blew, some vertexs of 3D boxes locats in the line of 2D boxes. We can use this principle to do inference location coordinate. First, assuming the location coordinate(center point in 3D box) as x, y, z and then use dimension and orientation figure to calculate the coordinates of 8 vertexs with xyz. Second, transfer the 8 coordinates from world coordinate system into camera coordinate system. Third, each vertex of 3D box is possible located in each line of 2D box, thus there are 4^8 cases. However, because of some priori knowledge and angle prediction result, the number of cases reduce to 64. In this step, we get 64 equations and each quations comtains 4 equation. Forth, solve the 64 equations and then select the best solution as location coordinate xyz. 38 | ![image](https://github.com/ZhixinLai/3D-detection-with-monocular-RGB-image/blob/master/images/img2.png) 39 | 40 | * Method two: According to papaer2, because of the priori knowledge that objects in self-driving scene are on the ground, we can use the height, 2D box and projection constraint to inference the depth information(z). And then use z to inference xy with projection constraint. 41 | 42 | * Method three: The method two supposes that after being projected into 2D images, the center point of 3D box coincide with center point of 2D boxes. Actually, the two points will not coincides with each other exactly. According to paper3, we can predict the 3D projected point in 2D image first(shown as blew) and then use the point to step into method two. 43 | ![image](https://github.com/ZhixinLai/3D-detection-with-monocular-RGB-image/blob/master/images/img3.png) 44 | 45 | ## Results 46 | The best performance: 47 | car_detection AP: 96.452675 86.783386 77.942184 48 | car_orientation AP: 93.204292 82.368660 73.366890 49 | pedestrian_detection AP: 69.537376 60.686756 52.112762 50 | pedestrian_orientation AP: 51.052326 44.875721 38.976936 51 | cyclist_detection AP: 65.076256 47.723835 46.861427 52 | cyclist_orientation AP: 40.380432 30.131805 29.838795 53 | ### 2D box prediction visualization 54 | ![image](https://github.com/ZhixinLai/3D-detection-with-monocular-RGB-image/blob/master/images/img4.png) 55 | ### 3D box prediction visualization 56 | ![image](https://github.com/ZhixinLai/3D-detection-with-monocular-RGB-image/blob/master/images/img5.png) 57 | ![image](https://github.com/ZhixinLai/3D-detection-with-monocular-RGB-image/blob/master/images/img6.png) 58 | ![image](https://github.com/ZhixinLai/3D-detection-with-monocular-RGB-image/blob/master/images/img7.png) 59 | ## conclusion 60 | As for orientation prediction, divide sida_y into 4bins and regress sin cos is better. 61 | As for Location inference, method one has better performance. 62 | -------------------------------------------------------------------------------- /images/img1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhixinLai/3D-detection-with-monocular-RGB-image/db6dcd50655a1ccccbdeabd3356fd58b31bfbf3e/images/img1.png -------------------------------------------------------------------------------- /images/img2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhixinLai/3D-detection-with-monocular-RGB-image/db6dcd50655a1ccccbdeabd3356fd58b31bfbf3e/images/img2.png -------------------------------------------------------------------------------- /images/img3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhixinLai/3D-detection-with-monocular-RGB-image/db6dcd50655a1ccccbdeabd3356fd58b31bfbf3e/images/img3.png -------------------------------------------------------------------------------- /images/img4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhixinLai/3D-detection-with-monocular-RGB-image/db6dcd50655a1ccccbdeabd3356fd58b31bfbf3e/images/img4.png -------------------------------------------------------------------------------- /images/img5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhixinLai/3D-detection-with-monocular-RGB-image/db6dcd50655a1ccccbdeabd3356fd58b31bfbf3e/images/img5.png -------------------------------------------------------------------------------- /images/img6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhixinLai/3D-detection-with-monocular-RGB-image/db6dcd50655a1ccccbdeabd3356fd58b31bfbf3e/images/img6.png -------------------------------------------------------------------------------- /images/img7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhixinLai/3D-detection-with-monocular-RGB-image/db6dcd50655a1ccccbdeabd3356fd58b31bfbf3e/images/img7.png -------------------------------------------------------------------------------- /images/img8.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ZhixinLai/3D-detection-with-monocular-RGB-image/db6dcd50655a1ccccbdeabd3356fd58b31bfbf3e/images/img8.png -------------------------------------------------------------------------------- /images/readme: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /math_2Dto3D.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | 4 | def rotation_matrix(yaw, pitch=0, roll=0): 5 | tx = roll 6 | ty = yaw 7 | tz = pitch 8 | 9 | Rx = np.array([[1,0,0], [0, np.cos(tx), -np.sin(tx)], [0, np.sin(tx), np.cos(tx)]]) 10 | Ry = np.array([[np.cos(ty), 0, np.sin(ty)], [0, 1, 0], [-np.sin(ty), 0, np.cos(ty)]]) 11 | Rz = np.array([[np.cos(tz), -np.sin(tz), 0], [np.sin(tz), np.cos(tz), 0], [0,0,1]]) 12 | 13 | return Ry.reshape([3,3]) 14 | # return np.dot(np.dot(Rz,Ry), Rx) 15 | 16 | # option to rotate and shift (for label info) 17 | def create_corners(dimension, location=None, R=None): 18 | 19 | dx = dimension[2] / 2 # length 20 | dy = dimension[0] * 2 # height 21 | dz = dimension[1] / 2 # width 22 | # height width length 23 | x_corners = [] 24 | y_corners = [] 25 | z_corners = [] 26 | 27 | for i in [1, -1]: 28 | for j in [1,-1]: 29 | for k in [1,-1]: 30 | x_corners.append(dx*i) 31 | y_corners.append(dy*j) 32 | z_corners.append(dz*k) 33 | 34 | corners = [x_corners, y_corners, z_corners] 35 | 36 | # rotate if R is passed in 37 | if R is not None: 38 | corners = np.dot(R, corners) 39 | 40 | # shift if location is passed in 41 | if location is not None: 42 | for i,loc in enumerate(location): 43 | corners[i,:] = corners[i,:] + loc 44 | 45 | final_corners = [] 46 | for i in range(8): 47 | final_corners.append([corners[0][i], corners[1][i], corners[2][i]]) 48 | 49 | 50 | return final_corners 51 | 52 | # this is based on the paper. Math! 53 | # calib is a 3x4 matrix, box_2d is [(xmin, ymin), (xmax, ymax)] 54 | def calc_location(dimension, proj_matrix, box_2d, alpha, theta_ray): 55 | #global orientation 56 | orient = alpha + theta_ray 57 | #print('orient:',orient) 58 | R = rotation_matrix(orient) # 为了方便通过loc+ori计算顶点坐标;将世界坐标系转化为相机坐标系 59 | #print('rotation_matrix:', R) 60 | # format 2d corners 61 | xmin = box_2d[0][0] 62 | ymin = box_2d[0][1] 63 | xmax = box_2d[1][0] 64 | ymax = box_2d[1][1] 65 | 66 | # left top right bottom 67 | box_corners = [xmin, ymin, xmax, ymax] 68 | #print('box_corners:',box_corners) 69 | # get the point constraints 70 | constraints = [] 71 | 72 | left_constraints = [] 73 | right_constraints = [] 74 | top_constraints = [] 75 | bottom_constraints = [] 76 | 77 | # using a different coord system ; 原数据集中第一个是height(车高度),第二个是width(车两侧的距离),第三个是length(车头到车尾) 78 | dx = dimension[2] / 2 # length 79 | dy = dimension[0] / 2 # height 80 | dz = dimension[1] / 2 # width 81 | #print('dx:',dx,'dy:',dy,'dz:',dz) 82 | # below is very much based on trial and error 83 | 84 | # based on the relative angle, a different configuration occurs 85 | # negative is back of car, positive is front 86 | left_mult = 1 87 | right_mult = -1 88 | 89 | # about straight on but opposite way 90 | if alpha < np.deg2rad(92) and alpha > np.deg2rad(88): # 左边——车前右,右边——车前左 91 | left_mult = 1 92 | right_mult = 1 93 | # about straight on and same way 94 | elif alpha < np.deg2rad(-88) and alpha > np.deg2rad(-92): # 左边——车后左,右边——车后右 95 | left_mult = -1 96 | right_mult = -1 97 | # this works but doesnt make much sense 98 | elif alpha < np.deg2rad(90) and alpha > -np.deg2rad(90): # 左边——车后左/右(当alpha<0,左;alpha>0,右) 99 | left_mult = -1 100 | right_mult = 1 101 | 102 | 103 | # if the car is facing the oppositeway, switch left and right 104 | switch_mult = -1 #-1 105 | if alpha > 0: 106 | switch_mult = 1 #1 107 | 108 | # left and right could either be the front of the car or the back of the car 109 | # careful to use left and right based on image, no of actual car's left and right 110 | for i in (-2,0): 111 | left_constraints.append([left_mult * dx, i*dy, -switch_mult * dz]) 112 | for i in (-2,0): 113 | right_constraints.append([right_mult * dx, i*dy, switch_mult * dz]) 114 | 115 | """ 116 | # left and right could either be the front of the car ot the back of the car 117 | # careful to use left and right based on image, no of actual car's left and right 118 | for i in (-1,1): 119 | for j in (-1,1): 120 | for k in (-2,0): 121 | left_constraints.append([i * dx, k * dy, j * dz]) 122 | for i in (-1,1): 123 | for j in (-1,1): 124 | for k in (-2,0): 125 | right_constraints.append([i * dx, k * dy, j * dz]) 126 | """ 127 | 128 | # top and bottom are easy, just the top and bottom of car 129 | for i in (-1,1): 130 | for j in (-1,1): 131 | top_constraints.append([i*dx, -dy*2, j*dz]) 132 | for i in (-1,1): 133 | for j in (-1,1): 134 | bottom_constraints.append([i*dx, 0, j*dz]) 135 | 136 | # now, 64 combinations 137 | for left in left_constraints: 138 | for top in top_constraints: 139 | for right in right_constraints: 140 | for bottom in bottom_constraints: 141 | constraints.append([left, top, right, bottom]) 142 | 143 | # filter out the ones with repeats 144 | constraints = filter(lambda x: len(x) == len(set(tuple(i) for i in x)), constraints) 145 | 146 | # create pre M (the term with I and the R*X) 147 | pre_M = np.zeros([4,4]) 148 | # 1's down diagonal 149 | for i in range(0,4): 150 | pre_M[i][i] = 1 151 | 152 | best_loc = None 153 | best_error = [1e09] 154 | best_X = None 155 | 156 | # loop through each possible constraint, hold on to the best guess 157 | # constraint will be 64 sets of 4 corners 158 | count = 0 159 | for constraint in constraints: 160 | #print('constraint:',constraint) 161 | # each corner 162 | Xa = constraint[0] 163 | Xb = constraint[1] 164 | Xc = constraint[2] 165 | Xd = constraint[3] 166 | 167 | X_array = [Xa, Xb, Xc, Xd] # 4约束,对应上下左右 ,shape=4,3 168 | #print('X_array:',X_array) 169 | # M: all 1's down diagonal, and upper 3x1 is Rotation_matrix * [x, y, z] 170 | Ma = np.copy(pre_M) 171 | Mb = np.copy(pre_M) 172 | Mc = np.copy(pre_M) 173 | Md = np.copy(pre_M) 174 | 175 | M_array = [Ma, Mb, Mc, Md] # 4个对角为1的4*4方阵 176 | 177 | # create A, b 178 | A = np.zeros([4,3], dtype=np.float) 179 | b = np.zeros([4,1]) 180 | 181 | # 对于其中某个约束/上/下/左/右 182 | indicies = [0,1,0,1] 183 | for row, index in enumerate(indicies): 184 | X = X_array[row] 185 | # x_array is four constrains for up bottom left and right; 186 | # x is one point in world World coordinate system, .shape = 3 187 | M = M_array[row] # 一个对角是1的4*4方阵 .shape = 4*4 188 | 189 | # create M for corner Xx 190 | RX = np.dot(R, X) # 某边对应的某点在相机坐标系下的坐标, 维度3 .shape = 3,1 191 | M[:3,3] = RX.reshape(3) # 对角线是1,最后一列前三行分别是RX对应的相机坐标系下的长宽高 .shape = 4,4 192 | 193 | M = np.dot(proj_matrix, M) # 投影到二维平面的坐标(前三维).shape = 3,4,前三列是project矩阵,最后一列是二维平面的x,y,1 194 | 195 | A[row, :] = M[index,:3] - box_corners[row] * M[2,:3] 196 | b[row] = box_corners[row] * M[2,3] - M[index,3] 197 | 198 | #print('A:',A) 199 | #print('b:',b) 200 | #print("M:",M) 201 | #input() 202 | 203 | # solve here with least squares, since over fit will get some error 204 | loc, error, rank, s = np.linalg.lstsq(A, b, rcond=None) 205 | 206 | # found a better estimation 207 | if error < best_error: 208 | count += 1 # for debugging 209 | best_loc = loc 210 | best_error = error 211 | best_X = X_array 212 | 213 | # return best_loc, [left_constraints, right_constraints] # for debugging 214 | # 215 | best_loc = [best_loc[0][0], best_loc[1][0], best_loc[2][0]] 216 | return best_loc, best_X 217 | 218 | # this is based on the paper. Math! 219 | # calib is a 3x4 matrix, box_2d is [(xmin, ymin), (xmax, ymax)] 220 | def calc_location_new(dimension, proj_matrix, box_2d, alpha, theta_ray): 221 | #global orientation 222 | orient = alpha + theta_ray 223 | #print('orient:',orient) 224 | R = rotation_matrix(orient) # 为了方便通过loc+ori计算顶点坐标;将世界坐标系转化为相机坐标系 225 | #print('rotation_matrix:', R) 226 | # format 2d corners 227 | xmin = box_2d[0][0] 228 | ymin = box_2d[0][1] 229 | xmax = box_2d[1][0] 230 | ymax = box_2d[1][1] 231 | 232 | # left top right bottom 233 | box_corners = [xmin, ymin, xmax, ymax] 234 | #print('box_corners:',box_corners) 235 | # get the point constraints 236 | constraints = [] 237 | 238 | left_constraints = [] 239 | right_constraints = [] 240 | top_constraints = [] 241 | bottom_constraints = [] 242 | 243 | dx = dimension[2] / 2 # length 244 | dy = dimension[0] / 2 # height 245 | dz = dimension[1] / 2 # width 246 | #print('dx:',dx,'dy:',dy,'dz:',dz) 247 | # below is very much based on trial and error 248 | 249 | # based on the relative angle, a different configuration occurs 250 | # negative is back of car, positive is front 251 | left_mult = 1 252 | right_mult = -1 253 | 254 | # about straight on but opposite way 255 | if alpha < np.deg2rad(92) and alpha > np.deg2rad(88): # 左边——车前右,右边——车前左 256 | left_mult = 1 257 | right_mult = 1 258 | # about straight on and same way 259 | elif alpha < np.deg2rad(-88) and alpha > np.deg2rad(-92): # 左边——车后左,右边——车后右 260 | left_mult = -1 261 | right_mult = -1 262 | # this works but doesnt make much sense 263 | elif alpha < np.deg2rad(90) and alpha > -np.deg2rad(90): # 左边——车后左/右(当alpha<0,左;alpha>0,右) 264 | left_mult = -1 265 | right_mult = 1 266 | 267 | 268 | 269 | 270 | # if the car is facing the oppositeway, switch left and right 271 | switch_mult = -1 #-1 272 | if alpha > 0: 273 | switch_mult = 1 #1 274 | 275 | # left and right could either be the front of the car or the back of the car 276 | # careful to use left and right based on image, no of actual car's left and right 277 | for i in (-2,0): 278 | left_constraints.append([left_mult * dx, i*dy, -switch_mult * dz]) 279 | for i in (-2,0): 280 | right_constraints.append([right_mult * dx, i*dy, switch_mult * dz]) 281 | 282 | """ 283 | # left and right could either be the front of the car ot the back of the car 284 | # careful to use left and right based on image, no of actual car's left and right 285 | for i in (-1,1): 286 | for j in (-1,1): 287 | for k in (-2,0): 288 | left_constraints.append([i * dx, k * dy, j * dz]) 289 | for i in (-1,1): 290 | for j in (-1,1): 291 | for k in (-2,0): 292 | right_constraints.append([i * dx, k * dy, j * dz]) 293 | """ 294 | 295 | # top and bottom are easy, just the top and bottom of car 296 | for i in (-1,1): 297 | for j in (-1,1): 298 | top_constraints.append([i*dx, -dy*2, j*dz]) 299 | for i in (-1,1): 300 | for j in (-1,1): 301 | bottom_constraints.append([i*dx, 0, j*dz]) 302 | 303 | # now, 64 combinations 304 | for left in left_constraints: 305 | for top in top_constraints: 306 | for right in right_constraints: 307 | for bottom in bottom_constraints: 308 | constraints.append([left, top, right, bottom]) 309 | 310 | # filter out the ones with repeats 311 | constraints = filter(lambda x: len(x) == len(set(tuple(i) for i in x)), constraints) 312 | 313 | # create pre M (the term with I and the R*X) 314 | pre_M = np.zeros([4,4]) 315 | # 1's down diagonal 316 | for i in range(0,4): 317 | pre_M[i][i] = 1 318 | 319 | best_loc = None 320 | best_error = [1e09] 321 | best_X = None 322 | 323 | # loop through each possible constraint, hold on to the best guess 324 | # constraint will be 64 sets of 4 corners 325 | count = 0 326 | for constraint in constraints: 327 | #print('constraint:',constraint) 328 | # each corner 329 | Xa = constraint[0] 330 | Xb = constraint[1] 331 | Xc = constraint[2] 332 | Xd = constraint[3] 333 | 334 | X_array = [Xa, Xb, Xc, Xd] # 4约束,对应上下左右 ,shape=4,3 335 | #print('X_array:',X_array) 336 | # M: all 1's down diagonal, and upper 3x1 is Rotation_matrix * [x, y, z] 337 | Ma = np.copy(pre_M) 338 | Mb = np.copy(pre_M) 339 | Mc = np.copy(pre_M) 340 | Md = np.copy(pre_M) 341 | 342 | M_array = [Ma, Mb, Mc, Md] # 4个对角为1的4*4方阵 343 | 344 | # create A, b 345 | A = np.zeros([4,3], dtype=np.float) 346 | b = np.zeros([4,1]) 347 | 348 | # 对于其中某个约束/上/下/左/右 349 | indicies = [0,1,0,1] 350 | for row, index in enumerate(indicies): 351 | X = X_array[row] 352 | # x_array is four constrains for up bottom left and right; 353 | # x is one point in world World coordinate system, .shape = 3 354 | M = M_array[row] # 一个对角是1的4*4方阵 .shape = 4*4 355 | 356 | # create M for corner Xx 357 | RX = np.dot(R, X) # 某边对应的某点在相机坐标系下的坐标, 维度3 .shape = 3,1 358 | M[:3,3] = RX.reshape(3) # 对角线是1,最后一列前三行分别是RX对应的相机坐标系下的长宽高 .shape = 4,4 359 | 360 | M = np.dot(proj_matrix, M) # 投影到二维平面的坐标(前三维).shape = 3,4,前三列是project矩阵,最后一列是二维平面的x,y,1 361 | 362 | A[row, :] = M[index,:3] - box_corners[row] * M[2,:3] 363 | b[row] = box_corners[row] * M[2,3] - M[index,3] 364 | 365 | #print('A:',A) 366 | #print('b:',b) 367 | #print("M:",M) 368 | #input() 369 | 370 | # solve here with least squares, since over fit will get some error 371 | loc, error, rank, s = np.linalg.lstsq(A, b, rcond=None) 372 | 373 | # 3d bounding box dimensions 374 | 375 | # 3d bounding box corners 376 | x_corners = [dx , dx, -dx , -dx, dx, dx, -dx, -dx] 377 | y_corners = [0, 0, 0, 0, -2*dy, -2*dy, -2*dy, -2*dy] 378 | z_corners = [dz, -dz, -dz, dz, dz, -dz, -dz, dz] 379 | 380 | # rotate and translate 3d bounding box 381 | corners_3d = np.dot(R, np.vstack([x_corners, y_corners, z_corners])) 382 | # corners_3d = np.dot(R_x, corners_3d) 383 | # print corners_3d.shape 384 | corners_3d[0, :] = corners_3d[0, :] + loc[0] 385 | corners_3d[1, :] = corners_3d[1, :] + loc[1] 386 | corners_3d[2, :] = corners_3d[2, :] + loc[2] 387 | corners_3d = np.transpose(corners_3d) 388 | N = corners_3d.shape[0] 389 | points = np.hstack([corners_3d, np.ones((N, 1))]).T 390 | points = np.matmul(proj_matrix, points) 391 | points /= points[2, :] 392 | points_2d = (points[0:2, :]).T 393 | #print(points_2d) 394 | included = 0 395 | print('box_corners',box_corners) 396 | for cor_point_2d in points_2d: 397 | if cor_point_2d[0]xmin-3 and cor_point_2d[1]ymin-3: 398 | included += 1 399 | # found a better estimation 400 | if error < best_error and included == 8: 401 | count += 1 # for debugging 402 | best_loc = loc 403 | best_error = error 404 | best_X = X_array 405 | print('best_loc',best_loc) 406 | 407 | # return best_loc, [left_constraints, right_constraints] # for debugging 408 | # 409 | if best_loc is not None: 410 | best_loc = [best_loc[0], best_loc[1], best_loc[2]] 411 | return best_loc, best_X 412 | 413 | def calc_theta_ray(img, box_2d, proj_matrix): 414 | width = img.shape[1] 415 | fovx = 2 * np.arctan(width / (2 * proj_matrix[0][0])) 416 | center = (box_2d[1][0] + box_2d[0][0]) / 2 417 | dx = center - (width / 2) 418 | mult = 1 419 | if dx < 0: 420 | mult = -1 421 | dx = abs(dx) 422 | angle = np.arctan( (2*dx*np.tan(fovx/2)) / width ) 423 | angle = angle * mult 424 | return angle 425 | --------------------------------------------------------------------------------