├── MVGlab02_single-view-metrology-main ├── images │ ├── line.jpg │ ├── line'.jpg │ ├── plane.jpg │ ├── chessboard.jpg │ ├── intersection.jpg │ └── chessboard_3D''.png ├── readme.md ├── single_view_metrology.ipynb └── labelme_data │ └── chessboard_plane.json ├── MVGlab03_FandH-estimation-main ├── images │ ├── DSC_0480.JPG │ └── DSC_0481.JPG ├── readme.md └── FandH_estimation.ipynb ├── MVGlab01_camera-calibration-master ├── images │ └── Jietu20200301-091513.jpg ├── readme.md └── camera_calibration.ipynb └── requirements.md /MVGlab02_single-view-metrology-main/images/line.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CV-xueba/Total3DExercises/main/MVGlab02_single-view-metrology-main/images/line.jpg -------------------------------------------------------------------------------- /MVGlab03_FandH-estimation-main/images/DSC_0480.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CV-xueba/Total3DExercises/main/MVGlab03_FandH-estimation-main/images/DSC_0480.JPG -------------------------------------------------------------------------------- /MVGlab03_FandH-estimation-main/images/DSC_0481.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CV-xueba/Total3DExercises/main/MVGlab03_FandH-estimation-main/images/DSC_0481.JPG -------------------------------------------------------------------------------- /MVGlab02_single-view-metrology-main/images/line'.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CV-xueba/Total3DExercises/main/MVGlab02_single-view-metrology-main/images/line'.jpg -------------------------------------------------------------------------------- /MVGlab02_single-view-metrology-main/images/plane.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CV-xueba/Total3DExercises/main/MVGlab02_single-view-metrology-main/images/plane.jpg -------------------------------------------------------------------------------- /MVGlab02_single-view-metrology-main/images/chessboard.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CV-xueba/Total3DExercises/main/MVGlab02_single-view-metrology-main/images/chessboard.jpg -------------------------------------------------------------------------------- /MVGlab02_single-view-metrology-main/images/intersection.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CV-xueba/Total3DExercises/main/MVGlab02_single-view-metrology-main/images/intersection.jpg -------------------------------------------------------------------------------- /MVGlab02_single-view-metrology-main/images/chessboard_3D''.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CV-xueba/Total3DExercises/main/MVGlab02_single-view-metrology-main/images/chessboard_3D''.png -------------------------------------------------------------------------------- /MVGlab01_camera-calibration-master/images/Jietu20200301-091513.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CV-xueba/Total3DExercises/main/MVGlab01_camera-calibration-master/images/Jietu20200301-091513.jpg -------------------------------------------------------------------------------- /requirements.md: -------------------------------------------------------------------------------- 1 | # 《计算机视觉之三维重建篇》 作业代码 2 | 3 | ## 作业内容 4 | 5 | - 作业一:摄像机标定 6 | - 作业二:单视图重建 7 | - 作业三:基础矩阵和单映矩阵估计 8 | 9 | ## 代码测试环境 10 | 11 | > **python == 3.8.11** 12 | > numpy == 1.20.3 13 | > labelme == 4.5.13 14 | > matplotlib == 3.4.3 15 | > open3d == 0.13.0 16 | > scipy == 1.7.1 17 | > ExifRead == 2.3.2 18 | > opencv_python == 4.5.4.58 19 | -------------------------------------------------------------------------------- /MVGlab03_FandH-estimation-main/readme.md: -------------------------------------------------------------------------------- 1 | # Exercise 03. 基础矩阵估计、单应矩阵估计 2 | 3 | ## 实验内容 4 | 5 | 根据所给的函数所求的特征点对应关系,根据归一化八点法求取基本矩阵,根据四点法求取单应矩阵。 6 | 7 | ## 实验提交结果 8 | 9 | 1. 设计文档,包括任务陈述,算法流程说明,实验环境和代码运行方法,实验结果,实验结果分析。 10 | 2. 代码源文件(要求代码注释清晰,可以顺利运行)。 11 | 12 | ## 评分标准 13 | 14 | 1. 设计文档(50分) 15 | - 任务称述清晰——(4 分) 16 | - 算法流程描述清晰——(8 分) 17 | - 实验环境描述清晰——(3 分) 18 | - 实验结果及验证合理——(20 分) 19 | - 实验结果分析合理——(15 分) 20 | 21 | 2. 代码(50分) 22 | - 按照所描述的实验环境可以顺利运行——(10 分) 23 | - 核心函数实现正确——(25 分) 24 | - 注释清晰、代码规范——(15 分) -------------------------------------------------------------------------------- /MVGlab02_single-view-metrology-main/readme.md: -------------------------------------------------------------------------------- 1 | # Exercise 02. 单视图重建 2 | 3 | ## 实验内容 4 | 5 | 根据所提供的符合单视重构要求的单张图片完成单视图重建任务。 6 | 7 | 1. 加载使用 Labelme 工具标注产生的 json 数据,得到标记的平行线、平面交点、垂直面掩模。 8 | 2. 据所标记的平行线在图像上的交点(通过计算得到),估算隐消点和场景中各个平面的隐消线。 9 | 3. 根据由三组正交平行线得到的影消点,计算摄像机参数矩阵(标定)。 10 | 4. 根据三个垂直面对应的隐消线和摄像机矩阵,计算场景平面的法方向(法向量)。 11 | 5. 将平面交线的公共点代入具未知参数 d 的平面方程(设深度 λ=1),计算出各个场景平面到摄像机中心的距离 d。 12 | 6. 将掩模区域中所有 2D 点坐标代入各点对应的场景平面方程(已知 d),计算出 3D 坐标,保存为 ply 文件,即单视图重建完成。 13 | 14 | ## 实验提交结果 15 | 16 | 1. 设计文档,包括任务陈述,算法流程说明,实验环境和代码运行方法,实验结果(其中要包括单视图重构的结果展示截图),实验结果分析。 17 | 2. 代码源文件(要求代码注释清晰,可以顺利运行)。 18 | 19 | ## 评分标准 20 | 21 | 1. 设计文档(50分) 22 | - 任务称述清晰——(4 分) 23 | - 算法流程描述清晰——(8 分) 24 | - 实验环境描述清晰——(3 分) 25 | - 实验结果及验证合理——(20 分) 26 | - 实验结果分析合理——(15 分) 27 | 28 | 2. 代码(50分) 29 | - 按照所描述的实验环境可以顺利运行——(10 分) 30 | - 核心函数实现正确——(25 分) 31 | - 注释清晰、代码规范——(15 分) -------------------------------------------------------------------------------- /MVGlab01_camera-calibration-master/readme.md: -------------------------------------------------------------------------------- 1 | # Exercise 01. 摄像机标定 2 | 3 | ## 实验内容 4 | 5 | 通过 3D 摄像机标定物的单张图像进行摄像机标定。 6 | 7 | 1. 该示例选取网络上的一个立体标定物图片,虽然没有真正的摄像机标定装置,但是可以假设每个小方块的长度和宽度都是单位长度。这与摄像机标定原理并不违背。 8 | 2. 用 Labelme(或具有类似功能的任何其他应用程序)标记一些交叉点,并在世界坐标系中记录对应点的世界坐标。在此示例中,选择三个平面的交线作为世界坐标系的 x,y 和 z 轴,该坐标系遵循右手系。(如 *Jietu20200301-091513.jpg* 中所示)在此示例中,选择了12个点。尽管理论上6个点就足够了,但更多点有助于提高摄像机标定的精确度。在图像中,三个四边形用 Labelme 标记,只有它们的顶点才有用,因为 labelme 只会记录顶点的像素坐标。世界系统中的对应值以粉红色记录在图像中。箭头的方向表示坐标记录的顺序。 9 | 3. 使用选定的点通过课上的方法算出M,K,R,t。 10 | 11 | ## 实验提交结果 12 | 13 | 1. 设计文档,包括任务陈述,算法流程说明,实验环境和代码运行方法,实验结果(使用选定的点通过课上的方法算出M,K,R,t并使用适当的方法对结果进行验证和校准),实验结果分析。 14 | 2. 代码源文件(要求代码注释清晰,可以顺利运行)。 15 | 16 | ## 评分标准 17 | 18 | 1. 设计文档(50分) 19 | - 任务称述清晰——(4 分) 20 | - 算法流程描述清晰——(8 分) 21 | - 实验环境描述清晰——(3 分) 22 | - 实验结果及验证合理——(20 分) 23 | - 实验结果分析合理——(15 分) 24 | 25 | 2. 代码(50分) 26 | - 按照所描述的实验环境可以顺利运行——(10 分) 27 | - 核心函数实现正确——(25 分) 28 | - 注释清晰、代码规范——(15 分) -------------------------------------------------------------------------------- /MVGlab03_FandH-estimation-main/FandH_estimation.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# 第三次课程作业——基础矩阵、单应矩阵估计 " 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## 任务 \n", 15 | "\n", 16 | "- 根据归一化八点法求取基本矩阵\n", 17 | "- 根据四点法求取单应矩阵\n", 18 | "- 以 images 文件夹里的 **SDC_0480.JPG** 和 **DSC_0481.JPG** 两幅图片为例\n", 19 | "\n", 20 | "
\n", 21 | "\n", 22 | "\n", 23 | "
" 24 | ] 25 | }, 26 | { 27 | "cell_type": "markdown", 28 | "metadata": {}, 29 | "source": [ 30 | "## 示例代码" 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": {}, 36 | "source": [ 37 | "- **图像信息提取函数(SIFT特征点和EXIF信息)**" 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": 1, 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "class FeatureProcess:\n", 47 | " \"\"\"\n", 48 | " Simple SIFT feature process\n", 49 | " \"\"\"\n", 50 | "\n", 51 | " def __init__(self, image):\n", 52 | " \"\"\"\n", 53 | " Init\n", 54 | " :param image: (np.ndarray): Image in RGB\n", 55 | " \"\"\"\n", 56 | " self.image = image\n", 57 | " self.gray = cv2.cvtColor(self.image, cv2.COLOR_BGR2GRAY)\n", 58 | " self.keypoints = None\n", 59 | " self.descriptors = None\n", 60 | "\n", 61 | " def extract_features(self):\n", 62 | " \"\"\"\n", 63 | " Extract SIFT features in image.\n", 64 | " :return: (list, np.ndarray): keypoints, descriptors in image\n", 65 | " \"\"\"\n", 66 | " sift = cv2.SIFT_create()\n", 67 | " keypoints, descriptors = sift.detectAndCompute(self.gray, None)\n", 68 | "\n", 69 | " if len(keypoints) <= 20:\n", 70 | " return None, None\n", 71 | " else:\n", 72 | " self.keypoints = keypoints\n", 73 | " self.descriptors = descriptors\n", 74 | " return keypoints, descriptors\n", 75 | "\n", 76 | "class PhotoExifInfo:\n", 77 | " \"\"\"\n", 78 | " Extract photo exif info\n", 79 | " \"\"\"\n", 80 | "\n", 81 | " def __init__(self, photo_path):\n", 82 | " \"\"\"\n", 83 | " init\n", 84 | " :param photo_path: (str): photo path\n", 85 | " \"\"\"\n", 86 | " self.photo_path = photo_path\n", 87 | " self.focal_length = None\n", 88 | " self.image_width = None\n", 89 | " self.image_length = None\n", 90 | " self.sensor_pixel_size = None\n", 91 | "\n", 92 | " def get_tags(self):\n", 93 | " \"\"\"\n", 94 | " Get tags with interested info\n", 95 | " :return: None\n", 96 | " \"\"\"\n", 97 | " image_content = open(self.photo_path, 'rb')\n", 98 | " tags = exifread.process_file(image_content)\n", 99 | " self.focal_length = float(\n", 100 | " tags['EXIF FocalLength'].values[0].num) / float(tags['EXIF FocalLength'].values[0].den)\n", 101 | " self.image_width = float(tags['EXIF ExifImageWidth'].values[0])\n", 102 | " self.image_length = float(tags['EXIF ExifImageLength'].values[0])\n", 103 | " self.sensor_pixel_size = tags['MakerNote SensorPixelSize']\n", 104 | "\n", 105 | " def get_intrinsic_matrix(self):\n", 106 | " \"\"\"\n", 107 | " Get intrinsic matrix of photo's camera\n", 108 | " :return: (np.ndarray): intrinsic matrix K\n", 109 | " \"\"\"\n", 110 | " K = np.zeros([3, 3])\n", 111 | " dx = self.sensor_pixel_size.values[0].num / self.sensor_pixel_size.values[0].den / self.image_width\n", 112 | " dy = self.sensor_pixel_size.values[1].num / self.sensor_pixel_size.values[1].den / self.image_length\n", 113 | " fu = self.focal_length / dx\n", 114 | " fv = self.focal_length / dy\n", 115 | " u0 = self.image_width / 2\n", 116 | " v0 = self.image_length / 2\n", 117 | " K[0][0] = fu\n", 118 | " K[1][1] = fv\n", 119 | " K[0][2] = u0\n", 120 | " K[1][2] = v0\n", 121 | " K[2][2] = 1\n", 122 | " return K\n", 123 | "\n", 124 | " def get_area(self):\n", 125 | " \"\"\"\n", 126 | " Get area of photo\n", 127 | " :return: (int): area\n", 128 | " \"\"\"\n", 129 | " return int(self.image_width * self.image_length)\n", 130 | "\n", 131 | " def get_diam(self):\n", 132 | " \"\"\"\n", 133 | " Get diam of photo\n", 134 | " :return: (int): diam\n", 135 | " \"\"\"\n", 136 | " return int(max(self.image_width, self.image_length))\n", 137 | "\n", 138 | "def build_img_info(img_root):\n", 139 | " \"\"\"\n", 140 | " Get info(img,feat,K) from img\n", 141 | " :param img_root: (str): images root\n", 142 | " :return: (list[np.ndarray], list[dict], list[np.ndarray]): info from img\n", 143 | " \"\"\"\n", 144 | " imgs = []\n", 145 | " feats = []\n", 146 | " K = []\n", 147 | " for i, name in enumerate(os.listdir(img_root)):\n", 148 | " if '.jpg' in name or '.JPG' in name:\n", 149 | " path = os.path.join(img_root, name)\n", 150 | " img = cv2.imread(path)\n", 151 | " imgs.append(img)\n", 152 | " feature_process = FeatureProcess(img)\n", 153 | " kpt, des = feature_process.extract_features()\n", 154 | " photo_info = PhotoExifInfo(path)\n", 155 | " photo_info.get_tags()\n", 156 | " K.append(photo_info.get_intrinsic_matrix())\n", 157 | " A = photo_info.get_area()\n", 158 | " D = photo_info.get_diam()\n", 159 | " feats.append({'kpt': kpt, 'des': des, 'A': A, 'D': D})\n", 160 | " return imgs, feats, K" 161 | ] 162 | }, 163 | { 164 | "cell_type": "markdown", 165 | "metadata": {}, 166 | "source": [ 167 | "- **归一化八点法和四点法(基本矩阵和单应矩阵)**" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": 2, 173 | "metadata": {}, 174 | "outputs": [], 175 | "source": [ 176 | "def get_matches(des_query, des_train):\n", 177 | " \"\"\"\n", 178 | " Match features between query and train\n", 179 | " :param des_query: (np.ndarray): query descriptors\n", 180 | " :param des_train: (np.ndarray): train descriptors\n", 181 | " :return: (list[cv2.DMatch]): Match info\n", 182 | " \"\"\"\n", 183 | " bf = cv2.BFMatcher(cv2.NORM_L2)\n", 184 | " matches = bf.knnMatch(des_query, des_train, k=2)\n", 185 | "\n", 186 | " good = []\n", 187 | " for m, m_ in matches:\n", 188 | " # Ratio is 0.6 ,which to remain enough features\n", 189 | " if m.distance < 0.6 * m_.distance:\n", 190 | " good.append(m)\n", 191 | " return good\n", 192 | "\n", 193 | "def get_match_point(p, p_, matches):\n", 194 | " \"\"\"\n", 195 | " Find matched keypoints\n", 196 | " :param p: (list[cv2.KeyPoint]): query keypoints\n", 197 | " :param p_: (list[cv2.KeyPoint]): train keypoints\n", 198 | " :param matches: (list[cv2.DMatch]): match info between query and train\n", 199 | " :return: (np.ndarray, np.ndarray): matched keypoints between query and train\n", 200 | " \"\"\"\n", 201 | " points_query = np.asarray([p[m.queryIdx].pt for m in matches])\n", 202 | " points_train = np.asarray([p_[m.trainIdx].pt for m in matches])\n", 203 | " return points_query, points_train\n", 204 | "\n", 205 | "def normalize(pts, T=None):\n", 206 | " \"\"\"\n", 207 | " normalize points\n", 208 | " :param pts: (np.ndarray): points to be normalized\n", 209 | " :param T: (np.ndarray): IS None means we need to computer T\n", 210 | " :return: (np.ndarray, np.ndarray): normalized points and T\n", 211 | " \"\"\"\n", 212 | " if T is None:\n", 213 | " u = np.mean(pts, 0)\n", 214 | " d = np.sum(np.sqrt(np.sum(np.power(pts, 2), 1)))\n", 215 | " T = np.array([\n", 216 | " [np.sqrt(2) / d, 0, -(np.sqrt(2) / d * u[0])],\n", 217 | " [0, np.sqrt(2) / d, -(np.sqrt(2) / d * u[1])],\n", 218 | " [0, 0, 1]\n", 219 | " ])\n", 220 | " return homoco_pts_2_euco_pts(np.matmul(T, euco_pts_2_homoco_pts(pts).T).T), T\n", 221 | "\n", 222 | "def homoco_pts_2_euco_pts(pts):\n", 223 | " \"\"\"\n", 224 | " Homogeneous coordinate to Euclidean coordinates\n", 225 | " :param pts: (np.ndarray): Homogeneous coordinate\n", 226 | " :return: (np.ndarray): Euclidean coordinates\n", 227 | " \"\"\"\n", 228 | " if len(pts.shape) == 1:\n", 229 | " pts = pts.reshape(1, -1)\n", 230 | " res = pts / pts[:, -1, None]\n", 231 | " return res[:, :-1].squeeze()\n", 232 | "\n", 233 | "def euco_pts_2_homoco_pts(pts):\n", 234 | " \"\"\"\n", 235 | " Euclidean coordinate to Homogeneous coordinates\n", 236 | " :param pts: (np.ndarray): Euclidean coordinate\n", 237 | " :return: (np.ndarray): Homogeneous coordinates\n", 238 | " \"\"\"\n", 239 | " if len(pts.shape) == 1:\n", 240 | " pts = pts.reshape(1, -1)\n", 241 | " one = np.ones(pts.shape[0])\n", 242 | " res = np.c_[pts, one]\n", 243 | " return res.squeeze()\n", 244 | "\n", 245 | "def estimate_fundamental(pts1, pts2, num_sample=8):\n", 246 | " n = pts1.shape[0]\n", 247 | " pts_index = range(n)\n", 248 | " sample_index = random.sample(pts_index, num_sample)\n", 249 | " p1 = pts1[sample_index, :]\n", 250 | " p2 = pts2[sample_index, :]\n", 251 | " n = len(sample_index)\n", 252 | " p1_norm, T1 = normalize(p1, None)\n", 253 | " p2_norm, T2 = normalize(p2, None)\n", 254 | " w = np.zeros((n, 9))\n", 255 | " for i in range(n):\n", 256 | " w[i, 0] = p1_norm[i, 0] * p2_norm[i, 0]\n", 257 | " w[i, 1] = p1_norm[i, 1] * p2_norm[i, 0]\n", 258 | " w[i, 2] = p2_norm[i, 0]\n", 259 | " w[i, 3] = p1_norm[i, 0] * p2_norm[i, 1]\n", 260 | " w[i, 4] = p1_norm[i, 1] * p2_norm[i, 1]\n", 261 | " w[i, 5] = p2_norm[i, 1]\n", 262 | " w[i, 6] = p1_norm[i, 0]\n", 263 | " w[i, 7] = p1_norm[i, 1]\n", 264 | " w[i, 8] = 1\n", 265 | "\n", 266 | " U, sigma, VT = np.linalg.svd(w)\n", 267 | " f = VT[-1, :].reshape(3, 3)\n", 268 | " U, sigma, VT = np.linalg.svd(f)\n", 269 | " sigma[2] = 0\n", 270 | " f = U.dot(np.diag(sigma)).dot(VT)\n", 271 | " f = T2.T.dot(f).dot(T1)\n", 272 | " return f\n", 273 | "\n", 274 | "def estimate_homo(pts1, pts2, num_sample=4):\n", 275 | " n = pts1.shape[0]\n", 276 | " pts_index = range(n)\n", 277 | " sample_index = random.sample(pts_index, num_sample)\n", 278 | " p1 = pts1[sample_index, :]\n", 279 | " p2 = pts2[sample_index, :]\n", 280 | " n = len(sample_index)\n", 281 | " w = np.zeros((n * 2, 9))\n", 282 | " for i in range(n):\n", 283 | " w[2 * i, 0] = p1[i, 0]\n", 284 | " w[2 * i, 1] = p1[i, 1]\n", 285 | " w[2 * i, 2] = 1\n", 286 | " w[2 * i, 3] = 0\n", 287 | " w[2 * i, 4] = 0\n", 288 | " w[2 * i, 5] = 0\n", 289 | " w[2 * i, 6] = -p1[i, 0] * p2[i, 0]\n", 290 | " w[2 * i, 7] = -p1[i, 1] * p2[i, 0]\n", 291 | " w[2 * i, 8] = -p2[i, 0]\n", 292 | " w[2 * i + 1, 0] = 0\n", 293 | " w[2 * i + 1, 1] = 0\n", 294 | " w[2 * i + 1, 2] = 0\n", 295 | " w[2 * i + 1, 3] = p1[i, 0]\n", 296 | " w[2 * i + 1, 4] = p1[i, 1]\n", 297 | " w[2 * i + 1, 5] = 1\n", 298 | " w[2 * i + 1, 6] = -p1[i, 0] * p2[i, 1]\n", 299 | " w[2 * i + 1, 7] = -p1[i, 1] * p2[i, 1]\n", 300 | " w[2 * i + 1, 8] = -p2[i, 1]\n", 301 | " U, sigma, VT = np.linalg.svd(w)\n", 302 | " h = VT[-1, :].reshape(3, 3)\n", 303 | " return h" 304 | ] 305 | }, 306 | { 307 | "cell_type": "code", 308 | "execution_count": 3, 309 | "metadata": {}, 310 | "outputs": [], 311 | "source": [ 312 | "def build_F_H_pair_match(feats):\n", 313 | " \"\"\"\n", 314 | " Build F, H, pair and match\n", 315 | " :param feats: (list[dict]): feat of imgs\n", 316 | " :return: (np.ndarray, np.ndarray, dict, dict): F, H, pair of imgs, match of pairs\n", 317 | " \"\"\"\n", 318 | "\n", 319 | "\n", 320 | " pair = dict()\n", 321 | " match = dict()\n", 322 | "\n", 323 | " for i in range(len(feats)):\n", 324 | " for j in range(i + 1, len(feats)):\n", 325 | " print(i, j)\n", 326 | " matches = get_matches(\n", 327 | " feats[i]['des'], feats[j]['des'])\n", 328 | " pts1, pts2 = get_match_point(\n", 329 | " feats[i]['kpt'], feats[j]['kpt'], matches)\n", 330 | " assert pts1.shape == pts2.shape\n", 331 | " # Need 8 points to estimate models\n", 332 | " if pts1.shape[0] < 8:\n", 333 | " continue\n", 334 | "\n", 335 | " F_single = estimate_fundamental(pts1, pts2)\n", 336 | " H_single = estimate_homo(pts1, pts2)\n", 337 | "\n", 338 | " if pts1.shape[0] < 8:\n", 339 | " continue\n", 340 | "\n", 341 | " pair.update({(i, j): {'pts1': pts1, 'pts2': pts2}})\n", 342 | " match.update({(i, j): {'match': matches}})\n", 343 | "\n", 344 | "\n", 345 | " return F_single, H_single, pair, match" 346 | ] 347 | }, 348 | { 349 | "cell_type": "markdown", 350 | "metadata": {}, 351 | "source": [ 352 | "## 结果测试" 353 | ] 354 | }, 355 | { 356 | "cell_type": "markdown", 357 | "metadata": {}, 358 | "source": [ 359 | "- **导入所需模块**" 360 | ] 361 | }, 362 | { 363 | "cell_type": "code", 364 | "execution_count": 4, 365 | "metadata": {}, 366 | "outputs": [], 367 | "source": [ 368 | "import numpy as np\n", 369 | "import cv2\n", 370 | "import random\n", 371 | "import os\n", 372 | "import exifread" 373 | ] 374 | }, 375 | { 376 | "cell_type": "markdown", 377 | "metadata": {}, 378 | "source": [ 379 | "- **读取图像、提取特征点并计算基础矩阵和单应矩阵**" 380 | ] 381 | }, 382 | { 383 | "cell_type": "code", 384 | "execution_count": 6, 385 | "metadata": {}, 386 | "outputs": [ 387 | { 388 | "name": "stdout", 389 | "output_type": "stream", 390 | "text": [ 391 | "0 1\n", 392 | "The Fundamental Matrix is:\n", 393 | " [[ 2.04145376e-10 2.36540296e-08 -1.73467054e-05]\n", 394 | " [-2.17166341e-08 6.62880899e-10 -8.21073367e-05]\n", 395 | " [ 1.74208105e-05 7.92261019e-05 4.78852145e-03]]\n", 396 | "The Homography Matrix is:\n", 397 | " [[ 1.13554423e-02 -9.44761813e-05 -1.53000794e-01]\n", 398 | " [-4.50809624e-04 1.12960107e-02 9.88028524e-01]\n", 399 | " [-5.55802425e-07 1.25720789e-07 1.15600657e-02]]\n" 400 | ] 401 | } 402 | ], 403 | "source": [ 404 | "img_root = 'images/'\n", 405 | "imgs, feats, K = build_img_info(img_root)\n", 406 | "F, H, pair, match = build_F_H_pair_match(feats)\n", 407 | "print(\"The Fundamental Matrix is:\\n\", F)\n", 408 | "print(\"The Homography Matrix is:\\n\", H)" 409 | ] 410 | } 411 | ], 412 | "metadata": { 413 | "interpreter": { 414 | "hash": "2fd6ff00ff4a419d324d5b7e4b1b0789b8ee895e93e15d812a34184c59464f6c" 415 | }, 416 | "kernelspec": { 417 | "display_name": "Python 3.8.11 64-bit ('MyCV': conda)", 418 | "name": "python3" 419 | }, 420 | "language_info": { 421 | "codemirror_mode": { 422 | "name": "ipython", 423 | "version": 3 424 | }, 425 | "file_extension": ".py", 426 | "mimetype": "text/x-python", 427 | "name": "python", 428 | "nbconvert_exporter": "python", 429 | "pygments_lexer": "ipython3", 430 | "version": "3.8.11" 431 | }, 432 | "orig_nbformat": 4 433 | }, 434 | "nbformat": 4, 435 | "nbformat_minor": 2 436 | } 437 | -------------------------------------------------------------------------------- /MVGlab01_camera-calibration-master/camera_calibration.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# 第一次课程作业——摄像机标定" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## 任务 \n", 15 | "\n", 16 | "- 通过3D摄像机标定物的单张图像进行摄像机标定\n", 17 | "- 以 imgages 文件夹中的 **Jietu20200301-091513.jpg** 图片为例,该示例中选择了三个平面的交线作为世界坐标系的 x、y 和 z 轴,其中共选择了 12 个点(三个四边形的顶点)。\n", 18 | "- 通过选定的点进行相机标定,算出 M、K、R、T。\n", 19 | "- 验证结果,从世界坐标系中引入一些点,并检查它们是否可以通过 M 映射到相应的像素。" 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "
" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "metadata": {}, 32 | "source": [ 33 | "## 示例代码" 34 | ] 35 | }, 36 | { 37 | "cell_type": "code", 38 | "execution_count": 21, 39 | "metadata": {}, 40 | "outputs": [], 41 | "source": [ 42 | "class SingleCamera:\n", 43 | "\n", 44 | " def __init__(self, world_coor, pixel_coor, n):\n", 45 | "\n", 46 | " self.__world_coor = world_coor\n", 47 | " self.__pixel_coor = pixel_coor\n", 48 | " self.__point_num = n\n", 49 | "\n", 50 | " '''\n", 51 | " 0. P is the appropriate form when Pm=0\n", 52 | " 1. SVD-solved M is known up to scale, \n", 53 | " which means that the true values of the camera matrix are some scalar multiple of M,\n", 54 | " recorded as __roM\n", 55 | " 2. __M can be represented as form [A b], where A is a 3x3 matrix and b is with shape 3x1\n", 56 | " 3. __K is the intrisic Camera Matrix \n", 57 | " 4. __R and __t for rotation and translation\n", 58 | " \n", 59 | " '''\n", 60 | " self.__P = np.empty([self.__point_num, 12], dtype=float)\n", 61 | " self.__roM = np.empty([3, 4], dtype=float)\n", 62 | " self.__A = np.empty([3, 3], dtype=float)\n", 63 | " self.__b = np.empty([3, 1], dtype=float)\n", 64 | " self.__K = np.empty([3, 3], dtype=float)\n", 65 | " self.__R = np.empty([3, 3], dtype=float)\n", 66 | " self.__t = np.empty([3, 1], dtype=float)\n", 67 | "\n", 68 | " def returnAb(self):\n", 69 | " return self.__A, self.__b\n", 70 | "\n", 71 | " def returnKRT(self):\n", 72 | " return self.__K, self.__R, self.__t\n", 73 | "\n", 74 | " def returnM(self):\n", 75 | " return self.__roM\n", 76 | "\n", 77 | " def myReadFile(filePath):\n", 78 | " pass\n", 79 | "\n", 80 | " def changeHomo(no_homo):\n", 81 | " pass\n", 82 | "\n", 83 | " # to compose P in right form s.t. we can get Pm=0\n", 84 | " def composeP(self):\n", 85 | " i = 0\n", 86 | " P = np.empty([self.__point_num, 12], dtype=float)\n", 87 | " # print(P.shape)\n", 88 | " while i < self.__point_num:\n", 89 | " c = i // 2\n", 90 | " p1 = self.__world_coor[c]\n", 91 | " p2 = np.array([0, 0, 0, 0])\n", 92 | " if i % 2 == 0:\n", 93 | " p3 = -p1 * self.__pixel_coor[c][0]\n", 94 | " #print(p3)\n", 95 | " P[i] = np.hstack((p1, p2, p3))\n", 96 | "\n", 97 | " elif i % 2 == 1:\n", 98 | " p3 = -p1 * self.__pixel_coor[c][1]\n", 99 | " #print(p3)\n", 100 | " P[i] = np.hstack((p2, p1, p3))\n", 101 | " # M = P[i]\n", 102 | " # print(M)\n", 103 | " i = i + 1\n", 104 | " print(\"Now P is with form of :\")\n", 105 | " print(P)\n", 106 | " print('\\n')\n", 107 | " self.__P = P\n", 108 | "\n", 109 | " # svd to P,return A,b, where M=[A b]\n", 110 | " def svdP(self):\n", 111 | " U, sigma, VT = LA.svd(self.__P)\n", 112 | " # print(VT.shape)\n", 113 | " V = np.transpose(VT)\n", 114 | " preM = V[:, -1]\n", 115 | " roM = preM.reshape(3, 4)\n", 116 | " print(\"some scalar multiple of M,recorded as roM:\")\n", 117 | " print(roM)\n", 118 | " print('\\n')\n", 119 | " A = roM[0:3, 0:3].copy()\n", 120 | " b = roM[0:3, 3:4].copy()\n", 121 | " print(\"M can be written in form of [A b], where A is 3x3 and b is 3x1, as following:\")\n", 122 | " print(A)\n", 123 | " print(b)\n", 124 | " print('\\n')\n", 125 | " self.__roM = roM\n", 126 | " self.__A = A\n", 127 | " self.__b = b\n", 128 | "\n", 129 | " # solve the intrinsics and extrisics\n", 130 | " def workInAndOut(self):\n", 131 | " # compute ro, where ro=1/|a3|, ro may be positive or negative,\n", 132 | " # we choose the positive ro and name it ro01\n", 133 | " a3T = self.__A[2]\n", 134 | " # print(a3T)\n", 135 | " under = LA.norm(a3T)\n", 136 | " # print(under)\n", 137 | " ro01 = 1 / under\n", 138 | " print(\"The ro is %f \\n\" % ro01)\n", 139 | "\n", 140 | " # comput cx and cy\n", 141 | " a1T = self.__A[0]\n", 142 | " a2T = self.__A[1]\n", 143 | " cx = ro01 * ro01 * (np.dot(a1T, a3T))\n", 144 | " cy = ro01 * ro01 * (np.dot(a2T, a3T))\n", 145 | " print(\"cx=%f,cy=%f \\n\" % (cx, cy))\n", 146 | "\n", 147 | " # compute theta\n", 148 | " a_cross13 = np.cross(a1T, a3T)\n", 149 | " a_cross23 = np.cross(a2T, a3T)\n", 150 | " theta = np.arccos((-1) * np.dot(a_cross13, a_cross23) / (LA.norm(a_cross13) * LA.norm(a_cross23)))\n", 151 | " print(\"theta is: %f \\n\" % theta)\n", 152 | "\n", 153 | " # compute alpha and beta\n", 154 | " alpha = ro01 * ro01 * LA.norm(a_cross13) * np.sin(theta)\n", 155 | " beta = ro01 * ro01 * LA.norm(a_cross23) * np.sin(theta)\n", 156 | " print(\"alpha:%f, beta:%f \\n\" % (alpha,beta))\n", 157 | "\n", 158 | " # compute K\n", 159 | " K = np.array([alpha, -alpha * (1 / np.tan(theta)), cx, 0, beta / (np.sin(theta)), cy, 0, 0, 1])\n", 160 | " K = K.reshape(3, 3)\n", 161 | " print(\"We can get K accordingly: \")\n", 162 | " print(K)\n", 163 | " print('\\n')\n", 164 | " self.__K = K\n", 165 | "\n", 166 | " # compute R\n", 167 | " r1 = a_cross23 / LA.norm(a_cross23)\n", 168 | " r301 = ro01 * a3T\n", 169 | " r2 = np.cross(r301, r1)\n", 170 | " #print(r1, r2, r301)\n", 171 | " R = np.hstack((r1, r2, r301))\n", 172 | " R = R.reshape(3,3)\n", 173 | " print(\"we can get R:\")\n", 174 | " print(R)\n", 175 | " print('\\n')\n", 176 | " self.__R = R\n", 177 | "\n", 178 | " # compute T\n", 179 | " T = ro01 * np.dot(LA.inv(K), self.__b)\n", 180 | " print(\"we can get t:\")\n", 181 | " print(T)\n", 182 | " print('\\n')\n", 183 | " self.__t = T\n", 184 | "\n", 185 | " def selfcheck(self, w_check, c_check):\n", 186 | " my_size = c_check.shape[0]\n", 187 | " my_err = np.empty([my_size])\n", 188 | " for i in range(my_size) :\n", 189 | " test_pix = np.dot(self.__roM, w_check[i])\n", 190 | " u = test_pix[0] / test_pix[2]\n", 191 | " v = test_pix[1] / test_pix[2]\n", 192 | " u_c = c_check[i][0]\n", 193 | " v_c = c_check[i][1]\n", 194 | " print(\"you get test point %d with result (%f,%f)\" % (i, u, v))\n", 195 | " print(\"the correct result is (%f,%f)\" % (u_c,v_c))\n", 196 | " my_err[i] = (abs(u-u_c)/u_c+abs(v-v_c)/v_c)/2\n", 197 | " average_err = my_err.sum()/my_size\n", 198 | " print(\"The average error is %f ,\" % average_err)\n", 199 | " if average_err > 0.1:\n", 200 | " print(\"which is more than 0.1\")\n", 201 | " else:\n", 202 | " print(\"which is smaller than 0.1, the M is acceptable\")" 203 | ] 204 | }, 205 | { 206 | "cell_type": "markdown", 207 | "metadata": {}, 208 | "source": [ 209 | "## 结果测试" 210 | ] 211 | }, 212 | { 213 | "cell_type": "markdown", 214 | "metadata": {}, 215 | "source": [ 216 | "- **导入相关模块**" 217 | ] 218 | }, 219 | { 220 | "cell_type": "code", 221 | "execution_count": 22, 222 | "metadata": {}, 223 | "outputs": [], 224 | "source": [ 225 | "import numpy as np\n", 226 | "from numpy import linalg as LA" 227 | ] 228 | }, 229 | { 230 | "cell_type": "markdown", 231 | "metadata": {}, 232 | "source": [ 233 | "- **写入所有点的世界坐标,像素坐标,设置测试点**" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": 23, 239 | "metadata": {}, 240 | "outputs": [], 241 | "source": [ 242 | "# The homogeneous world coodinate\n", 243 | "\n", 244 | "# Although it would be more appropriate to write a function to read the coordinates, \n", 245 | "# we've simplified it by listing the coordinates directly in array.\n", 246 | "\n", 247 | "# world corrdinate\n", 248 | "# points: (8, 0, 9), (8, 0, 1), (6, 0, 1), (6, 0, 9)\n", 249 | "w_xz = np.array([8, 0, 9, 1, 8, 0, 1, 1, 6, 0, 1, 1, 6, 0, 9, 1])\n", 250 | "w_xz = w_xz.reshape(4, 4)\n", 251 | "# points: (5, 1, 0), (5, 9, 0), (4, 9, 0), (4, 1, 0)\n", 252 | "w_xy = np.array([5, 1, 0, 1, 5, 9, 0, 1, 4, 9, 0, 1, 4, 1, 0, 1])\n", 253 | "w_xy = w_xy.reshape(4, 4)\n", 254 | "# points: (0, 4, 7), (0, 4, 3), (0, 8, 3), (0, 8, 7)\n", 255 | "w_yz = np.array([0, 4, 7, 1, 0, 4, 3, 1, 0, 8, 3, 1, 0, 8, 7, 1])\n", 256 | "w_yz = w_yz.reshape(4, 4)\n", 257 | "w_coor = np.vstack((w_xz, w_xy, w_yz))\n", 258 | "#print(w_coor)\n", 259 | "# pixel coordinate\n", 260 | "c_xz = np.array([275, 142, 312, 454, 382, 436, 357, 134])\n", 261 | "c_xz = c_xz.reshape(4, 2)\n", 262 | "c_xy = np.array([432, 473, 612, 623, 647, 606, 464, 465])\n", 263 | "c_xy = c_xy.reshape(4, 2)\n", 264 | "c_yz = np.array([654, 216, 644, 368, 761, 420, 781, 246])\n", 265 | "c_yz = c_yz.reshape(4, 2)\n", 266 | "c_coor = np.vstack((c_xz, c_xy, c_yz))\n", 267 | "#print(c_coor)\n", 268 | "# coordinate for validation whether the M is correct or not\n", 269 | "w_check = np.array([6, 0, 5, 1, 3, 3, 0, 1, 0, 4, 0, 1, 0, 4, 4, 1, 0, 0, 7, 1])\n", 270 | "w_check = w_check.reshape(5, 4)\n", 271 | "c_check = np.array([369, 297, 531, 484, 640, 468, 646, 333, 556, 194])\n", 272 | "c_check = c_check.reshape(5, 2)" 273 | ] 274 | }, 275 | { 276 | "cell_type": "markdown", 277 | "metadata": {}, 278 | "source": [ 279 | "- **计算相机参数,测试投影矩阵**" 280 | ] 281 | }, 282 | { 283 | "cell_type": "code", 284 | "execution_count": 24, 285 | "metadata": {}, 286 | "outputs": [ 287 | { 288 | "name": "stdout", 289 | "output_type": "stream", 290 | "text": [ 291 | "Now P is with form of :\n", 292 | "[[ 8.000e+00 0.000e+00 9.000e+00 1.000e+00 0.000e+00 0.000e+00\n", 293 | " 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00]\n", 294 | " [ 0.000e+00 0.000e+00 0.000e+00 0.000e+00 8.000e+00 0.000e+00\n", 295 | " 9.000e+00 1.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00]\n", 296 | " [ 8.000e+00 0.000e+00 1.000e+00 1.000e+00 0.000e+00 0.000e+00\n", 297 | " 0.000e+00 0.000e+00 -2.496e+03 0.000e+00 -3.120e+02 -3.120e+02]\n", 298 | " [ 0.000e+00 0.000e+00 0.000e+00 0.000e+00 8.000e+00 0.000e+00\n", 299 | " 1.000e+00 1.000e+00 -3.632e+03 0.000e+00 -4.540e+02 -4.540e+02]\n", 300 | " [ 6.000e+00 0.000e+00 1.000e+00 1.000e+00 0.000e+00 0.000e+00\n", 301 | " 0.000e+00 0.000e+00 -2.292e+03 0.000e+00 -3.820e+02 -3.820e+02]\n", 302 | " [ 0.000e+00 0.000e+00 0.000e+00 0.000e+00 6.000e+00 0.000e+00\n", 303 | " 1.000e+00 1.000e+00 -2.616e+03 0.000e+00 -4.360e+02 -4.360e+02]\n", 304 | " [ 6.000e+00 0.000e+00 9.000e+00 1.000e+00 0.000e+00 0.000e+00\n", 305 | " 0.000e+00 0.000e+00 -2.142e+03 0.000e+00 -3.213e+03 -3.570e+02]\n", 306 | " [ 0.000e+00 0.000e+00 0.000e+00 0.000e+00 6.000e+00 0.000e+00\n", 307 | " 9.000e+00 1.000e+00 -8.040e+02 0.000e+00 -1.206e+03 -1.340e+02]\n", 308 | " [ 5.000e+00 1.000e+00 0.000e+00 1.000e+00 0.000e+00 0.000e+00\n", 309 | " 0.000e+00 0.000e+00 -2.160e+03 -4.320e+02 0.000e+00 -4.320e+02]\n", 310 | " [ 0.000e+00 0.000e+00 0.000e+00 0.000e+00 5.000e+00 1.000e+00\n", 311 | " 0.000e+00 1.000e+00 -2.365e+03 -4.730e+02 0.000e+00 -4.730e+02]\n", 312 | " [ 5.000e+00 9.000e+00 0.000e+00 1.000e+00 0.000e+00 0.000e+00\n", 313 | " 0.000e+00 0.000e+00 -3.060e+03 -5.508e+03 0.000e+00 -6.120e+02]\n", 314 | " [ 0.000e+00 0.000e+00 0.000e+00 0.000e+00 5.000e+00 9.000e+00\n", 315 | " 0.000e+00 1.000e+00 -3.115e+03 -5.607e+03 0.000e+00 -6.230e+02]]\n", 316 | "\n", 317 | "\n", 318 | "some scalar multiple of M,recorded as roM:\n", 319 | "[[ 5.40654233e-02 -5.48011712e-02 3.61180175e-02 -7.58221552e-01]\n", 320 | " [ 2.11829207e-02 -5.15006002e-02 5.25908724e-02 -6.41872412e-01]\n", 321 | " [ 6.18927969e-05 -4.42791434e-05 9.30771716e-05 -1.51459919e-03]]\n", 322 | "\n", 323 | "\n", 324 | "M can be written in form of [A b], where A is 3x3 and b is 3x1, as following:\n", 325 | "[[ 5.40654233e-02 -5.48011712e-02 3.61180175e-02]\n", 326 | " [ 2.11829207e-02 -5.15006002e-02 5.25908724e-02]\n", 327 | " [ 6.18927969e-05 -4.42791434e-05 9.30771716e-05]]\n", 328 | "[[-0.75822155]\n", 329 | " [-0.64187241]\n", 330 | " [-0.0015146 ]]\n", 331 | "\n", 332 | "\n", 333 | "The ro is 8317.544777 \n", 334 | "\n", 335 | "cx=631.943866,cy=587.108010 \n", 336 | "\n", 337 | "theta is: 2.030709 \n", 338 | "\n", 339 | "alpha:284.615004, beta:221.645270 \n", 340 | "\n", 341 | "We can get K accordingly: \n", 342 | "[[284.61500426 140.98127115 631.94386599]\n", 343 | " [ 0. 247.34678466 587.10801031]\n", 344 | " [ 0. 0. 1. ]]\n", 345 | "\n", 346 | "\n", 347 | "we can get R:\n", 348 | "[[-0.68940554 0.35894597 0.62918819]\n", 349 | " [-0.50961256 -0.85762317 -0.06911977]\n", 350 | " [ 0.51479611 -0.36829376 0.77417354]]\n", 351 | "\n", 352 | "\n", 353 | "we can get t:\n", 354 | "[[ 1.69296063]\n", 355 | " [ 8.31801962]\n", 356 | " [-12.59774659]]\n", 357 | "\n", 358 | "\n", 359 | "you get test point 0 with result (373.587780,371.495304)\n", 360 | "the correct result is (369.000000,297.000000)\n", 361 | "you get test point 1 with result (520.215162,501.331503)\n", 362 | "the correct result is (531.000000,484.000000)\n", 363 | "you get test point 2 with result (577.772139,501.192239)\n", 364 | "the correct result is (640.000000,468.000000)\n", 365 | "you get test point 3 with result (631.309457,483.180160)\n", 366 | "the correct result is (646.000000,333.000000)\n", 367 | "you get test point 4 with result (585.586194,317.169867)\n", 368 | "the correct result is (556.000000,194.000000)\n", 369 | "The average error is 0.164937 ,\n", 370 | "which is more than 0.1\n" 371 | ] 372 | } 373 | ], 374 | "source": [ 375 | "aCamera = SingleCamera(w_coor, c_coor, 12) # 12 points in total are used\n", 376 | "aCamera.composeP()\n", 377 | "aCamera.svdP()\n", 378 | "aCamera.workInAndOut() # print computed result\n", 379 | "aCamera.selfcheck(w_check,c_check) # test 5 points and verify M" 380 | ] 381 | } 382 | ], 383 | "metadata": { 384 | "interpreter": { 385 | "hash": "2fd6ff00ff4a419d324d5b7e4b1b0789b8ee895e93e15d812a34184c59464f6c" 386 | }, 387 | "kernelspec": { 388 | "display_name": "Python 3.8.11 64-bit ('MyCV': conda)", 389 | "name": "python3" 390 | }, 391 | "language_info": { 392 | "codemirror_mode": { 393 | "name": "ipython", 394 | "version": 3 395 | }, 396 | "file_extension": ".py", 397 | "mimetype": "text/x-python", 398 | "name": "python", 399 | "nbconvert_exporter": "python", 400 | "pygments_lexer": "ipython3", 401 | "version": "3.8.11" 402 | }, 403 | "orig_nbformat": 4 404 | }, 405 | "nbformat": 4, 406 | "nbformat_minor": 2 407 | } 408 | -------------------------------------------------------------------------------- /MVGlab02_single-view-metrology-main/single_view_metrology.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# 第二次课程作业——单视图重建" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## 任务 \n", 15 | "\n", 16 | "- 根据所提供的符合单视重构要求的单张图片完成单视图重建任务。\n", 17 | "\n", 18 | "1. 加载使用 LABEL ME 工具标注产生的 json 数据,得到标记的平行线、平面交点、垂直面掩模。\n", 19 | "2. 据所标记的平行线在图像上的交点(通过计算得到),估算隐消点和场景中各个平面的隐消线。\n", 20 | "3. 根据由三组正交平行线得到的影消点,计算摄像机参数矩阵(标定)。\n", 21 | "4. 根据三个垂直面对应的隐消线和摄像机矩阵,计算场景平面的法方向(法向量)。\n", 22 | "5. 将平面交线的公共点代入具未知参数 d 的平面方程(设深度入=1),计算出各个场景平面到摄像机中心的距离 d。\n", 23 | "6. 将掩模区域中所有 2D 点坐标代入各点对应的场景平面方程(已知 d),计算出 3D 坐标,保存为 ply 文件,即单视图重建完成。" 24 | ] 25 | }, 26 | { 27 | "cell_type": "markdown", 28 | "metadata": {}, 29 | "source": [ 30 | "## 预处理 \n", 31 | "\n", 32 | "- 以 images 文件里的 **chessboard.jpg** 图片为例,使用标注工具 labelme 对该图像进行标注,标记产生的 json 数据放在 **labelme_data** 文件夹中。首先在每个垂直面上标出若干条平行线,并确保三个平面上的平行线相互垂直。 \n", 33 | "
\n", 34 | "\n", 35 | "- 同上,再次在三个平面上标注相互垂直的平行线,并与上一步所标注的平行线方向不同(示例图为垂直)。 \n", 36 | "
\n", 37 | "\n", 38 | "- 标出平面交线的公共点\n", 39 | "
\n", 40 | "\n", 41 | "- 标出最终进行三维重建效果展示的区域(掩模)\n", 42 | "
" 43 | ] 44 | }, 45 | { 46 | "cell_type": "markdown", 47 | "metadata": {}, 48 | "source": [ 49 | "## 示例代码" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": 6, 55 | "metadata": {}, 56 | "outputs": [], 57 | "source": [ 58 | "def solution(A): \n", 59 | "\n", 60 | " \"\"\"Solves homogenous linear equations.\"\"\"\n", 61 | "\n", 62 | " ## Ax = 0\n", 63 | "\n", 64 | " # SVD\n", 65 | " u, e, v = svd(A)\n", 66 | " x = v.T[:, -1] # min column of right singular vector\n", 67 | "\n", 68 | " \"\"\" \n", 69 | " ## Other Methods\n", 70 | " # find the eigenvalues and eigenvector of A.T * A -- right singular \n", 71 | " e_vals, e_vecs = np.linalg.eig(np.dot(A.T, A)) \n", 72 | " # extract the eigenvector (column) associated with the minimum eigenvalue \n", 73 | " x = np.array(e_vecs[:, np.argmin(np.abs(e_vals))])\n", 74 | " \"\"\"\n", 75 | "\n", 76 | " # np.set_printoptions(suppress=True)\n", 77 | " # print(A, x, np.dot(A, x.reshape(4, 1)))\n", 78 | "\n", 79 | " return x\n", 80 | "\n", 81 | "def get_label(img, data_line, data_plane):\n", 82 | "\n", 83 | " \"\"\"Gets parallel lines and vertical planes from label json.\"\"\"\n", 84 | "\n", 85 | " ## get parallel lines \n", 86 | " \n", 87 | " # print(data_line['shapes'])\n", 88 | " line2points = defaultdict(list) # store 2 points on each parallel line \n", 89 | " line_names = [\"XZ_line\", \"YZ_line\", \"XY_line\", \"XZ_line'\", \"YZ_line'\", \"XY_line'\",]\n", 90 | " intersect = [] # common point of plane intersection line\n", 91 | " for each_shape in data_line['shapes']:\n", 92 | " for each_line in line_names:\n", 93 | " if each_shape['label'] == each_line:\n", 94 | " # print(each_shape['points'])\n", 95 | " # print([list(reversed(each_shape['points'][0])), list(reversed(each_shape['points'][1]))])\n", 96 | " line2points[each_line].append([list(reversed(each_shape['points'][0])), list(reversed(each_shape['points'][1]))])\n", 97 | " elif each_shape['label'] == \"intersection\": # common point of plane intersection line\n", 98 | " intersect = list(reversed(each_shape['points'][0])) + [1]\n", 99 | " # print(line2points)\n", 100 | "\n", 101 | " ## get vertical planes\n", 102 | "\n", 103 | " lbl, lbl_names = utils.labelme_shapes_to_label(img.shape, data_plane['shapes']) \n", 104 | "\n", 105 | " # lbl:binary array in: 1 out: 0\n", 106 | " # lbl_names :dict _background_:0 other labels: positive\n", 107 | " # print(lbl_names)\n", 108 | " mask = [] # for 3 verticle planes\n", 109 | " for i in range(1, len(lbl_names)): # ignore background\n", 110 | " mask.append((lbl == i).astype(np.uint8)) \n", 111 | "\n", 112 | " # captions = ['%d: %s' % (l, name) for l, name in enumerate(lbl_names)]\n", 113 | " # lbl_viz = utils.draw_label(lbl, img, captions)\n", 114 | " # plt.imshow(lbl_viz)\n", 115 | " # plt.show()\n", 116 | "\n", 117 | " return line2points, intersect, mask\n", 118 | "\n", 119 | "def cal_vanish(line2points):\n", 120 | "\n", 121 | " \"\"\"Calculates vanishing points and horizon lines.\"\"\"\n", 122 | " \n", 123 | " ## calculate vanishing points\n", 124 | " vanish_points = defaultdict(list) # for pairs of parallel lines \n", 125 | " for each_line, points in line2points.items(): # dict\n", 126 | " # print(each_line, len(points))\n", 127 | " line = [] # for each pair of parallel lines, store several line equations\n", 128 | " for each_point in points: # points array\n", 129 | " # cross product -- calculate line equation\n", 130 | " # line.append(np.cross(each_point[0] + [1], each_point[1] + [1])) # homogenous coordinates\n", 131 | " line.append(solution(np.array([each_point[0] + [1], each_point[1] + [1]])))\n", 132 | " # print(line)\n", 133 | " # p = np.cross(line[0], line[1]) # homogenous -- up to a scale\n", 134 | " p = solution(np.array(line)) \n", 135 | " vanish_points[each_line] = p / p[-1]\n", 136 | " # print(vanish_points)\n", 137 | "\n", 138 | " ## calculate vanishing lines\n", 139 | " \"\"\"\n", 140 | " horizs = np.array([np.cross(vanish_points[\"XZ_line\"], vanish_points[\"XZ_line'\"]), \n", 141 | " np.cross(vanish_points[\"YZ_line\"], vanish_points[\"YZ_line'\"]),\n", 142 | " np.cross(vanish_points[\"XY_line\"], vanish_points[\"XY_line'\"])]).T \n", 143 | " \"\"\"\n", 144 | " horizs = np.array([solution(np.array([vanish_points[\"XZ_line\"], vanish_points[\"XZ_line'\"]])), \n", 145 | " solution(np.array([vanish_points[\"YZ_line\"], vanish_points[\"YZ_line'\"]])), \n", 146 | " solution(np.array([vanish_points[\"XY_line\"], vanish_points[\"XY_line'\"]]))]).T\n", 147 | " # print(horizs)\n", 148 | "\n", 149 | " return vanish_points, horizs\n", 150 | "\n", 151 | "def calibrate(vanish_points):\n", 152 | " \n", 153 | " \"\"\"Calibrates the camera (square pixels & no skew).\"\"\" \n", 154 | "\n", 155 | " # W = [[w1 w2 w4] [w2 w3 w5] [w4 w5 w6]]\n", 156 | " # (square pixels: w2 = 0 no skew: w1 = w3)\n", 157 | "\n", 158 | " # θ = 90° -- v1.T * W * v2 = 0 ; v2.T * W * v3 = 0 ; v1.T * W * v4 = 0\n", 159 | " # (a1a2 + b1b2)w1 + (c1a2 + a1c2)w4 + (c1b2 + b1c2)w5 + c1c2w6 = 0\n", 160 | " # (a1a3 + b1b3)w1 + (c1a3 + a1c3)w4 + (c1b3 + b1c3)w5 + c1c3w6 = 0\n", 161 | " # (a2a3 + b2b3)w1 + (c2a3 + a2c3)w4 + (c2b3 + b2c3)w5 + c2c3w6 = 0 \n", 162 | "\n", 163 | " a1, b1, c1 = vanish_points[\"XZ_line\"]\n", 164 | " a2, b2, c2 = vanish_points[\"YZ_line\"]\n", 165 | " a3, b3, c3 = vanish_points[\"XY_line\"]\n", 166 | "\n", 167 | " param = np.array([[a1 * a2 + b1 * b2, c1 * a2 + a1 * c2, c1 * b2 + b1 * c2, c1 * c2], \n", 168 | " [a1 * a3 + b1 * b3, c1 * a3 + a1 * c3, c1 * b3 + b1 * c3, c1 * c3],\n", 169 | " [a2 * a3 + b2 * b3, c2 * a3 + a2 * c3, c2 * b3 + b2 * c3, c2 * c3]]) # parameter matrix \n", 170 | "\n", 171 | " # print(np.linalg.matrix_rank(param)) # rank < 4 -- non-zero solution\n", 172 | " \n", 173 | " w1, w4, w5, w6 = solution(param) # solve w1, w4, w5, w6 -- up to a scale\n", 174 | " W = np.array([[w1, 0, w4], [0, w1, w5], [w4, w5, w6]])\n", 175 | " W /= W[-1, -1]\n", 176 | " \n", 177 | " ### Once W is calculated, we get camera matrix K: \n", 178 | "\n", 179 | " ## Cholesky factorization -- W = (K.inv).T * (K.inv)\n", 180 | " # e_vals, e_vecs = np.linalg.eig(W) \n", 181 | " # print(e_vals) # all e_vals > 0 -- positive denifite\n", 182 | " L = cholesky(W, lower=False) # W = L.T * L, L is upper trianguler matrix \n", 183 | " K = np.linalg.inv(L) # K is upper trianguler matrix \n", 184 | " K /= K[-1][-1]\n", 185 | " # print(K)\n", 186 | "\n", 187 | " \"\"\"\n", 188 | " ### Other Methods\n", 189 | " ## directly -- W = (K * K.T).inv\n", 190 | " # K = [[f 0 u0] [0 f v0] [0 0 1]] (square pixels & no skew)\n", 191 | " # K * K.T = [[f^2 + u0^2 u0v0 u0] [u0v0 f^2 + v0^2 v0] [u0 v0 1]]\n", 192 | " W_inv = np.linalg.inv(W) \n", 193 | " dot = W_inv / W_inv[-1][-1]\n", 194 | " # print(W, W_inv, dot)\n", 195 | " u0 = dot[-1][0]\n", 196 | " v0 = dot[-1][1]\n", 197 | " # assert u0 * v0 == dot[-1][0] * dot[-1][1]\n", 198 | " # assert abs((u0 * u0 - v0 * v0) - (dot[0][0] - dot[1][1])) < 1e-3\n", 199 | " f = np.sqrt(dot[1][1] - v0 * v0)\n", 200 | " K = np.array([[f, 0, u0], [0, f, v0], [0, 0, 1]])\n", 201 | " # print(K)\n", 202 | " \"\"\"\n", 203 | " \n", 204 | " return K\n", 205 | "\n", 206 | "def cal_3D(K, p_2D_H):\n", 207 | "\n", 208 | " \"\"\"Calculates unit 3D pos without taking projective depth into account.\"\"\"\n", 209 | "\n", 210 | " p_3D = np.dot(np.linalg.inv(K), p_2D_H)\n", 211 | " p_3D /= np.linalg.norm(p_3D) # p / || p ||\n", 212 | "\n", 213 | " return p_3D\n", 214 | "\n", 215 | "def cal_scene(K, horizs, intersect):\n", 216 | "\n", 217 | " \"\"\"Calculates scene plane equations.\"\"\"\n", 218 | "\n", 219 | " # X_H.T * Pi = 0 (homogenous) -- Pi = (unit_n.T, d).T\n", 220 | " # unit_n.T * X + d = 0\n", 221 | "\n", 222 | " ## calculate scene plane orientations -- normal vector\n", 223 | " \n", 224 | " # n = K.T * l_horiz\n", 225 | " N = np.dot(K.T, horizs)\n", 226 | " # unit_n = n / || n ||\n", 227 | " unit_N = N / np.linalg.norm(N, axis=0)\n", 228 | " # print(unit_N)\n", 229 | "\n", 230 | " ## calculate distance between plane and camera center\n", 231 | "\n", 232 | " # common point of plane intersection lines in 3D \n", 233 | " # X_H = (λ * K.inv * x / || K.inv * x ||, 1) -- suppose projective depth λ is 1\n", 234 | " intersect_3D = cal_3D(K, intersect).reshape((3,1))\n", 235 | " # print(intersect_3D)\n", 236 | " # intersect_3D_H = np.append(intersect_3D, 1)\n", 237 | "\n", 238 | " # substitute the common point into plane equations\n", 239 | " D = -1 * np.dot(unit_N.T, intersect_3D)\n", 240 | " # print(D)\n", 241 | "\n", 242 | " # unit_N.T * X + D = 0 -- X_H * Pi = 0\n", 243 | " Pi = np.concatenate((unit_N, D.T), axis=0) # (4, 3)\n", 244 | "\n", 245 | " return Pi\n", 246 | "\n", 247 | "def reconstruction(K, Pi, img, masks):\n", 248 | "\n", 249 | " \"\"\"Reconstructs 3D points in masked image for 3 verticle planes.\"\"\"\n", 250 | "\n", 251 | " assert img.shape[:-1] == np.array(masks).shape[1:]\n", 252 | "\n", 253 | " pos = [] # position in 3D\n", 254 | " rgb = [] # pixel in original image\n", 255 | " for axis, each_mask in enumerate(masks): # plane XZ; YZ; XY\n", 256 | " for i in range(img.shape[0]):\n", 257 | " for j in range(img.shape[1]):\n", 258 | " if each_mask[i, j] == 1:\n", 259 | " # X_H * Pi = 0 -- X * Pi[:-1] + Pi[-1] = 0\n", 260 | " # X_H = (λ * K.inv * x / || K.inv * x ||, 1) -- solve λ\n", 261 | " point = cal_3D(K, [i, j, 1]).reshape((3,1))\n", 262 | " lambd = -Pi[-1, axis] / np.dot(Pi[:-1, axis], point)\n", 263 | " pos.append(lambd * point)\n", 264 | " rgb.append(img[i, j])\n", 265 | "\n", 266 | " return pos, rgb\n", 267 | "\n", 268 | "def create_output(vertices, colors, filename):\n", 269 | "\n", 270 | " \"\"\"Creates point cloud file.\"\"\"\n", 271 | "\n", 272 | " vertices = np.hstack([np.array(vertices).reshape(-1, 3), np.array(colors).reshape(-1, 3)])\n", 273 | " np.savetxt(filename, vertices, fmt='%f %f %f %d %d %d') \n", 274 | " ply_header = '''ply\\nformat ascii 1.0\\nelement vertex %(vert_num)d\\nproperty float x\\nproperty float y\\nproperty float z\\nproperty uchar red\\nproperty uchar green\\nproperty uchar blue\\nend_header\\n'''\n", 275 | " with open(filename, 'r+') as f:\n", 276 | " old = f.read()\n", 277 | " f.seek(0)\n", 278 | " f.write(ply_header % dict(vert_num=len(vertices)))\n", 279 | " f.write(old)" 280 | ] 281 | }, 282 | { 283 | "cell_type": "markdown", 284 | "metadata": {}, 285 | "source": [ 286 | "## 结果测试" 287 | ] 288 | }, 289 | { 290 | "cell_type": "markdown", 291 | "metadata": {}, 292 | "source": [ 293 | "- **导入相关模块**" 294 | ] 295 | }, 296 | { 297 | "cell_type": "code", 298 | "execution_count": 7, 299 | "metadata": {}, 300 | "outputs": [], 301 | "source": [ 302 | "import json\n", 303 | "import matplotlib.image as mpimg\n", 304 | "import matplotlib.pyplot as plt\n", 305 | "import numpy as np\n", 306 | "\n", 307 | "from labelme import utils\n", 308 | "from collections import defaultdict\n", 309 | "from scipy.linalg import svd, cholesky, qr\n", 310 | "import open3d as o3d\n", 311 | "from open3d import web_visualizer" 312 | ] 313 | }, 314 | { 315 | "cell_type": "markdown", 316 | "metadata": {}, 317 | "source": [ 318 | "- **加载 json 文件中的标记数据**" 319 | ] 320 | }, 321 | { 322 | "cell_type": "code", 323 | "execution_count": 8, 324 | "metadata": {}, 325 | "outputs": [], 326 | "source": [ 327 | "# load data\n", 328 | "\n", 329 | "json_file_line = r'labelme_data\\chessboard_line.json' # label the parallel lines\n", 330 | "json_file_plane = r'labelme_data\\chessboard_plane.json' # label the vertical planes\n", 331 | "data_line = json.load(open(json_file_line))\n", 332 | "data_plane = json.load(open(json_file_plane))\n", 333 | "# img = utils.img_b64_to_arr(data_plane['imageData']) # png: 4 channels\n", 334 | "img = mpimg.imread(r'images\\chessboard.jpg')" 335 | ] 336 | }, 337 | { 338 | "cell_type": "markdown", 339 | "metadata": {}, 340 | "source": [ 341 | "- **单视图标定与重构,并保存 3D 模型**" 342 | ] 343 | }, 344 | { 345 | "cell_type": "code", 346 | "execution_count": 9, 347 | "metadata": {}, 348 | "outputs": [ 349 | { 350 | "name": "stderr", 351 | "output_type": "stream", 352 | "text": [ 353 | "[WARNING] shape:labelme_shapes_to_label:79 - labelme_shapes_to_label is deprecated, so please use shapes_to_label.\n" 354 | ] 355 | } 356 | ], 357 | "source": [ 358 | "line2points, intersect, masks = get_label(img, data_line, data_plane)\n", 359 | "\n", 360 | "# calculate points and lines at infinity from labeled parallel lines\n", 361 | "\n", 362 | "vanish_points, vanish_lines = cal_vanish(line2points)\n", 363 | "\n", 364 | "# according to intersection points of 3 pairs parallel lines which are mutually orthogonal,\n", 365 | "# calculate camera matrix\n", 366 | "\n", 367 | "K = calibrate(vanish_points)\n", 368 | "\n", 369 | "# according to lines at infinity of 3 vertical planes and camera matrix, \n", 370 | "# calculate scene plane orientations (normal vectors);\n", 371 | "# substitute the common point(assuming projective depth) of plane intersection lines into plane equations, \n", 372 | "# calculate distance between each plane and camera center\n", 373 | "\n", 374 | "Pi = cal_scene(K, vanish_lines, intersect)\n", 375 | "\n", 376 | "# substituting 2D points into corresponding plane equation,\n", 377 | "# calculate 3D positions for masked image (up to a unknown scale)\n", 378 | "\n", 379 | "pos, rgb = reconstruction(K, Pi, img, masks)\n", 380 | "\n", 381 | "# save as ply file\n", 382 | "\n", 383 | "create_output(pos, rgb, r'chessboard_3D.ply')" 384 | ] 385 | }, 386 | { 387 | "cell_type": "markdown", 388 | "metadata": {}, 389 | "source": [ 390 | "- **可视化 3D 模型**" 391 | ] 392 | }, 393 | { 394 | "cell_type": "code", 395 | "execution_count": 10, 396 | "metadata": {}, 397 | "outputs": [ 398 | { 399 | "data": { 400 | "application/vnd.jupyter.widget-view+json": { 401 | "model_id": "5aaa2b66541d464d84dbd60dcf73d768", 402 | "version_major": 2, 403 | "version_minor": 0 404 | }, 405 | "text/plain": [ 406 | "WebVisualizer(window_uid='window_1')" 407 | ] 408 | }, 409 | "metadata": {}, 410 | "output_type": "display_data" 411 | } 412 | ], 413 | "source": [ 414 | "# visualization of point clouds.\n", 415 | "pcd = o3d.io.read_point_cloud('chessboard_3D.ply')\n", 416 | "# o3d.visualization.draw_geometries([pcd])\n", 417 | "web_visualizer.draw(pcd)" 418 | ] 419 | } 420 | ], 421 | "metadata": { 422 | "interpreter": { 423 | "hash": "2fd6ff00ff4a419d324d5b7e4b1b0789b8ee895e93e15d812a34184c59464f6c" 424 | }, 425 | "kernelspec": { 426 | "display_name": "Python 3.8.11 64-bit ('MyCV': conda)", 427 | "name": "python3" 428 | }, 429 | "language_info": { 430 | "codemirror_mode": { 431 | "name": "ipython", 432 | "version": 3 433 | }, 434 | "file_extension": ".py", 435 | "mimetype": "text/x-python", 436 | "name": "python", 437 | "nbconvert_exporter": "python", 438 | "pygments_lexer": "ipython3", 439 | "version": "3.8.11" 440 | }, 441 | "orig_nbformat": 4 442 | }, 443 | "nbformat": 4, 444 | "nbformat_minor": 2 445 | } 446 | -------------------------------------------------------------------------------- /MVGlab02_single-view-metrology-main/labelme_data/chessboard_plane.json: -------------------------------------------------------------------------------- 1 | { 2 | "version": "3.16.7", 3 | "flags": {}, 4 | "shapes": [ 5 | { 6 | "label": "XZ", 7 | "line_color": null, 8 | "fill_color": null, 9 | "points": [ 10 | [ 11 | 15.15384615384616, 12 | 41.452991452991455 13 | ], 14 | [ 15 | 176.69230769230774, 16 | 25.213675213675216 17 | ], 18 | [ 19 | 175.4102564102564, 20 | 240.17094017094018 21 | ], 22 | [ 23 | 33.52991452991455, 24 | 279.0598290598291 25 | ] 26 | ], 27 | "shape_type": "polygon", 28 | "flags": {} 29 | }, 30 | { 31 | "label": "YZ", 32 | "line_color": null, 33 | "fill_color": null, 34 | "points": [ 35 | [ 36 | 176.69230769230774, 37 | 25.213675213675216 38 | ], 39 | [ 40 | 365.1538461538462, 41 | 56.83760683760684 42 | ], 43 | [ 44 | 333.957264957265, 45 | 331.19658119658123 46 | ], 47 | [ 48 | 175.83760683760687, 49 | 241.45299145299145 50 | ] 51 | ], 52 | "shape_type": "polygon", 53 | "flags": {} 54 | }, 55 | { 56 | "label": "XY", 57 | "line_color": null, 58 | "fill_color": null, 59 | "points": [ 60 | [ 61 | 33.957264957264954, 62 | 279.4871794871795 63 | ], 64 | [ 65 | 164.29914529914532, 66 | 397.86324786324786 67 | ], 68 | [ 69 | 333.52991452991455, 70 | 330.34188034188037 71 | ], 72 | [ 73 | 174.982905982906, 74 | 241.02564102564105 75 | ] 76 | ], 77 | "shape_type": "polygon", 78 | "flags": {} 79 | } 80 | ], 81 | "lineColor": [ 82 | 0, 83 | 255, 84 | 0, 85 | 128 86 | ], 87 | "fillColor": [ 88 | 255, 89 | 0, 90 | 0, 91 | 128 92 | ], 93 | "imagePath": "chessboard_plane.png", 94 | "imageData": "", 95 | "imageHeight": 404, 96 | "imageWidth": 404 97 | } --------------------------------------------------------------------------------