├── .gitignore ├── README.md ├── data ├── __init__.py ├── data_train_gym_lidar_aerial │ └── ground_truth_gym_lidar.csv ├── data_train_qsdjt_lidar_aerial │ └── ground_truth_qsdjt_lidar.csv ├── data_train_qsdjt_lidar_sat │ └── ground_truth_qsdjt_lidar.csv ├── data_train_qsdjt_stereo_aerial │ └── ground_truth_qsdjt_lidar.csv ├── data_train_qsdjt_stereo_sat │ └── ground_truth_qsdjt_lidar.csv ├── data_utils.py ├── dataset.py ├── dataset_DPCN.py ├── ortho │ └── ground_truth_gym_lidar.csv └── simulation.py ├── detect.py ├── fft ├── __init__.py ├── dft_test.py ├── fft_demo.py ├── imreg_test.py └── test.py ├── images_for_readme ├── Result1.png ├── Result2.png └── simulation.png ├── ipynb ├── pytorch_fcn.ipynb ├── pytorch_resnet18_unet.ipynb └── pytorch_unet.ipynb ├── log_polar ├── __init__.py ├── log_polar.py ├── polar.py ├── polarizeLayer.py └── tf_test.py ├── phase_correlation ├── __init__.py ├── phase_corr.py └── tmp.dot ├── requirements.txt ├── trainDPCN.py ├── unet ├── LICENSE ├── README.md ├── __init__.py ├── helper.py ├── loss.py └── pytorch_DPCN.py ├── utils ├── __init__.py ├── detect_utils.py ├── train_utils.py ├── utils.py └── validate_utils.py └── validate.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.jpg 2 | *.pth 3 | *.zip 4 | *.pt 5 | *.mav-lab 6 | *.png 7 | *.DS_Store 8 | *.pyc 9 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![arXiv](https://img.shields.io/badge/arxiv-2008.09474-B31B1B.svg)](https://arxiv.org/abs/2008.09474) 2 | [![video](https://img.shields.io/badge/Video-CoRL2020-ff69b4.svg)](https://www.youtube.com/watch?v=_xPoHFf_8yI) 3 | [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) 4 | # DPCN: Deep Phase Correlation for End-to-End Heterogeneous Sensor Measurements Matching 5 | 6 | This is the official repository for DPCN, with paper "[Deep Phase Correlation for End-to-End Heterogeneous Sensor Measurements Matching](https://proceedings.mlr.press/v155/chen21g.html)" accpepted and presented in CoRL2020. 7 | 8 | 9 | 10 | ## Dependencies 11 | 12 | There are a few dependencies required to run the code. They are listed below: 13 | 14 | ### System Environments: 15 | 16 | `Python >= 3.5` 17 | 18 | `CUDA 10.1` 19 | 20 | `CuDnn` 21 | 22 | ### Pip dependencies: 23 | 24 | `Pytorch 1.5.0` 25 | 26 | `Torchvision 0.6.0` 27 | 28 | `Kornia 0.3.1` 29 | 30 | `Graphviz` 31 | 32 | `Opencv` 33 | 34 | `Scipy` 35 | 36 | `Matplotlib` 37 | 38 | `Pandas` 39 | 40 | `TensorboardX` 41 | 42 | 43 | 44 | You can install these dependencies by changing your current directory to this DPCN directory and running: 45 | 46 | `pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html` (if using pip), 47 | 48 | or manually install these dependencies if it makes you feel more comfortable and assured. 49 | 50 | We would suggest you to setup the whole environment in Conda or Virtualenv so that your original system will not be affected. 51 | 52 | Also, we are now working on the upgrade process to be competible with the lastes version of packages (e.g. torch==1.7.1, cuda11, and kornia0.4.1), coming soon! 53 | 54 | 55 | 56 | ### Pre-trained Models: 57 | 58 | ***Pre-trained Models can be downloaded from this link:*** https://drive.google.com/file/d/1GZ8hz3cfaBP7F7KEdQK7M_WcwOM1GF6z/view?usp=sharing 59 | 60 | 61 | 62 | 63 | 64 | ## How to Run The Code 65 | 66 | ### Step 0 67 | 68 | Before you start to train or validate on the simulation dataset, you might want to choose one specific dataset type of the following three: Homogeneous, Heterogeneous, and Dynamic Obstacles. Demonstrations are shown as follows: 69 | 70 | ![](./images_for_readme/simulation.png) 71 | 72 | **By default, the code is running on Heterogeneous dataset** and you can change this by modifying following lines in "./data/simulation.py". 73 | 74 | 1. To decide whether they are heterogeneous, modify line 78~80: 75 | 76 | `# for heterogeneous image, comment out if you want them to be homogeneous 77 | rot = cv2.GaussianBlur(rot, (9,9), 13) 78 | kernel = np.array([[-1,-1,-1],[-1,9,-1],[-1,-1,-1]]) 79 | rot = cv2.filter2D(rot, -1, kernel)` 80 | 81 | 2. To decide whether there are dynamic obstacles, modify line 83~85 82 | 83 | `# for dynamic obstacles, comment out if you dont want any dynamic obstacles 84 | arr[0,] = add_plus(arr[0], *plus_location4) 85 | arr = np.reshape(arr, (1, height, width)).astype(np.float32)` 86 | 87 | 88 | 89 | ### Training 90 | 91 | If you want to train the DPCN network from scratch on simulation dataset then simply run: 92 | 93 | `python trainDPCN.py --simulation`. 94 | 95 | By default, this will train the network on the heterogeneous dataset with the batch size of 2, and will run on GPU. There are several settings you can change by adding arguments below: 96 | 97 | | Arguments | What it will trigger | Default | 98 | | ------------------- | ------------------------------------------------------------ | --------------------------- | 99 | | --save_path | The path to save the checkpoint of every epoch | ./checkpoints/ | 100 | | --simulation | The training will be applied on a randomly generated simulation dataset | False | 101 | | --cpu | The Program will use cpu for the training | False | 102 | | --load_pretrained | Choose whether to use a pretrained model to fine tune | Fasle | 103 | | --load_path | The path to load a pretrained checkpoint | ./checkpoints/checkpoint.pt | 104 | | --load_optimizer | When using a pretrained model, options of loading it's optimizer | False | 105 | | --pretrained_mode | Three options:
'all' for loading rotation and translation;
'rot' for loading only rotation;
'trans' for loading only translation | All | 106 | | --use_dsnt | When enabled, the loss will be calculated via DSNT and MSELoss, or it will use a CELoss | False | 107 | | --batch_size_train | The batch size of training | 2 | 108 | | --batch_size_val | The batch size of training | 2 | 109 | | --train_writer_path | Where to write the Log of training | ./checkpoints/log/train/ | 110 | | --val_writer_path | Where to write the Log of validation | ./checkpoints/log/val/ | 111 | 112 | 113 | 114 | 115 | 116 | ### Validating 117 | 118 | If you are only interested in validating on the randomly generated simulation dataset, then you can simply run following lines based on the specific dataset type you chose in **Step 0**. 119 | 120 | For **Homogeneous** sets: 121 | 122 | `python validate.py --simulation --only_valid --load_path=./checkpoints/checkpoint_simulation_homo.pt` 123 | 124 | For **Heterogeneous** sets: 125 | 126 | `python validate.py --simulation --only_valid --load_path=./checkpoints/checkpoint_simulation_hetero.pt` 127 | 128 | For **Dynamic** Obstacle sets: 129 | 130 | `python validate.py --simulation --only_valid --load_path=./checkpoints/checkpoint_simulation_dynamic.pt` 131 | 132 | 133 | 134 | ***Again, Pre-trained Models can be downloaded from this link:*** https://drive.google.com/file/d/1GZ8hz3cfaBP7F7KEdQK7M_WcwOM1GF6z/view?usp=sharing 135 | 136 | 137 | 138 | Similarly, there are several options that you can choose when running validation, shown as follows: 139 | 140 | | Arguments | What it will trigger | Default | 141 | | ----------------- | ------------------------------------------------------------ | --------------------------- | 142 | | --only_valid | You have to use this command if you run validation alone | False | 143 | | --simulation | The training will be applied on a randomly generated simulation dataset | False | 144 | | --cpu | The Program will use cpu for the training | False | 145 | | --load_path | The path to load a pretrained checkpoint | ./checkpoints/checkpoint.pt | 146 | | --use_dsnt | When enabled, the loss will be calculated via DSNT and MSELoss, or it will use a CELoss | False | 147 | | --batch_size_val | The batch size of training | 2 | 148 | | --val_writer_path | Where to write the Log of validation | ./checkpoints/log/val/ | 149 | 150 | 151 | 152 | ### Detecting and Infering: 153 | 154 | This repository also provided a single pair detecting script so that you can see the result of DPCN directly. A few demos are given in the "./demo" directory including images pairs from both simulation dataset and Aero-Ground Dataset. You could customize the script `detect.py` to test the chosen images pair with relative pre-trained model given, and run this code below: 155 | 156 | `python detect.py` 157 | 158 | The results should be something like this: 159 | 160 | 161 | 162 | ![](./images_for_readme/Result1.png) 163 | 164 | ![Result2](./images_for_readme/Result2.png) 165 | 166 | ### Citation 167 | If our source code could help you in your project, please cite the following: 168 | ```bibtex 169 | @InProceedings{chen2020deep, 170 | title = {Deep Phase Correlation for End-to-End Heterogeneous Sensor Measurements Matching}, 171 | author = {Chen, Zexi and Xu, Xuecheng and Wang, Yue and Xiong, Rong}, 172 | booktitle = {Proceedings of the 2020 Conference on Robot Learning}, 173 | pages = {2359--2375}, 174 | year = {2021}, 175 | volume = {155}, 176 | series = {Proceedings of Machine Learning Research}, 177 | month = {16--18 Nov}, 178 | publisher = {PMLR}, 179 | pdf = {https://proceedings.mlr.press/v155/chen21g/chen21g.pdf}, 180 | url = {https://proceedings.mlr.press/v155/chen21g.html}, 181 | } 182 | ``` 183 | 184 | 185 | 186 | 187 | 188 | ## HAVE FUN with the CODE!!!! 189 | -------------------------------------------------------------------------------- /data/__init__.py: -------------------------------------------------------------------------------- 1 | """This package includes a miscellaneous collection of useful helper functions.""" 2 | -------------------------------------------------------------------------------- /data/data_utils.py: -------------------------------------------------------------------------------- 1 | from torch.utils.data import Dataset, DataLoader 2 | from torchvision import transforms, utils 3 | import pandas as pd 4 | from pandas import Series,DataFrame 5 | import numpy as np 6 | import torch 7 | import os 8 | import cv2 9 | from PIL import Image 10 | import matplotlib.pyplot as plt 11 | import sys 12 | import os 13 | sys.path.append(os.path.abspath("..")) 14 | from utils.utils import * 15 | 16 | def default_loader(path, resize_shape, change_scale = False): 17 | trans = transforms.Compose([ 18 | transforms.ToTensor(), 19 | # transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # imagenet 20 | ]) 21 | image = cv2.imread(path) 22 | image = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY) 23 | (h_original, w_original) = image.shape 24 | image = cv2.resize(image, dsize=(resize_shape,resize_shape), interpolation=cv2.INTER_CUBIC) 25 | angle = (np.random.rand()-0.5) * 0. 26 | angle += 180. 27 | angle %= 360 28 | angle -= 180. 29 | (h, w) = image.shape 30 | (cX, cY) = (w//2, h//2) 31 | 32 | t_x = np.random.rand() * 0. 33 | t_y = np.random.rand() * 0. 34 | translation = np.array((t_y, t_x)) 35 | 36 | # arr = arr[0,] 37 | # rot = ndii.rotate(arr, angle) 38 | 39 | # N = np.float32([[1,0,t_x],[0,1,t_y]]) 40 | # image = cv2.warpAffine(image, N, (w, h)) 41 | # M = cv2.getRotationMatrix2D((cX, cY), angle, 1.0) 42 | # image = cv2.warpAffine(image, M, (w, h)) 43 | # image = cv2.resize(image, (h, w), interpolation=cv2.INTER_CUBIC) 44 | 45 | np_image_data = np.asarray(image) 46 | image_tensor = trans(np_image_data) 47 | scaling_factor = 1 48 | if change_scale: 49 | center = torch.ones(1,2) 50 | center[:, 0] = h // 2 51 | center[:, 1] = w // 2 52 | scaling_factor = torch.tensor(np.random.rand()*0.2+1) 53 | angle_source = torch.ones(1) * 0. 54 | scale_source = torch.ones(1) * scaling_factor 55 | image_tensor = image_tensor.unsqueeze(0) 56 | rot_mat = kornia.get_rotation_matrix2d(center, angle_source, scale_source) 57 | image_tensor = kornia.warp_affine(image_tensor, rot_mat, dsize=(h, w)) 58 | image_tensor = image_tensor.squeeze(0) 59 | # image = Image.open(path) 60 | # image = image.convert("1") 61 | # # image.show() 62 | # image = image.resize((128,128)) 63 | # image_tensor = trans(image) 64 | return image_tensor, angle, translation, scaling_factor, h_original, w_original 65 | 66 | def get_gt_tensor(this_gt, size): 67 | this_gt = this_gt +180 68 | gt_tensor_self = torch.zeros(size,size) 69 | angle_convert = this_gt*size/360 70 | angle_index = angle_convert//1 + (angle_convert%1+0.5)//1 71 | if angle_index.long() == size: 72 | angle_index = size-1 73 | gt_tensor_self[angle_index,0] = 1 74 | else: 75 | gt_tensor_self[angle_index.long(),0] = 1 76 | # print("angle_index", angle_index) 77 | 78 | return gt_tensor_self 79 | -------------------------------------------------------------------------------- /data/dataset.py: -------------------------------------------------------------------------------- 1 | from torch.utils.data import Dataset, DataLoader 2 | from torchvision import transforms, datasets, models 3 | import numpy as np 4 | import torch 5 | import sys 6 | import data.simulation as simulation 7 | from data.data_utils import * 8 | 9 | 10 | class SimDataset(Dataset): 11 | def __init__(self, count, transform=None): 12 | self.input_images, self.rotate_image, self.gt_rot, self.gt_trans = simulation.generate_random_data(256, 256, count=count) 13 | self.transform = transform 14 | 15 | def __len__(self): 16 | return len(self.input_images) 17 | 18 | def __getitem__(self, idx): 19 | image = self.input_images[idx] 20 | rot = self.rotate_image[idx] 21 | gt_rot = self.gt_rot[idx] 22 | gt_trans = self.gt_trans[idx] 23 | gt_scale = torch.tensor(1.) 24 | 25 | if self.transform: 26 | image = self.transform(image) 27 | rot = self.transform(rot) 28 | gt_rots = torch.tensor(gt_rot) 29 | gt_scale = torch.tensor(1.) 30 | gt_trans = torch.tensor(gt_trans) 31 | # print("gt = ", gt) 32 | # print("gt tensor = ", gt_tensor) 33 | 34 | return [image, rot, gt_rots, gt_scale, gt_trans] 35 | 36 | def generate_dataloader(batch_size): 37 | # use the same transformations for train/val in this example 38 | trans = transforms.Compose([ 39 | transforms.ToTensor(), 40 | # transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # imagenet 41 | ]) 42 | 43 | train_set = SimDataset(2000, transform = trans) 44 | val_set = SimDataset(1000, transform = trans) 45 | 46 | image_datasets = { 47 | 'train': train_set, 'val': val_set 48 | } 49 | 50 | dataloaders = { 51 | 'train': DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=0), 52 | 'val': DataLoader(val_set, batch_size=batch_size, shuffle=True, num_workers=0) 53 | } 54 | return dataloaders -------------------------------------------------------------------------------- /data/dataset_DPCN.py: -------------------------------------------------------------------------------- 1 | from torch.utils.data import Dataset, DataLoader 2 | from torchvision import transforms, utils 3 | import pandas as pd 4 | from pandas import Series,DataFrame 5 | import numpy as np 6 | import torch 7 | import os 8 | import cv2 9 | from PIL import Image 10 | import matplotlib.pyplot as plt 11 | import sys 12 | import os 13 | 14 | sys.path.append(os.path.abspath("..")) 15 | from utils.utils import * 16 | from data.data_utils import * 17 | 18 | # datacsv = pd.read_csv("./data_train/ground_truth.csv") 19 | 20 | # template_list = datacsv["name"].values 21 | # template_list = [i+".jpg" for i in template_list] 22 | # template_list = [os.path.join("./data_train/ground/",i) for i in template_list ] 23 | # template_train_list = template_list[:5000] 24 | # template_val_list = template_list[:5000] 25 | 26 | # source_list = datacsv["name"].values 27 | # source_list = [i+".jpg" for i in source_list] 28 | # source_list = [os.path.join("./data_train/aerial/",i) for i in source_list ] 29 | # source_train_list = source_list[:5000] 30 | # source_val_list = source_list[:5000] 31 | 32 | # ground_truth_list = datacsv["rotation"].values 33 | # ground_truth_train_list = ground_truth_list[5000:] 34 | # ground_truth_val_list = ground_truth_list[:5000] 35 | # ground_truth_train_list = torch.from_numpy(ground_truth_train_list) 36 | # ground_truth_val_list = torch.from_numpy(ground_truth_val_list) 37 | 38 | # trans = transforms.Compose([ 39 | # transforms.ToTensor(), 40 | # # transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # imagenet 41 | # ]) 42 | 43 | 44 | class AeroGroundDataset_train(Dataset): 45 | def __init__(self, template_train_list, source_train_list, gt_rot_train_list, gt_scale_train_list, gt_x_train_list, gt_y_train_list, loader=default_loader): 46 | self.template_path_list = template_train_list #this is a list of the path to the template image 47 | self.source_path_list = source_train_list 48 | self.gt_rot_list = gt_rot_train_list 49 | self.gt_scale_list = gt_scale_train_list 50 | self.loader = loader 51 | self.gt_trans_x = gt_x_train_list 52 | self.gt_trans_y = gt_y_train_list 53 | 54 | # add x y theta 55 | def __len__(self): 56 | return len(self.template_path_list) 57 | 58 | def __getitem__(self, index): 59 | # print(np.shape(this_source)) 60 | this_template_path = self.template_path_list[index] 61 | this_source_path = self.source_path_list[index] 62 | rot_gt = self.gt_rot_list[index] 63 | scale_gt = self.gt_scale_list[index] 64 | trans_x = self.gt_trans_x[index] 65 | trans_y = self.gt_trans_y[index] 66 | 67 | this_template, _, _, _, _,_ = self.loader(this_template_path, resize_shape=256) 68 | this_source, _, _, scaling_factor, h_original, w_original = self.loader(this_source_path, resize_shape=256, change_scale=True) 69 | # print("this gt =", rot_gt) 70 | # gt_tensor = get_gt_tensor(rot_gt, this_template.size(1)) 71 | # print("gt_tensor =", gt_tensor) 72 | # rot_gt += angle_t + angle_s 73 | # rot_gt += 180. 74 | # rot_gt %= 360 75 | # rot_gt -= 180. 76 | rot_gt = torch.tensor(rot_gt) 77 | scale_gt = torch.tensor(scale_gt) * scaling_factor 78 | 79 | trans = np.array((trans_y/(h_original/this_template.size(1)), trans_x/(w_original/this_template.size(1)))) 80 | 81 | gt_trans = torch.tensor(trans) 82 | # add x y theta 83 | return [this_template, this_source, rot_gt, scale_gt, gt_trans] 84 | 85 | class AeroGroundDataset_val(Dataset): 86 | def __init__(self, template_val_list, source_val_list, gt_rot_val_list, gt_scale_val_list, gt_x_val_list, gt_y_val_list, loader=default_loader): 87 | self.template_path_list = template_val_list #this is a list of the path to the template image 88 | self.source_path_list = source_val_list 89 | self.gt_rot_list = gt_rot_val_list 90 | self.gt_scale_list = gt_scale_val_list 91 | self.loader = loader 92 | self.gt_trans_x = gt_x_val_list 93 | self.gt_trans_y = gt_y_val_list 94 | 95 | def __len__(self): 96 | return len(self.template_path_list) 97 | 98 | def __getitem__(self, index): 99 | 100 | this_template_path = self.template_path_list[index] 101 | this_source_path = self.source_path_list[index] 102 | rot_gt = self.gt_rot_list[index] 103 | scale_gt = self.gt_scale_list[index] 104 | trans_x = self.gt_trans_x[index] 105 | trans_y = self.gt_trans_y[index] 106 | this_template, _, _, _, _, _ = self.loader(this_template_path, resize_shape=256) 107 | this_source, _, _, scaling_factor, h_original, w_original = self.loader(this_source_path, resize_shape=256, change_scale=True) 108 | 109 | # gt_tensor = get_gt_tensor(rot_gt, this_template.size(1)) 110 | 111 | # rot_gt += angle_t + angle_s 112 | # rot_gt += 180. 113 | # rot_gt %= 360 114 | # rot_gt -= 180. 115 | rot_gt = torch.tensor(rot_gt) 116 | scale_gt = torch.tensor(scale_gt) * scaling_factor 117 | 118 | # trans_x = (torch.sign(-trans_x) + 1) / 2 * 256 + trans_x 119 | # trans_y = (torch.sign(-trans_y) + 1) / 2 * 256 + trans_y 120 | trans = np.array((trans_y/(h_original/this_template.size(1)), trans_x/(w_original/this_template.size(1)))) 121 | gt_trans = torch.tensor(trans) 122 | 123 | return [this_template, this_source, rot_gt, scale_gt, gt_trans] 124 | 125 | 126 | def DPCNdataloader(batch_size): 127 | # use the same transformations for train/val in this example 128 | path = "./data" 129 | datacsv = pd.read_csv(path + "/data_train_qsdjt_stereo_sat/ground_truth_qsdjt_lidar.csv") 130 | train_upper = 6000 131 | val_num = 2000 132 | val_upper = train_upper+val_num 133 | 134 | template_list = datacsv["name"].values 135 | template_list = [i+".jpg" for i in template_list] 136 | template_list = [os.path.join(path + "/data_train_qsdjt_stereo_sat/ground/",i) for i in template_list ] 137 | template_train_list = template_list[:train_upper] 138 | template_val_list = template_list[train_upper:val_upper] 139 | 140 | source_list = datacsv["name"].values 141 | source_list = [i+".jpg" for i in source_list] 142 | source_list = [os.path.join(path + "/data_train_qsdjt_stereo_sat/aerial/",i) for i in source_list ] 143 | source_train_list = source_list[:train_upper] 144 | source_val_list = source_list[train_upper:val_upper] 145 | 146 | gt_rot_list = datacsv["rotation"].values 147 | gt_rot_train_list = gt_rot_list[:train_upper] 148 | gt_rot_val_list = gt_rot_list[train_upper:val_upper] 149 | gt_rot_train_list = torch.from_numpy(gt_rot_train_list) 150 | gt_rot_val_list = torch.from_numpy(gt_rot_val_list) 151 | 152 | gt_scale_list = datacsv["rotation"].values * 0 +1.0 153 | gt_scale_train_list = gt_scale_list[:train_upper] 154 | gt_scale_val_list = gt_scale_list[train_upper:val_upper] 155 | gt_scale_train_list = torch.from_numpy(gt_scale_train_list) 156 | gt_scale_val_list = torch.from_numpy(gt_scale_val_list) 157 | 158 | gt_x_list = datacsv["shift_x"].values 159 | gt_x_train_list = gt_x_list[:train_upper] 160 | gt_x_val_list = gt_x_list[train_upper:val_upper] 161 | gt_x_train_list = torch.from_numpy(gt_x_train_list) 162 | gt_x_val_list = torch.from_numpy(gt_x_val_list) 163 | 164 | gt_y_list = datacsv["shift_y"].values 165 | gt_y_train_list = gt_y_list[:train_upper] 166 | gt_y_val_list = gt_y_list[train_upper:val_upper] 167 | gt_y_train_list = torch.from_numpy(gt_y_train_list) 168 | gt_y_val_list = torch.from_numpy(gt_y_val_list) 169 | 170 | train_set = AeroGroundDataset_train(template_train_list, source_train_list, gt_rot_train_list, gt_scale_train_list, gt_x_train_list, gt_y_train_list) 171 | val_set = AeroGroundDataset_val(template_val_list, source_val_list, gt_rot_val_list, gt_scale_val_list, gt_x_val_list, gt_y_val_list) 172 | 173 | image_datasets = { 174 | 'train': train_set, 'val': val_set 175 | } 176 | 177 | 178 | dataloaders = { 179 | 'train': DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=0), 180 | 'val': DataLoader(val_set, batch_size=batch_size, shuffle=True, num_workers=0) 181 | } 182 | return dataloaders 183 | -------------------------------------------------------------------------------- /data/ortho/ground_truth_gym_lidar.csv: -------------------------------------------------------------------------------- 1 | name,rotation,shift_x,shift_y 2 | 0_,172.4768822,37,-28 3 | 28_,172.4768822,24,-43 4 | -------------------------------------------------------------------------------- /data/simulation.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import numpy as np 3 | import random 4 | try: 5 | import scipy.ndimage.interpolation as ndii 6 | except ImportError: 7 | import ndimage.interpolation as ndii 8 | import matplotlib.pyplot as plt 9 | 10 | 11 | def generate_random_data(height, width, count): 12 | x, y, gt, trans = zip(*[generate_img_and_rot_img(height, width) for i in range(0, count)]) 13 | 14 | X = np.asarray(x) * 255 15 | X = X.repeat(1, axis=1).transpose([0, 2, 3, 1]).astype(np.uint8) 16 | Y = np.asarray(y) * 100 17 | Y = Y.repeat(1, axis=1).transpose([0, 2, 3, 1]).astype(np.uint8) 18 | 19 | return X, Y, gt, trans 20 | 21 | def generate_img_and_rot_img(height, width): 22 | shape = (height, width) 23 | 24 | triangle_location = get_random_location(*shape) 25 | triangle_location1 = get_random_location(*shape) 26 | triangle_location2 = get_random_location(*shape) 27 | circle_location1 = get_random_location(*shape, zoom=0.7) 28 | circle_location2 = get_random_location(*shape, zoom=0.5) 29 | circle_location3 = get_random_location(*shape, zoom=0.9) 30 | mesh_location = get_random_location(*shape) 31 | square_location = get_random_location(*shape, zoom=0.8) 32 | plus_location = get_random_location(*shape, zoom=1.2) 33 | plus_location1 = get_random_location(*shape, zoom=1.2) 34 | plus_location2 = get_random_location(*shape, zoom=1.2) 35 | plus_location3 = get_random_location(*shape, zoom=1.2) 36 | plus_location4 = get_random_location(*shape, zoom=1.2) 37 | 38 | # Create input image 39 | arr = np.zeros(shape, dtype=bool) 40 | arr = add_triangle(arr, *triangle_location) 41 | arr = add_triangle(arr, *triangle_location1) 42 | arr = add_triangle(arr, *triangle_location2) 43 | arr = add_circle(arr, *circle_location1) 44 | arr = add_circle(arr, *circle_location2, fill=True) 45 | arr = add_circle(arr, *circle_location3) 46 | arr = add_mesh_square(arr, *mesh_location) 47 | arr = add_filled_square(arr, *square_location) 48 | arr = add_plus(arr, *plus_location) 49 | arr = add_plus(arr, *plus_location1) 50 | arr = add_plus(arr, *plus_location2) 51 | arr = add_plus(arr, *plus_location3) 52 | arr = np.reshape(arr, (1, height, width)).astype(np.float32) 53 | 54 | angle = np.random.rand() * 180 55 | t_x = (np.random.rand()-0.5) * 50. 56 | t_y = (np.random.rand()-0.5) * 50. 57 | trans = np.array((t_y, t_x)) 58 | 59 | if angle < -180.0: 60 | angle = angle + 360.0 61 | elif angle > 180.0: 62 | angle = angle - 360.0 63 | 64 | # arr = arr[0,] 65 | # rot = ndii.rotate(arr, angle) 66 | 67 | (_, h, w) = arr.shape 68 | (cX, cY) = (w//2, h//2) 69 | rot = arr[0,] 70 | 71 | N = np.float32([[1,0,t_x],[0,1,t_y]]) 72 | rot = cv2.warpAffine(rot, N, (w, h)) 73 | 74 | M = cv2.getRotationMatrix2D((cX, cY), angle, 1.0) 75 | rot = cv2.warpAffine(rot, M, (w, h)) 76 | rot = cv2.resize(rot, (h, w), interpolation=cv2.INTER_CUBIC) 77 | # for heterogeneous image, comment out if you want them to be homogeneous 78 | rot = cv2.GaussianBlur(rot, (9,9), 13) 79 | kernel = np.array([[-1,-1,-1],[-1,9,-1],[-1,-1,-1]]) 80 | rot = cv2.filter2D(rot, -1, kernel) 81 | 82 | rot = rot[np.newaxis, :] 83 | # for dynamic obstacles, comment out if you dont want any dynamic obstacles 84 | # arr[0,] = add_plus(arr[0], *plus_location4) 85 | # arr = np.reshape(arr, (1, height, width)).astype(np.float32) 86 | 87 | return arr, rot, angle, trans 88 | 89 | def add_square(arr, x, y, size): 90 | s = int(size / 2) 91 | arr[x-s,y-s:y+s] = True 92 | arr[x+s,y-s:y+s] = True 93 | arr[x-s:x+s,y-s] = True 94 | arr[x-s:x+s,y+s] = True 95 | 96 | return arr 97 | 98 | def add_filled_square(arr, x, y, size): 99 | s = int(size / 2) 100 | 101 | xx, yy = np.mgrid[:arr.shape[0], :arr.shape[1]] 102 | 103 | return np.logical_or(arr, logical_and([xx > x - s, xx < x + s, yy > y - s, yy < y + s])) 104 | 105 | def logical_and(arrays): 106 | new_array = np.ones(arrays[0].shape, dtype=bool) 107 | for a in arrays: 108 | new_array = np.logical_and(new_array, a) 109 | 110 | return new_array 111 | 112 | def add_mesh_square(arr, x, y, size): 113 | s = int(size / 2) 114 | 115 | xx, yy = np.mgrid[:arr.shape[0], :arr.shape[1]] 116 | 117 | return np.logical_or(arr, logical_and([xx > x - s, xx < x + s, xx % 2 == 1, yy > y - s, yy < y + s, yy % 2 == 1])) 118 | 119 | def add_triangle(arr, x, y, size): 120 | s = int(size / 2) 121 | 122 | triangle = np.tril(np.ones((size, size), dtype=bool)) 123 | 124 | arr[x-s:x-s+triangle.shape[0],y-s:y-s+triangle.shape[1]] = triangle 125 | 126 | return arr 127 | 128 | def add_circle(arr, x, y, size, fill=False): 129 | xx, yy = np.mgrid[:arr.shape[0], :arr.shape[1]] 130 | circle = np.sqrt((xx - x) ** 2 + (yy - y) ** 2) 131 | new_arr = np.logical_or(arr, np.logical_and(circle < size, circle >= size * 0.7 if not fill else True)) 132 | 133 | return new_arr 134 | 135 | def add_plus(arr, x, y, size): 136 | s = int(size / 2) 137 | arr[x-1:x+1,y-s:y+s] = True 138 | arr[x-s:x+s,y-1:y+1] = True 139 | 140 | return arr 141 | 142 | def get_random_location(width, height, zoom=1.0): 143 | x = int(width * random.uniform(0.22, 0.78)) 144 | y = int(height * random.uniform(0.22, 0.78)) 145 | 146 | size = int(min(width, height) * random.uniform(0.06, 0.12) * zoom) 147 | 148 | return (x, y, size) -------------------------------------------------------------------------------- /detect.py: -------------------------------------------------------------------------------- 1 | from collections import defaultdict 2 | import torch.nn.functional as F 3 | import torch 4 | import torch.optim as optim 5 | import torch.nn as nn 6 | from torch.optim import lr_scheduler 7 | import time 8 | import copy 9 | from unet.pytorch_DPCN import FFT2, UNet, LogPolar, PhaseCorr, Corr2Softmax 10 | from data.dataset_DPCN import * 11 | import numpy as np 12 | import shutil 13 | from utils.utils import * 14 | import kornia 15 | from data.dataset import * 16 | from utils.detect_utils import * 17 | 18 | 19 | def detect_model(template_path, source_path, model_template, model_source, model_corr2softmax,\ 20 | model_trans_template, model_trans_source, model_trans_corr2softmax): 21 | batch_size_inner = 1 22 | 23 | since = time.time() 24 | 25 | # Each epoch has a training and validation phase 26 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 27 | # device = torch.device("cpu") 28 | phase = "val" 29 | 30 | model_template.eval() # Set model to evaluate mode 31 | model_source.eval() 32 | model_corr2softmax.eval() 33 | model_trans_template.eval() 34 | model_trans_source.eval() 35 | model_trans_corr2softmax.eval() 36 | with torch.no_grad(): 37 | 38 | iters = 0 39 | acc = 0. 40 | template, _, _, _, _, _ = default_loader(template_path, 256) 41 | source, _, _, _, _, _ = default_loader(source_path, 256) 42 | template = template.to(device) 43 | source = source.to(device) 44 | template = template.unsqueeze(0) 45 | template = template.permute(1,0,2,3) 46 | source = source.unsqueeze(0) 47 | source = source.permute(1,0,2,3) 48 | 49 | iters += 1 50 | since = time.time() 51 | rotation_cal, scale_cal = detect_rot_scale(template, source,\ 52 | model_template, model_source, model_corr2softmax, device ) 53 | tranformation_y, tranformation_x, image_aligned, source_rotated = detect_translation(template, source, rotation_cal, scale_cal, \ 54 | model_trans_template, model_trans_source, model_trans_corr2softmax, device) 55 | time_elapsed = time.time() - since 56 | # print('in detection time {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60)) 57 | print("in detection time", time_elapsed) 58 | plot_and_save_result(template[0,:,:], source[0,:,:], source_rotated[0,:,:], image_aligned) 59 | 60 | 61 | 62 | 63 | 64 | checkpoint_path = "./checkpoints/checkpoint_simulation_hetero.pt" 65 | template_path = "./demo/temp_1.png" 66 | source_path = "./demo/src_1.png" 67 | 68 | load_pretrained =True 69 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 70 | print("The devices that the code is running on:", device) 71 | # device = torch.device("cpu") 72 | batch_size = 1 73 | num_class = 1 74 | start_epoch = 0 75 | model_template = UNet(num_class).to(device) 76 | model_source = UNet(num_class).to(device) 77 | model_corr2softmax = Corr2Softmax(200., 0.).to(device) 78 | model_trans_template = UNet(num_class).to(device) 79 | model_trans_source = UNet(num_class).to(device) 80 | model_trans_corr2softmax = Corr2Softmax(200., 0.).to(device) 81 | 82 | optimizer_ft_temp = optim.Adam(filter(lambda p: p.requires_grad, model_template.parameters()), lr=2e-4) 83 | optimizer_ft_src = optim.Adam(filter(lambda p: p.requires_grad, model_source.parameters()), lr=2e-4) 84 | optimizer_c2s = optim.Adam(filter(lambda p: p.requires_grad, model_corr2softmax.parameters()), lr=1e-1) 85 | optimizer_trans_ft_temp = optim.Adam(filter(lambda p: p.requires_grad, model_template.parameters()), lr=2e-4) 86 | optimizer_trans_ft_src = optim.Adam(filter(lambda p: p.requires_grad, model_source.parameters()), lr=2e-4) 87 | optimizer_trans_c2s = optim.Adam(filter(lambda p: p.requires_grad, model_corr2softmax.parameters()), lr=1e-1) 88 | 89 | if load_pretrained: 90 | model_template, model_source, model_corr2softmax, model_trans_template, model_trans_source, model_trans_corr2softmax,\ 91 | _, _, _, _, _, _,\ 92 | start_epoch = load_checkpoint(\ 93 | checkpoint_path, model_template, model_source, model_corr2softmax, model_trans_template, model_trans_source, model_trans_corr2softmax,\ 94 | optimizer_ft_temp, optimizer_ft_src, optimizer_c2s, optimizer_trans_ft_temp, optimizer_trans_ft_src, optimizer_trans_c2s, device) 95 | 96 | detect_model(template_path, source_path, model_template, model_source, model_corr2softmax, model_trans_template, model_trans_source, model_trans_corr2softmax) 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | -------------------------------------------------------------------------------- /fft/__init__.py: -------------------------------------------------------------------------------- 1 | """This package includes a miscellaneous collection of useful helper functions.""" 2 | -------------------------------------------------------------------------------- /fft/dft_test.py: -------------------------------------------------------------------------------- 1 | """ 2 | FFT based image registration. --- main functions 3 | """ 4 | 5 | from __future__ import division, print_function 6 | 7 | import math 8 | 9 | import numpy as np 10 | try: 11 | import pyfftw.interfaces.numpy_fft as fft 12 | except ImportError: 13 | import numpy.fft as fft 14 | import scipy.ndimage.interpolation as ndii 15 | 16 | import imreg_dft.utils as utils 17 | 18 | 19 | def _logpolar_filter(shape): 20 | """ 21 | Make a radial cosine filter for the logpolar transform. 22 | This filter suppresses low frequencies and completely removes 23 | the zero freq. 24 | """ 25 | yy = np.linspace(- np.pi / 2., np.pi / 2., shape[0])[:, np.newaxis] 26 | xx = np.linspace(- np.pi / 2., np.pi / 2., shape[1])[np.newaxis, :] 27 | # Supressing low spatial frequencies is a must when using log-polar 28 | # transform. The scale stuff is poorly reflected with low freqs. 29 | rads = np.sqrt(yy ** 2 + xx ** 2) 30 | filt = 1.0 - np.cos(rads) ** 2 31 | # vvv This doesn't really matter, very high freqs are not too usable anyway 32 | filt[np.abs(rads) > np.pi / 2] = 1 33 | return filt 34 | 35 | 36 | def _get_pcorr_shape(shape): 37 | ret = (int(max(shape) * 1.0),) * 2 38 | return ret 39 | 40 | 41 | def _get_ang_scale(ims, bgval, exponent='inf', constraints=None, reports=None): 42 | """ 43 | Given two images, return their scale and angle difference. 44 | Args: 45 | ims (2-tuple-like of 2D ndarrays): The images 46 | bgval: We also pad here in the :func:`map_coordinates` 47 | exponent (float or 'inf'): The exponent stuff, see :func:`similarity` 48 | constraints (dict, optional) 49 | reports (optional) 50 | Returns: 51 | tuple: Scale, angle. Describes the relationship of 52 | the subject image to the first one. 53 | """ 54 | assert len(ims) == 2, \ 55 | "Only two images are supported as input" 56 | shape = ims[0].shape 57 | 58 | ims_apod = [utils._apodize(im) for im in ims] 59 | dfts = [fft.fftshift(abs(fft.fft2(im/255))) for im in ims] 60 | # imshow(dfts[0]*1000,dfts[0]*1000,dfts[0]*1000) 61 | # imshow(ims_apod[0], ims_apod[0], ims_apod[0]) 62 | 63 | filt = _logpolar_filter(shape) 64 | dfts = [dft * filt for dft in dfts] 65 | # imshow(dfts[0]*1000,dfts[0]*1000,dfts[0]*1000) 66 | 67 | # High-pass filtering used to be here, but we have moved it to a higher 68 | # level interface 69 | 70 | pcorr_shape = _get_pcorr_shape(shape) 71 | log_base = _get_log_base(shape, pcorr_shape[1]) 72 | 73 | stuffs = [_logpolar(np.abs(dft), pcorr_shape, log_base) 74 | for dft in dfts] 75 | # print(stuffs[0]) 76 | (arg_ang, arg_rad), success = _phase_correlation( 77 | stuffs[0], stuffs[1], 78 | utils.argmax_angscale, log_base, exponent, constraints, reports) 79 | 80 | angle = -np.pi * arg_ang / float(pcorr_shape[0]) 81 | angle = np.rad2deg(angle) 82 | angle = utils.wrap_angle(angle, 360) 83 | scale = log_base ** arg_rad 84 | 85 | angle = - angle 86 | scale = 1.0 / scale 87 | 88 | if reports is not None: 89 | reports["shape"] = filt.shape 90 | reports["base"] = log_base 91 | 92 | if reports.show("spectra"): 93 | reports["dfts_filt"] = dfts 94 | if reports.show("inputs"): 95 | reports["ims_filt"] = [fft.ifft2(np.fft.ifftshift(dft)) 96 | for dft in dfts] 97 | if reports.show("logpolar"): 98 | reports["logpolars"] = stuffs 99 | 100 | if reports.show("scale_angle"): 101 | reports["amas-result-raw"] = (arg_ang, arg_rad) 102 | reports["amas-result"] = (scale, angle) 103 | reports["amas-success"] = success 104 | extent_el = pcorr_shape[1] / 2.0 105 | reports["amas-extent"] = ( 106 | log_base ** (-extent_el), log_base ** extent_el, 107 | -90, 90 108 | ) 109 | 110 | if not 0.5 < scale < 2: 111 | raise ValueError( 112 | "Images are not compatible. Scale change %g too big to be true." 113 | % scale) 114 | 115 | return scale, angle 116 | 117 | 118 | def translation(im0, im1, filter_pcorr=0, odds=1, constraints=None, 119 | reports=None): 120 | """ 121 | Return translation vector to register images. 122 | It tells how to translate the im1 to get im0. 123 | Args: 124 | im0 (2D numpy array): The first (template) image 125 | im1 (2D numpy array): The second (subject) image 126 | filter_pcorr (int): Radius of the minimum spectrum filter 127 | for translation detection, use the filter when detection fails. 128 | Values > 3 are likely not useful. 129 | constraints (dict or None): Specify preference of seeked values. 130 | For more detailed documentation, refer to :func:`similarity`. 131 | The only difference is that here, only keys ``tx`` and/or ``ty`` 132 | (i.e. both or any of them or none of them) are used. 133 | odds (float): The greater the odds are, the higher is the preferrence 134 | of the angle + 180 over the original angle. Odds of -1 are the same 135 | as inifinity. 136 | The value 1 is neutral, the converse of 2 is 1 / 2 etc. 137 | Returns: 138 | dict: Contains following keys: ``angle``, ``tvec`` (Y, X), 139 | and ``success``. 140 | """ 141 | angle = 0 142 | report_one = report_two = None 143 | if reports is not None and reports.show("translation"): 144 | report_one = reports.copy_empty() 145 | report_two = reports.copy_empty() 146 | 147 | # We estimate translation for the original image... 148 | tvec, succ = _translation(im0, im1, filter_pcorr, constraints, report_one) 149 | # ... and for the 180-degrees rotated image (the rotation estimation 150 | # doesn't distinguish rotation of x vs x + 180deg). 151 | tvec2, succ2 = _translation(im0, utils.rot180(im1), filter_pcorr, 152 | constraints, report_two) 153 | 154 | pick_rotated = False 155 | if succ2 * odds > succ or odds == -1: 156 | pick_rotated = True 157 | 158 | if reports is not None and reports.show("translation"): 159 | reports["t0-orig"] = report_one["amt-orig"] 160 | reports["t0-postproc"] = report_one["amt-postproc"] 161 | reports["t0-success"] = succ 162 | reports["t0-tvec"] = tuple(tvec) 163 | 164 | reports["t1-orig"] = report_two["amt-orig"] 165 | reports["t1-postproc"] = report_two["amt-postproc"] 166 | reports["t1-success"] = succ2 167 | reports["t1-tvec"] = tuple(tvec2) 168 | 169 | if reports is not None and reports.show("transformed"): 170 | toapp = [ 171 | transform_img(utils.rot180(im1), tvec=tvec2, mode="wrap", order=3), 172 | transform_img(im1, tvec=tvec, mode="wrap", order=3), 173 | ] 174 | if pick_rotated: 175 | toapp = toapp[::-1] 176 | reports["after_tform"].extend(toapp) 177 | 178 | if pick_rotated: 179 | tvec = tvec2 180 | succ = succ2 181 | angle = angle + 180 182 | 183 | ret = dict(tvec=tvec, success=succ, angle=angle) 184 | return ret 185 | 186 | 187 | def _get_precision(shape, scale=1): 188 | """ 189 | Given the parameters of the log-polar transform, get width of the interval 190 | where the correct values are. 191 | Args: 192 | shape (tuple): Shape of images 193 | scale (float): The scale difference (precision varies) 194 | """ 195 | pcorr_shape = _get_pcorr_shape(shape) 196 | log_base = _get_log_base(shape, pcorr_shape[1]) 197 | # * 0.5 <= max deviation is half of the step 198 | # * 0.25 <= we got subpixel precision now and 0.5 / 2 == 0.25 199 | # sccale: Scale deviation depends on the scale value 200 | Dscale = scale * (log_base - 1) * 0.25 201 | # angle: Angle deviation is constant 202 | Dangle = 180.0 / pcorr_shape[0] * 0.25 203 | return Dangle, Dscale 204 | 205 | 206 | def _similarity(im0, im1, numiter=1, order=3, constraints=None, 207 | filter_pcorr=0, exponent='inf', bgval=None, reports=None): 208 | """ 209 | This function takes some input and returns mutual rotation, scale 210 | and translation. 211 | It does these things during the process: 212 | * Handles correct constraints handling (defaults etc.). 213 | * Performs angle-scale determination iteratively. 214 | This involves keeping constraints in sync. 215 | * Performs translation determination. 216 | * Calculates precision. 217 | Returns: 218 | Dictionary with results. 219 | """ 220 | if bgval is None: 221 | bgval = utils.get_borderval(im1, 5) 222 | 223 | shape = im0.shape 224 | if shape != im1.shape: 225 | raise ValueError("Images must have same shapes.") 226 | elif im0.ndim != 2: 227 | raise ValueError("Images must be 2-dimensional.") 228 | 229 | # We are going to iterate and precise scale and angle estimates 230 | scale = 1.0 231 | angle = 0.0 232 | im2 = im1 233 | 234 | constraints_default = dict(angle=[0, None], scale=[1, None]) 235 | if constraints is None: 236 | constraints = constraints_default 237 | 238 | # We guard against case when caller passes only one constraint key. 239 | # Now, the provided ones just replace defaults. 240 | constraints_default.update(constraints) 241 | constraints = constraints_default 242 | 243 | # During iterations, we have to work with constraints too. 244 | # So we make the copy in order to leave the original intact 245 | constraints_dynamic = constraints.copy() 246 | constraints_dynamic["scale"] = list(constraints["scale"]) 247 | constraints_dynamic["angle"] = list(constraints["angle"]) 248 | 249 | if reports is not None and reports.show("transformed"): 250 | reports["after_tform"] = [im2.copy()] 251 | 252 | for ii in range(numiter): 253 | newscale, newangle = _get_ang_scale([im0, im2], bgval, exponent, 254 | constraints_dynamic, reports) 255 | print("angle, scale",newangle, newscale) 256 | scale = scale*newscale 257 | angle = angle+newangle 258 | 259 | constraints_dynamic["scale"][0] = constraints_dynamic["scale"][0]/newscale 260 | constraints_dynamic["angle"][0] = constraints_dynamic["angle"][0]-newangle 261 | im2 = transform_img(im1, scale, angle, bgval=bgval, order=order) 262 | 263 | if reports is not None and reports.show("transformed"): 264 | reports["after_tform"].append(im2.copy()) 265 | 266 | # Here we look how is the turn-180 267 | target, stdev = constraints.get("angle", (0, None)) 268 | odds = _get_odds(angle, target, stdev) 269 | # now we can use pcorr to guess the translation 270 | res = translation(im0, im2, filter_pcorr, odds, 271 | constraints, reports) 272 | print("odd") 273 | 274 | # print("newangle, newscale",angle, scale) 275 | # The log-polar transform may have got the angle wrong by 180 degrees. 276 | # The phase correlation can help us to correct that 277 | angle = angle + res["angle"] 278 | res["angle"] = utils.wrap_angle(angle, 360) 279 | 280 | # don't know what it does, but it alters the scale a little bit 281 | # scale = (im1.shape[1] - 1) / (int(im1.shape[1] / scale) - 1) 282 | 283 | Dangle, Dscale = _get_precision(shape, scale) 284 | 285 | res["scale"] = scale 286 | res["Dscale"] = Dscale 287 | res["Dangle"] = Dangle 288 | # 0.25 because we go subpixel now 289 | res["Dt"] = 0.25 290 | 291 | return res 292 | 293 | 294 | def similarity(im0, im1, numiter=1, order=3, constraints=None, 295 | filter_pcorr=0, exponent='inf', reports=None): 296 | """ 297 | Return similarity transformed image im1 and transformation parameters. 298 | Transformation parameters are: isotropic scale factor, rotation angle (in 299 | degrees), and translation vector. 300 | A similarity transformation is an affine transformation with isotropic 301 | scale and without shear. 302 | Args: 303 | im0 (2D numpy array): The first (template) image 304 | im1 (2D numpy array): The second (subject) image 305 | numiter (int): How many times to iterate when determining scale and 306 | rotation 307 | order (int): Order of approximation (when doing transformations). 1 = 308 | linear, 3 = cubic etc. 309 | filter_pcorr (int): Radius of a spectrum filter for translation 310 | detection 311 | exponent (float or 'inf'): The exponent value used during processing. 312 | Refer to the docs for a thorough explanation. Generally, pass "inf" 313 | when feeling conservative. Otherwise, experiment, values below 5 314 | are not even supposed to work. 315 | constraints (dict or None): Specify preference of seeked values. 316 | Pass None (default) for no constraints, otherwise pass a dict with 317 | keys ``angle``, ``scale``, ``tx`` and/or ``ty`` (i.e. you can pass 318 | all, some of them or none of them, all is fine). The value of a key 319 | is supposed to be a mutable 2-tuple (e.g. a list), where the first 320 | value is related to the constraint center and the second one to 321 | softness of the constraint (the higher is the number, 322 | the more soft a constraint is). 323 | More specifically, constraints may be regarded as weights 324 | in form of a shifted Gaussian curve. 325 | However, for precise meaning of keys and values, 326 | see the documentation section :ref:`constraints`. 327 | Names of dictionary keys map to names of command-line arguments. 328 | Returns: 329 | dict: Contains following keys: ``scale``, ``angle``, ``tvec`` (Y, X), 330 | ``success`` and ``timg`` (the transformed subject image) 331 | .. note:: There are limitations 332 | * Scale change must be less than 2. 333 | * No subpixel precision (but you can use *resampling* to get 334 | around this). 335 | """ 336 | bgval = utils.get_borderval(im1, 5) 337 | 338 | res = _similarity(im0, im1, numiter, order, constraints, 339 | filter_pcorr, exponent, bgval, reports) 340 | 341 | im2 = transform_img_dict(im1, res, bgval, order) 342 | # Order of mask should be always 1 - higher values produce strange results. 343 | imask = transform_img_dict(np.ones_like(im1), res, 0, 1) 344 | # This removes some weird artifacts 345 | imask[imask > 0.8] = 1.0 346 | 347 | # Framing here = just blending the im2 with its BG according to the mask 348 | im3 = utils.frame_img(im2, imask, 10) 349 | 350 | res["timg"] = im3 351 | return res 352 | 353 | 354 | def _get_odds(angle, target, stdev): 355 | """ 356 | Determine whether we are more likely to choose the angle, or angle + 180 357 | Args: 358 | angle (float, degrees): The base angle. 359 | target (float, degrees): The angle we think is the right one. 360 | Typically, we take this from constraints. 361 | stdev (float, degrees): The relevance of the target value. 362 | Also typically taken from constraints. 363 | Return: 364 | float: The greater the odds are, the higher is the preferrence 365 | of the angle + 180 over the original angle. Odds of -1 are the same 366 | as inifinity. 367 | """ 368 | ret = 1 369 | if stdev is not None: 370 | diffs = [abs(utils.wrap_angle(ang, 360)) 371 | for ang in (target - angle, target - angle + 180)] 372 | odds0, odds1 = 0, 0 373 | if stdev > 0: 374 | odds0, odds1 = [np.exp(- diff ** 2 / stdev ** 2) for diff in diffs] 375 | if odds0 == 0 and odds1 > 0: 376 | # -1 is treated as infinity in _translation 377 | ret = -1 378 | elif stdev == 0 or (odds0 == 0 and odds1 == 0): 379 | ret = -1 380 | if diffs[0] < diffs[1]: 381 | ret = 0 382 | else: 383 | ret = odds1 / odds0 384 | return ret 385 | 386 | 387 | def _translation(im0, im1, filter_pcorr=0, constraints=None, reports=None): 388 | """ 389 | The plain wrapper for translation phase correlation, no big deal. 390 | """ 391 | # Apodization and pcorr don't play along 392 | # im0, im1 = [utils._apodize(im, ratio=1) for im in (im0, im1)] 393 | ret, succ = _phase_correlation( 394 | im0, im1, 395 | utils.argmax_translation, filter_pcorr, constraints, reports) 396 | return ret, succ 397 | 398 | 399 | def _phase_correlation(im0, im1, callback=None, *args): 400 | """ 401 | Computes phase correlation between im0 and im1 402 | Args: 403 | im0 404 | im1 405 | callback (function): Process the cross-power spectrum (i.e. choose 406 | coordinates of the best element, usually of the highest one). 407 | Defaults to :func:`imreg_dft.utils.argmax2D` 408 | Returns: 409 | tuple: The translation vector (Y, X). Translation vector of (0, 0) 410 | means that the two images match. 411 | """ 412 | if callback is None: 413 | callback = utils._argmax2D 414 | 415 | # TODO: Implement some form of high-pass filtering of PHASE correlation 416 | f0, f1 = [fft.fft2(arr) for arr in (im0, im1)] 417 | print(im0) 418 | # spectrum can be filtered (already), 419 | # so we have to take precaution against dividing by 0 420 | eps = abs(f1).max() * 1e-15 421 | 422 | # cps == cross-power spectrum of im0 and im1 423 | cps = abs(fft.ifft2((f0 * f1.conjugate()) / (abs(f0) * abs(f1) + eps))) 424 | 425 | #scps = shifted cps 426 | scps = fft.fftshift(cps) 427 | # imshow(scps*1000,scps*1000,scps*1000) 428 | imshow(scps,scps, scps) 429 | #scps = cps 430 | 431 | (t0, t1), success = callback(scps, *args) 432 | ret = np.array((t0, t1)) 433 | 434 | # _compensate_fftshift is not appropriate here, this is OK. 435 | # t0 -= f0.shape[0] // 2 436 | # t1 -= f0.shape[1] // 2 437 | print("ret_oir",ret) 438 | 439 | ret = ret-np.array(f0.shape, int) // 2 440 | print("ret",ret) 441 | return ret, success 442 | 443 | 444 | def transform_img_dict(img, tdict, bgval=None, order=1, invert=False): 445 | """ 446 | Wrapper of :func:`transform_img`, works well with the :func:`similarity` 447 | output. 448 | Args: 449 | img 450 | tdict (dictionary): Transformation dictionary --- supposed to contain 451 | keys "scale", "angle" and "tvec" 452 | bgval 453 | order 454 | invert (bool): Whether to perform inverse transformation --- doesn't 455 | work very well with the translation. 456 | Returns: 457 | np.ndarray: .. seealso:: :func:`transform_img` 458 | """ 459 | scale = tdict["scale"] 460 | angle = tdict["angle"] 461 | tvec = np.array(tdict["tvec"]) 462 | if invert: 463 | scale = 1.0 / scale 464 | angle = -angle 465 | tvec = -tvec 466 | res = transform_img(img, scale, angle, tvec, bgval=bgval, order=order) 467 | return res 468 | 469 | 470 | def transform_img(img, scale=1.0, angle=0.0, tvec=(0, 0), 471 | mode="constant", bgval=None, order=1): 472 | """ 473 | Return translation vector to register images. 474 | Args: 475 | img (2D or 3D numpy array): What will be transformed. 476 | If a 3D array is passed, it is treated in a manner in which RGB 477 | images are supposed to be handled - i.e. assume that coordinates 478 | are (Y, X, channels). 479 | Complex images are handled in a way that treats separately 480 | the real and imaginary parts. 481 | scale (float): The scale factor (scale > 1.0 means zooming in) 482 | angle (float): Degrees of rotation (clock-wise) 483 | tvec (2-tuple): Pixel translation vector, Y and X component. 484 | mode (string): The transformation mode (refer to e.g. 485 | :func:`scipy.ndimage.shift` and its kwarg ``mode``). 486 | bgval (float): Shade of the background (filling during transformations) 487 | If None is passed, :func:`imreg_dft.utils.get_borderval` with 488 | radius of 5 is used to get it. 489 | order (int): Order of approximation (when doing transformations). 1 = 490 | linear, 3 = cubic etc. Linear works surprisingly well. 491 | Returns: 492 | np.ndarray: The transformed img, may have another 493 | i.e. (bigger) shape than the source. 494 | """ 495 | if img.ndim == 3: 496 | # A bloody painful special case of RGB images 497 | ret = np.empty_like(img) 498 | for idx in range(img.shape[2]): 499 | sli = (slice(None), slice(None), idx) 500 | ret[sli] = transform_img(img[sli], scale, angle, tvec, 501 | mode, bgval, order) 502 | return ret 503 | elif np.iscomplexobj(img): 504 | decomposed = np.empty(img.shape + (2,), float) 505 | decomposed[:, :, 0] = img.real 506 | decomposed[:, :, 1] = img.imag 507 | # The bgval makes little sense now, as we decompose the image 508 | res = transform_img(decomposed, scale, angle, tvec, mode, None, order) 509 | ret = res[:, :, 0] + 1j * res[:, :, 1] 510 | return ret 511 | 512 | if bgval is None: 513 | bgval = utils.get_borderval(img) 514 | 515 | bigshape = np.round(np.array(img.shape) * 1.2).astype(int) 516 | bg = np.zeros(bigshape, img.dtype) + bgval 517 | 518 | dest0 = utils.embed_to(bg, img.copy()) 519 | # TODO: We have problems with complex numbers 520 | # that are not supported by zoom(), rotate() or shift() 521 | if scale != 1.0: 522 | dest0 = ndii.zoom(dest0, scale, order=order, mode=mode, cval=bgval) 523 | if angle != 0.0: 524 | dest0 = ndii.rotate(dest0, angle, order=order, mode=mode, cval=bgval) 525 | 526 | if tvec[0] != 0 or tvec[1] != 0: 527 | dest0 = ndii.shift(dest0, tvec, order=order, mode=mode, cval=bgval) 528 | 529 | bg = np.zeros_like(img) + bgval 530 | dest = utils.embed_to(bg, dest0) 531 | return dest 532 | 533 | 534 | def similarity_matrix(scale, angle, vector): 535 | """ 536 | Return homogeneous transformation matrix from similarity parameters. 537 | Transformation parameters are: isotropic scale factor, rotation angle (in 538 | degrees), and translation vector (of size 2). 539 | The order of transformations is: scale, rotate, translate. 540 | """ 541 | raise NotImplementedError("We have no idea what this is supposed to do") 542 | m_scale = np.diag([scale, scale, 1.0]) 543 | m_rot = np.identity(3) 544 | angle = math.radians(angle) 545 | m_rot[0, 0] = math.cos(angle) 546 | m_rot[1, 1] = math.cos(angle) 547 | m_rot[0, 1] = -math.sin(angle) 548 | m_rot[1, 0] = math.sin(angle) 549 | m_transl = np.identity(3) 550 | m_transl[:2, 2] = vector 551 | return np.dot(m_transl, np.dot(m_rot, m_scale)) 552 | 553 | 554 | EXCESS_CONST = 1.1 555 | 556 | 557 | def _get_log_base(shape, new_r): 558 | """ 559 | Basically common functionality of :func:`_logpolar` 560 | and :func:`_get_ang_scale` 561 | This value can be considered fixed, if you want to mess with the logpolar 562 | transform, mess with the shape. 563 | Args: 564 | shape: Shape of the original image. 565 | new_r (float): The r-size of the log-polar transform array dimension. 566 | Returns: 567 | float: Base of the log-polar transform. 568 | The following holds: 569 | :math:`log\_base = \exp( \ln [ \mathit{spectrum\_dim} ] / \mathit{loglpolar\_scale\_dim} )`, 570 | or the equivalent :math:`log\_base^{\mathit{loglpolar\_scale\_dim}} = \mathit{spectrum\_dim}`. 571 | """ 572 | # The highest radius we have to accomodate is 'old_r', 573 | # However, we cut some parts out as only a thin part of the spectra has 574 | # these high frequencies 575 | old_r = shape[0] * EXCESS_CONST 576 | # We are radius, so we divide the diameter by two. 577 | old_r /= 2.0 578 | # we have at most 'new_r' of space. 579 | log_base = np.exp(np.log(old_r) / new_r) 580 | return log_base 581 | 582 | 583 | def _logpolar(image, shape, log_base, bgval=None): 584 | """ 585 | Return log-polar transformed image 586 | Takes into account anisotropicity of the freq spectrum 587 | of rectangular images 588 | Args: 589 | image: The image to be transformed 590 | shape: Shape of the transformed image 591 | log_base: Parameter of the transformation, get it via 592 | :func:`_get_log_base` 593 | bgval: The backround value. If None, use minimum of the image. 594 | Returns: 595 | The transformed image 596 | """ 597 | if bgval is None: 598 | bgval = np.percentile(image, 1) 599 | imshape = np.array(image.shape) 600 | center = imshape[0] / 2.0, imshape[1] / 2.0 601 | # 0 .. pi = only half of the spectrum is used 602 | theta = utils._get_angles(shape) 603 | radius_x = utils._get_lograd(shape, log_base) 604 | radius_y = radius_x.copy() 605 | ellipse_coef = imshape[0] / float(imshape[1]) 606 | # We have to acknowledge that the frequency spectrum can be deformed 607 | # if the image aspect ratio is not 1.0 608 | # The image is x-thin, so we acknowledge that the frequency spectra 609 | # scale in x is shrunk. 610 | # radius_x /= ellipse_coef 611 | 612 | y = radius_y * np.sin(theta) + center[0] 613 | x = radius_x * np.cos(theta) + center[1] 614 | output = np.empty_like(y) 615 | ndii.map_coordinates(image, [y, x], output=output, order=3, 616 | mode="constant", cval=bgval) 617 | # print(radius_x) 618 | # imshow(image, image, image) 619 | # imshow(output, output, output) 620 | # print(output) 621 | return output 622 | 623 | 624 | def imshow(im0, im1, im2, cmap=None, fig=None, **kwargs): 625 | """ 626 | Plot images using matplotlib. 627 | Opens a new figure with four subplots: 628 | :: 629 | +----------------------+---------------------+ 630 | | | | 631 | |