├── README.md ├── data_utils.py ├── doc ├── .DS_Store └── baseline.png ├── evaluate.py ├── log ├── params_ensemble_w_fe.pkl ├── params_single_w_fe.pkl └── params_single_wo_fe.pkl ├── modelnet_data.py ├── point_utils.py ├── pointhop.py └── train.py /README.md: -------------------------------------------------------------------------------- 1 | # PointHop++: *A Lightweight Learning Model on Point Sets for 3D Classification* 2 | Created by [Min Zhang](https://github.com/minzhang-1), [Yifan Wang](https://github.com/yifan-fanyi), Pranav Kadam, Shan Liu, C.-C. Jay Kuo from University of Southern California. 3 | 4 | ![introduction](https://github.com/minzhang-1/PointHop2/blob/master/doc/baseline.png) 5 | 6 | ### Introduction 7 | This work is an official implementation of our [arXiv tech report](https://arxiv.org/abs/2002.03281). We improve the [PointHop method](https://arxiv.org/abs/1907.12766) furthermore in two aspects: 1) reducing its model complexity in terms of the model parameter number and 2) ordering discriminant features automatically based on the cross-entropy criterion. The resulting method is called PointHop++. The first improvement is essential for wearable and mobile computing while the second improvement bridges statistics-based and optimization-based machine learning methodologies. With experiments conducted on the ModelNet40 benchmark dataset, we show that the PointHop++ method performs on par with deep neural network (DNN) solutions and surpasses other unsupervised feature extraction methods. 8 | 9 | In this repository, we release code and data for training a PointHop++ classification network on point clouds sampled from 3D shapes. 10 | 11 | ### Spark version 12 | This implementation has a high requirement for memory. If you only have 16/32GB memory, please use our [new distributed version](https://github.com/minzhang-1/PointHop-PointHop2_Spark) which is built upon Apache Spark. The new version implements the baseline within 40 minutes, using less than 14GB memory. 13 | 14 | ### Citation 15 | If you find our work useful in your research, please consider citing: 16 | 17 | @article{zhang2020pointhop++, 18 | title={PointHop++: A Lightweight Learning Model on Point Sets for 3D Classification}, 19 | author={Zhang, Min and Wang, Yifan and Kadam, Pranav and Liu, Shan and Kuo, C-C Jay}, 20 | journal={arXiv preprint arXiv:2002.03281}, 21 | year={2020} 22 | } 23 | 24 | ### Installation 25 | 26 | The code has been tested with Python 3.5. You may need to install h5py, pytorch, sklearn, pickle and threading packages. 27 | 28 | To install h5py for Python: 29 | ```bash 30 | sudo apt-get install libhdf5-dev 31 | sudo pip install h5py 32 | ``` 33 | 34 | ### Usage 35 | To train a single model without feature selection and ensemble to classify point clouds sampled from 3D shapes: 36 | 37 | python3 train.py 38 | 39 | After the above training, we can evaluate the single model. You can also use the provided model `params_single_wo_fe` to do evaluation directly. 40 | 41 | python3 evaluate.py 42 | 43 | Log files and network parameters will be saved to `log` folder. If you would like to achieve better performance, you can change the argument `feature_selection` from `None` to `0.95` or `ensemble` from `False` to `True` or both in `train.py` and `evaluate.py` respectively. Or use the provided model `params_single_w_fe` and `params_ensemble_w_fe`. 44 | 45 | Point clouds of ModelNet40 models in HDF5 files will be automatically downloaded (416MB) to the data folder. Each point cloud contains 2048 points uniformly sampled from a shape surface. Each cloud is zero-mean and normalized into an unit sphere. There are also text files in `data/modelnet40_ply_hdf5_2048` specifying the ids of shapes in h5 files. 46 | 47 | 48 | -------------------------------------------------------------------------------- /data_utils.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | 4 | def normal_pc(pc): 5 | """ 6 | normalize point cloud in range L 7 | :param pc: type list 8 | :return: type list 9 | """ 10 | pc_mean = pc.mean(axis=0) 11 | pc = pc - pc_mean 12 | pc_L_max = np.max(np.sqrt(np.sum(abs(pc ** 2), axis=-1))) 13 | pc = pc/pc_L_max 14 | return pc 15 | 16 | 17 | def rotation_point_cloud(pc): 18 | """ 19 | Randomly rotate the point clouds to augment the dataset 20 | rotation is per shape based along up direction 21 | :param pc: B X N X 3 array, original batch of point clouds 22 | :return: BxNx3 array, rotated batch of point clouds 23 | """ 24 | # rotated_data = np.zeros(pc.shape, dtype=np.float32) 25 | 26 | rotation_angle = np.random.uniform() * 2 * np.pi 27 | cosval = np.cos(rotation_angle) 28 | sinval = np.sin(rotation_angle) 29 | rotation_matrix = np.array([[cosval, 0, sinval], 30 | [0, 1, 0], 31 | [-sinval, 0, cosval]]) 32 | rotated_data = np.dot(pc.reshape((-1, 3)), rotation_matrix) 33 | 34 | return rotated_data 35 | 36 | 37 | def rotate_point_cloud_by_angle(pc, rotation_angle): 38 | """ 39 | Randomly rotate the point clouds to augment the dataset 40 | rotation is per shape based along up direction 41 | :param pc: B X N X 3 array, original batch of point clouds 42 | :param rotation_angle: angle of rotation 43 | :return: BxNx3 array, rotated batch of point clouds 44 | """ 45 | # rotated_data = np.zeros(pc.shape, dtype=np.float32) 46 | 47 | # rotation_angle = np.random.uniform() * 2 * np.pi 48 | cosval = np.cos(rotation_angle) 49 | sinval = np.sin(rotation_angle) 50 | rotation_matrix = np.array([[cosval, 0, sinval], 51 | [0, 1, 0], 52 | [-sinval, 0, cosval]]) 53 | rotated_data = np.dot(pc.reshape((-1, 3)), rotation_matrix) 54 | 55 | return rotated_data 56 | 57 | 58 | def jitter_point_cloud(pc, sigma=0.01, clip=0.05): 59 | """ 60 | Randomly jitter points. jittering is per point. 61 | :param pc: B X N X 3 array, original batch of point clouds 62 | :param sigma: 63 | :param clip: 64 | :return: 65 | """ 66 | jittered_data = np.clip(sigma * np.random.randn(*pc.shape), -1 * clip, clip) 67 | jittered_data += pc 68 | return jittered_data 69 | 70 | 71 | def shift_point_cloud(pc, shift_range=0.1): 72 | """ Randomly shift point cloud. Shift is per point cloud. 73 | Input: 74 | BxNx3 array, original batch of point clouds 75 | Return: 76 | BxNx3 array, shifted batch of point clouds 77 | """ 78 | N, C = pc.shape 79 | shifts = np.random.uniform(-shift_range, shift_range, 3) 80 | pc += shifts 81 | return pc 82 | 83 | 84 | def random_scale_point_cloud(pc, scale_low=0.8, scale_high=1.25): 85 | """ Randomly scale the point cloud. Scale is per point cloud. 86 | Input: 87 | BxNx3 array, original batch of point clouds 88 | Return: 89 | BxNx3 array, scaled batch of point clouds 90 | """ 91 | N, C = pc.shape 92 | scales = np.random.uniform(scale_low, scale_high, 1) 93 | pc *= scales 94 | return pc 95 | 96 | 97 | def rotate_perturbation_point_cloud(pc, angle_sigma=0.06, angle_clip=0.18): 98 | """ Randomly perturb the point clouds by small rotations 99 | Input: 100 | BxNx3 array, original batch of point clouds 101 | Return: 102 | BxNx3 array, rotated batch of point clouds 103 | """ 104 | # rotated_data = np.zeros(pc.shape, dtype=np.float32) 105 | angles = np.clip(angle_sigma * np.random.randn(3), -angle_clip, angle_clip) 106 | Rx = np.array([[1, 0, 0], 107 | [0, np.cos(angles[0]), -np.sin(angles[0])], 108 | [0, np.sin(angles[0]), np.cos(angles[0])]]) 109 | Ry = np.array([[np.cos(angles[1]), 0, np.sin(angles[1])], 110 | [0, 1, 0], 111 | [-np.sin(angles[1]), 0, np.cos(angles[1])]]) 112 | Rz = np.array([[np.cos(angles[2]), -np.sin(angles[2]), 0], 113 | [np.sin(angles[2]), np.cos(angles[2]), 0], 114 | [0, 0, 1]]) 115 | R = np.dot(Rz, np.dot(Ry, Rx)) 116 | shape_pc = pc 117 | rotated_data = np.dot(shape_pc.reshape((-1, 3)), R) 118 | return rotated_data 119 | 120 | 121 | def pc_augment(pc, angle): 122 | pc = rotate_point_cloud_by_angle(pc, angle) 123 | # pc = rotation_point_cloud(pc) 124 | # pc = jitter_point_cloud(pc) 125 | # pc = random_scale_point_cloud(pc) 126 | # pc = rotate_perturbation_point_cloud(pc) 127 | # pc = shift_point_cloud(pc) 128 | return pc 129 | 130 | 131 | def data_augment(train_data, angle): 132 | return pc_augment(train_data, angle).reshape(-1, train_data.shape[1], train_data.shape[2]) 133 | 134 | -------------------------------------------------------------------------------- /doc/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/minzhang-1/PointHop2/3a31f1aad6702dd7cb0a3cbd71b9e8181bfa69aa/doc/.DS_Store -------------------------------------------------------------------------------- /doc/baseline.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/minzhang-1/PointHop2/3a31f1aad6702dd7cb0a3cbd71b9e8181bfa69aa/doc/baseline.png -------------------------------------------------------------------------------- /evaluate.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import pickle 3 | import modelnet_data 4 | import pointhop 5 | import numpy as np 6 | import data_utils 7 | import os 8 | import time 9 | import sklearn 10 | 11 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 12 | 13 | parser = argparse.ArgumentParser() 14 | parser.add_argument('--initial_point', type=int, default=1024, help='Point Number [256/512/1024/2048]') 15 | parser.add_argument('--validation', default=False, help='Split train data or not') 16 | parser.add_argument('--feature_selection', default=0.95, help='Percentage of feature selection try 0.95') 17 | parser.add_argument('--ensemble', default=True, help='Ensemble or not') 18 | parser.add_argument('--rotation_angle', default=np.pi/4, help='Rotate angle') 19 | parser.add_argument('--rotation_freq', default=8, help='Rotate time') 20 | parser.add_argument('--log_dir', default='log', help='Log dir [default: log]') 21 | parser.add_argument('--num_point', default=[1024, 128, 128, 64], help='Point Number after down sampling') 22 | parser.add_argument('--num_sample', default=[64, 64, 64, 64], help='KNN query number') 23 | FLAGS = parser.parse_args() 24 | 25 | initial_point = FLAGS.initial_point 26 | VALID = FLAGS.validation 27 | FE = FLAGS.feature_selection 28 | ENSEMBLE = FLAGS.ensemble 29 | angle_rotation = FLAGS.rotation_angle 30 | freq_rotation = FLAGS.rotation_freq 31 | num_point = FLAGS.num_point 32 | num_sample = FLAGS.num_sample 33 | 34 | 35 | LOG_DIR = FLAGS.log_dir 36 | if not os.path.exists(LOG_DIR): 37 | os.mkdir(LOG_DIR) 38 | LOG_FOUT = open(os.path.join(LOG_DIR, 'log_train_eva.txt'), 'w') 39 | LOG_FOUT.write(str(FLAGS) + '\n') 40 | 41 | 42 | def log_string(out_str): 43 | LOG_FOUT.write(out_str+'\n') 44 | LOG_FOUT.flush() 45 | print(out_str) 46 | 47 | 48 | def main(): 49 | time_start = time.time() 50 | # load data 51 | train_data, train_label = modelnet_data.data_load(num_point=initial_point, data_dir=os.path.join(BASE_DIR, 'modelnet40_ply_hdf5_2048'), train=True) 52 | test_data, test_label = modelnet_data.data_load(num_point=initial_point, data_dir=os.path.join(BASE_DIR, 'modelnet40_ply_hdf5_2048'), train=False) 53 | 54 | # validation set 55 | if VALID: 56 | train_data, train_label, valid_data, valid_label = modelnet_data.data_separate(train_data, train_label) 57 | else: 58 | valid_data = test_data 59 | valid_label = test_label 60 | 61 | print(train_data.shape, train_label.shape, valid_data.shape, valid_label.shape) 62 | 63 | if ENSEMBLE: 64 | angle = np.repeat(angle_rotation, freq_rotation) 65 | else: 66 | angle = [0] 67 | 68 | feat_valid = [] 69 | for i in range(len(angle)): 70 | with open(os.path.join(LOG_DIR, 'params.pkl'), 'rb') as f: 71 | params_total = pickle.load(f) 72 | params = params_total['params:', i] 73 | weight = params_total['weight:', i] 74 | 75 | log_string('------------Test {} --------------'.format(i)) 76 | leaf_node_test = pointhop.pointhop_pred(False, valid_data, pca_params=params, n_newpoint=num_point, n_sample=num_sample) 77 | feature_valid = pointhop.extract(leaf_node_test) 78 | feature_valid = np.concatenate(feature_valid, axis=-1) 79 | if FE is not None: 80 | fe_ind = params_total['fe_ind:', i] 81 | feature_valid = feature_valid[:, fe_ind] 82 | feature_valid, pred_valid = pointhop.llsr_pred(feature_valid, weight) 83 | feat_valid.append(feature_valid) 84 | 85 | acc_valid = sklearn.metrics.accuracy_score(valid_label, pred_valid) 86 | acc = pointhop.average_acc(valid_label, pred_valid) 87 | log_string('test: {} , test mean: {}'.format(acc_valid, np.mean(acc))) 88 | log_string('per-class: {}'.format(acc)) 89 | valid_data = data_utils.data_augment(valid_data, angle[i]) 90 | 91 | if ENSEMBLE: 92 | weight = params_total['weight ensemble'] 93 | feat_valid = np.concatenate(feat_valid, axis=-1) 94 | feat_valid, pred_valid = pointhop.llsr_pred(feat_valid, weight) 95 | acc_valid = sklearn.metrics.accuracy_score(valid_label, pred_valid) 96 | acc = pointhop.average_acc(valid_label, pred_valid) 97 | log_string('ensemble test: {}, ensemble test mean: {}'.format(acc_valid, np.mean(acc))) 98 | log_string('ensemble per-class: {}'.format(acc)) 99 | 100 | time_end = time.time() 101 | log_string('totally time cost is {} minutes'.format((time_end - time_start)//60)) 102 | 103 | 104 | if __name__ == '__main__': 105 | main() -------------------------------------------------------------------------------- /log/params_ensemble_w_fe.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/minzhang-1/PointHop2/3a31f1aad6702dd7cb0a3cbd71b9e8181bfa69aa/log/params_ensemble_w_fe.pkl -------------------------------------------------------------------------------- /log/params_single_w_fe.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/minzhang-1/PointHop2/3a31f1aad6702dd7cb0a3cbd71b9e8181bfa69aa/log/params_single_w_fe.pkl -------------------------------------------------------------------------------- /log/params_single_wo_fe.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/minzhang-1/PointHop2/3a31f1aad6702dd7cb0a3cbd71b9e8181bfa69aa/log/params_single_wo_fe.pkl -------------------------------------------------------------------------------- /modelnet_data.py: -------------------------------------------------------------------------------- 1 | import os 2 | import h5py 3 | import numpy as np 4 | from sklearn.model_selection import train_test_split 5 | 6 | 7 | def load_dir(data_dir, name='train_files.txt'): 8 | with open(os.path.join(data_dir,name),'r') as f: 9 | lines = f.readlines() 10 | return [os.path.join(data_dir, line.rstrip().split('/')[-1]) for line in lines] 11 | 12 | 13 | def shuffle_data(data): 14 | """ Shuffle data order. 15 | Input: 16 | data: B,N,... numpy array 17 | Return: 18 | shuffled data, shuffle indices 19 | """ 20 | idx = np.arange(data.shape[0]) 21 | np.random.shuffle(idx) 22 | return data[idx, ...], idx 23 | 24 | 25 | def shuffle_points(data): 26 | """ Shuffle orders of points in each point cloud -- changes FPS behavior. 27 | Input: 28 | BxNxC array 29 | Output: 30 | BxNxC array 31 | """ 32 | idx = np.arange(data.shape[1]) 33 | np.random.shuffle(idx) 34 | return data[:, idx, :], idx 35 | 36 | 37 | def xyz2sphere(data): 38 | """ 39 | Input: data(B,N,3) xyz_coordinates 40 | Return: data(B,N,3) sphere_coordinates 41 | """ 42 | r = np.sqrt(np.sum(data**2, axis=2, keepdims=False)) 43 | theta = np.arccos(data[...,2]*1.0/r) 44 | phi = np.arctan(data[...,1]*1.0/data[...,0]) 45 | 46 | if len(r.shape) == 2: 47 | r = np.expand_dims(r, 2) 48 | if len(theta.shape) == 2: 49 | theta = np.expand_dims(theta, 2) 50 | if len(phi.shape) == 2: 51 | phi = np.expand_dims(phi, 2) 52 | 53 | data_sphere = np.concatenate([r, theta, phi], axis=2) 54 | return data_sphere 55 | 56 | 57 | def xyz2cylind(data): 58 | """ 59 | Input: data(B,N,3) xyz_coordinates 60 | Return: data(B,N,3) cylindrical_coordinates 61 | """ 62 | r = np.sqrt(np.sum(data[...,:2]**2, axis=2, keepdims=False)) 63 | phi = np.arctan(data[...,1]*1.0/data[...,0]) 64 | z = data[...,2] 65 | 66 | if len(r.shape) == 2: 67 | r = np.expand_dims(r, 2) 68 | if len(z.shape) == 2: 69 | z = np.expand_dims(z, 2) 70 | if len(phi.shape) == 2: 71 | phi = np.expand_dims(phi, 2) 72 | 73 | data_sphere = np.concatenate([r, z, phi], axis=2) 74 | return data_sphere 75 | 76 | 77 | def data_load(num_point=None, data_dir=None, train=True): 78 | if not os.path.exists(data_dir): 79 | www = 'https://shapenet.cs.stanford.edu/media/modelnet40_ply_hdf5_2048.zip' 80 | zipfile = os.path.basename(www) 81 | os.system('wget --no-check-certificate %s; unzip %s' % (www, zipfile)) 82 | os.system('rm %s' % (zipfile)) 83 | 84 | if train: 85 | data_pth = load_dir(data_dir, name='train_files.txt') 86 | else: 87 | data_pth = load_dir(data_dir, name='test_files.txt') 88 | 89 | point_list = [] 90 | label_list = [] 91 | for pth in data_pth: 92 | data_file = h5py.File(pth, 'r') 93 | point = data_file['data'][:] 94 | label = data_file['label'][:] 95 | point_list.append(point) 96 | label_list.append(label) 97 | data = np.concatenate(point_list, axis=0) 98 | label = np.concatenate(label_list, axis=0) 99 | # data, idx = shuffle_data(data) 100 | # data, ind = shuffle_points(data) 101 | 102 | if not num_point: 103 | return data[:, :, :], label 104 | else: 105 | return data[:, :num_point, :], label 106 | 107 | 108 | def data_separate(data, label): 109 | seed = 7 110 | np.random.seed(seed) 111 | train_data, valid_data, train_label, valid_label = train_test_split(data, label, test_size=0.1, random_state=seed) 112 | 113 | return train_data, train_label, valid_data, valid_label 114 | 115 | 116 | -------------------------------------------------------------------------------- /point_utils.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch 3 | import threading 4 | 5 | 6 | def calc_distances(tmp, pts): 7 | ''' 8 | 9 | :param tmp:(B, k, 3)/(B, 3) 10 | :param pts:(B, N, 3) 11 | :return:(B, N, k)/(B, N) 12 | ''' 13 | if len(tmp.shape) == 2: 14 | tmp = np.expand_dims(tmp, axis=1) 15 | tmp_trans = np.transpose(tmp, [0, 2, 1]) 16 | xy = np.matmul(pts, tmp_trans) 17 | pts_square = (pts**2).sum(axis=2, keepdims=True) 18 | tmp_square_trans = (tmp_trans**2).sum(axis=1, keepdims=True) 19 | return np.squeeze(pts_square + tmp_square_trans - 2 * xy) 20 | 21 | 22 | def index_points(points, idx): 23 | """ 24 | Input: 25 | points: input points data, [B, N, C] 26 | idx: sample index data, [B, S] 27 | Return: 28 | new_points:, indexed points data, [B, S, C] 29 | """ 30 | B = points.shape[0] 31 | view_shape = list(idx.shape) 32 | view_shape[1:] = [1] * (len(view_shape) - 1) 33 | repeat_shape = list(idx.shape) 34 | repeat_shape[0] = 1 35 | batch_indices = np.tile(np.arange(B).reshape(view_shape),repeat_shape) 36 | new_points = points[batch_indices, idx, :] 37 | return new_points 38 | 39 | 40 | def furthest_point_sample(pts, K): 41 | """ 42 | Input: 43 | pts: pointcloud data, [B, N, C] 44 | K: number of samples 45 | Return: 46 | (B, K, 3) 47 | """ 48 | B, N, C = pts.shape 49 | centroids = np.zeros((B, K), dtype=int) 50 | distance = np.ones((B, N), dtype=int) * 1e10 51 | farthest = np.random.randint(0, N, (B,)) 52 | batch_indices = np.arange(B) 53 | for i in range(K): 54 | centroids[:, i] = farthest 55 | centroid = pts[batch_indices, farthest, :].reshape(B, 1, 3) 56 | dist = np.sum((pts - centroid) ** 2, axis=-1) 57 | mask = dist < distance 58 | distance[mask] = dist[mask] 59 | farthest = np.argmax(distance, axis=-1) 60 | return index_points(pts, centroids) 61 | 62 | 63 | def knn_query(new_pts, pts, n_sample, idx): 64 | ''' 65 | new_pts:(B, K, 3) 66 | pts:(B, N, 3) 67 | n_sample:int 68 | :return: nn_idx (B, n_sample, K) 69 | ''' 70 | distance_matrix = calc_distances(new_pts, pts) 71 | nn_idx = np.argpartition(distance_matrix, (0, n_sample), axis=1)[:, :n_sample, :] 72 | idx.append(nn_idx) 73 | 74 | 75 | def knn(new_xyz, point_data, n_sample): 76 | batch_size = point_data.shape[0]//8 77 | idx1 = [] 78 | idx2 = [] 79 | idx3 = [] 80 | idx4 = [] 81 | idx5 = [] 82 | idx6 = [] 83 | idx7 = [] 84 | idx8 = [] 85 | threads = [] 86 | t1 = threading.Thread(target=knn_query, args=(new_xyz[:batch_size], point_data[:batch_size], n_sample, idx1)) 87 | threads.append(t1) 88 | t2 = threading.Thread(target=knn_query, args=(new_xyz[batch_size:2*batch_size], point_data[batch_size:2*batch_size], n_sample, idx2)) 89 | threads.append(t2) 90 | t3 = threading.Thread(target=knn_query, args=(new_xyz[2*batch_size:3*batch_size], point_data[2*batch_size:3*batch_size], n_sample, idx3)) 91 | threads.append(t3) 92 | t4 = threading.Thread(target=knn_query, args=(new_xyz[3*batch_size:4*batch_size], point_data[3*batch_size:4*batch_size], n_sample, idx4)) 93 | threads.append(t4) 94 | t5 = threading.Thread(target=knn_query, args=(new_xyz[4*batch_size:5*batch_size], point_data[4*batch_size:5*batch_size], n_sample, idx5)) 95 | threads.append(t5) 96 | t6 = threading.Thread(target=knn_query, args=(new_xyz[5*batch_size:6*batch_size], point_data[5*batch_size:6*batch_size], n_sample, idx6)) 97 | threads.append(t6) 98 | t7 = threading.Thread(target=knn_query, args=(new_xyz[6*batch_size:7*batch_size], point_data[6*batch_size:7*batch_size], n_sample, idx7)) 99 | threads.append(t7) 100 | t8 = threading.Thread(target=knn_query, args=(new_xyz[7*batch_size:], point_data[7*batch_size:], n_sample, idx8)) 101 | threads.append(t8) 102 | 103 | for t in threads: 104 | t.setDaemon(False) 105 | t.start() 106 | for t in threads: 107 | if t.isAlive(): 108 | t.join() 109 | idx = idx1 + idx2 + idx3 + idx4 + idx5 + idx6 + idx7 + idx8 110 | idx_tmp = np.concatenate(idx, axis=0) 111 | 112 | return idx_tmp 113 | 114 | 115 | def gather_ops(nn_idx, pts): 116 | """ 117 | nn_idx:(B, n_sample, K) 118 | pts:(B, N, dim) 119 | :return: pc_n(B, n_sample, K, dim) 120 | """ 121 | num_newpts = nn_idx.shape[2] 122 | num_dim = pts.shape[2] 123 | pts_expand = torch.from_numpy(pts).type(torch.FloatTensor).unsqueeze(2).expand(-1, -1, num_newpts, -1) 124 | nn_idx_expand = torch.from_numpy(nn_idx).type(torch.LongTensor).unsqueeze(3).expand(-1, -1, -1, num_dim) 125 | pc_n = torch.gather(pts_expand, 1, nn_idx_expand) 126 | return pc_n.numpy() 127 | 128 | 129 | def calc_feature(pc_temp, pc_bin, pc_gather): 130 | value = np.multiply(pc_temp, pc_bin) 131 | value = np.sum(value, axis=2, keepdims=True) 132 | num = np.sum(pc_bin, axis=2, keepdims=True) 133 | final = np.squeeze(value/num, axis=(2,)) 134 | # index.append(t) 135 | pc_gather.append(final) 136 | 137 | 138 | def calc_feature_single(pc_temp, pc_bin): 139 | value = np.multiply(pc_temp, pc_bin) 140 | value = np.sum(value, axis=2, keepdims=True) 141 | num = np.sum(pc_bin, axis=2, keepdims=True) 142 | final = np.squeeze(value/num, axis=(2,)) 143 | return final 144 | 145 | 146 | def gather_fea(nn_idx, point_data, fea): 147 | """ 148 | nn_idx:(B, n_sample, K) 149 | pts:(B, N, dim) 150 | :return: pc_n(B, K, dim_fea) 151 | """ 152 | pts_fea = np.concatenate([point_data, fea], axis=-1) 153 | 154 | pts_fea_expand = index_points(pts_fea, nn_idx) 155 | pts_fea_expand = pts_fea_expand.transpose(0, 2, 1, 3) # (B, K, n_sample, dim) 156 | pc_n = pts_fea_expand[..., :3] 157 | pc_temp = pts_fea_expand[..., 3:] 158 | 159 | pc_n_center = np.expand_dims(pc_n[:, :, 0, :], axis=2) 160 | pc_n_uncentered = pc_n - pc_n_center 161 | 162 | pc_idx = [] 163 | pc_idx.append(pc_n_uncentered[:, :, :, 0] >= 0) 164 | pc_idx.append(pc_n_uncentered[:, :, :, 0] <= 0) 165 | pc_idx.append(pc_n_uncentered[:, :, :, 1] >= 0) 166 | pc_idx.append(pc_n_uncentered[:, :, :, 1] <= 0) 167 | pc_idx.append(pc_n_uncentered[:, :, :, 2] >= 0) 168 | pc_idx.append(pc_n_uncentered[:, :, :, 2] <= 0) 169 | 170 | pc_bin = [] 171 | pc_bin.append(np.expand_dims((pc_idx[0] * pc_idx[2] * pc_idx[4])*1.0, axis=3)) 172 | pc_bin.append(np.expand_dims((pc_idx[0] * pc_idx[2] * pc_idx[5])*1.0, axis=3)) 173 | pc_bin.append(np.expand_dims((pc_idx[0] * pc_idx[3] * pc_idx[4])*1.0, axis=3)) 174 | pc_bin.append(np.expand_dims((pc_idx[0] * pc_idx[3] * pc_idx[5])*1.0, axis=3)) 175 | pc_bin.append(np.expand_dims((pc_idx[1] * pc_idx[2] * pc_idx[4])*1.0, axis=3)) 176 | pc_bin.append(np.expand_dims((pc_idx[1] * pc_idx[2] * pc_idx[5])*1.0, axis=3)) 177 | pc_bin.append(np.expand_dims((pc_idx[1] * pc_idx[3] * pc_idx[4])*1.0, axis=3)) 178 | pc_bin.append(np.expand_dims((pc_idx[1] * pc_idx[3] * pc_idx[5])*1.0, axis=3)) 179 | 180 | pc_gather1 = [] 181 | pc_gather2 = [] 182 | pc_gather3 = [] 183 | pc_gather4 = [] 184 | pc_gather5 = [] 185 | pc_gather6 = [] 186 | pc_gather7 = [] 187 | pc_gather8 = [] 188 | threads = [] 189 | t1 = threading.Thread(target=calc_feature, args=(pc_temp, pc_bin[0], pc_gather1)) 190 | threads.append(t1) 191 | t2 = threading.Thread(target=calc_feature, args=(pc_temp, pc_bin[1], pc_gather2)) 192 | threads.append(t2) 193 | t3 = threading.Thread(target=calc_feature, args=(pc_temp, pc_bin[2], pc_gather3)) 194 | threads.append(t3) 195 | t4 = threading.Thread(target=calc_feature, args=(pc_temp, pc_bin[3], pc_gather4)) 196 | threads.append(t4) 197 | t5 = threading.Thread(target=calc_feature, args=(pc_temp, pc_bin[4], pc_gather5)) 198 | threads.append(t5) 199 | t6 = threading.Thread(target=calc_feature, args=(pc_temp, pc_bin[5], pc_gather6)) 200 | threads.append(t6) 201 | t7 = threading.Thread(target=calc_feature, args=(pc_temp, pc_bin[6], pc_gather7)) 202 | threads.append(t7) 203 | t8 = threading.Thread(target=calc_feature, args=(pc_temp, pc_bin[7], pc_gather8)) 204 | threads.append(t8) 205 | 206 | for t in threads: 207 | t.setDaemon(False) 208 | t.start() 209 | for t in threads: 210 | if t.isAlive(): 211 | t.join() 212 | 213 | pc_gather = pc_gather1 + pc_gather2 + pc_gather3 + pc_gather4 + pc_gather5 + pc_gather6 + pc_gather7 + pc_gather8 214 | pc_fea = np.concatenate(pc_gather, axis=2) 215 | 216 | return pc_fea 217 | 218 | 219 | def gather_global_fea(feature, xyz, part=5): 220 | ''' 221 | 222 | :param feature: (B, n_point, dim) 223 | :param xyz: (B, n_point, 3) 224 | :param part:int 225 | :return: (B, dim*part) 226 | ''' 227 | 228 | pts_square = (xyz**2).sum(axis=2, keepdims=False) 229 | dis = np.sqrt(pts_square) # (B, n_point) 230 | total_fea = [] 231 | for i in range(part): 232 | idx = (dis >= (i/float(part))) * (dis <= ((i+1)/float(part)))*1.0 233 | part_fea = (feature*np.expand_dims(idx, axis=2)).max(axis=1, keepdims=False) 234 | total_fea.append(part_fea) 235 | return np.concatenate(total_fea, axis=1) -------------------------------------------------------------------------------- /pointhop.py: -------------------------------------------------------------------------------- 1 | import math 2 | import sklearn 3 | import numpy as np 4 | from sklearn.decomposition import PCA 5 | from numpy import linalg as LA 6 | import point_utils 7 | import threading 8 | from sklearn.cluster import KMeans 9 | 10 | 11 | def sample_knn(point_data, n_newpoint, n_sample): 12 | point_num = point_data.shape[1] 13 | if n_newpoint == point_num: 14 | new_xyz = point_data 15 | else: 16 | new_xyz = point_utils.furthest_point_sample(point_data, n_newpoint) 17 | idx = point_utils.knn(new_xyz, point_data, n_sample) 18 | return new_xyz, idx 19 | 20 | 21 | def tree(Train, Bias, point_data, data, grouped_feature, idx, pre_energy, threshold, params): 22 | if grouped_feature is None: 23 | grouped_feature = data 24 | grouped_feature = point_utils.gather_fea(idx, point_data, grouped_feature) 25 | s1 = grouped_feature.shape[0] 26 | s2 = grouped_feature.shape[1] 27 | grouped_feature = grouped_feature.reshape(s1 * s2, -1) 28 | 29 | if Train is True: 30 | kernels, mean, energy = find_kernels_pca(grouped_feature) 31 | bias = LA.norm(grouped_feature, axis=1) 32 | bias = np.max(bias) 33 | if pre_energy is not None: 34 | energy = energy * pre_energy 35 | num_node = np.sum(energy > threshold) 36 | params = {} 37 | params['bias'] = bias 38 | params['kernel'] = kernels 39 | params['pca_mean'] = mean 40 | params['energy'] = energy 41 | params['num_node'] = num_node 42 | else: 43 | kernels = params['kernel'] 44 | mean = params['pca_mean'] 45 | bias = params['bias'] 46 | 47 | if Bias is True: 48 | grouped_feature = grouped_feature + bias 49 | 50 | transformed = np.matmul(grouped_feature, np.transpose(kernels)) 51 | 52 | if Bias is True: 53 | e = np.zeros((1, kernels.shape[0])) 54 | e[0, 0] = 1 55 | transformed -= bias * e 56 | 57 | transformed = transformed.reshape(s1, s2, -1) 58 | output = [] 59 | for i in range(transformed.shape[-1]): 60 | output.append(transformed[:, :, i].reshape(s1, s2, 1)) 61 | return params, output 62 | 63 | 64 | def tree_multi(Train, Bias, point_data, data, grouped_feature, idx, pre_energy, threshold, params, j, index, params_t, out): 65 | if grouped_feature is None: 66 | grouped_feature = data 67 | grouped_feature = point_utils.gather_fea(idx, point_data, grouped_feature) 68 | s1 = grouped_feature.shape[0] 69 | s2 = grouped_feature.shape[1] 70 | grouped_feature = grouped_feature.reshape(s1 * s2, -1) 71 | 72 | if Train is True: 73 | kernels, mean, energy = find_kernels_pca(grouped_feature) 74 | bias = LA.norm(grouped_feature, axis=1) 75 | bias = np.max(bias) 76 | if pre_energy is not None: 77 | energy = energy * pre_energy 78 | num_node = np.sum(energy > threshold) 79 | params = {} 80 | params['bias'] = bias 81 | params['kernel'] = kernels 82 | params['pca_mean'] = mean 83 | params['energy'] = energy 84 | params['num_node'] = num_node 85 | else: 86 | kernels = params['kernel'] 87 | mean = params['pca_mean'] 88 | bias = params['bias'] 89 | 90 | if Bias is True: 91 | grouped_feature = grouped_feature + bias 92 | 93 | transformed = np.matmul(grouped_feature, np.transpose(kernels)) 94 | 95 | if Bias is True: 96 | e = np.zeros((1, kernels.shape[0])) 97 | e[0, 0] = 1 98 | transformed -= bias * e 99 | 100 | transformed = transformed.reshape(s1, s2, -1) 101 | output = [] 102 | for i in range(transformed.shape[-1]): 103 | output.append(transformed[:, :, i].reshape(s1, s2, 1)) 104 | index.append(j) 105 | params_t.append(params) 106 | out.append(output) 107 | 108 | 109 | def pointhop_train(Train, data, n_newpoint, n_sample, threshold): 110 | ''' 111 | Train based on the provided samples. 112 | :param train_data: [num_samples, num_point, feature_dimension] 113 | :param n_newpoint: point numbers used in every stage 114 | :param n_sample: k nearest neighbors 115 | :param layer_num: num kernels to be preserved 116 | :param energy_percent: the percent of energy to be preserved 117 | :return: idx, new_idx, final stage feature, feature, pca_params 118 | ''' 119 | 120 | point_data = data 121 | Bias = [False, True, True, True] 122 | info = {} 123 | pca_params = {} 124 | leaf_node = [] 125 | leaf_node_energy = [] 126 | 127 | for i in range(len(n_newpoint)): 128 | new_xyz, idx = sample_knn(point_data, n_newpoint[i], n_sample[i]) 129 | if i == 0: 130 | print(i) 131 | pre_energy = 1 132 | params, output = tree(Train, Bias[i], point_data, data, None, idx, pre_energy, threshold, None) 133 | pca_params['Layer_{:d}_pca_params'.format(i)] = params 134 | num_node = params['num_node'] 135 | energy = params['energy'] 136 | info['Layer_{:d}_feature'.format(i)] = output[:num_node] 137 | info['Layer_{:d}_energy'.format(i)] = energy 138 | info['Layer_{:d}_num_node'.format(i)] = num_node 139 | if num_node != len(output): 140 | for m in range(num_node, len(output), 1): 141 | leaf_node.append(output[m]) 142 | leaf_node_energy.append(energy[m]) 143 | elif i == 1: 144 | output = info['Layer_{:d}_feature'.format(i - 1)] 145 | pre_energy = info['Layer_{:d}_energy'.format(i - 1)] 146 | num_node = info['Layer_{:d}_num_node'.format(i - 1)] 147 | s1 = 0 148 | index = [] 149 | params_t = [] 150 | out = [] 151 | threads = [] 152 | for j in range(num_node): 153 | threads.append(threading.Thread(target=tree_multi, args=(Train, Bias[i], point_data, data, output[j], idx, 154 | pre_energy[j], threshold, None, j, index, params_t, out))) 155 | for t in threads: 156 | t.setDaemon(False) 157 | t.start() 158 | for t in threads: 159 | if t.isAlive(): 160 | t.join() 161 | 162 | for j in range(num_node): 163 | print(i, j) 164 | ind = np.where(np.array(index) == j)[0] 165 | params = params_t[ind[0]] 166 | output_t = out[ind[0]] 167 | pca_params['Layer_{:d}_{:d}_pca_params'.format(i, j)] = params 168 | num_node_t = params['num_node'] 169 | energy = params['energy'] 170 | info['Layer_{:d}_{:d}_feature'.format(i, j)] = output_t[:num_node_t] 171 | info['Layer_{:d}_{:d}_energy'.format(i, j)] = energy 172 | info['Layer_{:d}_{:d}_num_node'.format(i, j)] = num_node_t 173 | s1 = s1 + num_node_t 174 | if num_node_t != len(output_t): 175 | for m in range(num_node_t, len(output_t), 1): 176 | leaf_node.append(output_t[m]) 177 | leaf_node_energy.append(energy[m]) 178 | elif i == 2: 179 | num_node = info['Layer_{:d}_num_node'.format(i - 2)] 180 | for j in range(num_node): 181 | output = info['Layer_{:d}_{:d}_feature'.format(i - 1, j)] 182 | pre_energy = info['Layer_{:d}_{:d}_energy'.format(i - 1, j)] 183 | num_node_t = info['Layer_{:d}_{:d}_num_node'.format(i - 1, j)] 184 | 185 | index = [] 186 | params_t = [] 187 | out = [] 188 | threads = [] 189 | for k in range(num_node_t): 190 | threads.append( 191 | threading.Thread(target=tree_multi, args=(Train, Bias[i], point_data, data, output[k], idx, 192 | pre_energy[k], threshold, None, k, index, params_t, out))) 193 | for t in threads: 194 | t.setDaemon(False) 195 | t.start() 196 | for t in threads: 197 | if t.isAlive(): 198 | t.join() 199 | 200 | for k in range(num_node_t): 201 | print(i, j, k) 202 | ind = np.where(np.array(index) == k)[0] 203 | params = params_t[ind[0]] 204 | output_t = out[ind[0]] 205 | pca_params['Layer_{:d}_{:d}_{:d}_pca_params'.format(i, j, k)] = params 206 | num_node_tt = params['num_node'] 207 | energy = params['energy'] 208 | info['Layer_{:d}_{:d}_{:d}_feature'.format(i, j, k)] = output_t[:num_node_tt] 209 | info['Layer_{:d}_{:d}_{:d}_energy'.format(i, j, k)] = energy 210 | info['Layer_{:d}_{:d}_{:d}_num_node'.format(i, j, k)] = num_node_tt 211 | if num_node_tt != len(output_t): 212 | for m in range(num_node_tt, len(output_t), 1): 213 | leaf_node.append(output_t[m]) 214 | leaf_node_energy.append(energy[m]) 215 | elif i == 3: 216 | num_node = info['Layer_{:d}_num_node'.format(i - 3)] 217 | for j in range(num_node): 218 | num_node_t = info['Layer_{:d}_{:d}_num_node'.format(i - 2, j)] 219 | for k in range(num_node_t): 220 | output = info['Layer_{:d}_{:d}_{:d}_feature'.format(i - 1, j, k)] 221 | pre_energy = info['Layer_{:d}_{:d}_{:d}_energy'.format(i - 1, j, k)] 222 | num_node_tt = info['Layer_{:d}_{:d}_{:d}_num_node'.format(i - 1, j, k)] 223 | 224 | index = [] 225 | params_t = [] 226 | out = [] 227 | threads = [] 228 | for t in range(num_node_tt): 229 | threads.append( 230 | threading.Thread(target=tree_multi, args=(Train, Bias[i], point_data, data, output[t], idx, 231 | pre_energy[t], threshold, None, t, index, params_t, out))) 232 | for t in threads: 233 | t.setDaemon(False) 234 | t.start() 235 | for t in threads: 236 | if t.isAlive(): 237 | t.join() 238 | 239 | for t in range(num_node_tt): 240 | print(i, j, k, t) 241 | ind = np.where(np.array(index) == t)[0] 242 | params = params_t[ind[0]] 243 | output_t = out[ind[0]] 244 | pca_params['Layer_{:d}_{:d}_{:d}_{:d}_pca_params'.format(i, j, k, t)] = params 245 | num_node_ttt = params['num_node'] 246 | energy = params['energy'] 247 | info['Layer_{:d}_{:d}_{:d}_{:d}_feature'.format(i, j, k, t)] = output_t[:num_node_ttt] 248 | info['Layer_{:d}_{:d}_{:d}_{:d}_energy'.format(i, j, k, t)] = energy 249 | info['Layer_{:d}_{:d}_{:d}_{:d}_num_node'.format(i, j, k, t)] = num_node_ttt 250 | for m in range(len(output_t)): 251 | leaf_node.append(output_t[m]) 252 | leaf_node_energy.append(energy[m]) 253 | point_data = new_xyz 254 | # print(len(leaf_node)) 255 | return pca_params, leaf_node, leaf_node_energy 256 | 257 | 258 | def pointhop_pred(Train, data, pca_params, n_newpoint, n_sample): 259 | ''' 260 | Test based on the provided samples. 261 | :param test_data: [num_samples, num_point, feature_dimension] 262 | :param pca_params: pca kernel and mean 263 | :param n_newpoint: point numbers used in every stage 264 | :param n_sample: k nearest neighbors 265 | :param layer_num: num kernels to be preserved 266 | :param idx_save: knn index 267 | :param new_xyz_save: down sample index 268 | :return: final stage feature, feature, pca_params 269 | ''' 270 | 271 | point_data = data 272 | Bias = [False, True, True, True] 273 | info_test = {} 274 | leaf_node = [] 275 | 276 | for i in range(len(n_newpoint)): 277 | new_xyz, idx = sample_knn(point_data, n_newpoint[i], n_sample[i]) 278 | if i == 0: 279 | print(i) 280 | params = pca_params['Layer_{:d}_pca_params'.format(i)] 281 | num_node = params['num_node'] 282 | params_t, output = tree(Train, Bias[i], point_data, data, None, idx, None, None, params) 283 | info_test['Layer_{:d}_feature'.format(i)] = output[:num_node] 284 | info_test['Layer_{:d}_num_node'.format(i)] = num_node 285 | if num_node != len(output): 286 | for m in range(num_node, len(output), 1): 287 | leaf_node.append(output[m]) 288 | elif i == 1: 289 | output = info_test['Layer_{:d}_feature'.format(i - 1)] 290 | num_node = info_test['Layer_{:d}_num_node'.format(i - 1)] 291 | 292 | index = [] 293 | params_t = [] 294 | out = [] 295 | threads = [] 296 | for j in range(num_node): 297 | threads.append( 298 | threading.Thread(target=tree_multi, args=(Train, Bias[i], point_data, data, output[j], idx, 299 | None, None, pca_params['Layer_{:d}_{:d}_pca_params'.format(i, j)], j, index, params_t, out))) 300 | for t in threads: 301 | t.setDaemon(False) 302 | t.start() 303 | for t in threads: 304 | if t.isAlive(): 305 | t.join() 306 | 307 | for j in range(num_node): 308 | print(i, j) 309 | ind = np.where(np.array(index) == j)[0] 310 | output_t = out[ind[0]] 311 | params = pca_params['Layer_{:d}_{:d}_pca_params'.format(i, j)] 312 | num_node_t = params['num_node'] 313 | info_test['Layer_{:d}_{:d}_feature'.format(i, j)] = output_t[:num_node_t] 314 | info_test['Layer_{:d}_{:d}_num_node'.format(i, j)] = num_node_t 315 | if num_node_t != len(output_t): 316 | for m in range(num_node_t, len(output_t), 1): 317 | leaf_node.append(output_t[m]) 318 | elif i == 2: 319 | num_node = info_test['Layer_{:d}_num_node'.format(i - 2)] 320 | for j in range(num_node): 321 | output = info_test['Layer_{:d}_{:d}_feature'.format(i - 1, j)] 322 | num_node_t = info_test['Layer_{:d}_{:d}_num_node'.format(i - 1, j)] 323 | 324 | index = [] 325 | params_t = [] 326 | out = [] 327 | threads = [] 328 | for k in range(num_node_t): 329 | threads.append( 330 | threading.Thread(target=tree_multi, args=(Train, Bias[i], point_data, data, output[k], idx, 331 | None, None, pca_params['Layer_{:d}_{:d}_{:d}_pca_params'.format(i, j, k)], k, index, params_t, out))) 332 | for t in threads: 333 | t.setDaemon(False) 334 | t.start() 335 | for t in threads: 336 | if t.isAlive(): 337 | t.join() 338 | for k in range(num_node_t): 339 | print(i, j, k) 340 | params = pca_params['Layer_{:d}_{:d}_{:d}_pca_params'.format(i, j, k)] 341 | num_node_tt = params['num_node'] 342 | ind = np.where(np.array(index) == k)[0] 343 | output_t = out[ind[0]] 344 | info_test['Layer_{:d}_{:d}_{:d}_feature'.format(i, j, k)] = output_t[:num_node_tt] 345 | info_test['Layer_{:d}_{:d}_{:d}_num_node'.format(i, j, k)] = num_node_tt 346 | if num_node_tt != len(output_t): 347 | for m in range(num_node_tt, len(output_t), 1): 348 | leaf_node.append(output_t[m]) 349 | elif i == 3: 350 | num_node = info_test['Layer_{:d}_num_node'.format(i - 3)] 351 | for j in range(num_node): 352 | num_node_t = info_test['Layer_{:d}_{:d}_num_node'.format(i - 2, j)] 353 | for k in range(num_node_t): 354 | output = info_test['Layer_{:d}_{:d}_{:d}_feature'.format(i - 1, j, k)] 355 | num_node_tt = info_test['Layer_{:d}_{:d}_{:d}_num_node'.format(i - 1, j, k)] 356 | 357 | index = [] 358 | params_t = [] 359 | out = [] 360 | threads = [] 361 | for t in range(num_node_tt): 362 | threads.append( 363 | threading.Thread(target=tree_multi, args=(Train, Bias[i], point_data, data, output[t], idx, 364 | None, None, pca_params['Layer_{:d}_{:d}_{:d}_{:d}_pca_params'.format(i, j, k, t)], t, index, params_t, out))) 365 | for t in threads: 366 | t.setDaemon(False) 367 | t.start() 368 | for t in threads: 369 | if t.isAlive(): 370 | t.join() 371 | for t in range(num_node_tt): 372 | print(i, j, k, t) 373 | params = pca_params['Layer_{:d}_{:d}_{:d}_{:d}_pca_params'.format(i, j, k, t)] 374 | num_node_ttt = params['num_node'] 375 | ind = np.where(np.array(index) == t)[0] 376 | output_t = out[ind[0]] 377 | info_test['Layer_{:d}_{:d}_{:d}_{:d}_feature'.format(i, j, k, t)] = output_t[:num_node_ttt] 378 | info_test['Layer_{:d}_{:d}_{:d}_{:d}_num_node'.format(i, j, k, t)] = num_node_ttt 379 | for m in range(len(output_t)): 380 | leaf_node.append(output_t[m]) 381 | point_data = new_xyz 382 | # print(len(leaf_node)) 383 | return leaf_node 384 | 385 | 386 | def remove_mean(features, axis): 387 | ''' 388 | Remove the dataset mean. 389 | :param features [num_samples,...] 390 | :param axis the axis to compute mean 391 | 392 | ''' 393 | feature_mean = np.mean(features, axis=axis, keepdims=True) 394 | feature_remove_mean = features-feature_mean 395 | return feature_remove_mean, feature_mean 396 | 397 | 398 | def remove_zero_patch(samples): 399 | std_var = (np.std(samples, axis=1)).reshape(-1, 1) 400 | ind_bool = (std_var == 0) 401 | ind = np.where(ind_bool==True)[0] 402 | samples_new = np.delete(samples, ind, 0) 403 | return samples_new 404 | 405 | 406 | def find_kernels_pca(sample_patches): 407 | ''' 408 | Do the PCA based on the provided samples. 409 | If num_kernels is not set, will use energy_percent. 410 | If neither is set, will preserve all kernels. 411 | :param samples: [num_samples, feature_dimension] 412 | :param num_kernels: num kernels to be preserved 413 | :param energy_percent: the percent of energy to be preserved 414 | :return: kernels, sample_mean 415 | ''' 416 | # Remove patch mean 417 | sample_patches_centered, dc = remove_mean(sample_patches, axis=1) 418 | sample_patches_centered = remove_zero_patch(sample_patches_centered) 419 | # Remove feature mean (Set E(X)=0 for each dimension) 420 | training_data, feature_expectation = remove_mean(sample_patches_centered, axis=0) 421 | 422 | pca = PCA(n_components=training_data.shape[1], svd_solver='full', whiten=True) 423 | pca.fit(training_data) 424 | 425 | num_channels = sample_patches.shape[-1] 426 | largest_ev = [np.var(dc*np.sqrt(num_channels))] 427 | dc_kernel = 1/np.sqrt(num_channels)*np.ones((1, num_channels))/np.sqrt(largest_ev) 428 | 429 | kernels = pca.components_[:, :] 430 | mean = pca.mean_ 431 | kernels = np.concatenate((dc_kernel, kernels), axis=0)[:kernels.shape[0], :] 432 | 433 | energy = np.concatenate((largest_ev, pca.explained_variance_[:kernels.shape[0]-1]), axis=0) \ 434 | / (np.sum(pca.explained_variance_[:kernels.shape[0]-1]) + largest_ev) 435 | return kernels, mean, energy 436 | 437 | 438 | def extract(feat): 439 | ''' 440 | Do feature extraction based on the provided feature. 441 | :param feat: [num_layer, num_samples, feature_dimension] 442 | # :param pooling: pooling method to be used 443 | :return: feature 444 | ''' 445 | mean = [] 446 | maxi = [] 447 | l1 = [] 448 | l2 = [] 449 | 450 | for i in range(len(feat)): 451 | mean.append(feat[i].mean(axis=1, keepdims=False)) 452 | maxi.append(feat[i].max(axis=1, keepdims=False)) 453 | l1.append(np.linalg.norm(feat[i], ord=1, axis=1, keepdims=False)) 454 | l2.append(np.linalg.norm(feat[i], ord=2, axis=1, keepdims=False)) 455 | mean = np.concatenate(mean, axis=-1) 456 | maxi = np.concatenate(maxi, axis=-1) 457 | l1 = np.concatenate(l1, axis=-1) 458 | l2 = np.concatenate(l2, axis=-1) 459 | 460 | return [mean, maxi, l1, l2] 461 | 462 | 463 | def aggregate(feat, pool): 464 | feature = [] 465 | for j in range(len(feat)): 466 | feature.append(feat[j] * pool[j]) 467 | feature = np.concatenate(feature, axis=-1) 468 | return feature 469 | 470 | 471 | def average_acc(label, pred_label): 472 | 473 | classes = np.arange(40) 474 | acc = np.zeros(len(classes)) 475 | for i in range(len(classes)): 476 | ind = np.where(label == classes[i])[0] 477 | pred_test_special = pred_label[ind] 478 | acc[i] = len(np.where(pred_test_special == classes[i])[0])/float(len(ind)) 479 | return acc 480 | 481 | 482 | def onehot_encoding(n_class, labels): 483 | 484 | targets = labels.reshape(-1) 485 | one_hot_targets = np.eye(n_class)[targets] 486 | return one_hot_targets 487 | 488 | 489 | def KMeans_Cross_Entropy(X, Y, num_class, num_bin=32): 490 | if np.unique(Y).shape[0] == 1: 491 | return 0 492 | if X.shape[0] < num_bin: 493 | return -1 494 | kmeans = KMeans(n_clusters=num_bin, random_state=0).fit(X) 495 | prob = np.zeros((num_bin, num_class)) 496 | for i in range(num_bin): 497 | idx = (kmeans.labels_ == i) 498 | tmp = Y[idx] 499 | for j in range(num_class): 500 | prob[i, j] = float(tmp[tmp == j].shape[0]) / (float(Y[Y == j].shape[0]) + 1e-5) 501 | prob = (prob) / (np.sum(prob, axis=1).reshape(-1, 1) + 1e-5) 502 | true_indicator = onehot_encoding(num_class, Y) 503 | probab = prob[kmeans.labels_] 504 | return sklearn.metrics.log_loss(true_indicator, probab)/math.log(num_class) 505 | 506 | 507 | def CE(X, Y, num_class): 508 | H = [] 509 | for i in range(X.shape[1]): 510 | H.append(KMeans_Cross_Entropy(X[:, i].reshape(-1, 1), Y, num_class, num_bin=40)) 511 | return np.array(H) 512 | 513 | 514 | def llsr_train(feature, label): 515 | A = np.ones((feature.shape[0], 1)) 516 | feature = np.concatenate((A, feature), axis=1) 517 | y = onehot_encoding(40, label) 518 | weight = np.matmul(LA.pinv(feature), y) 519 | return weight 520 | 521 | 522 | def llsr_pred(feature, weight): 523 | A = np.ones((feature.shape[0], 1)) 524 | feature = np.concatenate((A, feature), axis=1) 525 | feature = np.matmul(feature, weight) 526 | pred = np.argmax(feature, axis=1) 527 | return feature, pred -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import pickle 3 | import modelnet_data 4 | import pointhop 5 | import numpy as np 6 | import data_utils 7 | import os 8 | import time 9 | import sklearn 10 | 11 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 12 | 13 | parser = argparse.ArgumentParser() 14 | parser.add_argument('--initial_point', type=int, default=1024, help='Point Number [256/512/1024/2048]') 15 | parser.add_argument('--validation', default=False, help='Split train data or not') 16 | parser.add_argument('--feature_selection', default=0.95, help='Percentage of feature selection try 0.95') 17 | parser.add_argument('--ensemble', default=True, help='Ensemble or not') 18 | parser.add_argument('--rotation_angle', default=np.pi/4, help='Rotate angle') 19 | parser.add_argument('--rotation_freq', default=8, help='Rotate time') 20 | parser.add_argument('--log_dir', default='log', help='Log dir [default: log]') 21 | parser.add_argument('--num_point', default=[1024, 128, 128, 64], help='Point Number after down sampling') 22 | parser.add_argument('--num_sample', default=[64, 64, 64, 64], help='KNN query number') 23 | parser.add_argument('--threshold', default=0.0001, help='threshold') 24 | FLAGS = parser.parse_args() 25 | 26 | initial_point = FLAGS.initial_point 27 | VALID = FLAGS.validation 28 | FE = FLAGS.feature_selection 29 | ENSEMBLE = FLAGS.ensemble 30 | angle_rotation = FLAGS.rotation_angle 31 | freq_rotation = FLAGS.rotation_freq 32 | num_point = FLAGS.num_point 33 | num_sample = FLAGS.num_sample 34 | threshold = FLAGS.threshold 35 | 36 | 37 | LOG_DIR = FLAGS.log_dir 38 | if not os.path.exists(LOG_DIR): 39 | os.mkdir(LOG_DIR) 40 | LOG_FOUT = open(os.path.join(LOG_DIR, 'log_train.txt'), 'w') 41 | LOG_FOUT.write(str(FLAGS) + '\n') 42 | 43 | 44 | def log_string(out_str): 45 | LOG_FOUT.write(out_str+'\n') 46 | LOG_FOUT.flush() 47 | print(out_str) 48 | 49 | 50 | def main(): 51 | time_start = time.time() 52 | # load data 53 | train_data, train_label = modelnet_data.data_load(num_point=initial_point, data_dir=os.path.join(BASE_DIR, 'modelnet40_ply_hdf5_2048'), train=True) 54 | test_data, test_label = modelnet_data.data_load(num_point=initial_point, data_dir=os.path.join(BASE_DIR, 'modelnet40_ply_hdf5_2048'), train=False) 55 | 56 | # validation set 57 | if VALID: 58 | train_data, train_label, valid_data, valid_label = modelnet_data.data_separate(train_data, train_label) 59 | else: 60 | valid_data = test_data 61 | valid_label = test_label 62 | 63 | print(train_data.shape, train_label.shape, valid_data.shape, valid_label.shape) 64 | 65 | if ENSEMBLE: 66 | angle = np.repeat(angle_rotation, freq_rotation) 67 | else: 68 | angle = [0] 69 | 70 | params_total = {} 71 | feat_train = [] 72 | feat_valid = [] 73 | for i in range(len(angle)): 74 | log_string('------------Train {} --------------'.format(i)) 75 | params, leaf_node, leaf_node_energy = pointhop.pointhop_train(True, train_data, n_newpoint=num_point, 76 | n_sample=num_sample, threshold=threshold) 77 | feature_train = pointhop.extract(leaf_node) 78 | feature_train = np.concatenate(feature_train, axis=-1) 79 | if FE is not None: 80 | entropy = pointhop.CE(feature_train, train_label, 40) 81 | ind = np.argsort(entropy) 82 | fe_ind = ind[:int(len(ind)*FE)] 83 | feature_train = feature_train[:, fe_ind] 84 | params_total['fe_ind:', i] = fe_ind 85 | weight = pointhop.llsr_train(feature_train, train_label) 86 | feature_train, pred_train = pointhop.llsr_pred(feature_train, weight) 87 | feat_train.append(feature_train) 88 | acc_train = sklearn.metrics.accuracy_score(train_label, pred_train) 89 | log_string('train accuracy: {}'.format(acc_train)) 90 | params_total['params:', i] = params 91 | params_total['weight:', i] = weight 92 | train_data = data_utils.data_augment(train_data, angle[i]) 93 | 94 | if VALID: 95 | log_string('------------Validation {} --------------'.format(i)) 96 | leaf_node_test = pointhop.pointhop_pred(False, valid_data, pca_params=params, n_newpoint=num_point, 97 | n_sample=num_sample) 98 | feature_valid = pointhop.extract(leaf_node_test) 99 | feature_valid = np.concatenate(feature_valid, axis=-1) 100 | if FE is not None: 101 | feature_valid = feature_valid[:, fe_ind] 102 | feature_valid, pred_valid = pointhop.llsr_pred(feature_valid, weight) 103 | acc_valid = sklearn.metrics.accuracy_score(valid_label, pred_valid) 104 | acc = pointhop.average_acc(valid_label, pred_valid) 105 | feat_valid.append(feature_valid) 106 | log_string('val: {} , val mean: {}'.format(acc_valid, np.mean(acc))) 107 | log_string('per-class: {}'.format(acc)) 108 | valid_data = data_utils.data_augment(valid_data, angle[i]) 109 | 110 | if ENSEMBLE: 111 | feat_train = np.concatenate(feat_train, axis=-1) 112 | weight = pointhop.llsr_train(feat_train, train_label) 113 | feat_train, pred_train = pointhop.llsr_pred(feat_train, weight) 114 | acc_train = sklearn.metrics.accuracy_score(train_label, pred_train) 115 | params_total['weight ensemble'] = weight 116 | log_string('ensemble train accuracy: {}'.format(acc_train)) 117 | 118 | if VALID: 119 | feat_valid = np.concatenate(feat_valid, axis=-1) 120 | feat_valid, pred_valid = pointhop.llsr_pred(feat_valid, weight) 121 | acc_valid = sklearn.metrics.accuracy_score(valid_label, pred_valid) 122 | acc = pointhop.average_acc(valid_label, pred_valid) 123 | log_string('ensemble val: {}, ensemble val mean: {}'.format(acc_valid, np.mean(acc))) 124 | log_string('ensemble per-class: {}'.format(acc)) 125 | 126 | time_end = time.time() 127 | log_string('totally time cost is {} minutes'.format((time_end - time_start)//60)) 128 | 129 | with open(os.path.join(LOG_DIR, 'params.pkl'), 'wb') as f: 130 | pickle.dump(params_total, f) 131 | 132 | 133 | if __name__ == '__main__': 134 | main() --------------------------------------------------------------------------------