├── ARNet_material_classification ├── README.md ├── compute_baseline.py ├── configs │ ├── config_data_split_1.yaml │ ├── config_data_split_2.yaml │ ├── config_data_split_3.yaml │ └── config_objectfolder.yaml ├── dataset │ ├── AR_material_dataset.py │ ├── data_preparation.py │ ├── data_preparation_objectfolder.py │ └── material_simple_categories.json ├── images │ ├── data_split_1_test_confusion_matrix.png │ ├── data_split_1_test_refine_confusion_matrix.png │ ├── data_split_1_train_confusion_matrix.png │ ├── data_split_1_valid_confusion_matrix.png │ ├── data_split_1_valid_refine_confusion_matrix.png │ ├── data_split_2_test_confusion_matrix.png │ ├── data_split_2_test_refine_confusion_matrix.png │ ├── data_split_2_train_confusion_matrix.png │ ├── data_split_2_valid_confusion_matrix.png │ ├── data_split_2_valid_refine_confusion_matrix.png │ ├── data_split_3_test_confusion_matrix.png │ ├── data_split_3_test_refine_confusion_matrix.png │ ├── data_split_3_train_confusion_matrix.png │ ├── data_split_3_valid_confusion_matrix.png │ ├── data_split_3_valid_refine_confusion_matrix.png │ ├── object_folder_train_confusion_matrix.png │ └── object_folder_valid_confusion_matrix.png ├── main.py ├── models │ └── ARNet_material.py ├── refine_material_predication.py ├── requirements.txt ├── results │ ├── data_split_1 │ │ ├── test │ │ │ ├── label_gt_test.npy │ │ │ └── label_prediction_test.npy │ │ ├── train │ │ │ ├── label_gt_train.npy │ │ │ └── label_prediction_train.npy │ │ └── valid │ │ │ ├── label_gt_valid.npy │ │ │ └── label_prediction_valid.npy │ ├── data_split_2 │ │ ├── test │ │ │ ├── label_gt_test.npy │ │ │ └── label_prediction_test.npy │ │ ├── train │ │ │ ├── label_gt_train.npy │ │ │ └── label_prediction_train.npy │ │ └── valid │ │ │ ├── label_gt_valid.npy │ │ │ └── label_prediction_valid.npy │ ├── data_split_3 │ │ ├── test │ │ │ ├── label_gt_test.npy │ │ │ └── label_prediction_test.npy │ │ ├── train │ │ │ ├── label_gt_train.npy │ │ │ └── label_prediction_train.npy │ │ └── valid │ │ │ ├── label_gt_valid.npy │ │ │ └── label_prediction_valid.npy │ └── object_folder │ │ ├── train │ │ ├── label_gt_train.npy │ │ └── label_prediction_train.npy │ │ └── valid │ │ ├── label_gt_valid.npy │ │ └── label_prediction_valid.npy └── test.py ├── ARNet_object_classification ├── README.md ├── baseline.py ├── configs │ ├── config_data_split_1_audio+points.yaml │ ├── config_data_split_1_audio.yaml │ ├── config_data_split_1_points.yaml │ ├── config_data_split_2_audio+points.yaml │ ├── config_data_split_2_audio.yaml │ ├── config_data_split_2_points.yaml │ ├── config_data_split_3_audio+points.yaml │ ├── config_data_split_3_audio.yaml │ └── config_data_split_3_points.yaml ├── dataset │ └── AR_object_dataset.py ├── images │ ├── data_15_dataset_82_objects_points_test_confusion_matrix.png │ ├── data_15_dataset_82_objects_points_test_confusion_matrix.xlsx │ ├── data_15_dataset_82_objects_points_valid_confusion_matrix.png │ └── data_15_dataset_82_objects_points_valid_confusion_matrix.xlsx ├── main.py ├── main_sweep.py ├── models │ └── ARNet_object.py ├── refine_material_predication.py ├── show_dataset_statistic.py └── test.py ├── ARNet_shape_reconstruction ├── README.md ├── chamfer.py ├── compute_L1_cd.py ├── compute_baseline.py ├── configs │ ├── config_data_split_1.yaml │ ├── config_data_split_2.yaml │ └── config_data_split_3.yaml ├── dataset │ ├── AR_recon_dataset.py │ ├── __pycache__ │ │ ├── AR_recon_dataset.cpython-38.pyc │ │ ├── data_preparation_real.cpython-38.pyc │ │ └── data_preparation_synthetic.cpython-38.pyc │ ├── data_preparation.py │ ├── data_split_1 │ │ ├── real_test_gt.npy │ │ ├── real_test_tapping.npy │ │ ├── real_train_gt.npy │ │ ├── real_train_tapping.npy │ │ ├── real_valid_gt.npy │ │ ├── real_valid_tapping.npy │ │ ├── syn_train_gt.npy │ │ ├── syn_train_tapping.npy │ │ ├── test_object_list.npy │ │ └── valid_object_list.npy │ ├── data_split_2 │ │ ├── real_test_gt.npy │ │ ├── real_test_tapping.npy │ │ ├── real_train_gt.npy │ │ ├── real_train_tapping.npy │ │ ├── real_valid_gt.npy │ │ ├── real_valid_tapping.npy │ │ ├── syn_train_gt.npy │ │ ├── syn_train_tapping.npy │ │ ├── test_object_list.npy │ │ └── valid_object_list.npy │ └── data_split_3 │ │ ├── real_test_gt.npy │ │ ├── real_test_tapping.npy │ │ ├── real_train_gt.npy │ │ ├── real_train_tapping.npy │ │ ├── real_valid_gt.npy │ │ ├── real_valid_tapping.npy │ │ ├── syn_train_gt.npy │ │ ├── syn_train_tapping.npy │ │ ├── test_object_list.npy │ │ └── valid_object_list.npy ├── main.py ├── models │ └── ARnet_recon.py ├── requirements.txt ├── test.py └── visualization │ └── visualization.py ├── Hardware_instruction ├── CAD_model │ ├── Assemble_whole_hand.f3z │ ├── Assembly.png │ ├── LX-224_servo_motor.f3d │ ├── assembly_fingertip.f3z │ ├── assembly_one_finger.f3z │ ├── base.f3d │ ├── base_audio_jack_cover.f3d │ ├── base_cover.f3d │ ├── base_finger.f3d │ ├── base_finger_2.f3d │ ├── fingertip.f3d │ ├── fingertip_arm_1.f3d │ ├── fingertip_arm_2.f3d │ └── fingertip_counterweight_mount.f3d └── hardware_document.pdf ├── LICENSE ├── README.md ├── figures ├── Fig 1. Teaser_figure.jpg ├── Fig 6. Learning_results_version.jpg └── teaser.gif └── supplementary └── Object Re-identification Confusion Matrices.zip /ARNet_material_classification/README.md: -------------------------------------------------------------------------------- 1 | # ARNet_material_classification 2 | 3 | ## Installation 4 | The code has been tested on Ubuntu 20.04 with CUDA 12.0. 5 | ``` 6 | virtualenv -p /usr/bin/python venv_classification 7 | source venv_classification/bin/activate 8 | cd ARNet_material_classification 9 | pip install -r requirements.txt 10 | ``` 11 | 12 | ## Data Preparation 13 | 14 | Download the dataset from the [Google Drive](https://drive.google.com/file/d/121ZZw-_Bd2QLxrFwHMaK20bsRWV-05OF/view?usp=drive_link) and unzip under the folder. 15 | 16 | ## About Configs 17 | - `exp_name` - used to specify data split 18 | - `ckpt_path` - used to specify saved model for testing 19 | - `init_dataset` - if it is True, the program with randomly rebalance the dataset, which will create a different dataset. Set it to False to reproduce the results. 20 | 21 | ## Training 22 | 23 | Run the following command for training: 24 | ``` 25 | CUDA_VISIBLE_DEVICES=6 python main.py data_split_<1-3> 26 | ``` 27 | During training, the training and validation dataset are balanced, so the best model is selected based on the best validation accuracy. 28 | ## Evaluation 29 | Run the following command for testing: 30 | ``` 31 | CUDA_VISIBLE_DEVICES=6 python test.py data_split_<1-3> 32 | ``` 33 | During testing, the testing dataset is unbalanced, so F1 score is used to evaluate the performance and the parameters of refinement method is searched on unbalanced validation dataset. 34 | 35 | To test the model performance of the model trained on objectfolder dataset, run the following command: 36 | ``` 37 | CUDA_VISIBLE_DEVICES=6 python test_object_folder.py object_folder 38 | ``` 39 | The confusion matrices will be saved under `images` folder, and numerical results are saved under `results` folder. 40 | 41 | 42 | -------------------------------------------------------------------------------- /ARNet_material_classification/compute_baseline.py: -------------------------------------------------------------------------------- 1 | import json 2 | import yaml 3 | import torch 4 | import random 5 | import statistics 6 | import numpy as np 7 | import pandas as pd 8 | import seaborn as sns 9 | from munch import munchify 10 | from sewar.full_ref import mse 11 | import matplotlib.pyplot as plt 12 | from torchmetrics.classification import MulticlassConfusionMatrix 13 | 14 | def load_config(filepath): 15 | with open(filepath, 'r') as stream: 16 | try: 17 | trainer_params = yaml.safe_load(stream) 18 | return trainer_params 19 | except yaml.YAMLError as exc: 20 | print(exc) 21 | 22 | def read_simplified_label_from_npy(path): 23 | path, idx = path 24 | if 'objectfolder' in path: 25 | label=1 26 | else: 27 | idx=idx[1]+(idx[0]-1)*4 28 | data = [np.load(f'{path}', allow_pickle=True)[idx]] 29 | label=data[0] 30 | for key in simplified_label_mapping: 31 | if label in simplified_label_mapping[key]: 32 | simplified_label= list(simplified_label_mapping.keys()).index(key) 33 | return np.array(int(simplified_label)) 34 | 35 | def read_audio_from_npy(path): 36 | path = path[0] 37 | audio_data = np.load(f'{path}', allow_pickle=True) 38 | return np.array([audio_data], np.float32) 39 | 40 | def save_confusion_matrix(y_hat,y): 41 | metric = MulticlassConfusionMatrix(num_classes=params.num_label) 42 | cm=metric(y_hat,y) 43 | confusion_matrix_computed = cm.detach().cpu().numpy().astype(int) 44 | uniformed_confusion_matrix=[] 45 | for idx,i in enumerate(confusion_matrix_computed): 46 | uniformed_confusion_matrix.append([val/sum(i) for val in i]) 47 | final_acc_list=[] 48 | for idx in range(len(uniformed_confusion_matrix)): 49 | final_acc_list.append(uniformed_confusion_matrix[idx][idx]) 50 | final_acc=sum(final_acc_list)/len(final_acc_list) 51 | 52 | df_cm = pd.DataFrame(uniformed_confusion_matrix,index=params.label_name,columns=params.label_name) 53 | plt.figure(figsize = (10,8)) 54 | fig_ = sns.heatmap(df_cm, annot=True, cmap='Reds').get_figure() 55 | plt.xlabel('Predicted labels') 56 | plt.ylabel('True lables') 57 | plt.savefig(f'images/{params.exp_name}_{params.testing_split}_baseline_confusion_matrix', dpi=300) 58 | plt.close(fig_) 59 | 60 | 61 | def get_average_f1_score(y_hat,y): 62 | metric = MulticlassConfusionMatrix(num_classes=params.num_label) 63 | cm=metric(y_hat,y) 64 | confusion_matrix_computed = cm.detach().cpu().numpy().astype(int) 65 | precision_list = [] 66 | for idx, row in enumerate(confusion_matrix_computed): 67 | p = row[idx]/sum(row) 68 | precision_list.append(p) 69 | recall_list = [] 70 | for idx in range(params.num_label): 71 | i_column=[] 72 | for row in confusion_matrix_computed: 73 | i_column.append(row[idx]) 74 | r = confusion_matrix_computed[idx][idx]/sum(i_column) 75 | recall_list.append(r) 76 | f1_score_list = [] 77 | for i in range(params.num_label): 78 | f1_score_list.append(statistics.harmonic_mean([precision_list[i],recall_list[i]])) 79 | return np.mean(f1_score_list) 80 | 81 | with open('dataset/material_simple_categories.json') as f: 82 | simplified_label_mapping = json.load(f) 83 | 84 | for split in ['data_split_1', 'data_split_2','data_split_3']: 85 | test_audio_list = np.load(f'data/ARdataset/{split}/original_test_audio_list.npy', allow_pickle = True) 86 | train_audio_list = np.load(f'data/ARdataset/{split}/train_audio_list.npy', allow_pickle = True) 87 | test_label_list = np.load(f'data/ARdataset/{split}/original_test_label_list.npy', allow_pickle = True) 88 | train_label_list = np.load(f'data/ARdataset/{split}/train_label_list.npy', allow_pickle = True) 89 | 90 | print(len(test_audio_list)) 91 | print(len(train_audio_list)) 92 | print(len(test_label_list)) 93 | print(len(train_label_list)) 94 | 95 | #random baseline 96 | correct=0 97 | params = load_config(filepath='configs/config.yaml') 98 | params = munchify(params) 99 | current_label_list=params.label_mapping 100 | test_labels=[] 101 | train_labels=[] 102 | for test_label_path in test_label_list: 103 | random_train_label_path = random.choice(train_label_list) 104 | test_label = np.array(current_label_list[int(read_simplified_label_from_npy(test_label_path))]) 105 | train_label = np.array(current_label_list[int(read_simplified_label_from_npy(random_train_label_path))]) 106 | test_labels.append(test_label) 107 | train_labels.append(train_label) 108 | #get confusion matrix 109 | save_confusion_matrix(torch.from_numpy(np.array(train_labels)),torch.from_numpy(np.array(test_labels))) 110 | #get f1 score 111 | average_f1_score = get_average_f1_score(torch.from_numpy(np.array(train_labels)),torch.from_numpy(np.array(test_labels))) 112 | print(split, 'avg f1 score:', average_f1_score) 113 | 114 | #nearest neighbor baseline 115 | correct=0 116 | params = load_config(filepath='configs/config.yaml') 117 | params = munchify(params) 118 | current_label_list=params.label_mapping 119 | test_labels=[] 120 | train_labels=[] 121 | for test_idx, test_audio_path in enumerate(test_audio_list): 122 | test_audio = read_audio_from_npy(test_audio_path) 123 | min_dis=float('inf') 124 | for train_idx, train_audio_path in enumerate(train_audio_list): 125 | train_audio = read_audio_from_npy(train_audio_path) 126 | distance = mse(train_audio,test_audio) 127 | if distance Tensor: 75 | x = self.conv1(x) 76 | x = self.bn1(x) 77 | x = self.relu(x) 78 | x = self.maxpool(x) 79 | x = self.conv2(x) 80 | x = self.bn2(x) 81 | x = self.dropout1(x) 82 | x = self.relu2(x) 83 | x = self.maxpool2(x) 84 | x = self.conv3(x) 85 | x = self.bn3(x) 86 | x = self.relu3(x) 87 | x = torch.flatten(x, 1) 88 | x = self.dropout2(x) 89 | x = self.fc1(x) 90 | x = self.dropout2(x) 91 | x = self.fc2(x) 92 | return x 93 | 94 | def configure_optimizers(self): 95 | optimizer = optim.SGD(self.parameters(), lr=self.learning_rate) 96 | # optimizer = optim.Adam(self.parameters(), lr=self.params.lr,betas=(0.9, 0.999)) 97 | scheduler = Optim.lr_scheduler.StepLR(optimizer, step_size=200, gamma=0.1) 98 | return [optimizer],[scheduler] 99 | 100 | def training_step(self,batch,batch_idx): 101 | image, labels = batch 102 | image = image.cuda() 103 | labels = labels.cuda() 104 | outputs = self.forward(image) 105 | criterion = nn.CrossEntropyLoss() 106 | train_loss = criterion(outputs, labels) 107 | self.train_acc(outputs, labels) 108 | self.log('train_loss', train_loss, on_step=True, on_epoch=True, prog_bar=True, logger=True) 109 | self.log('train_acc', self.train_acc, on_epoch=True,prog_bar=True) 110 | return train_loss 111 | 112 | def validation_step(self,batch,batch_idx): 113 | image, labels = batch 114 | image = image.cuda() 115 | labels = labels.cuda() 116 | outputs = self.forward(image) 117 | criterion = nn.CrossEntropyLoss() 118 | val_loss = criterion(outputs, labels) 119 | self.valid_acc(outputs, labels) 120 | self.log('val_loss', val_loss, on_step=True, on_epoch=True, prog_bar=True, logger=True) 121 | self.log('val_acc', self.valid_acc, on_epoch=True,prog_bar=True) 122 | return val_loss 123 | 124 | def test_step(self, batch,batch_idx): 125 | image, labels = batch 126 | image = image.cuda() 127 | labels = labels.cuda() 128 | outputs = self.forward(image) 129 | self.test_acc(outputs, labels) 130 | #confusion matrix 131 | self.test_step_y_hats.append(outputs) 132 | self.test_step_ys.append(labels) 133 | y_hat = torch.cat(self.test_step_y_hats) 134 | y = torch.cat(self.test_step_ys) 135 | metric = MulticlassConfusionMatrix(num_classes=self.num_classes).cuda() 136 | cm=metric(y_hat,y) 137 | confusion_matrix_computed = cm.detach().cpu().numpy().astype(int) 138 | uniformed_confusion_matrix=[] 139 | for idx,i in enumerate(confusion_matrix_computed): 140 | uniformed_confusion_matrix.append([val/sum(i) for val in i]) 141 | final_acc_list=[] 142 | for idx in range(len(uniformed_confusion_matrix)): 143 | final_acc_list.append(uniformed_confusion_matrix[idx][idx]) 144 | final_acc=sum(final_acc_list)/len(final_acc_list) 145 | print('final acc among class = ',final_acc) 146 | df_cm = pd.DataFrame(uniformed_confusion_matrix,index=self.label_name,columns=self.label_name) 147 | plt.figure(figsize = (10,8)) 148 | fig_ = sns.heatmap(df_cm, annot=True, cmap='Reds').get_figure() 149 | plt.xlabel('Predicted labels') 150 | plt.ylabel('True lables') 151 | plt.savefig(f'images/{self.params.exp_name}_{self.params.testing_split}_confusion_matrix', dpi=300) 152 | plt.close(fig_) 153 | self.loggers[0].experiment.add_figure("Test confusion matrix", fig_, self.current_epoch) 154 | #save the evaluation results 155 | label_prediction=[i.detach().cpu().numpy() for i in self.test_step_y_hats] 156 | label_gt=[i.detach().cpu().numpy() for i in self.test_step_ys] 157 | np.save(f'results/{self.params.exp_name}/{self.params.testing_split}/label_prediction_{self.params.testing_split}.npy',label_prediction[0]) 158 | np.save(f'results/{self.params.exp_name}/{self.params.testing_split}/label_gt_{self.params.testing_split}.npy',label_gt[0]) 159 | #compute metric 160 | recall_metric = MulticlassRecall(num_classes=self.num_classes, average='none').cuda() 161 | precision_metric = MulticlassPrecision(num_classes=self.num_classes, average='none').cuda() 162 | F1_score_metric = MulticlassF1Score(num_classes=self.num_classes,average='none').cuda() 163 | F1_score=F1_score_metric(y_hat,y) 164 | F1_score_average_metric = MulticlassF1Score(num_classes=self.num_classes,average='macro').cuda() 165 | F1_score_average=F1_score_average_metric(y_hat,y) 166 | recall = recall_metric(y_hat,y) 167 | precision = precision_metric(y_hat,y) 168 | print('torch recall', recall) 169 | print('torch precision ',precision) 170 | print('torch F1',F1_score) 171 | print('torch F1 average',F1_score_average) 172 | self.log('F1_score_average', F1_score_average,on_step=False, on_epoch=True,prog_bar=True) 173 | 174 | return {'preds' : outputs, 'targets' : labels} 175 | 176 | 177 | 178 | -------------------------------------------------------------------------------- /ARNet_material_classification/refine_material_predication.py: -------------------------------------------------------------------------------- 1 | import os 2 | import math 3 | import copy 4 | import yaml 5 | import torch 6 | import pickle 7 | import natsort 8 | import statistics 9 | import numpy as np 10 | import pandas as pd 11 | import open3d as o3d 12 | import seaborn as sns 13 | from munch import munchify 14 | import matplotlib.pyplot as plt 15 | from torchmetrics.classification import MulticlassConfusionMatrix 16 | 17 | class get_overall_accuracy(object): 18 | def __init__(self,params,config_path): 19 | self.num_class = params.num_label 20 | self.params = params 21 | self.config_path = config_path 22 | 23 | def calculate_accuracy(self,split,k_neighbor,num_loop,mim_occurence): 24 | filepath = self.config_path 25 | with open(filepath, 'r') as stream: 26 | params = yaml.safe_load(stream) 27 | params = munchify(params) 28 | correct_label_list = params.label_mapping 29 | d_swap = {v: k for k, v in correct_label_list.items()} 30 | correct_label_list = d_swap 31 | npy_files = natsort.natsorted((np.load(f'data/ARdataset/{params.exp_name}/{split}_object_list.npy', allow_pickle=True))) 32 | file_name = np.load(f'data/ARdataset/{params.exp_name}/original_{split}_label_list.npy', allow_pickle=True) 33 | data = np.load(f'results/{params.exp_name}/{split}/label_prediction_{split}.npy', allow_pickle=True) 34 | gtdata = np.load(f'results/{params.exp_name}/{split}/label_gt_{split}.npy', allow_pickle=True) 35 | list_accuracy = [] 36 | correct_prediction = [] 37 | y_hat=[] 38 | y=[] 39 | num_correct_prediction = [0 for _ in range(len(params.label_name))] 40 | total_num_prediction = [0 for _ in range(len(params.label_name))] 41 | for file in npy_files: 42 | object_name = file.replace('.npy', '') 43 | raw_contact_points = self.read_raw_contact_from_txt( 44 | os.path.join('data/ARdataset/contact_position', file.replace('.npy', '.txt'))) 45 | object_name_prediction = [] 46 | label_prediction = [] 47 | # not all of the contact point have valid audio data included in training, so here we extract the contact points with valid material prediction 48 | contact_points_idx_prediction = [] 49 | ground_truth_label = [] 50 | for idx, i in enumerate(file_name): 51 | # print( object_name) 52 | if object_name in i[0]: 53 | object_name_prediction.append(i) 54 | contact_points_idx_prediction.append((i[1][0] - 1) * 4 + i[1][1]) 55 | label_prediction.append( 56 | correct_label_list[int(np.where(data[idx] == np.max(data[idx]))[0])]) 57 | ground_truth_label.append(correct_label_list[int(gtdata[idx])]) 58 | predicted_contact_points = [] 59 | for idx, value in enumerate(contact_points_idx_prediction): 60 | predicted_contact_points.append(raw_contact_points[value]) 61 | 62 | # get statistic of occurence of labels and filter out labels with low occurance 63 | label_prediction,maxlabel = self.filter_out_labels_with_low_occurence(k_neighbor,mim_occurence, label_prediction,predicted_contact_points) 64 | for i in range(num_loop): 65 | voted_label_list = [] 66 | for idx, value in enumerate(predicted_contact_points): 67 | voted_lable = self.get_labels_of_k_neighbor_points(k_neighbor, value,maxlabel, label_prediction,predicted_contact_points) 68 | voted_label_list.append(voted_lable) 69 | label_prediction = voted_label_list.copy() 70 | 71 | #save confusion matrix 72 | for idx in range(len(label_prediction)): 73 | y_hat.append(params.label_mapping[int(label_prediction[idx])]) 74 | y.append(params.label_mapping[int(ground_truth_label[idx])]) 75 | 76 | # check number of correct prediction 77 | for idx, value in enumerate(label_prediction): 78 | pred_index = params.label_mapping[value] 79 | gt_index = params.label_mapping[ground_truth_label[idx]] 80 | if value == ground_truth_label[idx]: 81 | num_correct_prediction[pred_index] += 1 82 | total_num_prediction[gt_index]+=1 83 | 84 | #save confusion matrix 85 | if split == 'test': 86 | self.save_confusion_matrix(torch.from_numpy(np.array(y_hat)),torch.from_numpy(np.array(y))) 87 | 88 | #get f1 score 89 | average_f1_score = self.get_average_f1_score(torch.from_numpy(np.array(y_hat)),torch.from_numpy(np.array(y))) 90 | acc=[] 91 | for idx, i in enumerate(num_correct_prediction): 92 | acc.append(num_correct_prediction[idx]/total_num_prediction[idx]) 93 | # calculate overall prediction accuracy 94 | accuracy_balanced = sum(acc) / len(acc) 95 | accuracy = sum(num_correct_prediction) / sum(total_num_prediction) 96 | list_accuracy.append(accuracy) 97 | return accuracy,average_f1_score 98 | def save_confusion_matrix(self,y_hat,y): 99 | metric = MulticlassConfusionMatrix(num_classes=self.num_class) 100 | cm=metric(y_hat,y) 101 | confusion_matrix_computed = cm.detach().cpu().numpy().astype(int) 102 | uniformed_confusion_matrix=[] 103 | for idx,i in enumerate(confusion_matrix_computed): 104 | uniformed_confusion_matrix.append([val/sum(i) for val in i]) 105 | final_acc_list=[] 106 | for idx in range(len(uniformed_confusion_matrix)): 107 | final_acc_list.append(uniformed_confusion_matrix[idx][idx]) 108 | df_cm = pd.DataFrame(uniformed_confusion_matrix,index=self.params.label_name,columns=self.params.label_name) 109 | plt.figure(figsize = (10,8)) 110 | fig_ = sns.heatmap(df_cm, annot=True, cmap='Reds').get_figure() 111 | plt.xlabel('Predicted labels') 112 | plt.ylabel('True lables') 113 | plt.savefig(f'images/{self.params.exp_name}_{self.params.testing_split}_refine_confusion_matrix', dpi=300) 114 | plt.close(fig_) 115 | 116 | def get_average_f1_score(self,y_hat,y): 117 | metric = MulticlassConfusionMatrix(num_classes=self.num_class) 118 | cm=metric(y_hat,y) 119 | confusion_matrix_computed = cm.detach().cpu().numpy().astype(int) 120 | precision_list = [] 121 | for idx, row in enumerate(confusion_matrix_computed): 122 | p = row[idx]/sum(row) 123 | precision_list.append(p) 124 | recall_list = [] 125 | for idx in range(self.num_class): 126 | i_column=[] 127 | for row in confusion_matrix_computed: 128 | i_column.append(row[idx]) 129 | r = confusion_matrix_computed[idx][idx]/sum(i_column) 130 | recall_list.append(r) 131 | f1_score_list = [] 132 | for i in range(self.num_class): 133 | f1_score_list.append(statistics.harmonic_mean([precision_list[i],recall_list[i]])) 134 | return np.mean(f1_score_list) 135 | 136 | def read_raw_contact_from_txt(self,path): 137 | # extract [x,y,z] data from {contact_points}.txt file 138 | data = [] 139 | with open(f'{path}', "r") as f: 140 | for line in f: 141 | ls = line.strip().split() 142 | ls = [float(i) for i in ls] 143 | data.append(ls) 144 | points = [] 145 | for i in range(len(data)): 146 | points.append(data[i][0:3]) 147 | #turn unit into meter and apply transformation for aligning contact points with 3D point cloud model 148 | pcd = o3d.geometry.PointCloud() 149 | pcd.points = o3d.utility.Vector3dVector(np.array(points, dtype=float)) 150 | pcd.scale(0.01, [0, 0, 0]) 151 | transform_path = 'data/ARdataset/transformation_matrix/tapping_position_transformation_matrix.pkl' 152 | with open(transform_path, 'rb') as f: 153 | T = pickle.load(f)[0] 154 | pcd_t = copy.deepcopy(pcd).transform(T) 155 | transform_path = 'data/ARdataset/transformation_matrix/tapping_position_transformation_matrix_1.pkl' 156 | with open(transform_path, 'rb') as f: 157 | T = pickle.load(f)[0] 158 | pcd_t = copy.deepcopy(pcd_t).transform(T) 159 | return np.asarray(pcd_t.points) 160 | 161 | def get_labels_of_k_neighbor_points(self,k, value, maxlabel,label_list,predicted_contact_points): 162 | neighbors_labels = [] 163 | neighbors_points = [] 164 | contact_points_list = predicted_contact_points.copy() 165 | contact_points_list = [list(i) for i in contact_points_list] 166 | for i in range(k): 167 | min = float('inf') 168 | for idx, points in enumerate(contact_points_list): 169 | #compute Euclidean Distance of two points 170 | distance = math.dist(value, points) 171 | if distance < min: 172 | min = distance 173 | key = idx 174 | closest_point = points 175 | neighbors_points.append(contact_points_list[key]) 176 | contact_points_list.remove(closest_point) 177 | contact_points_list = predicted_contact_points.copy() 178 | contact_points_list = [list(i) for i in contact_points_list] 179 | for idx, points in enumerate(contact_points_list): 180 | for value in neighbors_points: 181 | if list(value) == list(points): 182 | neighbors_labels.append(label_list[idx]) 183 | if len(statistics.multimode(neighbors_labels)) != 1: 184 | voted_label = maxlabel 185 | else: 186 | voted_label = statistics.multimode(neighbors_labels)[0] 187 | return voted_label 188 | 189 | def filter_out_labels_with_low_occurence(self,k_neighbor,mim_occurence,list_of_label,predicted_contact_points): 190 | labels = np.unique(list_of_label) 191 | labels_statistic=[] 192 | for i in labels: 193 | labels_statistic.append(list_of_label.count(i)) 194 | maxidx=labels_statistic.index(max(labels_statistic)) 195 | maxlabel=labels[maxidx] 196 | for i in range(len(labels_statistic)): 197 | if labels_statistic[i] 23 | ``` 24 | 25 | ## Evaluation 26 | Choose your ckpt file and change it in test_ckpt_path in config.yaml, then run the following command: 27 | ``` 28 | CUDA_VISIBLE_DEVICES=6 python test.py 29 | ``` 30 | * choice of datasplit : data_split_1/data_split_2/data_split_3 31 | * choice of input type : audio+points/audio/points 32 | The confusion matrices and numerical results are saved under `images` folder. -------------------------------------------------------------------------------- /ARNet_object_classification/baseline.py: -------------------------------------------------------------------------------- 1 | import os 2 | import math 3 | import copy 4 | import yaml 5 | import pickle 6 | import random 7 | import statistics 8 | import numpy as np 9 | import pandas as pd 10 | import seaborn as sns 11 | from munch import munchify 12 | from natsort import natsorted 13 | from sewar.full_ref import mse 14 | import matplotlib.pyplot as plt 15 | from torchmetrics.classification import MulticlassConfusionMatrix 16 | 17 | def load_config(filepath): 18 | with open(filepath, 'r') as stream: 19 | try: 20 | trainer_params = yaml.safe_load(stream) 21 | return trainer_params 22 | except yaml.YAMLError as exc: 23 | print(exc) 24 | params = load_config(filepath='configs/config.yaml') 25 | params = munchify(params) 26 | def get_data_list(split): 27 | contact_points_paths = [] 28 | audio_paths = [] 29 | label = [] 30 | for file_name in natsorted(os.listdir(f'./data/contact_points/{split}')): 31 | contact_points_paths.append(f'./data/contact_points/{split}/{file_name}') 32 | for label_name in params.label_mapping: 33 | if f'_{label_name}.npy' in file_name: 34 | label.append(params.label_mapping[label_name]) 35 | for path in contact_points_paths: 36 | path = path.replace('contact_points','audio') 37 | audio_paths.append(path) 38 | return contact_points_paths, audio_paths, label 39 | 40 | def read_audio_from_npy(path): 41 | # path = path[0] 42 | audio_data = np.load(f'{path}', allow_pickle=True) 43 | return np.array([audio_data], np.float32) 44 | 45 | def save_confusion_matrix(y_hat,y): 46 | metric = MulticlassConfusionMatrix(num_classes=params.num_label) 47 | cm=metric(y_hat,y) 48 | confusion_matrix_computed = cm.detach().cpu().numpy().astype(int) 49 | uniformed_confusion_matrix=[] 50 | for idx,i in enumerate(confusion_matrix_computed): 51 | uniformed_confusion_matrix.append([val/sum(i) for val in i]) 52 | final_acc_list=[] 53 | for idx in range(len(uniformed_confusion_matrix)): 54 | final_acc_list.append(uniformed_confusion_matrix[idx][idx]) 55 | final_acc=sum(final_acc_list)/len(final_acc_list) 56 | # print('final acc = ',final_acc) 57 | df_cm = pd.DataFrame(uniformed_confusion_matrix,index=params.label_name,columns=params.label_name) 58 | plt.figure(figsize = (10,8)) 59 | fig_ = sns.heatmap(df_cm, annot=True, cmap='Reds').get_figure() 60 | plt.xlabel('Predicted labels') 61 | plt.ylabel('True lables') 62 | plt.savefig(f'images/{params.exp_name}_{params.testing_split}_baseline_confusion_matrix', dpi=300) 63 | plt.close(fig_) 64 | 65 | def get_average_f1_score(y_hat,y): 66 | metric = MulticlassConfusionMatrix(num_classes=params.num_label) 67 | cm=metric(y_hat,y) 68 | confusion_matrix_computed = cm.detach().cpu().numpy().astype(int) 69 | precision_list = [] 70 | for idx, row in enumerate(confusion_matrix_computed): 71 | p = row[idx]/sum(row) 72 | precision_list.append(p) 73 | recall_list = [] 74 | for idx in range(params.num_label): 75 | i_column=[] 76 | for row in confusion_matrix_computed: 77 | i_column.append(row[idx]) 78 | r = confusion_matrix_computed[idx][idx]/sum(i_column) 79 | recall_list.append(r) 80 | f1_score_list = [] 81 | for i in range(params.num_label): 82 | f1_score_list.append(statistics.harmonic_mean([precision_list[i],recall_list[i]])) 83 | return np.mean(f1_score_list) 84 | 85 | #Radom baseline 86 | _,_,train_label_list = get_data_list('train') 87 | _,_,test_label_list = get_data_list('test') 88 | correct = 0 89 | for i in test_label_list: 90 | prediction = random.choice(train_label_list) 91 | if i == prediction: 92 | correct+=1 93 | print('random baseline accuracy',correct/len(test_label_list)) 94 | 95 | #Nearest neighbor baseline 96 | train_contact_points,train_audio_path,train_label_list = get_data_list('train') 97 | test_contact_points,test_audio_path,test_label_list = get_data_list('test') 98 | print(len(train_contact_points),len(train_audio_path),len(train_label_list)) 99 | print(len(test_contact_points),len(test_audio_path),len(test_label_list)) 100 | correct = 0 101 | print(len(test_label_list)/4) 102 | for test_idx in range(int(len(test_label_list)/2),int(3*len(test_label_list)/4)): 103 | test_audio = read_audio_from_npy(test_audio_path[test_idx]) 104 | test_point_cloud = np.load(test_contact_points[test_idx], allow_pickle = True) 105 | min_loss=float('inf') 106 | train_index = 0 107 | print(test_idx) 108 | for train_idx in range(len(train_label_list)): 109 | train_audio = read_audio_from_npy(train_audio_path[train_idx]) 110 | train_point_cloud = np.load(train_contact_points[train_idx], allow_pickle = True) 111 | audio_loss = mse(train_audio,test_audio) 112 | distanct_1=[] 113 | for i in test_point_cloud: 114 | min_dis=float('inf') 115 | for j in train_point_cloud: 116 | distance = distance=math.dist(i,j) 117 | if distance 28 | ``` 29 | 30 | ## Evaluation 31 | Run the following command for evaluation of the trained models: 32 | ``` 33 | CUDA_VISIBLE_DEVICES=6 python test.py data_split_<1-3> 34 | ``` 35 | The visual reconstruction results are saved under `image` folder.The numerical reconstruction results will be saved in .npy file under `reconstruction_results` folder. 36 | -------------------------------------------------------------------------------- /ARNet_shape_reconstruction/compute_L1_cd.py: -------------------------------------------------------------------------------- 1 | import os 2 | import math 3 | import numpy as np 4 | 5 | split='data_split_1' 6 | loss_list=[] 7 | file_list=[] 8 | for file_name in os.listdir(f'reconstruction_results/{split}'): 9 | reconstruction_results=np.load(f'reconstruction_results/{split}/{file_name}',allow_pickle=True)[0] 10 | gt=np.load(f'data/ARDataset/ground_truth_5000_correct_unit/{file_name}',allow_pickle=True) 11 | distanct_1=[] 12 | for i in reconstruction_results: 13 | min_dis=float('inf') 14 | for j in gt: 15 | distance = distance=math.dist(i,j) 16 | if distance 5 | Duke University 6 |
7 | 8 | ### [Project Website](http://generalroboticslab.com/SonicSense) | [Video](https://www.youtube.com/watch?v=MvSYdLMsvx4) | [Paper](https://arxiv.org/abs/2406.17932) 9 | 10 | ## Overview 11 | We introduce SonicSense, a holistic design of hardware and software to enable rich robot object perception through in-hand acoustic vibration sensing. While previous studies have shown promising results with acoustic sensing for object perception, current solutions are constrained to a handful of objects with simple geometries and homogeneous materials, single-finger sensing, and mixing training and testing on the same objects. SonicSense enables container inventory status differentiation, heterogeneous material prediction, 3D shape reconstruction, and object re-identification from a diverse set of 83 real-world objects. Our system employs a simple but effective heuristic exploration policy to interact with the objects as well as end-to-end learning-based algorithms to fuse vibration signals to infer object properties. Our framework underscores the significance of in-hand acoustic vibration sensing in advancing robot tactile perception. 12 | 13 |

14 | teaser 15 |

16 | 17 | ## Code Structure 18 | 19 | We provide detailed instructions on running our code for [material classification](https://github.com/generalroboticslab/SonicSense/tree/main/ARNet_material_classification), [shape reconstruction](https://github.com/generalroboticslab/SonicSense/tree/main/ARNet_shape_reconstruction) and [object re-identification](https://github.com/generalroboticslab/SonicSense/tree/main/ARNet_object_classification) under each subdirectory. Please refer to specific README files under each directory. 20 | 21 | The full CAD model and instruction of our hardware design are under [Hardware_instruction](https://github.com/generalroboticslab/SonicSense/tree/main/Hardware_instruction) subdirectory. 22 | 23 | ## Citation 24 | 25 | If you find our paper or codebase helpful, please consider citing: 26 | 27 | ``` 28 | @inproceedings{ 29 | liu2024sonicsense, 30 | title={SonicSense: Object Perception from In-Hand Acoustic Vibration}, 31 | author={Jiaxun Liu and Boyuan Chen}, 32 | booktitle={8th Annual Conference on Robot Learning}, 33 | year={2024}, 34 | url={https://openreview.net/forum?id=CpXiqz6qf4} 35 | } 36 | ``` 37 | 38 | ## License 39 | 40 | This repository is released under the Apache License 2.0. See [LICENSE](LICENSE) for additional details. 41 | 42 | ## Acknowledgement 43 | [Point Cloud Renderer](https://github.com/zekunhao1995/PointFlowRenderer), [PyLX-16A](https://github.com/ethanlipson/PyLX-16A) 44 | 45 | 46 | `This work is supported by ARL STRONG program under awards W911NF2320182 and W911NF2220113, by DARPA FoundSci program under award HR00112490372, and DARPA TIAMAT program under award HR00112490419.` 47 | -------------------------------------------------------------------------------- /figures/Fig 1. Teaser_figure.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/generalroboticslab/SonicSense/816be5a79149d0664fe4139ca93719bf6652be06/figures/Fig 1. Teaser_figure.jpg -------------------------------------------------------------------------------- /figures/Fig 6. Learning_results_version.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/generalroboticslab/SonicSense/816be5a79149d0664fe4139ca93719bf6652be06/figures/Fig 6. Learning_results_version.jpg -------------------------------------------------------------------------------- /figures/teaser.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/generalroboticslab/SonicSense/816be5a79149d0664fe4139ca93719bf6652be06/figures/teaser.gif -------------------------------------------------------------------------------- /supplementary/Object Re-identification Confusion Matrices.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/generalroboticslab/SonicSense/816be5a79149d0664fe4139ca93719bf6652be06/supplementary/Object Re-identification Confusion Matrices.zip --------------------------------------------------------------------------------