├── .DS_Store ├── README.MD ├── img ├── filtering_comparison.png ├── filtration_barcodes_chazal_funda.jpg ├── final_pd_0.png ├── final_pd_1.png ├── homology_classification.jpg ├── pd_0_clean.png └── pd_0_naive.png ├── py ├── pd_segmentation_0.py ├── pd_segmentation_0_main.py ├── pd_segmentation_1.py ├── pd_segmentation_1_main.py ├── segment_with.py └── tree.py └── rapport_biblio-pages-15-21.pdf /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salimandre/unsupervised-image-segmentation-persistent-homology/9de7e14e30e0d8dbe970354454ea535285e91d79/.DS_Store -------------------------------------------------------------------------------- /README.MD: -------------------------------------------------------------------------------- 1 | # Unsupervised image segmentation using persistent homology theory 2 | 3 | ## Topological Data Analysis 4 | 5 | In the early of 20th century Algebraic topology provided, thanks to Poincaré, a general framework to classify shapes. Indeed the **Euler characteristic** equal to the alternating sum of the Betti numbers is a **topological invariant**. Roughly these numbers count the number of distinct objects in the domain, the number of holes and the number of voids they contain etc... 6 | 7 |

8 | 9 |

10 | 11 | **Topological Data Analysis** (TDA) is the field which apply these theorical tools in order to proceed data analysis. But these latter characteristics cannot be used straight forward because of the uncertainty of the datas and because of the sensitivity of Betti numbers to minor outliers in the data set. Therefore to tackle this issue the main tool TDA is using is **persistent homology**, in which the invariants are in the form of **persistence diagrams** also called **barcodes**. Topological invariants will then quantify the stability of geometric features with respect to degradation such as noise or artefacts 12 | 13 |

14 | 15 |

16 |

17 | Credits @ Frédéric Chazal, Bertrand Michel 18 |

19 | 20 | ## Our Method 21 | 22 | We used TDA framework to perform unsupervised image segmentation. The set of images provided by the skimage library from python and the python library Gudhi @INRIA to produce simplicial complexes and persistence diagrams. 23 | 24 | The main **procedure** was as followed : 25 | 26 | + Apply a small **gaussian blur** to the image to remove isolated pixels (outliers) 27 | + Take a random sample of points from the image called **superpixels**. We obtain a 3D cloud point if image is gray a 5D cloud point if the image is in color. 28 | + Compute the **Nested Rips-Vietoris complexes** from those points 29 | + Computed nested RV complexes for radius ε from 0 to infinity 30 | + Set a value for edges : the distance between the two vertices. Value of vertices is set to 0. 31 | + Set a value for each simplex by taking the max value of all its edges (Method called age filter) 32 | + Compute the **persistent pairs** for homology groups for dimensions 0 and 1 and for 1 and 2. 33 | + For dimensions 0 and 1 these are pairs (c,e) where e is an edge that vanish a connected component c (represented by its first vertex). 34 | + For dimensions 1 and 2 these are pairs (c,e) where e is an edge that vanish a 1-loop c (represented by its first vertex). 35 | + We compute the graph from the set of all edges of persistent pairs of dimensions 0-1. In fact it is equivalent as computing the **covering tree** of the 1-skeleton of our RV-complex, that is to say the covering tree over our cloud data point with minimum value. 36 | + In order to compute most persistent connected components and loops we then apply different procedures: 37 | + For connected components we compute the **graph** from the set of all edges of persistent pairs of dimensions 0-1. In fact it is equivalent as computing the **covering tree** of the 1-skeleton of our RV-complex, that is to say the covering tree over our cloud data point with minimum value.Then **removing** n − 1 edges from this tree in decreasing order of value gives us the n most persistent connected components. 38 | + For cycles we **add** edges which give birth to the most persistent cycles through the filtration. Then we find loops with a **traversal algorithm** 39 | + The most persistent connected components and the most persistent cycles give each a segmentation of our images. 40 | 41 | *Please note that in practice to compute most relevant and persistent 0 homology groups we followed two more steps we don't develop:* 42 | + to compute most relevant connected components we actually use a **tree** where each split represents a persistence pair and we use **gini criterion** to select most relevant ones 43 | + we use a **sampling method** based on empirical distribution of superpixel labels to infer labels on every pixels from image 44 | 45 | ## Our Results 46 | 47 | **Note:** since computation was heavy for our computer we had to use only 5000 to 10000 superpixels that is 2% to 4% of all pixels. It is interesting to see that we still managed to get decent results while TDA researchers actually use all pixels into their computation. 48 | 49 | Here is one example of image segmentation produced using the **naive procedure**. It produced 250 most persistent 0th homology groups but they are irrelevant. These are isolated pixels. 50 | 51 |

52 | 53 |

54 | 55 | Here we removed isolated pixels. We then applied our **sampling method** to recover labels for these as well. We have a more parsimonious segmentation with 21 segments but they are not homogeneous as a large part of image remains as one segment. 56 | 57 |

58 | 59 |

60 | 61 | Here are our **final results** for 0th homology groups where we applied **full procedure**. We used the same parameters for all images. Results are parsimonious as we get between 10 and 30 segments and they are more homogenous. 62 | 63 | 64 | 65 | Here are our final results for 1th homology groups where we applied full procedure. Actually we noticed that most often only the most persistent cycle may be relevant to the image. Although the main drawback with 1th homology groups is that a relevant cycle in 3D or 5D RV complex may become irrelevant when projected into 2D. 66 | 67 | 68 | 69 | **To conclude** we found that topological persistence of 0 dimension elements is an effective and robust (no need of parameter tuning) method for image segmentation but more generally for unsupervised data processing. Also it is interesting to see that this method is close to Felzenswalb's algorithm. It provides a theorical framework which explains why this algorithm is so powerful. In the contrary we found that topological persistence of 1 dimension elements is not useful in the case of images due to projection issue. 70 | -------------------------------------------------------------------------------- /img/filtering_comparison.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salimandre/unsupervised-image-segmentation-persistent-homology/9de7e14e30e0d8dbe970354454ea535285e91d79/img/filtering_comparison.png -------------------------------------------------------------------------------- /img/filtration_barcodes_chazal_funda.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salimandre/unsupervised-image-segmentation-persistent-homology/9de7e14e30e0d8dbe970354454ea535285e91d79/img/filtration_barcodes_chazal_funda.jpg -------------------------------------------------------------------------------- /img/final_pd_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salimandre/unsupervised-image-segmentation-persistent-homology/9de7e14e30e0d8dbe970354454ea535285e91d79/img/final_pd_0.png -------------------------------------------------------------------------------- /img/final_pd_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salimandre/unsupervised-image-segmentation-persistent-homology/9de7e14e30e0d8dbe970354454ea535285e91d79/img/final_pd_1.png -------------------------------------------------------------------------------- /img/homology_classification.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salimandre/unsupervised-image-segmentation-persistent-homology/9de7e14e30e0d8dbe970354454ea535285e91d79/img/homology_classification.jpg -------------------------------------------------------------------------------- /img/pd_0_clean.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salimandre/unsupervised-image-segmentation-persistent-homology/9de7e14e30e0d8dbe970354454ea535285e91d79/img/pd_0_clean.png -------------------------------------------------------------------------------- /img/pd_0_naive.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salimandre/unsupervised-image-segmentation-persistent-homology/9de7e14e30e0d8dbe970354454ea535285e91d79/img/pd_0_naive.png -------------------------------------------------------------------------------- /py/pd_segmentation_0.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplotlib.pyplot as plt 3 | import sys 4 | # python gudhi library path 5 | sys.path.append('/Users/Salim_Andre/Desktop/GUDHI_2.3.0/build/cython'); 6 | import gudhi as gd 7 | from scipy.ndimage import gaussian_filter 8 | import networkx as nx 9 | from sklearn.neighbors.kde import KernelDensity 10 | import time 11 | 12 | def pd_segmentation_0(mode_, n_superpixels_,img_, RV_epsilon_=30, gauss_sigma_=0.5, list_events_=[350], n_pxl_min_=10, entropy_thresh_=0.05, density_excl_=0.05, plot_pd_=False): 13 | # required: Networkx as nx, Gudhi as gd, from sklearn.neighbors.kde import KernelDensity 14 | # n_superpixels_: wanted number of superpixels, algo will take closest N_suppxl s.t. N_suppxl * square_size = N_pixels 15 | # RV_epsilon_ : maximum edge length of the Rips-Vietoris complex 16 | # sigma_ : 0 < param <= 1 coeff for gaussian blur the bigger the more blur 17 | # dim_ : order of homology, also dimension of the R-V complex (=0) 18 | # density_excl_ : percent of points to be exclude using a gaussian kernel density filter 19 | # n_pxl_min_ : minimum number of pixels per segments 20 | # list_events_ : sequence of cuts in the covering tree of minimum value e.g. [n1, n2, n3] will produce n1 cuts, then n2 - n1 cuts, then n3 - n2 cuts. where n12: 27 | n_col_chan = 3; 28 | else: 29 | n_col_chan=1; 30 | 31 | n_pixels = np.prod(img_.shape[:2]); 32 | list_squares=[4, 9, 16, 25, 36, 49, 64] 33 | list_pos=[(1,0), (1,1), (2,1), (2,2), (3,2), (3,3), (4,3)]; 34 | list_steps=[2, 3, 4, 5, 6, 7, 8]; 35 | 36 | if mode_=='standard': 37 | gauss_sigma_ = 0.5; 38 | n_pxl_min_= 15; 39 | entropy_thresh_= .15; 40 | density_excl_= 0.; 41 | plot_pd_=False; 42 | list_squares=[4, 9, 16, 25, 36, 49, 64]#[25, 36, 49, 64]; 43 | list_pos=[(1,0), (1,1), (2,1), (2,2), (3,2), (3,3), (4,3)]#[(2,2), (3,2), (3,3), (4,3)]; 44 | list_steps=[2, 3, 4, 5, 6, 7, 8]#[5, 6, 7, 8]; 45 | 46 | i_step = np.argmin([np.abs(n_pixels/s - n_superpixels_) for s in list_squares]); 47 | step = list_steps[i_step]; 48 | 49 | step_up_j=list_pos[i_step][0] 50 | step_down_j=list_pos[i_step][1] 51 | 52 | step_up_i=list_pos[i_step][1] 53 | step_down_i=list_pos[i_step][0] 54 | 55 | dh = int(np.ceil(height/step))*step - height; 56 | dw = int(np.ceil(width/step))*step - width; 57 | 58 | if dh>0: 59 | dhI=img_[-dh:,:]; 60 | img_=np.concatenate((img_, dhI[::-1,:]), axis=0); 61 | if dw>0: 62 | dwI=img_[:,-dw:]; 63 | img_=np.concatenate((img_, dwI[:,::-1]), axis=1); 64 | 65 | grid_y, grid_x = np.mgrid[:img_.shape[0], :img_.shape[1]]; 66 | means_y = grid_y[list_pos[i_step][0]::step, list_pos[i_step][1]::step]; 67 | means_x = grid_x[list_pos[i_step][0]::step, list_pos[i_step][1]::step]; 68 | 69 | if gauss_sigma_ > 0: 70 | # gaussian blur 71 | Iblur=gaussian_filter(img_, sigma=gauss_sigma_*np.floor(0.5*step)/4); 72 | else: 73 | Iblur=img_; 74 | del img_; 75 | # from image to cloud point data 76 | pcd = np.dstack((means_y,means_x,Iblur[means_y, means_x]*255)).reshape((-1,n_col_chan+2)); 77 | 78 | nb_points = pcd.shape[0]; # real number of data points 79 | 80 | if density_excl_ > 0: 81 | print('\nNumber of initial superpixels: {:}'.format(nb_points)) # real number of data points after density filtering 82 | # apply density filtering 83 | kde = KernelDensity(kernel='gaussian', bandwidth=20).fit(pcd); 84 | pcd_density=kde.score_samples(pcd); 85 | ranked_density= sorted(pcd_density, reverse=True); 86 | n_excl=int(nb_points*density_excl_); 87 | thresh_density= ranked_density[-n_excl:][0]; 88 | # filter point cloud data with density threshold 89 | excl_pcd=pcd[pcd_density<=thresh_density,:2]; 90 | pcd=pcd[pcd_density>thresh_density,:]; 91 | # update number of data points 92 | nb_points = pcd.shape[0]; # updated real number of data points 93 | print('\nNumber of superpixels after density filtering: {:}'.format(nb_points)) # real number of data points after density filtering 94 | else: 95 | excl_pcd=np.zeros((0,2)); 96 | 97 | if mode_=='standard': # compute RV_epsilon from ratio 98 | ratio=np.prod(height_0*width_0)/nb_points; 99 | print('ratio = ', ratio) 100 | RV_epsilon_ = np.ceil(0.5*ratio+10); 101 | 102 | # print input variables 103 | print('\nInput variables: :') 104 | print('size of superpixels : ', step,' * ', step) 105 | print('Number of superpixels: {:}'.format(nb_points)) 106 | print('RV epsilon = ', RV_epsilon_) 107 | print('Blur sigma = ', gauss_sigma_*np.floor(0.5*step)/4) 108 | print('% of removed pixels by density filtering = ', density_excl_) 109 | print('min size of segments = ', n_pxl_min_*step*step, 'pixels ') 110 | print('max % of removed pixels per cut = ', entropy_thresh_) 111 | print('plot persistence diagram: ',plot_pd_); 112 | 113 | # from PCD to Rips-Vietoris complex 114 | Rips_complex_sample = gd.RipsComplex(points = pcd,max_edge_length=RV_epsilon_); 115 | Rips_simplex_tree_sample = Rips_complex_sample.create_simplex_tree(max_dimension=1); 116 | 117 | # compute persistence diagram on simplex tree structure 118 | diag_Rips = Rips_simplex_tree_sample.persistence(); # (dim, (birth_date, death_date)) 119 | 120 | if plot_pd_==True: 121 | # compute persistence diagram for dimension dim_ 122 | diag_Rips_0=Rips_simplex_tree_sample.persistence_intervals_in_dimension(0); 123 | print('lamost plot') 124 | plt=gd.plot_persistence_diagram([(0, interval) for interval in diag_Rips_0], max_plots=0, alpha=0.1,legend=True) 125 | plt.show() 126 | 127 | # stock persistent pairs -> key topological events 128 | ppairs=Rips_simplex_tree_sample.persistence_pairs(); 129 | 130 | betti_0 = Rips_simplex_tree_sample.betti_numbers()[0]; 131 | print('\nBetti number beta_0 at oo: {:}\n'.format(betti_0)) 132 | 133 | # stock persistent pairs -> key topological events 134 | key_edges_0=[tuple(pair[1]) for pair in Rips_simplex_tree_sample.persistence_pairs() if len(pair[0])==1][:-betti_0]; 135 | 136 | # build covering tree with minimum value using 0-1 persistence pairs 137 | G=nx.Graph() 138 | G.add_nodes_from(range(nb_points)); 139 | G.add_edges_from(key_edges_0); 140 | 141 | if mode_=='standard': 142 | 143 | if n_pxl_min_*step*step<=200 and ratio<10: 144 | list_events_=[500, 1000]; 145 | if n_pxl_min_*step*step>200 and ratio<10: 146 | list_events_=[200, 500, 1000]; 147 | 148 | if n_pxl_min_*step*step<=100 and ratio>=10 and ratio<18: 149 | list_events_=[600]; 150 | if n_pxl_min_*step*step>100 and n_pxl_min_*step*step<=200 and ratio>=10 and ratio<18: 151 | list_events_=[200, 600]; 152 | if n_pxl_min_*step*step>200 and ratio>=10 and ratio<18: 153 | list_events_=[125, 300, 700]; 154 | 155 | if n_pxl_min_*step*step<=100 and ratio>18: 156 | list_events_=[500]; 157 | if n_pxl_min_*step*step>=100 and n_pxl_min_*step*step<200 and ratio>18: 158 | list_events_=[200, 500]; 159 | if n_pxl_min_*step*step>=200 and n_pxl_min_*step*step<300 and ratio>18: 160 | list_events_=[150, 500]; 161 | if n_pxl_min_*step*step>=300 and ratio>18: 162 | list_events_=[75, 275, 500]; 163 | 164 | c0=(list_events_[0]-betti_0)*(betti_0=list_events_[0]); 165 | list_events_[0]=c0; 166 | 167 | print('list of cuts: ',list_events_, '\n') 168 | list_events=[None]+[-nc for nc in list_events_] 169 | cuts=[] 170 | for i in range(len(list_events_)): 171 | cuts = cuts + [[tuple(np.sort(edge)) for edge in key_edges_0[list_events[i+1]:list_events[i]]] ]; 172 | 173 | tree = Tree(G); 174 | 175 | n_expand=len(list_events_); 176 | for i in range(n_expand): 177 | tree.expand(cuts[i], size = n_pxl_min_, proba = entropy_thresh_); 178 | 179 | #print(tree.as_str()) 180 | 181 | #print('depth = ', tree.get_depth(),'\n') 182 | 183 | #print('number of nodes = ', tree.count,'\n') 184 | 185 | in_segments = [leaf.pixels for leaf in tree.get_leaves()]; 186 | 187 | n_kept_pixels = sum([len(leaf_pxl) for leaf_pxl in in_segments]); 188 | 189 | n_leaves = len(in_segments); 190 | #print('nb leaves = ', n_leaves,'\n') 191 | 192 | #print('PCD = ', n_kept_pixels, ' pixels | ', round(100.*n_kept_pixels/len(tree.pixels),2), ' %\n') 193 | 194 | n_removed_pixels = len(tree.pixels)-n_kept_pixels; 195 | print('loss = ', n_removed_pixels, ' pixels | ', round(100.*n_removed_pixels/len(tree.pixels),2), ' %\n') 196 | 197 | out_segments = [0]*n_removed_pixels; 198 | i=0; 199 | for cut in tree.out_pixels: #start sampling for least persistent removed pixels ! 200 | for seg in cut: 201 | for pxl in seg: 202 | out_segments[i]=pxl; 203 | i+=1; 204 | 205 | # image with labels 206 | img_labels=np.zeros(Iblur.shape[:2],dtype='int64'); 207 | height, width = Iblur.shape[:2]; 208 | 209 | # segments with pixel positions 210 | pcd_in_segments = [pcd[seg,:2] for seg in in_segments]; 211 | pcd_out_segments = pcd[out_segments,:2]; 212 | 213 | # label all superpixels which have not been removed by kernel density filter 214 | for l, segment_l in enumerate(pcd_in_segments): 215 | for pxl in segment_l: 216 | i=int(pxl[0]); 217 | j=int(pxl[1]); 218 | img_labels[i-step_down_i:i+step_up_i+1,j-step_down_j:j+step_up_j+1]=l+1; 219 | 220 | # label all superpixels which have been removed by kernel density filter by with uniform distribution on neigbhors 221 | 222 | percent_out = 100.*len(out_segments)/pcd.shape[0]; 223 | #print('sampling clusters for {:} removed pixels, {:2.2f} % of all pixels\n'.format(len(out_segments),percent_out)); 224 | distrib=[np.power(np.sum(img_labels==l), -0.12) for l in range(1, np.max(img_labels)+1)]; 225 | #step_i=[step_down_i,step_up_i]; 226 | #step_j=[step_down_j,step_up_j]; 227 | for pxl in pcd_out_segments[::-1,:]: 228 | i=int(pxl[0]); 229 | j=int(pxl[1]); 230 | label_V_ij=[] 231 | k=1; 232 | while len(label_V_ij)<1: 233 | #V_k=[(i*2*step_i[i>0], j*2*step_j[j>0]) for i in range(-k,k+1) for j in range(-k,k+1) if abs(i)==k or abs(j)==k]; 234 | for delta in [(-2*k*step_down_i,-2*k*step_down_j), (-2*k*step_down_i,0), (-2*k*step_down_i,+2*k*step_up_j), (0,-2*k*step_down_j), (0,+2*k*step_up_j), (+2*k*step_up_i,-2*k*step_down_j), (+2*k*step_up_i,0), (+2*k*step_up_i,+2*k*step_up_j)]: 235 | if 0<=i+delta[0] and i+delta[0]1: 239 | distrib_V_ij=[0]*len(label_V_ij); 240 | for ind, l in enumerate(label_V_ij): 241 | distrib_V_ij[ind]=distrib[l-1]; 242 | distrib_V_ij=distrib_V_ij/sum(distrib_V_ij); 243 | 244 | img_labels[i-step_down_i:i+step_up_i+1,j-step_down_j:j+step_up_j+1]=np.random.choice(label_V_ij, p=distrib_V_ij); # sample label based on neigbhors and label distribution 245 | 246 | else: 247 | 248 | img_labels[i-step_down_i:i+step_up_i+1,j-step_down_j:j+step_up_j+1]=label_V_ij[0]; # closest neighbor 249 | label_V_ij=[]; 250 | 251 | # remove symmetric expansion 252 | img_labels=img_labels[:height_0,:width_0]; 253 | 254 | return Iblur, pcd_in_segments, img_labels 255 | 256 | 257 | 258 | 259 | 260 | 261 | 262 | 263 | 264 | 265 | 266 | -------------------------------------------------------------------------------- /py/pd_segmentation_0_main.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import os 3 | import matplotlib.pyplot as plt 4 | from skimage import data 5 | from skimage.util import img_as_float 6 | from skimage.color import rgb2gray, gray2rgb 7 | from skimage.segmentation import mark_boundaries 8 | import time 9 | import matplotlib.image as mpimg 10 | exec(open('/Users/Salim_Andre/Desktop/IMA/PRAT/code/pd_segmentation_0.py').read()) 11 | exec(open('/Users/Salim_Andre/Desktop/IMA/PRAT/code/tree.py').read()) 12 | 13 | ### DATASET 14 | 15 | PATH_img = '/Users/Salim_Andre/Desktop/IMA/PRAT/' # path to my own images 16 | 17 | swans=mpimg.imread(PATH_img+'swans.jpg'); 18 | baby=mpimg.imread(PATH_img+'baby.jpg'); 19 | 20 | img_set = [data.astronaut(), data.camera(), data.coins(), data.checkerboard(), data.chelsea(), \ 21 | data.coffee(), data.clock(), data.hubble_deep_field(), data.horse(), data.immunohistochemistry(), \ 22 | data.moon(), data.page(), data.rocket(), swans, baby] 23 | 24 | ### IMAGE 25 | 26 | I=img_as_float(img_set[0]); 27 | 28 | ### PARAMETERS FOR 0-HOMOLOGY GROUPS 29 | 30 | mode='customized'; 31 | n_superpixels=10000; 32 | RV_epsilon=30; 33 | gauss_sigma=0.5; 34 | list_events=[800]; 35 | n_pxl_min_ = 30; 36 | density_excl=0.0; 37 | entropy_thresh_=1.1; 38 | plot_pd=False; 39 | 40 | ''' 41 | mode = 'standard' 42 | n_superpixels=10000; 43 | ''' 44 | ### RESULTS FOR 0-HOMOLOGY GROUPS 45 | 46 | start = time.time() 47 | if mode=='standard': 48 | img_sym, segments_pd, img_labels = pd_segmentation_0(mode, n_superpixels, I); 49 | else: 50 | img_sym, segments_pd, img_labels = pd_segmentation_0(mode, n_superpixels, I, RV_epsilon, gauss_sigma, list_events, n_pxl_min_, entropy_thresh_, density_excl, plot_pd); 51 | end = time.time() 52 | 53 | ### PLOTS 54 | 55 | #stock colors 56 | 57 | my_colors = [(0,0,0)]*len(segments_pd); 58 | for i, seg in enumerate(segments_pd): 59 | my_colors[i]=tuple(np.random.rand(1,3)[0]); 60 | 61 | # SEGMENTATION 62 | 63 | # image 64 | plt.rcParams["figure.figsize"] = [10,10] 65 | plt.imshow(I) 66 | plt.axis('off') 67 | plt.show() 68 | 69 | # image segmentation with pd 70 | plt.rcParams["figure.figsize"] = [10,10] 71 | plt.imshow(np.ones(img_sym.shape), cmap='gray', vmin=0, vmax=1) 72 | for i, seg in enumerate(segments_pd): 73 | plt.plot(seg[:,1], seg[:,0], 's', ms=5.1, color=my_colors[i]) 74 | plt.axis('off') 75 | #plt.title('PCD\'s most persistent 0-homology groups') 76 | plt.show() 77 | 78 | # image segmentation with pd 79 | plt.rcParams["figure.figsize"] = [10,10] 80 | plt.imshow(rgb2gray(img_sym), cmap='gray', interpolation='nearest') 81 | for i, seg in enumerate(segments_pd): 82 | plt.plot(seg[:,1], seg[:,0], 's', ms=5.1, color=my_colors[i]) 83 | plt.axis('off') 84 | #plt.title('Pixels removed are near boundaries and complex regions') 85 | plt.show() 86 | 87 | # SAMPLING 88 | 89 | Isampling=np.zeros((I.shape[0],I.shape[1],3)); 90 | for l in range(1,np.max(img_labels)+1): 91 | Isampling[img_labels==l]=my_colors[l-1]; 92 | plt.imshow(Isampling) 93 | plt.axis('off') 94 | #plt.title('Image segmention after sampling') 95 | plt.show() 96 | ''' 97 | plt.imshow(mark_boundaries(I, img_labels)) 98 | for i, seg in enumerate(segments_pd): 99 | plt.plot(seg[:,1], seg[:,0], 's', ms=5.1, color=my_colors[i]) 100 | plt.axis('off') 101 | plt.title('Image segmention after sampling') 102 | plt.show() 103 | ''' 104 | # IMAGE SEGMENTATION 105 | 106 | plt.rcParams["figure.figsize"] = [10,10] 107 | plt.axis('off') 108 | plt.imshow(mark_boundaries(I, img_labels)) 109 | #plt.title('Image segmentation') 110 | plt.show() 111 | 112 | # IMAGE RECOVERING 113 | 114 | Iregions=np.zeros(I.shape); 115 | for l in range(1,np.max(img_labels)+1): 116 | Iregions[img_labels==l]=np.mean(I[img_labels==l],axis=0); 117 | 118 | if len(I.shape)==2: 119 | plt.imshow(Iregions, cmap='gray', interpolation='nearest'); 120 | else: 121 | plt.imshow(Iregions); 122 | #plt.title('Image recovering from segments') 123 | plt.axis('off') 124 | plt.show() 125 | 126 | #print(np.sum(img_labels==0)) 127 | 128 | print('\nExecution time: {:2.2f} seconds \n'.format(end - start)) 129 | 130 | print('\nNumber of segments: {:} \n'.format(np.max(img_labels))) 131 | 132 | err_1 = np.sum(np.abs((Iregions-I)*255.))/np.prod(Iregions.shape); 133 | print('Mean error norm 1 per pixel: {:2.2f}'.format(err_1)); 134 | 135 | err_2 = np.sqrt(np.sum(((Iregions-I)*255.)**2)/np.prod(Iregions.shape)); 136 | print('Mean error norm 2 per pixel: {:2.2f}'.format(err_2)); 137 | 138 | ''' 139 | collage photos 140 | 349 * 344 141 | ''' 142 | -------------------------------------------------------------------------------- /py/pd_segmentation_1.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplotlib.pyplot as plt 3 | import sys 4 | # python gudhi library path 5 | sys.path.append('/Users/Salim_Andre/Desktop/GUDHI_2.3.0/build/cython'); 6 | import gudhi as gd 7 | from scipy.ndimage import gaussian_filter 8 | import networkx as nx 9 | from sklearn.neighbors.kde import KernelDensity 10 | from random import choice 11 | from scipy import ndimage 12 | import collections 13 | 14 | def pd_segmentation_1(n_superpixels_,img_,RV_epsilon_, gauss_sigma_, n_events_, density_excl_=0.05, plot_pd_=False): 15 | # required: Networkx as nx, Gudhi as gd, from sklearn.neighbors.kde import KernelDensity 16 | # RV_epsilon_ : maximum edge length of the Rips-Vietoris complex 17 | # sigma_ : 0 < param <= 1 coeff for gaussian blur the bigger the more blur 18 | # dim_ : order of homology, also dimension of the R-V complex 19 | # density_param_ : percent of points to be exclude using a gaussian kernel density filter 20 | 21 | dim_ = 1; 22 | 23 | height, width = img_.shape[:2]; 24 | if len(img_.shape) >2: 25 | n_col_chan = 3; 26 | else: 27 | n_col_chan=1; 28 | 29 | step = int(round(0.5*(np.sqrt(height * width / n_superpixels_)-1))); # make sure to get n_segments superpixels 30 | # extend img by symmetry to have a perfect cover with square patches 31 | dh=(2*step+1)-(height%(2*step+1)); 32 | dw=(2*step+1)-(width%(2*step+1)); 33 | dhI=img_[-dh:,:]; 34 | img_=np.concatenate((img_,dhI[::-1,:]),axis=0); 35 | dwI=img_[:,-dw:]; 36 | img_=np.concatenate((img_,dwI[:,::-1]),axis=1); 37 | # subsampling 38 | grid_y, grid_x = np.mgrid[:img_.shape[0], :img_.shape[1]]; 39 | means_y = grid_y[step::2*step+1, step::2*step+1]; 40 | means_x = grid_x[step::2*step+1, step::2*step+1]; 41 | 42 | if gauss_sigma_ > 0: 43 | # gaussian blur 44 | Iblur=gaussian_filter(img_, sigma=gauss_sigma_*step/4); 45 | else: 46 | Iblur=img_; 47 | del img_; 48 | # from image to cloud point data 49 | pcd = np.dstack((means_y,means_x,Iblur[means_y, means_x]*255)).reshape((-1,n_col_chan+2)); 50 | 51 | nb_points = pcd.shape[0]; # real number of data points 52 | print('\nNumber of initial superpixels: {:}'.format(nb_points)) # real number of data points after density filtering 53 | 54 | #ndimage.sobel(rgb2gray(I)) 55 | 56 | if density_excl_ > 0: 57 | # apply density filtering 58 | kde = KernelDensity(kernel='gaussian', bandwidth=20).fit(pcd); 59 | pcd_density=kde.score_samples(pcd); 60 | ranked_density= sorted(pcd_density, reverse=True); 61 | n_excl=int(nb_points*density_excl_); 62 | thresh_density= ranked_density[-n_excl:][0]; 63 | # filter point cloud data with density threshold 64 | excl_pcd=pcd[pcd_density<=thresh_density,:2]; 65 | pcd=pcd[pcd_density>thresh_density,:]; 66 | # update number of data points 67 | nb_points = pcd.shape[0]; # updated real number of data points 68 | else: 69 | excl_pcd=np.zeros((0,2)); 70 | 71 | print('\nNumber of superpixels after density filtering: {:}'.format(nb_points)) # real number of data points after density filtering 72 | 73 | # from PCD to Rips-Vietoris complex 74 | Rips_complex_sample = gd.RipsComplex(points = pcd,max_edge_length=RV_epsilon_); 75 | Rips_simplex_tree_sample = Rips_complex_sample.create_simplex_tree(max_dimension=2); 76 | # compute persistence diagram on simplex tree structure 77 | diag_Rips = Rips_simplex_tree_sample.persistence(); # (dim, (birth_date, death_date)) 78 | 79 | if plot_pd_==True: 80 | # compute persistence diagram for dimension dim_ 81 | plt=gd.plot_persistence_diagram(diag_Rips,legend=True); 82 | plt.show() 83 | 84 | betti_nb_0 = Rips_simplex_tree_sample.betti_numbers()[0]; 85 | print('\nBetti number beta_0 at oo: {:}'.format(betti_nb_0)); 86 | betti_nb_1 = Rips_simplex_tree_sample.betti_numbers()[1]; 87 | print('\nBetti number beta_1 at oo: {:}'.format(betti_nb_1)); 88 | 89 | # build covering tree with minimum value 90 | G=nx.Graph() 91 | ppairs_0=[pair for pair in Rips_simplex_tree_sample.persistence_pairs() if len(pair[0])==1]; 92 | list_edges_0=[tuple(pair[1]) for pair in ppairs_0 if pair[1]!=[]]; 93 | G.add_nodes_from(range(nb_points)); 94 | G.add_edges_from(list_edges_0); 95 | 96 | ppairs_1=[pair for pair in Rips_simplex_tree_sample.persistence_pairs() if len(pair[0])==2]; 97 | list_key_edges_1=[tuple(pair[0]) for pair in ppairs_1]; 98 | death_by_interv=[a[1][1] for a in Rips_simplex_tree_sample.persistence() if a[0]==1] 99 | ind_sorted=np.argsort(np.argsort(death_by_interv)); 100 | key_edges = [pair for pair in [list_key_edges_1[ind] for ind in ind_sorted[:n_events_] ]]; 101 | G_cycles=[]; 102 | # add edge which creates a new class of cycle 103 | # search cycles with dft in O(|V|) 104 | for edge in key_edges: 105 | G.add_edges_from([edge]); 106 | G_cycles=G_cycles+nx.cycle_basis(G) #nx.find_cycle(G)]; 107 | G.remove_edges_from([edge]); 108 | 109 | segments_pxl=[pcd[nodes,:2] for nodes in G_cycles]; 110 | 111 | return Iblur, segments_pxl 112 | -------------------------------------------------------------------------------- /py/pd_segmentation_1_main.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import os 3 | import matplotlib.pyplot as plt 4 | from skimage import data 5 | from skimage.util import img_as_float 6 | from skimage.color import rgb2gray, gray2rgb 7 | from skimage.segmentation import mark_boundaries 8 | import time 9 | import matplotlib.image as mpimg 10 | exec(open('/Users/Salim_Andre/Desktop/IMA/PRAT/code/pd_segmentation_1.py').read()) 11 | 12 | ### DATASET 13 | 14 | PATH_img = '/Users/Salim_Andre/Desktop/IMA/PRAT/' # path to my own images 15 | 16 | swans=mpimg.imread(PATH_img+'swans.jpg'); 17 | baby=mpimg.imread(PATH_img+'baby.jpg'); 18 | 19 | img_set = [data.astronaut(), data.camera(), data.coins(), data.checkerboard(), data.chelsea(), \ 20 | data.coffee(), data.clock(), data.hubble_deep_field(), data.horse(), data.immunohistochemistry(), \ 21 | data.moon(), data.page(), data.rocket(), swans, baby] 22 | 23 | ### IMAGE 24 | 25 | I=img_as_float(img_set[4]); 26 | 27 | ### PARAMETERS FOR 1-HOMOLOGY GROUPS 28 | 29 | n_superpixels=400; 30 | RV_epsilon=180; 31 | gauss_sigma=0.5; 32 | n_events=6; 33 | #n_pxl_min_ = 10; 34 | density_excl=0.0; 35 | plot_pd=False; 36 | 37 | ### RESULTS FOR 1-HOMOLOGY GROUPS 38 | 39 | dim=1; 40 | 41 | start_exc = time.time() 42 | img_sym, segments_pxl = pd_segmentation_1(n_superpixels, I, RV_epsilon, gauss_sigma, n_events, density_excl, plot_pd); 43 | end_exc = time.time() 44 | 45 | ### SHOW RESULTS FOR 1-HOMOLOGY GROUPS 46 | 47 | # 1-holes in image 48 | plt.rcParams["figure.figsize"] = [7,7] 49 | #listplots=[321,322,323,324,325,326]; 50 | for i, seg in enumerate(segments_pxl): 51 | #plt.subplot(listplots[i]); 52 | plt.axis('off') 53 | plt.imshow(rgb2gray(I), cmap='gray', interpolation='nearest') 54 | plt.scatter(seg[:,1],seg[:,0], c='r')#, ms=5) 55 | start=seg[0,:]; 56 | end=seg[-1,:]; 57 | end_loop=np.array([end,start]); 58 | plt.plot(seg[:,1],seg[:,0], '-r')#, ms=5) 59 | plt.plot(end_loop[:,1],end_loop[:,0], '-r')#, ms=5) 60 | #plt.savefig('/Users/Salim_Andre/Desktop/IMA/PRAT/projet/figures_experiments/cascades/coffee_'+str(i+1)+'.png') 61 | plt.show() 62 | 63 | ''' 64 | # 1-holes in image 65 | plt.rcParams["figure.figsize"] = [7,7] 66 | plt.imshow(rgb2gray(I), cmap='gray', interpolation='nearest') 67 | #plt.imshow(np.abs(ndimage.sobel(rgb2gray(I)))) 68 | #plt.imshow(np.abs(gaussian_filter(ndimage.sobel(rgb2gray(I)), sigma=5.))) 69 | #listplots=[321,322,323,324,325,326]; 70 | for i, seg in enumerate(segments_pxl): 71 | #plt.subplot(listplots[i]); 72 | plt.axis('off') 73 | plt.scatter(seg[:,1],seg[:,0], c='r', alpha=0.01)#, ms=5) 74 | start=seg[0,:]; 75 | end=seg[-1,:]; 76 | end_loop=np.array([end,start]); 77 | plt.plot(seg[:,1],seg[:,0], '-r')#, ms=5) 78 | plt.plot(end_loop[:,1],end_loop[:,0], '-r')#, ms=5) 79 | #plt.savefig('/Users/Salim_Andre/Desktop/IMA/PRAT/projet/figures_experiments/cascades/coffee_'+str(i+1)+'.png') 80 | plt.show() 81 | ''' 82 | print('\nExecution time: {:2.2f} seconds \n'.format(end_exc - start_exc)) 83 | -------------------------------------------------------------------------------- /py/segment_with.py: -------------------------------------------------------------------------------- 1 | 2 | import matplotlib.pyplot as plt 3 | 4 | from skimage.filters import threshold_otsu, sobel 5 | from skimage.segmentation import felzenszwalb, slic, quickshift, watershed 6 | from skimage.segmentation import mark_boundaries 7 | from skimage import exposure 8 | from skimage.util import img_as_float 9 | from skimage.color import rgb2gray, gray2rgb 10 | 11 | def segment_with(I, seg_method): 12 | 13 | # PARAMETERS 14 | 15 | # Felzenswalb 16 | scale_=400; 17 | sigma_=.5; 18 | min_size_=500; 19 | 20 | # SLIC 21 | n_segments_=100; #15 22 | compactness_=10; 23 | sigma_=1 24 | 25 | # Quickshift 26 | kernel_size_=20 27 | max_dist_=45 28 | ratio_=0.5 29 | 30 | # Watershed 31 | markers_=10 32 | compactness_=0.001 33 | 34 | # SEGMENTATION METHODS 35 | 36 | if seg_method=='Otsu Thresholding': 37 | 38 | I=rgb2gray(I); 39 | thd=threshold_otsu(I); 40 | Ib=Idepth: 43 | depth=leaf.depth; 44 | return depth 45 | 46 | def expand(self,list_edges_to_be_removed, size=10, proba=0.05): 47 | out_pixels=[] 48 | H=self.graph; 49 | for leaf in self.get_leaves(): 50 | 51 | if leaf.state==0: 52 | 53 | subH=nx.Graph(H.subgraph(leaf.pixels)); 54 | 55 | if set(subH.edges()) & set(list_edges_to_be_removed): 56 | 57 | subH.remove_edges_from(list_edges_to_be_removed) 58 | cc = [list(a) for a in nx.connected_components( subH ) ]; 59 | 60 | if len(cc)>1: 61 | 62 | if size>0: 63 | 64 | nodes_toberemoved=[-1]*len(cc); 65 | loss = 0; 66 | n_nodes_tbr=0 67 | max_size_c=size; 68 | for j, c in enumerate(cc): 69 | size_c = len(c); 70 | if size_c <= size: #small cc 71 | nodes_toberemoved[n_nodes_tbr]=j 72 | loss+=size_c; 73 | n_nodes_tbr+=1; 74 | if size_c > max_size_c: #big cc 75 | max_size_c = size_c; 76 | nodes_toberemoved=nodes_toberemoved[:n_nodes_tbr]; 77 | cpt_not_small = len(cc)-len(nodes_toberemoved); 78 | bool_expand = loss/len(leaf.pixels) <= proba or self.get_depth()==0; 79 | bool_pause = (max_size_c/len(leaf.pixels) >= 1.-proba and not self.get_depth()==0 and max_size_c/len(self.get_root().pixels)<0.3); 80 | else: 81 | nodes_toberemoved=[]; 82 | bool_expand=True; 83 | bool_pause=False; 84 | cpt_not_small=1000; 85 | 86 | if bool_expand and cpt_not_small>0 and not bool_pause: #expand 87 | if len(nodes_toberemoved)>0: 88 | out_pixels=out_pixels+[cc[i] for i in nodes_toberemoved]; 89 | cc = [cc[i] for i in range(len(cc)) if i not in nodes_toberemoved]; 90 | leaf.children = [Tree() for c in cc]; 91 | for i, child in enumerate(leaf.children): 92 | child.pixels=cc[i]; 93 | child.parent=leaf; 94 | child.depth = leaf.depth + 1; 95 | child.key = child.get_root(count='yes').count-1; 96 | elif bool_pause: 97 | leaf.state=0; # continue 98 | 99 | else: 100 | leaf.state=1; #block 101 | if not self.get_root().out_pixels: 102 | self.get_root().out_pixels=[out_pixels]; 103 | else: 104 | self.get_root().out_pixels+=[out_pixels]; 105 | 106 | def as_str(self, level=0): 107 | ret = "\t"*level+repr(len(self.pixels))+"\n" 108 | if not not self.children: 109 | for child in self.children: 110 | ret += child.as_str(level+1) 111 | return ret 112 | 113 | 114 | ''' 115 | H= nx.Graph(); 116 | H.add_path([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]); 117 | 118 | tree = Tree(H) 119 | 120 | re_1=[(3,4),(5,6)]; 121 | 122 | tree.expand(re_1) 123 | 124 | print(tree.as_str()) 125 | 126 | re_2=[(2,3),(3,4),(5,6),(7,8),(9,10)]; 127 | 128 | tree.expand(re_2) 129 | 130 | print(tree.as_str()) 131 | ''' 132 | -------------------------------------------------------------------------------- /rapport_biblio-pages-15-21.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/salimandre/unsupervised-image-segmentation-persistent-homology/9de7e14e30e0d8dbe970354454ea535285e91d79/rapport_biblio-pages-15-21.pdf --------------------------------------------------------------------------------