├── Final.project.outputs.pdf ├── README.md ├── meanshift.with.boundary.extra.20.py └── meanshift.without.boundary.py /Final.project.outputs.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/agamdeep/Mean-Shift-Segmentation-using-Python/dae181a89a6c5e16b39da5626e062e6f9189047b/Final.project.outputs.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Mean-Shift-Segmentation-using-Python 2 | Performed the mean shift segmentation to track objects over image sequences. 3 | Mean Shift Segmentation 4 | Implemented by: 5 | Agam Deep Arora (50169805) 6 | & 7 | Debika Dutt (50170009) 8 | Submission date: 14-Dec-15 9 | 2. Literature review 10 | a. Review of the technologies in the project 11 | While working on this project we learned to program in Python. We learned the functionalities of many python functions such as imread for reading an input, imshow to display the output and various others. We also learned about various Python libraries like scipy, numpy, sys, random and opencv. We realized the subtle difference python has from all of its contemporary languages like MATLAB. 12 | b. Basic technologies learned in class: 13 | 1. Boundary preserving filtering by applying median filter to the input image 14 | 2. Mean Shift Filtering 15 | 3. Thresholding 16 | 4. Calculating Euclidean distance from one pixel to the other 17 | 5. Applications of normal or Epanechnikov kernels 18 | 6. Mean shift vector 19 | 7. Cluster Analysis 20 | 8. Image Segmentation 21 | 3. Project Final Report 22 | Introduction 23 | i) Mainly from literature reviews: Mean shift is a simple iterative process that shifts each data point to the average of data points in its neighborhood. This is considered as the local maxima of the probability density (density modes) given by the samples. Hence Mean shift segmentation avoids the estimation of probability density functions. No parameters are estimated and no specific function form is assumed. This technique was proposed by Fukunaga and Hostetler in the year 1975 and was recently generalized by Cheng. This generalization makes k-means like clustering algorithms in special cases. For Gaussian kernels, mean shift is a gradient mapping. Convergence is studied for mean shift iterations. Cluster analysis is treated as a deterministic problem of finding a fixed point of mean shift that characterizes the data. 24 | ii) Overview of the project carried out: Mean shift is a non-parametric feature-space analysis technique to partition the image into semantically meaningful regions done by clustering the pixels in the image. We applied mean shift procedure to the plotted points after mapping the image on a feature space. We continue the process until 25 | each point shifts towards a high density region until convergence. Lastly we retrace the image from the plotted dense points on the feature space. 26 | The algorithms implemented are given below: 27 | Mode Detection 28 |  Using multiple initializations covering the entire feature space, employ the mean shift procedure to identify the stationary points of f(h,k). 29 |  Prune these points to only retain the local maxima corresponding to the density modes. 30 | Discontinuity Preserving filtering 31 |  For each image pixel x(i), initialize step j=1 and y(I,1)=x(i) 32 |  Compute y(I,j+1) until convergence y(i,cor). 33 |  The filtered pixel values are defined as z(i)= (x(i)**s, y(i)**r 34 | Mean shift image segmentation 35 |  Employ the Mean shift discontinuity preserving filtering and store all the information about the d-dimensional convergence points y(i,con). 36 |  Determine the clusters {C(p)}p=1,…,m by grouping all z(i) which are closer than hs in spatial domain and hr in the range domain. 37 |  Assign L(i)={p|z(i)EC(p)} for each pixel i=1,…,n. 38 |  If required eliminate regions smaller than P pixels. 39 | Our Approach: 40 | Working of the overall system: 41 | 1. The input image had r rows and c columns with each pixel having intensity values of R,G and B 42 | 2. The image features is extracted in a vector matrix M with dimensions [r*c],[5]containing range and spatial information. 43 | 3. We set a single threshold value h for this spatial range feature matrix and convergence criterion value iter. 44 | 4. A random row is selected from the vector matrix M as initial seed point which is the current mean. 45 | 5. We then calculate the Euclidean distance of all the other pixels in M with the current mean. 46 | 6. The distance is then compared with the threshold and if the value is within threshold we store the rows in a list. 47 | 7. We then find the new mean by averaging out each column of the selected points. 48 | 8. If the distance of the new mean and the current mean is less than iter the new mean is assigned to all the pixels in the image. After the convergence, we eliminate the indexes of all those pixels which are already marked. 49 | 9. Otherwise, we make the new mean as the current mean and this iteration is repeated again until we exhaust of all points. 50 | i) Software used to implement: Enthought Canopy 1.6.1 51 | Outcome and deviations: 52 | i. Presentation of the project outcome: 53 | The outcome of this project depends on the following criterion: 54 | 1. If the input image is of dimensions 512*512 and after it has been extracted into the feature space as 262144, 5 we must ensure that our mean shift algorithm is able to traverse all the rows until every point is exhausted. 55 | 2. If the Euclidean distance is within the threshold we ensure that we store the value of the rows within a list. 56 | 3. If the distance between the new mean and the current mean is greater than the iter value we set new mean value to all pixel points. Otherwise, new mean is set as the current mean.] 57 | ii. Discussion of the outcome: 58 |  Initially when we considered only RGB values without considering pixel co-ordinates/ location, the output image was missing many colors. The prominent features got lost with the insignificant features giving us only a rough output very different from the desired output. Therefore, we must include the image pixel index values in our Euclidean distance formula to get the desired output. 59 | Incorrect outputs 60 | 61 |  Also, there was a case when we misunderstood the fact the input image has e rows and c columns with each pixel having three colors R,G,B. We thought that the image is a 2D array with each pixel having either R, G or B value. 62 | iii. Lessons learned from algorithmic development: 63 | If we apply a uniform weighted kernel that assigns a weight of 1 to all pixels we get a rough image outcome with the image boundaries not properly outlined. 64 | Now, on using a kernel (Preferably a normal or epanechnikov kernel), boundaries are properly outlined. We get two thresholds hs and hr with the kernels and apply the mean shift edge preserving filtering and store all the converged mean points. We get segmented boundaries over the original image. 65 | Summary and Discussion 66 | The Mean shift segmentation routine is implemented in two phases: 67 | 1) Mode Detection and Discontinuity preserving filtering 68 | 2) Mean Shift Clustering 69 | • During the implementation, we have made sure that the image noise has been removed while the boundaries are preserved. Also as shown in the image above, when the feature space is created, the Red, Green and Blue values in each image pixel must be removed and placed in a new array. There were a wide variety of clustering strategies that were available like Agglomerative clustering, Divisive clustering, K-means clustering and K-medoids of which K-means clustering has been used. 70 | The Mean shift segmentation has the following applications: 71 |  Clustering 72 |  Smoothing 73 |  Tracking 74 | Following are the Strengths and Weaknesses of the implemented algorithm: 75 | Strengths 76 |  The algorithm doesn’t assume any prior shape of data clusters 77 |  It does not require to estimate the probability density function which reduces complexity by a huge margin. 78 |  The code can handle arbitrary feature spaces. 79 |  This code when improvised in real time, would be platform independent. 80 | Weaknesses 81 |  The window size (bandwidth selection) is not trivial 82 |  Inappropriate window size can cause modes to be merged, or generate additional “shallow” modes  Use adaptive window size. 83 | Lessons learned 84 | We learned about various concepts in Computer Vision and Image Processing like image formation and its properties, basic image understanding techniques, how to represent an image mathematically, Image processing fundamentals, basic image segmentation techniques, Mathematical morphology for shape analysis. From homework problems we learnt to do various image processing techniques such as image convolution and image histogram equalization. 85 | While working on this project we learned to program in Python. We learned the functionalities of many python functions such as imread for reading an input, imshow to display the output and various others. We also learned about various Python libraries like scipy, numpy, sys, random and opencv. We realized the subtle difference python has from all of its contemporary languages like MATLAB. Our working on the project made our concepts in Mean shift Segmentation, Mode detection and Discontinuity Preserving Filtering crystal clear. 86 | Acknowledgement 87 | We are very thankful to our Professor Dr.Chang Wen Cheng for teaching us Computer Vision and Image Processing and how to do Mean Shift. This project could not have been successfully completed without the help of our teaching assistant Radhakrishna Dasari, who was always ready to help with any doubts related to our project. 88 | References: https://en.wikipedia.org/wiki/Mean_shift#Applications http://luthuli.cs.uiuc.edu/~daf/courses/CS-498-DAF-PS/Segmentation.pdf http://comaniciu.net/Papers/RobustAnalysisFeatureSpaces.pdf http://home.ku.edu.tr/mehyilmaz/public_html/mean-shift/00400568.pdf https://saravananthirumuruganathan.wordpress.com/2010/04/01/introduction-to-mean-shift-algorithm/ Lecture slides. 89 | -------------------------------------------------------------------------------- /meanshift.with.boundary.extra.20.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import random 4 | import sys 5 | 6 | if(len(sys.argv) <2): 7 | print 'Error' 8 | else: 9 | Input_Image=sys.argv[1]; 10 | 11 | # load image in "original_image" 12 | 13 | K= cv2.imread(Input_Image,1) 14 | 15 | # load image in "original_image" 16 | 17 | #K= cv2.imread("C:\Desktop\CSE 573\Project 1\Project #2 Image_House.jpg",1) 18 | 19 | row=K.shape[0] 20 | col=K.shape[1] 21 | 22 | J= row * col 23 | Size = row,col,3 24 | R = np.zeros(Size, dtype= np.uint8) 25 | D=np.zeros((J,5)) 26 | arr=np.array((1,3)) 27 | 28 | #cv2.imshow("image", K) 29 | 30 | 31 | #Creating a Epanichnov Kernel 32 | Hr=30 33 | Hs=2 34 | 35 | counter=0 36 | 37 | iter=0.01 38 | 39 | 40 | threshold=30 41 | current_mean_random = True 42 | current_mean_arr = np.zeros((1,5)) 43 | below_threshold_arr=[] 44 | 45 | # converted the image K[rows][col] into a feature space D. The dimensions of D are [rows*col][5] 46 | for i in range(0,row): 47 | for j in range(0,col): 48 | arr= K[i][j] 49 | 50 | for k in range(0,5): 51 | if(k>=0) & (k <=2): 52 | D[counter][k]=arr[k] 53 | else: 54 | if(k==3): 55 | D[counter][k]=i 56 | else: 57 | D[counter][k]=j 58 | counter+=1 59 | 60 | while(len(D) > 0): 61 | #print J 62 | print len(D) 63 | #selecting a random row from the feature space and assigning it as the current mean 64 | if(current_mean_random): 65 | current_mean= random.randint(0,len(D)-1) 66 | for i in range(0,5): 67 | current_mean_arr[0][i] = D[current_mean][i] 68 | below_threshold_arr=[] 69 | for i in range(0,len(D)): 70 | #print "Entered here" 71 | ecl_dist = 0 72 | color_total_current = 0 73 | color_total_new = 0 74 | #Finding the eucledian distance of the randomly selected row i.e. current mean with all the other rows 75 | for j in range(0,5): 76 | if(abs(i-current_mean)==1): 77 | ecl_dist += ((current_mean_arr[0][j] - D[i][j]*0.75)**2) 78 | else: 79 | ecl_dist += ((current_mean_arr[0][j] - D[i][j]*0)**2) 80 | 81 | ecl_dist = ecl_dist**0.5 82 | 83 | #Checking if the distance calculated is within the threshold. If yes taking those rows and adding 84 | #them to a list below_threshold_arr 85 | 86 | if(ecl_dist < threshold): 87 | below_threshold_arr.append(i) 88 | #print "came here" 89 | 90 | mean_R=0 91 | mean_G=0 92 | mean_B=0 93 | mean_i=0 94 | mean_j=0 95 | current_mean = 0 96 | mean_col = 0 97 | 98 | #For all the rows found and placed in below_threshold_arr list, calculating the average of 99 | #Red, Green, Blue and index positions. 100 | 101 | for i in range(0, len(below_threshold_arr)): 102 | mean_R += D[below_threshold_arr[i]][0] 103 | mean_G += D[below_threshold_arr[i]][1] 104 | mean_B += D[below_threshold_arr[i]][2] 105 | mean_i += D[below_threshold_arr[i]][3] 106 | mean_j += D[below_threshold_arr[i]][4] 107 | 108 | mean_R = mean_R / len(below_threshold_arr) 109 | mean_G = mean_G / len(below_threshold_arr) 110 | mean_B = mean_B / len(below_threshold_arr) 111 | mean_i = mean_i / len(below_threshold_arr) 112 | mean_j = mean_j / len(below_threshold_arr) 113 | 114 | #Finding the distance of these average values with the current mean and comparing it with iter 115 | 116 | mean_e_distance = ((mean_R - current_mean_arr[0][0])**2 + (mean_G - current_mean_arr[0][1])**2 + (mean_B - current_mean_arr[0][2])**2 + (mean_i - current_mean_arr[0][3])**2 + (mean_j - current_mean_arr[0][4])**2) 117 | 118 | mean_e_distance = mean_e_distance**0.5 119 | 120 | 121 | 122 | nearest_i = 0 123 | min_e_dist = 0 124 | counter_threshold = 0 125 | # If less than iter, find the row in below_threshold_arr that has i,j nearest to mean_i and mean_j 126 | #This is because mean_i and mean_j could be decimal values which do not correspond 127 | #to actual pixel in the Image array. 128 | 129 | if(mean_e_distance < iter): 130 | 131 | new_arr = np.zeros((1,3)) 132 | new_arr[0][0] = mean_R 133 | new_arr[0][1] = mean_G 134 | new_arr[0][2] = mean_B 135 | 136 | # When found, color all the rows in below_threshold_arr with 137 | #the color of the row in below_threshold_arr that has i,j nearest to mean_i and mean_j 138 | for i in range(0, len(below_threshold_arr)): 139 | R[D[below_threshold_arr[i]][3]][D[below_threshold_arr[i]][4]] = new_arr 140 | 141 | # Also now don't use those rows that have been colored once. 142 | 143 | D[below_threshold_arr[i]][0] = -1 144 | current_mean_random = True 145 | new_D=np.zeros((len(D),5)) 146 | counter_i = 0 147 | 148 | for i in range(0, len(D)): 149 | if(D[i][0] != -1): 150 | new_D[counter_i][0] = D[i][0] 151 | new_D[counter_i][1] = D[i][1] 152 | new_D[counter_i][2] = D[i][2] 153 | new_D[counter_i][3] = D[i][3] 154 | new_D[counter_i][4] = D[i][4] 155 | counter_i += 1 156 | 157 | 158 | D=np.zeros((counter_i,5)) 159 | 160 | counter_i -= 1 161 | for i in range(0, counter_i): 162 | D[i][0] = new_D[i][0] 163 | D[i][1] = new_D[i][1] 164 | D[i][2] = new_D[i][2] 165 | D[i][3] = new_D[i][3] 166 | D[i][4] = new_D[i][4] 167 | 168 | else: 169 | current_mean_random = False 170 | 171 | current_mean_arr[0][0] = mean_R 172 | current_mean_arr[0][1] = mean_G 173 | current_mean_arr[0][2] = mean_B 174 | current_mean_arr[0][3] = mean_i 175 | current_mean_arr[0][4] = mean_j 176 | 177 | #cv2.imwrite("image"+ str(len(below_threshold_arr)) +".png", K) 178 | 179 | #if(len(total_array) >= 40000): 180 | #break 181 | cv2.imshow("finalImage", R) 182 | #print arr 183 | ''' 184 | fig = plt.figure() 185 | ax = fig.add_subplot(111, projection='3d') 186 | #Axis3D(I)''' -------------------------------------------------------------------------------- /meanshift.without.boundary.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import random 4 | import sys 5 | 6 | if(len(sys.argv) <2): 7 | print 'Error' 8 | else: 9 | Input_Image=sys.argv[1]; 10 | 11 | # load image in "original_image" 12 | 13 | K= cv2.imread(Input_Image,1) 14 | 15 | 16 | # load image in "original_image" 17 | 18 | #K= cv2.imread("C:\Desktop\CSE 573\Project 1\Project #2 Image_House.jpg",1) 19 | 20 | row=K.shape[0] 21 | col=K.shape[1] 22 | 23 | J= row * col 24 | Size = row,col,3 25 | R = np.zeros(Size, dtype= np.uint8) 26 | D=np.zeros((J,5)) 27 | arr=np.array((1,3)) 28 | 29 | #cv2.imshow("image", K) 30 | 31 | counter=0 32 | iter=1.0 33 | 34 | 35 | threshold=30 36 | current_mean_random = True 37 | current_mean_arr = np.zeros((1,5)) 38 | below_threshold_arr=[] 39 | 40 | # converted the image K[rows][col] into a feature space D. The dimensions of D are [rows*col][5] 41 | for i in range(0,row): 42 | for j in range(0,col): 43 | arr= K[i][j] 44 | 45 | for k in range(0,5): 46 | if(k>=0) & (k <=2): 47 | D[counter][k]=arr[k] 48 | else: 49 | if(k==3): 50 | D[counter][k]=i 51 | else: 52 | D[counter][k]=j 53 | counter+=1 54 | 55 | while(len(D) > 0): 56 | #print J 57 | print len(D) 58 | #selecting a random row from the feature space and assigning it as the current mean 59 | if(current_mean_random): 60 | current_mean= random.randint(0,len(D)-1) 61 | for i in range(0,5): 62 | current_mean_arr[0][i] = D[current_mean][i] 63 | below_threshold_arr=[] 64 | for i in range(0,len(D)): 65 | #print "Entered here" 66 | ecl_dist = 0 67 | color_total_current = 0 68 | color_total_new = 0 69 | #Finding the eucledian distance of the randomly selected row i.e. current mean with all the other rows 70 | for j in range(0,5): 71 | ecl_dist += ((current_mean_arr[0][j] - D[i][j])**2) 72 | 73 | ecl_dist = ecl_dist**0.5 74 | 75 | #Checking if the distance calculated is within the threshold. If yes taking those rows and adding 76 | #them to a list below_threshold_arr 77 | 78 | if(ecl_dist < threshold): 79 | below_threshold_arr.append(i) 80 | #print "came here" 81 | 82 | mean_R=0 83 | mean_G=0 84 | mean_B=0 85 | mean_i=0 86 | mean_j=0 87 | current_mean = 0 88 | mean_col = 0 89 | 90 | #For all the rows found and placed in below_threshold_arr list, calculating the average of 91 | #Red, Green, Blue and index positions. 92 | 93 | for i in range(0, len(below_threshold_arr)): 94 | mean_R += D[below_threshold_arr[i]][0] 95 | mean_G += D[below_threshold_arr[i]][1] 96 | mean_B += D[below_threshold_arr[i]][2] 97 | mean_i += D[below_threshold_arr[i]][3] 98 | mean_j += D[below_threshold_arr[i]][4] 99 | 100 | mean_R = mean_R / len(below_threshold_arr) 101 | mean_G = mean_G / len(below_threshold_arr) 102 | mean_B = mean_B / len(below_threshold_arr) 103 | mean_i = mean_i / len(below_threshold_arr) 104 | mean_j = mean_j / len(below_threshold_arr) 105 | 106 | #Finding the distance of these average values with the current mean and comparing it with iter 107 | 108 | mean_e_distance = ((mean_R - current_mean_arr[0][0])**2 + (mean_G - current_mean_arr[0][1])**2 + (mean_B - current_mean_arr[0][2])**2 + (mean_i - current_mean_arr[0][3])**2 + (mean_j - current_mean_arr[0][4])**2) 109 | 110 | mean_e_distance = mean_e_distance**0.5 111 | 112 | 113 | 114 | nearest_i = 0 115 | min_e_dist = 0 116 | counter_threshold = 0 117 | # If less than iter, find the row in below_threshold_arr that has i,j nearest to mean_i and mean_j 118 | #This is because mean_i and mean_j could be decimal values which do not correspond 119 | #to actual pixel in the Image array. 120 | 121 | if(mean_e_distance < iter): 122 | #print "Entered here" 123 | ''' 124 | for i in range(0, len(below_threshold_arr)): 125 | new_e_dist = ((mean_i - D[below_threshold_arr[i]][3])**2 + (mean_j - D[below_threshold_arr[i]][4])**2 + (mean_R - D[below_threshold_arr[i]][0])**2 +(mean_G - D[below_threshold_arr[i]][1])**2 + (mean_B - D[below_threshold_arr[i]][3])**2) 126 | new_e_dist = new_e_dist**0.5 127 | if(i == 0): 128 | min_e_dist = new_e_dist 129 | nearest_i = i 130 | else: 131 | if(new_e_dist < min_e_dist): 132 | min_e_dist = new_e_dist 133 | nearest_i = i 134 | ''' 135 | new_arr = np.zeros((1,3)) 136 | new_arr[0][0] = mean_R 137 | new_arr[0][1] = mean_G 138 | new_arr[0][2] = mean_B 139 | 140 | # When found, color all the rows in below_threshold_arr with 141 | #the color of the row in below_threshold_arr that has i,j nearest to mean_i and mean_j 142 | for i in range(0, len(below_threshold_arr)): 143 | R[D[below_threshold_arr[i]][3]][D[below_threshold_arr[i]][4]] = new_arr 144 | 145 | # Also now don't use those rows that have been colored once. 146 | 147 | D[below_threshold_arr[i]][0] = -1 148 | current_mean_random = True 149 | new_D=np.zeros((len(D),5)) 150 | counter_i = 0 151 | 152 | for i in range(0, len(D)): 153 | if(D[i][0] != -1): 154 | new_D[counter_i][0] = D[i][0] 155 | new_D[counter_i][1] = D[i][1] 156 | new_D[counter_i][2] = D[i][2] 157 | new_D[counter_i][3] = D[i][3] 158 | new_D[counter_i][4] = D[i][4] 159 | counter_i += 1 160 | 161 | 162 | D=np.zeros((counter_i,5)) 163 | 164 | counter_i -= 1 165 | for i in range(0, counter_i): 166 | D[i][0] = new_D[i][0] 167 | D[i][1] = new_D[i][1] 168 | D[i][2] = new_D[i][2] 169 | D[i][3] = new_D[i][3] 170 | D[i][4] = new_D[i][4] 171 | 172 | else: 173 | current_mean_random = False 174 | 175 | current_mean_arr[0][0] = mean_R 176 | current_mean_arr[0][1] = mean_G 177 | current_mean_arr[0][2] = mean_B 178 | current_mean_arr[0][3] = mean_i 179 | current_mean_arr[0][4] = mean_j 180 | 181 | #cv2.imwrite("image"+ str(len(below_threshold_arr)) +".png", K) 182 | 183 | #if(len(total_array) >= 40000): 184 | #break 185 | cv2.imshow("finalImage", R) 186 | #print arr 187 | ''' 188 | fig = plt.figure() 189 | ax = fig.add_subplot(111, projection='3d') 190 | #Axis3D(I)''' --------------------------------------------------------------------------------