├── Final.project.outputs.pdf
├── README.md
├── meanshift.with.boundary.extra.20.py
└── meanshift.without.boundary.py


/Final.project.outputs.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/agamdeep/Mean-Shift-Segmentation-using-Python/dae181a89a6c5e16b39da5626e062e6f9189047b/Final.project.outputs.pdf


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Mean-Shift-Segmentation-using-Python
 2 | Performed the mean shift segmentation to track objects over image sequences.
 3 | Mean Shift Segmentation
 4 | Implemented by:
 5 | Agam Deep Arora (50169805)
 6 | &
 7 | Debika Dutt (50170009)
 8 | Submission date: 14-Dec-15
 9 | 2. Literature review
10 | a. Review of the technologies in the project
11 | While working on this project we learned to program in Python. We learned the functionalities of many python functions such as imread for reading an input, imshow to display the output and various others. We also learned about various Python libraries like scipy, numpy, sys, random and opencv. We realized the subtle difference python has from all of its contemporary languages like MATLAB.
12 | b. Basic technologies learned in class:
13 | 1. Boundary preserving filtering by applying median filter to the input image
14 | 2. Mean Shift Filtering
15 | 3. Thresholding
16 | 4. Calculating Euclidean distance from one pixel to the other
17 | 5. Applications of normal or Epanechnikov kernels
18 | 6. Mean shift vector
19 | 7. Cluster Analysis
20 | 8. Image Segmentation
21 | 3. Project Final Report
22 | Introduction
23 | i) Mainly from literature reviews: Mean shift is a simple iterative process that shifts each data point to the average of data points in its neighborhood. This is considered as the local maxima of the probability density (density modes) given by the samples. Hence Mean shift segmentation avoids the estimation of probability density functions. No parameters are estimated and no specific function form is assumed. This technique was proposed by Fukunaga and Hostetler in the year 1975 and was recently generalized by Cheng. This generalization makes k-means like clustering algorithms in special cases. For Gaussian kernels, mean shift is a gradient mapping. Convergence is studied for mean shift iterations. Cluster analysis is treated as a deterministic problem of finding a fixed point of mean shift that characterizes the data.
24 | ii) Overview of the project carried out: Mean shift is a non-parametric feature-space analysis technique to partition the image into semantically meaningful regions done by clustering the pixels in the image. We applied mean shift procedure to the plotted points after mapping the image on a feature space. We continue the process until
25 | each point shifts towards a high density region until convergence. Lastly we retrace the image from the plotted dense points on the feature space.
26 | The algorithms implemented are given below:
27 | Mode Detection
28 |  Using multiple initializations covering the entire feature space, employ the mean shift procedure to identify the stationary points of f(h,k).
29 |  Prune these points to only retain the local maxima corresponding to the density modes.
30 | Discontinuity Preserving filtering
31 |  For each image pixel x(i), initialize step j=1 and y(I,1)=x(i)
32 |  Compute y(I,j+1) until convergence y(i,cor).
33 |  The filtered pixel values are defined as z(i)= (x(i)**s, y(i)**r
34 | Mean shift image segmentation
35 |  Employ the Mean shift discontinuity preserving filtering and store all the information about the d-dimensional convergence points y(i,con).
36 |  Determine the clusters {C(p)}p=1,…,m by grouping all z(i) which are closer than hs in spatial domain and hr in the range domain.
37 |  Assign L(i)={p|z(i)EC(p)} for each pixel i=1,…,n.
38 |  If required eliminate regions smaller than P pixels.
39 | Our Approach:
40 | Working of the overall system:
41 | 1. The input image had r rows and c columns with each pixel having intensity values of R,G and B
42 | 2. The image features is extracted in a vector matrix M with dimensions [r*c],[5]containing range and spatial information.
43 | 3. We set a single threshold value h for this spatial range feature matrix and convergence criterion value iter.
44 | 4. A random row is selected from the vector matrix M as initial seed point which is the current mean.
45 | 5. We then calculate the Euclidean distance of all the other pixels in M with the current mean.
46 | 6. The distance is then compared with the threshold and if the value is within threshold we store the rows in a list.
47 | 7. We then find the new mean by averaging out each column of the selected points.
48 | 8. If the distance of the new mean and the current mean is less than iter the new mean is assigned to all the pixels in the image. After the convergence, we eliminate the indexes of all those pixels which are already marked.
49 | 9. Otherwise, we make the new mean as the current mean and this iteration is repeated again until we exhaust of all points.
50 | i) Software used to implement: Enthought Canopy 1.6.1
51 | Outcome and deviations:
52 | i. Presentation of the project outcome:
53 | The outcome of this project depends on the following criterion:
54 | 1. If the input image is of dimensions 512*512 and after it has been extracted into the feature space as 262144, 5 we must ensure that our mean shift algorithm is able to traverse all the rows until every point is exhausted.
55 | 2. If the Euclidean distance is within the threshold we ensure that we store the value of the rows within a list.
56 | 3. If the distance between the new mean and the current mean is greater than the iter value we set new mean value to all pixel points. Otherwise, new mean is set as the current mean.]
57 | ii. Discussion of the outcome:
58 |  Initially when we considered only RGB values without considering pixel co-ordinates/ location, the output image was missing many colors. The prominent features got lost with the insignificant features giving us only a rough output very different from the desired output. Therefore, we must include the image pixel index values in our Euclidean distance formula to get the desired output.
59 | Incorrect outputs
60 | 
61 |  Also, there was a case when we misunderstood the fact the input image has e rows and c columns with each pixel having three colors R,G,B. We thought that the image is a 2D array with each pixel having either R, G or B value.
62 | iii. Lessons learned from algorithmic development:
63 | If we apply a uniform weighted kernel that assigns a weight of 1 to all pixels we get a rough image outcome with the image boundaries not properly outlined.
64 | Now, on using a kernel (Preferably a normal or epanechnikov kernel), boundaries are properly outlined. We get two thresholds hs and hr with the kernels and apply the mean shift edge preserving filtering and store all the converged mean points. We get segmented boundaries over the original image.
65 | Summary and Discussion
66 | The Mean shift segmentation routine is implemented in two phases:
67 | 1) Mode Detection and Discontinuity preserving filtering
68 | 2) Mean Shift Clustering
69 | • During the implementation, we have made sure that the image noise has been removed while the boundaries are preserved. Also as shown in the image above, when the feature space is created, the Red, Green and Blue values in each image pixel must be removed and placed in a new array. There were a wide variety of clustering strategies that were available like Agglomerative clustering, Divisive clustering, K-means clustering and K-medoids of which K-means clustering has been used.
70 | The Mean shift segmentation has the following applications:
71 |  Clustering
72 |  Smoothing
73 |  Tracking
74 | Following are the Strengths and Weaknesses of the implemented algorithm:
75 | Strengths
76 |  The algorithm doesn’t assume any prior shape of data clusters
77 |  It does not require to estimate the probability density function which reduces complexity by a huge margin.
78 |  The code can handle arbitrary feature spaces.
79 |  This code when improvised in real time, would be platform independent.
80 | Weaknesses
81 |  The window size (bandwidth selection) is not trivial
82 |  Inappropriate window size can cause modes to be merged, or generate additional “shallow” modes  Use adaptive window size.
83 | Lessons learned
84 | We learned about various concepts in Computer Vision and Image Processing like image formation and its properties, basic image understanding techniques, how to represent an image mathematically, Image processing fundamentals, basic image segmentation techniques, Mathematical morphology for shape analysis. From homework problems we learnt to do various image processing techniques such as image convolution and image histogram equalization.
85 | While working on this project we learned to program in Python. We learned the functionalities of many python functions such as imread for reading an input, imshow to display the output and various others. We also learned about various Python libraries like scipy, numpy, sys, random and opencv. We realized the subtle difference python has from all of its contemporary languages like MATLAB. Our working on the project made our concepts in Mean shift Segmentation, Mode detection and Discontinuity Preserving Filtering crystal clear.
86 | Acknowledgement
87 | We are very thankful to our Professor Dr.Chang Wen Cheng for teaching us Computer Vision and Image Processing and how to do Mean Shift. This project could not have been successfully completed without the help of our teaching assistant Radhakrishna Dasari, who was always ready to help with any doubts related to our project.
88 | References: https://en.wikipedia.org/wiki/Mean_shift#Applications http://luthuli.cs.uiuc.edu/~daf/courses/CS-498-DAF-PS/Segmentation.pdf http://comaniciu.net/Papers/RobustAnalysisFeatureSpaces.pdf http://home.ku.edu.tr/mehyilmaz/public_html/mean-shift/00400568.pdf https://saravananthirumuruganathan.wordpress.com/2010/04/01/introduction-to-mean-shift-algorithm/ Lecture slides.
89 | 


--------------------------------------------------------------------------------
/meanshift.with.boundary.extra.20.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import cv2
  3 | import random
  4 | import sys
  5 | 
  6 | if(len(sys.argv) <2):
  7 |     print 'Error'
  8 | else:
  9 |     Input_Image=sys.argv[1];
 10 | 
 11 | # load image in "original_image"
 12 | 
 13 | K= cv2.imread(Input_Image,1)
 14 | 
 15 | # load image in "original_image"
 16 | 
 17 | #K= cv2.imread("C:\Desktop\CSE 573\Project 1\Project #2 Image_House.jpg",1)
 18 | 
 19 | row=K.shape[0]
 20 | col=K.shape[1]
 21 | 
 22 | J= row * col
 23 | Size = row,col,3
 24 | R = np.zeros(Size, dtype= np.uint8)
 25 | D=np.zeros((J,5))
 26 | arr=np.array((1,3))
 27 | 
 28 | #cv2.imshow("image", K)
 29 | 
 30 | 
 31 | #Creating a Epanichnov Kernel
 32 | Hr=30
 33 | Hs=2
 34 | 
 35 | counter=0  
 36 | 
 37 | iter=0.01        
 38 | 
 39 | 
 40 | threshold=30
 41 | current_mean_random = True
 42 | current_mean_arr = np.zeros((1,5))
 43 | below_threshold_arr=[]
 44 | 
 45 | # converted the image K[rows][col] into a feature space D. The dimensions of D are [rows*col][5]
 46 | for i in range(0,row):
 47 |     for j in range(0,col):      
 48 |         arr= K[i][j]
 49 |         
 50 |         for k in range(0,5):
 51 |             if(k>=0) & (k <=2):
 52 |                 D[counter][k]=arr[k]
 53 |             else:
 54 |                 if(k==3):
 55 |                     D[counter][k]=i
 56 |                 else:
 57 |                     D[counter][k]=j
 58 |         counter+=1
 59 | 
 60 | while(len(D) > 0):
 61 |     #print J
 62 |     print len(D)
 63 | #selecting a random row from the feature space and assigning it as the current mean    
 64 |     if(current_mean_random):
 65 |         current_mean= random.randint(0,len(D)-1)
 66 |         for i in range(0,5):
 67 |             current_mean_arr[0][i] = D[current_mean][i]
 68 |     below_threshold_arr=[]
 69 |     for i in range(0,len(D)):
 70 |         #print "Entered here"
 71 |         ecl_dist = 0
 72 |         color_total_current = 0
 73 |         color_total_new = 0
 74 | #Finding the eucledian distance of the randomly selected row i.e. current mean with all the other rows
 75 |         for j in range(0,5):
 76 |             if(abs(i-current_mean)==1):
 77 |                 ecl_dist += ((current_mean_arr[0][j] - D[i][j]*0.75)**2)
 78 |             else:
 79 |                 ecl_dist += ((current_mean_arr[0][j] - D[i][j]*0)**2)
 80 |                             
 81 |         ecl_dist = ecl_dist**0.5
 82 | 
 83 | #Checking if the distance calculated is within the threshold. If yes taking those rows and adding 
 84 | #them to a list below_threshold_arr
 85 |       
 86 |         if(ecl_dist < threshold):
 87 |             below_threshold_arr.append(i)
 88 |             #print "came here"  
 89 |     
 90 |     mean_R=0
 91 |     mean_G=0
 92 |     mean_B=0
 93 |     mean_i=0
 94 |     mean_j=0
 95 |     current_mean = 0
 96 |     mean_col = 0
 97 |     
 98 | #For all the rows found and placed in below_threshold_arr list, calculating the average of 
 99 | #Red, Green, Blue and index positions.
100 |     
101 |     for i in range(0, len(below_threshold_arr)):
102 |         mean_R += D[below_threshold_arr[i]][0]
103 |         mean_G += D[below_threshold_arr[i]][1]
104 |         mean_B += D[below_threshold_arr[i]][2]
105 |         mean_i += D[below_threshold_arr[i]][3]
106 |         mean_j += D[below_threshold_arr[i]][4]   
107 |     
108 |     mean_R = mean_R / len(below_threshold_arr)
109 |     mean_G = mean_G / len(below_threshold_arr)
110 |     mean_B = mean_B / len(below_threshold_arr)
111 |     mean_i = mean_i / len(below_threshold_arr)
112 |     mean_j = mean_j / len(below_threshold_arr)
113 |     
114 | #Finding the distance of these average values with the current mean and comparing it with iter
115 | 
116 |     mean_e_distance = ((mean_R - current_mean_arr[0][0])**2 + (mean_G - current_mean_arr[0][1])**2 + (mean_B - current_mean_arr[0][2])**2 + (mean_i - current_mean_arr[0][3])**2 + (mean_j - current_mean_arr[0][4])**2)
117 |     
118 |     mean_e_distance = mean_e_distance**0.5
119 |     
120 |     
121 | 
122 |     nearest_i = 0
123 |     min_e_dist = 0
124 |     counter_threshold = 0
125 | # If less than iter, find the row in below_threshold_arr that has i,j nearest to mean_i and mean_j
126 | #This is because mean_i and mean_j could be decimal values which do not correspond
127 | #to actual pixel in the Image array.
128 | 
129 |     if(mean_e_distance < iter):
130 |                            
131 |         new_arr = np.zeros((1,3))
132 |         new_arr[0][0] = mean_R
133 |         new_arr[0][1] = mean_G
134 |         new_arr[0][2] = mean_B
135 |         
136 | # When found, color all the rows in below_threshold_arr with 
137 | #the color of the row in below_threshold_arr that has i,j nearest to mean_i and mean_j
138 |         for i in range(0, len(below_threshold_arr)):    
139 |             R[D[below_threshold_arr[i]][3]][D[below_threshold_arr[i]][4]] = new_arr
140 |             
141 | # Also now don't use those rows that have been colored once.
142 |             
143 |             D[below_threshold_arr[i]][0] = -1
144 |         current_mean_random = True
145 |         new_D=np.zeros((len(D),5))
146 |         counter_i = 0
147 |         
148 |         for i in range(0, len(D)):
149 |             if(D[i][0] != -1):
150 |                 new_D[counter_i][0] = D[i][0]
151 |                 new_D[counter_i][1] = D[i][1]
152 |                 new_D[counter_i][2] = D[i][2]
153 |                 new_D[counter_i][3] = D[i][3]
154 |                 new_D[counter_i][4] = D[i][4]
155 |                 counter_i += 1
156 |             
157 |         
158 |         D=np.zeros((counter_i,5))
159 |         
160 |         counter_i -= 1
161 |         for i in range(0, counter_i):
162 |             D[i][0] = new_D[i][0]
163 |             D[i][1] = new_D[i][1]
164 |             D[i][2] = new_D[i][2]
165 |             D[i][3] = new_D[i][3]
166 |             D[i][4] = new_D[i][4]
167 |         
168 |     else:
169 |         current_mean_random = False
170 |          
171 |         current_mean_arr[0][0] = mean_R
172 |         current_mean_arr[0][1] = mean_G
173 |         current_mean_arr[0][2] = mean_B
174 |         current_mean_arr[0][3] = mean_i
175 |         current_mean_arr[0][4] = mean_j
176 |             
177 |         #cv2.imwrite("image"+ str(len(below_threshold_arr)) +".png", K)
178 |             
179 |     #if(len(total_array) >= 40000):
180 |         #break
181 | cv2.imshow("finalImage", R)
182 |              #print  arr       
183 | '''
184 | fig = plt.figure()
185 | ax = fig.add_subplot(111, projection='3d')
186 | #Axis3D(I)'''


--------------------------------------------------------------------------------
/meanshift.without.boundary.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import cv2
  3 | import random
  4 | import sys
  5 | 
  6 | if(len(sys.argv) <2):
  7 |     print 'Error'
  8 | else:
  9 |     Input_Image=sys.argv[1];
 10 | 
 11 | # load image in "original_image"
 12 | 
 13 | K= cv2.imread(Input_Image,1)
 14 | 
 15 | 
 16 | # load image in "original_image"
 17 | 
 18 | #K= cv2.imread("C:\Desktop\CSE 573\Project 1\Project #2 Image_House.jpg",1)
 19 | 
 20 | row=K.shape[0]
 21 | col=K.shape[1]
 22 | 
 23 | J= row * col
 24 | Size = row,col,3
 25 | R = np.zeros(Size, dtype= np.uint8)
 26 | D=np.zeros((J,5))
 27 | arr=np.array((1,3))
 28 | 
 29 | #cv2.imshow("image", K)
 30 | 
 31 | counter=0  
 32 | iter=1.0        
 33 | 
 34 | 
 35 | threshold=30
 36 | current_mean_random = True
 37 | current_mean_arr = np.zeros((1,5))
 38 | below_threshold_arr=[]
 39 | 
 40 | # converted the image K[rows][col] into a feature space D. The dimensions of D are [rows*col][5]
 41 | for i in range(0,row):
 42 |     for j in range(0,col):      
 43 |         arr= K[i][j]
 44 |         
 45 |         for k in range(0,5):
 46 |             if(k>=0) & (k <=2):
 47 |                 D[counter][k]=arr[k]
 48 |             else:
 49 |                 if(k==3):
 50 |                     D[counter][k]=i
 51 |                 else:
 52 |                     D[counter][k]=j
 53 |         counter+=1
 54 | 
 55 | while(len(D) > 0):
 56 |     #print J
 57 |     print len(D)
 58 | #selecting a random row from the feature space and assigning it as the current mean    
 59 |     if(current_mean_random):
 60 |         current_mean= random.randint(0,len(D)-1)
 61 |         for i in range(0,5):
 62 |             current_mean_arr[0][i] = D[current_mean][i]
 63 |     below_threshold_arr=[]
 64 |     for i in range(0,len(D)):
 65 |         #print "Entered here"
 66 |         ecl_dist = 0
 67 |         color_total_current = 0
 68 |         color_total_new = 0
 69 | #Finding the eucledian distance of the randomly selected row i.e. current mean with all the other rows
 70 |         for j in range(0,5):
 71 |             ecl_dist += ((current_mean_arr[0][j] - D[i][j])**2)
 72 |                 
 73 |         ecl_dist = ecl_dist**0.5
 74 | 
 75 | #Checking if the distance calculated is within the threshold. If yes taking those rows and adding 
 76 | #them to a list below_threshold_arr
 77 |       
 78 |         if(ecl_dist < threshold):
 79 |             below_threshold_arr.append(i)
 80 |             #print "came here"  
 81 |     
 82 |     mean_R=0
 83 |     mean_G=0
 84 |     mean_B=0
 85 |     mean_i=0
 86 |     mean_j=0
 87 |     current_mean = 0
 88 |     mean_col = 0
 89 |     
 90 | #For all the rows found and placed in below_threshold_arr list, calculating the average of 
 91 | #Red, Green, Blue and index positions.
 92 |     
 93 |     for i in range(0, len(below_threshold_arr)):
 94 |         mean_R += D[below_threshold_arr[i]][0]
 95 |         mean_G += D[below_threshold_arr[i]][1]
 96 |         mean_B += D[below_threshold_arr[i]][2]
 97 |         mean_i += D[below_threshold_arr[i]][3]
 98 |         mean_j += D[below_threshold_arr[i]][4]   
 99 |     
100 |     mean_R = mean_R / len(below_threshold_arr)
101 |     mean_G = mean_G / len(below_threshold_arr)
102 |     mean_B = mean_B / len(below_threshold_arr)
103 |     mean_i = mean_i / len(below_threshold_arr)
104 |     mean_j = mean_j / len(below_threshold_arr)
105 |     
106 | #Finding the distance of these average values with the current mean and comparing it with iter
107 | 
108 |     mean_e_distance = ((mean_R - current_mean_arr[0][0])**2 + (mean_G - current_mean_arr[0][1])**2 + (mean_B - current_mean_arr[0][2])**2 + (mean_i - current_mean_arr[0][3])**2 + (mean_j - current_mean_arr[0][4])**2)
109 |     
110 |     mean_e_distance = mean_e_distance**0.5
111 |     
112 |     
113 | 
114 |     nearest_i = 0
115 |     min_e_dist = 0
116 |     counter_threshold = 0
117 | # If less than iter, find the row in below_threshold_arr that has i,j nearest to mean_i and mean_j
118 | #This is because mean_i and mean_j could be decimal values which do not correspond
119 | #to actual pixel in the Image array.
120 | 
121 |     if(mean_e_distance < iter):
122 |         #print "Entered here"
123 |         '''    
124 |         for i in range(0, len(below_threshold_arr)):
125 |             new_e_dist = ((mean_i - D[below_threshold_arr[i]][3])**2 + (mean_j - D[below_threshold_arr[i]][4])**2 + (mean_R - D[below_threshold_arr[i]][0])**2 +(mean_G - D[below_threshold_arr[i]][1])**2 + (mean_B - D[below_threshold_arr[i]][3])**2)
126 |             new_e_dist = new_e_dist**0.5
127 |             if(i == 0):
128 |                 min_e_dist = new_e_dist
129 |                 nearest_i = i
130 |             else:
131 |                 if(new_e_dist < min_e_dist):
132 |                     min_e_dist = new_e_dist
133 |                     nearest_i = i
134 | '''                    
135 |         new_arr = np.zeros((1,3))
136 |         new_arr[0][0] = mean_R
137 |         new_arr[0][1] = mean_G
138 |         new_arr[0][2] = mean_B
139 |         
140 | # When found, color all the rows in below_threshold_arr with 
141 | #the color of the row in below_threshold_arr that has i,j nearest to mean_i and mean_j
142 |         for i in range(0, len(below_threshold_arr)):    
143 |             R[D[below_threshold_arr[i]][3]][D[below_threshold_arr[i]][4]] = new_arr
144 |             
145 | # Also now don't use those rows that have been colored once.
146 |             
147 |             D[below_threshold_arr[i]][0] = -1
148 |         current_mean_random = True
149 |         new_D=np.zeros((len(D),5))
150 |         counter_i = 0
151 |         
152 |         for i in range(0, len(D)):
153 |             if(D[i][0] != -1):
154 |                 new_D[counter_i][0] = D[i][0]
155 |                 new_D[counter_i][1] = D[i][1]
156 |                 new_D[counter_i][2] = D[i][2]
157 |                 new_D[counter_i][3] = D[i][3]
158 |                 new_D[counter_i][4] = D[i][4]
159 |                 counter_i += 1
160 |             
161 |         
162 |         D=np.zeros((counter_i,5))
163 |         
164 |         counter_i -= 1
165 |         for i in range(0, counter_i):
166 |             D[i][0] = new_D[i][0]
167 |             D[i][1] = new_D[i][1]
168 |             D[i][2] = new_D[i][2]
169 |             D[i][3] = new_D[i][3]
170 |             D[i][4] = new_D[i][4]
171 |         
172 |     else:
173 |         current_mean_random = False
174 |          
175 |         current_mean_arr[0][0] = mean_R
176 |         current_mean_arr[0][1] = mean_G
177 |         current_mean_arr[0][2] = mean_B
178 |         current_mean_arr[0][3] = mean_i
179 |         current_mean_arr[0][4] = mean_j
180 |             
181 |         #cv2.imwrite("image"+ str(len(below_threshold_arr)) +".png", K)
182 |             
183 |     #if(len(total_array) >= 40000):
184 |         #break
185 | cv2.imshow("finalImage", R)
186 |              #print  arr       
187 | '''
188 | fig = plt.figure()
189 | ax = fig.add_subplot(111, projection='3d')
190 | #Axis3D(I)'''


--------------------------------------------------------------------------------