├── boykovKolmogorov.py ├── .gitignore ├── images ├── baby.jpg ├── test1.jpg ├── test2.jpg ├── test3.jpg ├── babycut.jpg ├── babyseeded.jpg ├── test1cut.jpg ├── test2cut.jpg ├── test2seeds.jpg ├── test3cut.jpg ├── usageBKG.png ├── usageOBJ.png ├── imagetograph.png ├── test1seeded.jpg ├── test2seeded.jpg ├── test3seeded.jpg └── usageSegmentation.png ├── README.md ├── augmentingPath.py ├── style.css ├── pushRelabel.py ├── imagesegmentation.py └── index.html /boykovKolmogorov.py: -------------------------------------------------------------------------------- 1 | def boykovKolmogorov(): 2 | pass -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.cpp 2 | *.pyc 3 | *.npy 4 | BSDS300/ 5 | deprecated/ -------------------------------------------------------------------------------- /images/baby.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/baby.jpg -------------------------------------------------------------------------------- /images/test1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/test1.jpg -------------------------------------------------------------------------------- /images/test2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/test2.jpg -------------------------------------------------------------------------------- /images/test3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/test3.jpg -------------------------------------------------------------------------------- /images/babycut.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/babycut.jpg -------------------------------------------------------------------------------- /images/babyseeded.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/babyseeded.jpg -------------------------------------------------------------------------------- /images/test1cut.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/test1cut.jpg -------------------------------------------------------------------------------- /images/test2cut.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/test2cut.jpg -------------------------------------------------------------------------------- /images/test2seeds.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/test2seeds.jpg -------------------------------------------------------------------------------- /images/test3cut.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/test3cut.jpg -------------------------------------------------------------------------------- /images/usageBKG.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/usageBKG.png -------------------------------------------------------------------------------- /images/usageOBJ.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/usageOBJ.png -------------------------------------------------------------------------------- /images/imagetograph.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/imagetograph.png -------------------------------------------------------------------------------- /images/test1seeded.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/test1seeded.jpg -------------------------------------------------------------------------------- /images/test2seeded.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/test2seeded.jpg -------------------------------------------------------------------------------- /images/test3seeded.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/test3seeded.jpg -------------------------------------------------------------------------------- /images/usageSegmentation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/julie-jiang/image-segmentation/HEAD/images/usageSegmentation.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Image Segmentation 2 | 3 | ## Website 4 | 5 | For a detailed report, please visit https://julie-jiang.github.io/image-segmentation/. 6 | 7 | ## Usage 8 | ``` 9 | python imagesegmentation.py filename 10 | ``` 11 | 12 | A new window will pop up showing your image. Use your cursor to mark object seeds, which would be shown in red. Once you're done, press `esc`. Then do the same to mark background seeds, which would be shown in green. 13 | 14 | ## Dependencies 15 | 16 | - Python 2 17 | - OpenCV 3.3 18 | - NumPy 19 | 20 | ## Examples 21 | 22 | 1. `test1.jpg` 23 | 24 | Original, seeded, and segmented image 25 | 26 |    27 | 28 | 2. `test2.jpg` 29 | 30 | Original, seeded, and segmented image 31 | 32 |  33 | 34 | 3. `test3.jpg` 35 | 36 | Original, seeded, and segmented image 37 | 38 |  39 | 40 | 41 | 4. `baby.jpg` 42 | 43 | Original, seeded, and segmented image 44 | 45 |  46 | 47 | 48 | 49 | -------------------------------------------------------------------------------- /augmentingPath.py: -------------------------------------------------------------------------------- 1 | from Queue import Queue 2 | import numpy as np 3 | 4 | def bfs(rGraph, V, s, t, parent): 5 | q = Queue() 6 | visited = np.zeros(V, dtype=bool) 7 | q.put(s) 8 | visited[s] = True 9 | parent[s] = -1 10 | 11 | while not q.empty(): 12 | u = q.get() 13 | for v in xrange(V): 14 | if (not visited[v]) and rGraph[u][v] > 0: 15 | q.put(v) 16 | parent[v] = u 17 | visited[v] = True 18 | return visited[v] 19 | 20 | def dfs(rGraph, V, s, visited): 21 | stack = [s] 22 | while stack: 23 | v = stack.pop() 24 | if not visited[v]: 25 | visited[v] = True 26 | stack.extend([u for u in xrange(V) if rGraph[v][u]]) 27 | 28 | 29 | 30 | def augmentingPath(graph, s, t): 31 | print "Running augmenting path algorithm" 32 | rGraph = graph.copy() 33 | V = len(graph) 34 | parent = np.zeros(V, dtype='int32') 35 | 36 | while bfs(rGraph, V, s, t, parent): 37 | pathFlow = float("inf") 38 | v = t 39 | while v != s: 40 | u = parent[v] 41 | pathFlow = min(pathFlow, rGraph[u][v]) 42 | v = parent[v] 43 | 44 | v = t 45 | while v != s: 46 | u = parent[v] 47 | rGraph[u][v] -= pathFlow 48 | rGraph[v][u] += pathFlow 49 | v = parent[v] 50 | 51 | visited = np.zeros(V, dtype=bool) 52 | dfs(rGraph, V, s, visited) 53 | 54 | cuts = [] 55 | 56 | for i in xrange(V): 57 | for j in xrange(V): 58 | if visited[i] and not visited[j] and graph[i][j]: 59 | cuts.append((i, j)) 60 | return cuts 61 | -------------------------------------------------------------------------------- /style.css: -------------------------------------------------------------------------------- 1 | @import url('https://fonts.googleapis.com/css?family=Open+Sans:700|Lato:400|Lora:400,700|Inconsolata|Oswald'); 2 | body { 3 | margin: auto; 4 | padding: 0; 5 | /*width: 100;*/ 6 | } 7 | .contents{ 8 | /*width: 90%;*/ 9 | margin-left: 25%; 10 | margin-right: 15%; 11 | width: 850px; 12 | margin-bottom: 500px; 13 | } 14 | 15 | .contents h1 { 16 | font-family: "Open Sans", Helvetica, sans-serif; 17 | font-size: 3em; 18 | border-bottom: solid; 19 | border-bottom-color: #afafaf; 20 | } 21 | 22 | .contents h2 { 23 | font-family: "Lato", Helvetica, sans-serif; 24 | font-size: 2em; 25 | color:#282828; 26 | } 27 | 28 | .contents h3 { 29 | font-family: "PT Sans", sans-serif; 30 | color: #6d6d6d; 31 | font-size: 0.85em; 32 | line-height: 0%; 33 | padding: 0; 34 | margin: 0; 35 | font-weight: normal; 36 | text-decoration: none; 37 | } 38 | 39 | .contents p { 40 | font-family: "Lora", "PT Serif", serif; 41 | font-size: 18px; 42 | line-height: 150%; 43 | color: #282828; 44 | } 45 | 46 | .contents ul li { 47 | font-family: "Lora", "PT Serif", serif; 48 | font-size: 0.9em; 49 | line-height: 150%; 50 | color: #282828; 51 | } 52 | 53 | code { 54 | /*display: block;*/ 55 | background-color: #f2f2f2; 56 | font-family: "Inconsolata", monospace; 57 | font-size: 18px; 58 | } 59 | pre code { 60 | display: block; 61 | padding: 0.5em; 62 | font-family: "Inconsolata", monospace; 63 | font-size: 18px; 64 | } 65 | figure { 66 | display: block; 67 | margin:auto; 68 | } 69 | figure img{ 70 | display: block; 71 | margin:auto; 72 | } 73 | figure figcaption{ 74 | display: block; 75 | margin: auto; 76 | font-size:0.8em; 77 | color: #282828; 78 | text-align: center; 79 | width: 70%; 80 | font-family:"Lora", "PT Serif", serif; 81 | } 82 | 83 | 84 | .nav{ 85 | width: 20%; 86 | max-width: 150px; 87 | height: 100%; 88 | position: fixed; 89 | z-index: 1; 90 | top: 0; 91 | padding-top: 10%; 92 | padding-left: 5%; 93 | padding-right: 2%; 94 | } 95 | 96 | 97 | .nav a { 98 | display: block; 99 | transition: 0.3s; 100 | font-family: "Oswald", Helvetica, sans-serif; 101 | font-size: 1em; 102 | color:#282828; 103 | line-height:50px; 104 | text-decoration: none; 105 | } 106 | 107 | .nav a::hover{ 108 | color:#999999; 109 | text-decoration: underline; 110 | } 111 | 112 | #usagePics img{ 113 | float: left; 114 | padding: 1%; 115 | width: 28%; 116 | min-width: 250px; 117 | } 118 | .results figure { 119 | display: block; 120 | position: absolute; 121 | } 122 | 123 | .results img { 124 | float: left; 125 | width: 270px; 126 | padding: 5%; 127 | } 128 | 129 | 130 | 131 | 132 | 133 | -------------------------------------------------------------------------------- /pushRelabel.py: -------------------------------------------------------------------------------- 1 | from Queue import Queue 2 | import numpy as np 3 | import sys 4 | 5 | def preFlows(C, F, heights, eflows, s): 6 | # vertices[s,0] = len(vertices) 7 | heights[s] = len(heights) 8 | # Height of the source vertex is equal to the total # of vertices 9 | 10 | # edges[s,:,1] = edges[s,:,0] 11 | F[s,:] = C[s,:] 12 | # Flow of edges from source is equal to their respective capacities 13 | 14 | 15 | for v in xrange(len(C)): 16 | # For every vertex v that has an incoming edge from s 17 | if C[s,v] > 0: 18 | eflows[v] += C[s,v] 19 | # Initialize excess flow for v 20 | C[v,s] = 0 21 | F[v,s] = -C[s,v] 22 | # Set capacity of edge from v to s in residual graph to 0 23 | 24 | # Returns the first vertex that is not the source and not the sink and 25 | # has a nonzero excess flow 26 | # If non exists return None 27 | def overFlowVertex(vertices, s, t): 28 | for v in xrange(len(vertices)): 29 | if v != s and v != t and vertices[v,1] > 0 : 30 | return v 31 | return None 32 | 33 | # For a vertex v adjacent to u, we can push if: 34 | # (1) the flow of the edge u -> v is less than its capacity 35 | # (2) height of u > height of v 36 | # Flow is the minimum of the remaining possible flow on this edge 37 | # and the excess flow of u 38 | def push(edges, vertices, u): 39 | for v in xrange(len(edges[u])): 40 | if edges[u,v,1] != edges[u,v,0]: 41 | if vertices[u,0] > vertices[v,0]: 42 | flow = min(edges[u,v,0] - edges[u,v,1], vertices[u,1]) 43 | # print "pushing flow", flow, "from", u, "to", v 44 | vertices[u,1] -= flow 45 | vertices[v,1] += flow 46 | edges[u,v,1] += flow 47 | edges[v,u,1] -= flow 48 | 49 | return True 50 | 51 | return False 52 | 53 | # For a vertex v adjacent to u, we can relabel if 54 | # (1) the flow of the edge u -> v is less than its capacity 55 | # (2) the height of v is less than the minimum height 56 | def relabel(edges, vertices, u): 57 | mh = float("inf") # Minimum height 58 | for v in xrange(len(edges[u])): 59 | if edges[u,v,1] != edges[u,v,0] and vertices[v,0] < mh: 60 | mh = vertices[v,0] 61 | vertices[u,0] = mh + 1 62 | # print "relabeling", u, "with mh", mh + 1 63 | 64 | def dfs(rGraph, V, s, visited): 65 | 66 | stack = [s] 67 | while stack: 68 | v = stack.pop() 69 | if not visited[v]: 70 | visited[v] = True 71 | stack.extend([u for u in xrange(V) if rGraph[v][u] > 0]) 72 | 73 | 74 | 75 | def pushRelabel(C, s, t): 76 | print "Running push relabel algorithm" 77 | def preFlows(): 78 | heights[s] = V 79 | F[s,:] = C[s,:] 80 | for v in xrange(V): 81 | if C[s,v] > 0: 82 | excess[v] = C[s,v] 83 | excess[s] -= C[s,v] 84 | # C[v,s] = 0 85 | F[v,s] = -C[s,v] 86 | def overFlowVertex(): 87 | for v in xrange(V): 88 | if v != s and v != t and excess[v] > 0: 89 | return v 90 | return None 91 | def push(u): 92 | # print "pushing", u 93 | # assert(excess[u] > 0) 94 | for v in xrange(V): 95 | if C[u,v] > F[u,v] and heights[u] == heights[v] + 1: 96 | flow = min(C[u,v] - F[u,v], excess[u]) 97 | # if C[u,v] > 0: 98 | F[u,v] += flow 99 | 100 | # if C[u,v] == 0: 101 | # F[v,u] -= flow 102 | if C[v,u] > F[v,u]: 103 | F[v,u] -= flow 104 | else: 105 | F[v,u] = 0 106 | C[v,u] = flow 107 | excess[u] -= flow 108 | excess[v] += flow 109 | # F[u,v] += flow 110 | # F[v,u] -= flow 111 | return True 112 | return False 113 | def relabel(u): 114 | # assert(excess[u] > 0) 115 | # print "relabling", u, heights 116 | assert([heights[u] <= heights[v] for v in xrange(V) if C[u,v] > F[u,v]]) 117 | heights[u] = 1 + min([heights[v] for v in xrange(V) if C[u,v] > F[u,v]]) 118 | 119 | 120 | 121 | V = len(C) 122 | F = np.zeros((V, V)) 123 | heights = np.zeros(V) 124 | excess = np.zeros(V) 125 | 126 | 127 | preFlows() 128 | 129 | while True: 130 | u = overFlowVertex() 131 | # print "overflowing vertex is", u 132 | if u == None: break 133 | if not push(u): 134 | relabel(u) 135 | # Max flow is equal to the excess flow of the sink 136 | #return vertices[t,1] 137 | print "Max flow", excess[t] 138 | # print C 139 | # print F 140 | # print C-F 141 | # print heights 142 | # print excess 143 | 144 | 145 | 146 | visited = np.zeros(V, dtype=bool) 147 | dfs(C - F, V, s, visited) 148 | 149 | cuts = [] 150 | 151 | 152 | for u in xrange(V): 153 | for v in xrange(V): 154 | if visited[u] and not visited[v] and C[u,v]: 155 | cuts.append((u, v)) 156 | return cuts 157 | 158 | if __name__ == "__main__": 159 | 160 | # graph = [[0, 16, 13, 0, 0, 0], 161 | # [0, 0, 10, 12, 0, 0], 162 | # [0, 4, 0, 0, 14, 0], 163 | # [0, 0, 9, 0, 0, 20], 164 | # [0, 0, 0, 7, 0, 4], 165 | # [0, 0, 0, 0, 0, 0]] 166 | 167 | graph = [[0, 4, 0, 5, 1, 0, 0], 168 | [4, 0, 4, 0, 10, 0, 0], 169 | [0, 4, 0, 0, 0, 10, 6], 170 | [5, 0, 0, 0, 5, 0, 0], 171 | [1, 0, 0, 5, 0, 5, 0], 172 | [0, 0, 10, 0, 5, 0, 4], 173 | [0, 0, 6, 0, 0, 4, 0]] 174 | 175 | print pushRelabel(np.asarray(graph), 0, 6) 176 | -------------------------------------------------------------------------------- /imagesegmentation.py: -------------------------------------------------------------------------------- 1 | from __future__ import division 2 | import cv2 3 | import numpy as np 4 | import os 5 | import sys 6 | import argparse 7 | from math import exp, pow 8 | from augmentingPath import augmentingPath 9 | from pushRelabel import pushRelabel 10 | from boykovKolmogorov import boykovKolmogorov 11 | 12 | # np.set_printoptions(threshold=np.inf) 13 | graphCutAlgo = {"ap": augmentingPath, 14 | "pr": pushRelabel, 15 | "bk": boykovKolmogorov} 16 | SIGMA = 30 17 | # LAMBDA = 1 18 | OBJCOLOR, BKGCOLOR = (0, 0, 255), (0, 255, 0) 19 | OBJCODE, BKGCODE = 1, 2 20 | OBJ, BKG = "OBJ", "BKG" 21 | 22 | CUTCOLOR = (0, 0, 255) 23 | 24 | SOURCE, SINK = -2, -1 25 | SF = 10 26 | LOADSEEDS = False 27 | # drawing = False 28 | 29 | def show_image(image): 30 | windowname = "Segmentation" 31 | cv2.namedWindow(windowname, cv2.WINDOW_NORMAL) 32 | cv2.startWindowThread() 33 | cv2.imshow(windowname, image) 34 | cv2.waitKey(0) 35 | cv2.destroyAllWindows() 36 | 37 | def plantSeed(image): 38 | 39 | def drawLines(x, y, pixelType): 40 | if pixelType == OBJ: 41 | color, code = OBJCOLOR, OBJCODE 42 | else: 43 | color, code = BKGCOLOR, BKGCODE 44 | cv2.circle(image, (x, y), radius, color, thickness) 45 | cv2.circle(seeds, (x // SF, y // SF), radius // SF, code, thickness) 46 | 47 | def onMouse(event, x, y, flags, pixelType): 48 | global drawing 49 | if event == cv2.EVENT_LBUTTONDOWN: 50 | drawing = True 51 | drawLines(x, y, pixelType) 52 | elif event == cv2.EVENT_MOUSEMOVE and drawing: 53 | drawLines(x, y, pixelType) 54 | elif event == cv2.EVENT_LBUTTONUP: 55 | drawing = False 56 | 57 | def paintSeeds(pixelType): 58 | print "Planting", pixelType, "seeds" 59 | global drawing 60 | drawing = False 61 | windowname = "Plant " + pixelType + " seeds" 62 | cv2.namedWindow(windowname, cv2.WINDOW_AUTOSIZE) 63 | cv2.setMouseCallback(windowname, onMouse, pixelType) 64 | while (1): 65 | cv2.imshow(windowname, image) 66 | if cv2.waitKey(1) & 0xFF == 27: 67 | break 68 | cv2.destroyAllWindows() 69 | 70 | 71 | seeds = np.zeros(image.shape, dtype="uint8") 72 | image = cv2.cvtColor(image, cv2.COLOR_GRAY2RGB) 73 | image = cv2.resize(image, (0, 0), fx=SF, fy=SF) 74 | 75 | radius = 10 76 | thickness = -1 # fill the whole circle 77 | global drawing 78 | drawing = False 79 | 80 | 81 | paintSeeds(OBJ) 82 | paintSeeds(BKG) 83 | return seeds, image 84 | 85 | 86 | 87 | # Large when ip - iq < sigma, and small otherwise 88 | def boundaryPenalty(ip, iq): 89 | bp = 100 * exp(- pow(int(ip) - int(iq), 2) / (2 * pow(SIGMA, 2))) 90 | return bp 91 | 92 | def buildGraph(image): 93 | V = image.size + 2 94 | graph = np.zeros((V, V), dtype='int32') 95 | K = makeNLinks(graph, image) 96 | seeds, seededImage = plantSeed(image) 97 | makeTLinks(graph, seeds, K) 98 | return graph, seededImage 99 | 100 | def makeNLinks(graph, image): 101 | K = -float("inf") 102 | r, c = image.shape 103 | for i in xrange(r): 104 | for j in xrange(c): 105 | x = i * c + j 106 | if i + 1 < r: # pixel below 107 | y = (i + 1) * c + j 108 | bp = boundaryPenalty(image[i][j], image[i + 1][j]) 109 | graph[x][y] = graph[y][x] = bp 110 | K = max(K, bp) 111 | if j + 1 < c: # pixel to the right 112 | y = i * c + j + 1 113 | bp = boundaryPenalty(image[i][j], image[i][j + 1]) 114 | graph[x][y] = graph[y][x] = bp 115 | K = max(K, bp) 116 | return K 117 | 118 | 119 | 120 | def makeTLinks(graph, seeds, K): 121 | r, c = seeds.shape 122 | 123 | for i in xrange(r): 124 | for j in xrange(c): 125 | x = i * c + j 126 | if seeds[i][j] == OBJCODE: 127 | # graph[x][source] = K 128 | graph[SOURCE][x] = K 129 | elif seeds[i][j] == BKGCODE: 130 | graph[x][SINK] = K 131 | # graph[sink][x] = K 132 | # else: 133 | # graph[x][source] = LAMBDA * regionalPenalty(image[i][j], BKG) 134 | # graph[x][sink] = LAMBDA * regionalPenalty(image[i][j], OBJ) 135 | 136 | 137 | 138 | def displayCut(image, cuts): 139 | def colorPixel(i, j): 140 | image[i][j] = CUTCOLOR 141 | 142 | r, c = image.shape 143 | image = cv2.cvtColor(image, cv2.COLOR_GRAY2RGB) 144 | for c in cuts: 145 | if c[0] != SOURCE and c[0] != SINK and c[1] != SOURCE and c[1] != SINK: 146 | colorPixel(c[0] // r, c[0] % r) 147 | colorPixel(c[1] // r, c[1] % r) 148 | return image 149 | 150 | 151 | 152 | def imageSegmentation(imagefile, size=(30, 30), algo="ff"): 153 | pathname = os.path.splitext(imagefile)[0] 154 | image = cv2.imread(imagefile, cv2.IMREAD_GRAYSCALE) 155 | image = cv2.resize(image, size) 156 | graph, seededImage = buildGraph(image) 157 | cv2.imwrite(pathname + "seeded.jpg", seededImage) 158 | 159 | global SOURCE, SINK 160 | SOURCE += len(graph) 161 | SINK += len(graph) 162 | 163 | cuts = graphCutAlgo[algo](graph, SOURCE, SINK) 164 | print "cuts:" 165 | print cuts 166 | image = displayCut(image, cuts) 167 | image = cv2.resize(image, (0, 0), fx=SF, fy=SF) 168 | show_image(image) 169 | savename = pathname + "cut.jpg" 170 | cv2.imwrite(savename, image) 171 | print "Saved image as", savename 172 | 173 | 174 | def parseArgs(): 175 | def algorithm(string): 176 | if string in graphCutAlgo: 177 | return string 178 | raise argparse.ArgumentTypeError( 179 | "Algorithm should be one of the following:", graphCutAlgo.keys()) 180 | 181 | parser = argparse.ArgumentParser() 182 | parser.add_argument("imagefile") 183 | parser.add_argument("--size", "-s", 184 | default=30, type=int, 185 | help="Defaults to 30x30") 186 | parser.add_argument("--algo", "-a", default="ap", type=algorithm) 187 | return parser.parse_args() 188 | 189 | if __name__ == "__main__": 190 | 191 | args = parseArgs() 192 | imageSegmentation(args.imagefile, (args.size, args.size), args.algo) 193 | 194 | 195 | 196 | 197 | 198 | 199 | -------------------------------------------------------------------------------- /index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 |
4 |Update in 2022: This project was done as class project for an algo class when I was an undergrad at Tufts University in 2017, back when I knew next to nothing about rigorous research methodologies and academic writing. Over the years, this project has surprisingly attracted some attention, therefore I've minimally edited this page to correct for grammar mistakes and typos. Enjoy! -Julie
28 | 29 |Graph algorithms have been successfully applied to a number of computer vision and image processing problems. Our interest is in the application of graph cut algorithms to the problem of image segmentation.
32 |33 | This project focuses on using graph cuts to divide an image into background and foreground segments. The framework consists of two parts. First, a network flow graph is built based on the input image. Then a max-flow algorithm is run on the graph in order to find the min-cut, which produces the optimal segmentation. 34 |
35 |A network flow $G=(V,E)$ is a graph where each edge has a capacity and a flow. Two vertices in the network flow are designated to be the source vertex $s$ and the sink vertex $t$, respectively. The goal is to find the maximum amount of flow that could be delivered from $s$ to $t$, while satisfying the following constraints.
37 |38 |
The flow of the network is the flow that can be sent through some path from $s$ to $t$, which, by conservation of flow, is equal to the inflow of $s$ or the outflow of $t$. An $s/t$ cut is a partitioning of the vertices into two disjoint subsets such that one contains $s$ and the other contains $t$. The value of an $s/t$ cut is the total flow of the edges passing through the cut. 49 |
50 |51 | As stated by the max-flow min-cut theorem, the maximum amount of flow passing from the source to the sink is equivalent to the net flow of the edges in the minimum cut. So by solving the max-flow problem, we directly solve the min-cut problem as well. We will discuss algorithms for finding the max-flow or min-cut in a later section. 52 |
53 | 54 |56 | One of the most challenging things about this project is how to transform an image into a graph. In Graph cuts and efficient N-D image segmentation by Boykov and Funka-Lea, the authors described in great detail how to define a graph based on an image. Our implementation closely follows their idea of constructing the graph. For simplicity, we will use grayscale square images. Although the same idea could be easily extended to colored images with a suitable inter-pixel similarity measurement and also to rectangular images.
57 |58 | In this image-graph transformation scheme, a pixel in the image is a vertex in the graph labeled in row-major order. These vertices are called pixel vertices. In addition, there are two extra vertices that serve as the (invisible) source and the sink. 59 |
60 |61 | There are two types of edges in our graph. The first type is an $n$-link, which connects neighboring pixel vertices in a 4-neighboring system. The second type of edge is called a $t$-link. $t$-links connect the source or sink vertex with the pixel vertices. 62 |
63 |The $n$-link edges must have weights carefully computed in order to reflect inter-pixel similarities. Concretely, we want the weight of an edge to be large when the two pixels are similar, and small when they are quite different. One idea is to compute the weight from a boundary penalty function that maps two pixel intensities to a positive integer. Let $I_p$ be the brightness, or intensity, of the pixel vertex $p$. For any edge $(p,q)\in E$, the boundary penalty $B(I_p, I_q)$ is defined as: 64 | $$ B(I_p, I_q) = 100\cdot \exp\Bigg(\frac{-(I_p-I_q)^2}{2\sigma ^2}\Bigg)$$. 65 | 66 |
67 |This choice of $\sigma$ is determined from a series of trial and error. The penalty is high if $|I_p-I_q|<\sigma$, and is quite negligible if $|I_p-I_q|>\sigma$. From empirical results, we choose $\sigma=30$. Finally, the result is multiplied by 100 and cast to an integer. This is because network flow models require that the capacities be discrete rather than continuous. 68 |
69 |To facilitate the model to make suitable $t$-links, the user is prompted to highlight at least one pixel vertex as a background pixel and at least one as a foreground pixel. These pixel vertices are called seeds. For every background seed, an edge is added from the source vertex to the background seed with capacity $\mathcal{K}$ defined as follows. 70 | $$ \mathcal {K} = \max(\{B(I_p, I_q)|(p, q)\in E\}) $$ 71 | In a similar fashion, edges to the sink vertex are added for every foreground seeds with capacity $\mathcal{K}$. 72 | 73 |
74 |75 | As these seeds share an edge with the source or sink vertices, they are hard-coded to be either the foreground or the background.
76 |
78 | 81 | Now with the graph fully defined, we can run a graph cut algorithm to find the maximum flow/minimum cut. 82 |
84 | There are several algorithms for finding the maximum flow/minimum cut. This project explores the efficiency of the Edmonds-Karp algorithm and the push-relabel algorithm. Though these are very standard and straightforward algorithms, we caution that they are often slow in practice.
85 |Edmonds-Karp Algorithm
86 |87 | The Edmonds-Karp algorithm is an implementation of the Ford-Fulkerson method. First, we define a residual network $G_f=(V,E_f)$ to be the same network but with capacity $c_f(u,v)=c(u,v)-f(u,v)$ and no flow. The idea behind the Ford-Fulkerson method is that if there exists a path from $s$ to $t$ in the residual network, then we can augment the current flow by this path. This path is called the augmenting path. The augmentation of this path is equal to the smallest residual capacity along this path. Once no augmenting path can be found, the current flow must be the max-flow.
88 |while there is a path from s to t in the residual network:
89 | flow = min([residual capacity of every edge along this path])
90 | for every edge (u,v) in this path:
91 | residual capacity of (u,v) -= flow
92 | residual capacity of (v,u) += flow
93 | 94 | The Ford-Fulkerson method is called a method and not an algorithm because it does not specify how one should go about finding the augmenting path. The Edmonds-Karp algorithm specifies that the Ford-Fulkerson method be carried out via a breadth-first search to find a viable path in every iteration.
95 |96 | Since we are interested in the min-cut, we add one additional step after the main loop. According to the definition of the $s/t$ cut, $S$ contains the set of vertices reachable from $s$ in the residual network and $T$ contains the rest of the vertices. Therefore, we can run a depth-first search from $s$ using the residual network and deduce which edges are cut. 97 |
98 |99 | The Edmonds-Karp algorithm is an $O(VE^2)$ algorithm. An augmenting path can be found in $O(E)$, and the length of such path is $O(V)$. Since in every iteration, at least one edge becomes fully saturated, the number of times the same path is found to be the augmenting path is $O(V)$. This means the total number of times we can find an augmenting path is bounded by $O(VE)$. The body of the while loop runs in $O(E)$ time, so in total the time complexity is $O(VE^2)$. An accessible proof can be found in CLRS. 100 |
101 |Push Relabel Algorithm
102 |103 | In the Push Relabel algorithm, we maintain a "preflow", which is a flow sent through the network but does not necessarily satisfy the flow conservation. In a preflow, the flow entering a vertex can be larger than the flow exiting the vertex. Given a preflow, a vertex can have an excess flow equal to the difference between the flow entering the vertex and the flow exiting the vertex. A vertex with excess flow is called an overflowing vertex.
104 |105 | In addition, we augment each vertex with a height attribute. Let $h(u)$ denote the height of a vertex u. The heights determine how a flow can be pushed. We can only push a flow from a higher vertex to a lower vertex, and not the other way around. 106 |
107 |108 | We start off by saturating all outgoing edges from $s$. This produces a valid preflow. We also set the height of $s$ to be the number of vertices. All other heights are set to 0. The algorithm centers around two main operations: push and relabel. For a given vertex $u$, we can push it by finding an outgoing edge from $u$ that is not saturated. The push to this edge is equal to the minimum of the residual capacity of the edge and the excess flow of $u$. Once the flow is pushed, the excess flow at $u$ falls by the amount of the pushed flow, and the excess flow at $v$ increases by the same amount. Not all edges can be pushed, however. In that case, we will relabel $u$ by setting the height of $u$ to be 109 | $$h(u)=1+\min\{h(v)|(u,v)\in E_f\}$$ 110 |
111 |while there exists an overflowing vertex u:
112 | push(u) or relabel(u)
113 | 114 | The Push Relabel algorithm runs in $O(V^2E)$ time, improving upon the Edmonds-Karp algorithm. We prove this by bounding the number of times a vertex can be found overflowing.
115 |117 | The source code is available for download at https://github.com/julie-jiang/image-segmentation/. 118 | Make sure you have the following dependencies installed:
119 |126 | To run the program, provide a JPG image and run
127 |$ python imagesegmentation.py yourImage.jpg
128 | 129 | A window will soon pop up prompting you to plant object (foreground) seeds. Use your cursor to mark points or draw lines on the window to label any part of the graph as foreground. Once you're done, press ESC to continue. A second window will now pop up prompting you to plant background seeds. As with before, press ESC to continue. In about half a minute, the resulting image segmented image will be shown. Press ESC when you're ready to quit the program. The seeded image and the segmented image will both be saved.
130 |
132 |
133 |
134 |
137 | The program uses the Edmonds-Karp algorithm by default. To use the Push-Relabel algorithm, add the flag --algorithm pr .
138 |
141 | As the size of the network flow graph grows quadratically with the size of the input image, the program should only be used on very small images. Empirically, we found that the program runs reasonably fast (under a minute) on an image of size 30x30 pixels. Therefore, all input images will be resized to 30x30 at the beginning of the program. The following example images look big because bigger images are easier on the eyes and also because marking pixels using OpenCV is only accurate when the image is big enough. All images are sized up before being displayed or saved.
142 | 143 |Furthermore, the program may fail to identify the correct cuts. For example, if there isn't sufficient seeds, then the program may only cut between the source/sink vertex and the pixel vertices--corresponding to no visible cuts in our image--or it may try to cut only around the seeds. One solution is to increase the number of seeds. Another method, introduced by Shi and Malik in Normalized Cuts and Image Segmentation, uses normalized cuts to penalize cuts that only partition a small localized region of the image.
144 | 145 |The following is a series of seeded images and the resulting segmentations.
148 | 149 |
151 |
152 |
155 |
156 |
159 |
160 |