├── assets ├── test.jpg └── noidea.jpg ├── LICENSE ├── README.md ├── utils └── common.py ├── .gitignore └── notebooks ├── 2-Image_stats_and_image_processing.ipynb ├── 3-Features.ipynb └── 1-Fundamentals.ipynb /assets/test.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/computationalcore/introduction-to-opencv/HEAD/assets/test.jpg -------------------------------------------------------------------------------- /assets/noidea.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/computationalcore/introduction-to-opencv/HEAD/assets/noidea.jpg -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2016 Hannah 4 | Copyright (c) 2020 Vin Busquet 5 | 6 | Permission is hereby granted, free of charge, to any person obtaining a copy 7 | of this software and associated documentation files (the "Software"), to deal 8 | in the Software without restriction, including without limitation the rights 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | copies of the Software, and to permit persons to whom the Software is 11 | furnished to do so, subject to the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be included in all 14 | copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | SOFTWARE. 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Introduction to OpenCV 2 | 3 | This is a collection of Jupyter notebooks that is intended to provide an introduction to OpenCV's Python interface. All notebooks were initially developed and released by [Hannah](https://github.com/handee/opencv-gettingstarted), with some minors changes, code update for python3, and some other customizations provided by me. 4 | 5 | The target audience is broad and includes 6 | 7 | * People who have done computer science (maybe to graduate level) but who have not looked at OpenCV before 8 | * People who are studying other subjects and want to play with computer vision 9 | 10 | ![No idea](https://raw.githubusercontent.com/computationalcore/introduction-to-opencv/master/assets/noidea.jpg "I have no idea") 11 | 12 | ## Notebooks 13 | 14 | The notebooks are divided by the following lessons. 15 | I also provided the estimated time required to complete each lesson, a link to the source code, and the Google Colab link where anyone can use to follow the lessons and run the examples. 16 | 17 | 18 | | Lesson | Estimated time needed | Source Code | Colab | 19 | | ------------- |:---------------------:| :-----------:| -----:| 20 | | OpenCV fundamentals | 20 min | [Open](https://github.com/computationalcore/introduction-to-opencv/blob/master/notebooks/1-Fundamentals.ipynb) | [Open](https://colab.research.google.com/github/computationalcore/introduction-to-opencv/blob/master/notebooks/1-Fundamentals.ipynb) | 21 | | Image stats and image processing | 20 min | [Open](https://github.com/computationalcore/introduction-to-opencv/blob/master/notebooks/2-Image_stats_and_image_processing.ipynb) | [Open](https://colab.research.google.com/github/computationalcore/introduction-to-opencv/blob/master/notebooks/2-Image_stats_and_image_processing.ipynb) | 22 | | Features in computer vision | 20 min | [Open](https://github.com/computationalcore/introduction-to-opencv/blob/master/notebooks/3-Features.ipynb) | [Open](https://colab.research.google.com/github/computationalcore/introduction-to-opencv/blob/master/notebooks/3-Features.ipynb) | 23 | | Cascade Classification | 20 min | [Open](https://github.com/computationalcore/introduction-to-opencv/blob/master/notebooks/4-Cascade_classification.ipynb) | [Open](https://colab.research.google.com/github/computationalcore/introduction-to-opencv/blob/master/notebooks/4-Cascade_classification.ipynb) | 24 | | **Total** | **80 min** | | | 25 | 26 | ## License 27 | 28 | This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. 29 | 30 | 31 | ## Acknowledgments 32 | * [Hannah](https://github.com/handee) 33 | * [HCC Summer School in Bremen](http://hcc.uni-bremen.de/school2016/) 34 | * [OpenCV](https://opencv.org/) 35 | * [Raph Trajano](https://github.com/raphtrajano) 36 | * [Nelson Perez](https://github.com/bilthon) 37 | 38 | And a special thanks to [Raph Trajano](https://github.com/raphtrajano) for reviewing and fixing the materials. 39 | -------------------------------------------------------------------------------- /utils/common.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import os 4 | from contextlib import contextmanager 5 | import itertools as it 6 | from functools import reduce 7 | 8 | image_extensions = ['.bmp', '.jpg', '.jpeg', '.png', '.tif', '.tiff', '.pbm', '.pgm', '.ppm'] 9 | 10 | class Bunch(object): 11 | def __init__(self, **kw): 12 | self.__dict__.update(kw) 13 | def __str__(self): 14 | return str(self.__dict__) 15 | 16 | def splitfn(fn): 17 | path, fn = os.path.split(fn) 18 | name, ext = os.path.splitext(fn) 19 | return path, name, ext 20 | 21 | def anorm2(a): 22 | return (a*a).sum(-1) 23 | def anorm(a): 24 | return np.sqrt( anorm2(a) ) 25 | 26 | def homotrans(H, x, y): 27 | xs = H[0, 0]*x + H[0, 1]*y + H[0, 2] 28 | ys = H[1, 0]*x + H[1, 1]*y + H[1, 2] 29 | s = H[2, 0]*x + H[2, 1]*y + H[2, 2] 30 | return xs/s, ys/s 31 | 32 | def to_rect(a): 33 | a = np.ravel(a) 34 | if len(a) == 2: 35 | a = (0, 0, a[0], a[1]) 36 | return np.array(a, np.float64).reshape(2, 2) 37 | 38 | def rect2rect_mtx(src, dst): 39 | src, dst = to_rect(src), to_rect(dst) 40 | cx, cy = (dst[1] - dst[0]) / (src[1] - src[0]) 41 | tx, ty = dst[0] - src[0] * (cx, cy) 42 | M = np.float64([[ cx, 0, tx], 43 | [ 0, cy, ty], 44 | [ 0, 0, 1]]) 45 | return M 46 | 47 | 48 | def lookat(eye, target, up = (0, 0, 1)): 49 | fwd = np.asarray(target, np.float64) - eye 50 | fwd /= anorm(fwd) 51 | right = np.cross(fwd, up) 52 | right /= anorm(right) 53 | down = np.cross(fwd, right) 54 | R = np.float64([right, down, fwd]) 55 | tvec = -np.dot(R, eye) 56 | return R, tvec 57 | 58 | def mtx2rvec(R): 59 | w, u, vt = cv2.SVDecomp(R - np.eye(3)) 60 | p = vt[0] + u[:,0]*w[0] # same as np.dot(R, vt[0]) 61 | c = np.dot(vt[0], p) 62 | s = np.dot(vt[1], p) 63 | axis = np.cross(vt[0], vt[1]) 64 | return axis * np.arctan2(s, c) 65 | 66 | def draw_str(dst, xxx_todo_changeme, s): 67 | (x, y) = xxx_todo_changeme 68 | cv2.putText(dst, s, (x+1, y+1), cv2.FONT_HERSHEY_PLAIN, 1.0, (0, 0, 0), thickness = 2, lineType=cv2.CV_AA) 69 | cv2.putText(dst, s, (x, y), cv2.FONT_HERSHEY_PLAIN, 1.0, (255, 255, 255), lineType=cv2.CV_AA) 70 | 71 | class Sketcher: 72 | def __init__(self, windowname, dests, colors_func): 73 | self.prev_pt = None 74 | self.windowname = windowname 75 | self.dests = dests 76 | self.colors_func = colors_func 77 | self.dirty = False 78 | self.show() 79 | cv2.setMouseCallback(self.windowname, self.on_mouse) 80 | 81 | def show(self): 82 | cv2.imshow(self.windowname, self.dests[0]) 83 | 84 | def on_mouse(self, event, x, y, flags, param): 85 | pt = (x, y) 86 | if event == cv2.EVENT_LBUTTONDOWN: 87 | self.prev_pt = pt 88 | if self.prev_pt and flags & cv2.EVENT_FLAG_LBUTTON: 89 | for dst, color in zip(self.dests, self.colors_func()): 90 | cv2.line(dst, self.prev_pt, pt, color, 5) 91 | self.dirty = True 92 | self.prev_pt = pt 93 | self.show() 94 | else: 95 | self.prev_pt = None 96 | 97 | 98 | # palette data from matplotlib/_cm.py 99 | _jet_data = {'red': ((0., 0, 0), (0.35, 0, 0), (0.66, 1, 1), (0.89,1, 1), 100 | (1, 0.5, 0.5)), 101 | 'green': ((0., 0, 0), (0.125,0, 0), (0.375,1, 1), (0.64,1, 1), 102 | (0.91,0,0), (1, 0, 0)), 103 | 'blue': ((0., 0.5, 0.5), (0.11, 1, 1), (0.34, 1, 1), (0.65,0, 0), 104 | (1, 0, 0))} 105 | 106 | cmap_data = { 'jet' : _jet_data } 107 | 108 | def make_cmap(name, n=256): 109 | data = cmap_data[name] 110 | xs = np.linspace(0.0, 1.0, n) 111 | channels = [] 112 | eps = 1e-6 113 | for ch_name in ['blue', 'green', 'red']: 114 | ch_data = data[ch_name] 115 | xp, yp = [], [] 116 | for x, y1, y2 in ch_data: 117 | xp += [x, x+eps] 118 | yp += [y1, y2] 119 | ch = np.interp(xs, xp, yp) 120 | channels.append(ch) 121 | return np.uint8(np.array(channels).T*255) 122 | 123 | def nothing(*arg, **kw): 124 | pass 125 | 126 | def clock(): 127 | return cv2.getTickCount() / cv2.getTickFrequency() 128 | 129 | @contextmanager 130 | def Timer(msg): 131 | print(msg, '...', end=' ') 132 | start = clock() 133 | try: 134 | yield 135 | finally: 136 | print("%.2f ms" % ((clock()-start)*1000)) 137 | 138 | class StatValue: 139 | def __init__(self, smooth_coef = 0.5): 140 | self.value = None 141 | self.smooth_coef = smooth_coef 142 | def update(self, v): 143 | if self.value is None: 144 | self.value = v 145 | else: 146 | c = self.smooth_coef 147 | self.value = c * self.value + (1.0-c) * v 148 | 149 | class RectSelector: 150 | def __init__(self, win, callback): 151 | self.win = win 152 | self.callback = callback 153 | cv2.setMouseCallback(win, self.onmouse) 154 | self.drag_start = None 155 | self.drag_rect = None 156 | self.boo=False 157 | def onmouse(self, event, x, y, flags, param): 158 | x, y = np.int16([x, y]) # BUG 159 | if (self.boo): 160 | if event == cv2.EVENT_LBUTTONDOWN: 161 | print("click2boo") 162 | print(x,y) 163 | print(self.drag_start) 164 | xo, yo = self.drag_start 165 | x0, y0 = np.minimum([xo, yo], [x, y]) 166 | x1, y1 = np.maximum([xo, yo], [x, y]) 167 | self.drag_rect = None 168 | if x1-x0 > 0 and y1-y0 > 0: 169 | self.drag_rect = (x0, y0, x1, y1) 170 | print("got a rect") 171 | rect = self.drag_rect 172 | self.drag_start = None 173 | self.drag_rect = None 174 | if rect: 175 | self.callback(rect) 176 | self.boo=False 177 | elif event == cv2.EVENT_LBUTTONDOWN: 178 | print("click1") 179 | self.drag_start = (x, y) 180 | print(self.drag_start) 181 | self.boo=True 182 | def draw(self, vis): 183 | if not self.drag_rect: 184 | return False 185 | x0, y0, x1, y1 = self.drag_rect 186 | cv2.rectangle(vis, (x0, y0), (x1, y1), (0, 255, 0), 2) 187 | return True 188 | @property 189 | def dragging(self): 190 | return self.drag_rect is not None 191 | 192 | 193 | def grouper(n, iterable, fillvalue=None): 194 | '''grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx''' 195 | args = [iter(iterable)] * n 196 | return it.izip_longest(fillvalue=fillvalue, *args) 197 | 198 | def mosaic(w, imgs): 199 | '''Make a grid from images. 200 | 201 | w -- number of grid columns 202 | imgs -- images (must have same size and format) 203 | ''' 204 | imgs = iter(imgs) 205 | img0 = next(imgs) 206 | pad = np.zeros_like(img0) 207 | imgs = it.chain([img0], imgs) 208 | rows = grouper(w, imgs, pad) 209 | return np.vstack(list(map(np.hstack, rows))) 210 | 211 | def getsize(img): 212 | h, w = img.shape[:2] 213 | return w, h 214 | 215 | def mdot(*args): 216 | return reduce(np.dot, args) 217 | 218 | def draw_keypoints(vis, keypoints, color = (0, 255, 255)): 219 | for kp in keypoints: 220 | x, y = kp.pt 221 | cv2.circle(vis, (int(x), int(y)), 2, color) 222 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | ## Ignore Visual Studio temporary files, build results, and 2 | ## files generated by popular Visual Studio add-ons. 3 | ## 4 | ## Get latest from https://github.com/github/gitignore/blob/master/VisualStudio.gitignore 5 | 6 | # User-specific files 7 | *.rsuser 8 | *.suo 9 | *.user 10 | *.userosscache 11 | *.sln.docstates 12 | *.vscode 13 | 14 | # User-specific files (MonoDevelop/Xamarin Studio) 15 | *.userprefs 16 | 17 | # Mono auto generated files 18 | mono_crash.* 19 | 20 | # Build results 21 | [Dd]ebug/ 22 | [Dd]ebugPublic/ 23 | [Rr]elease/ 24 | [Rr]eleases/ 25 | x64/ 26 | x86/ 27 | [Aa][Rr][Mm]/ 28 | [Aa][Rr][Mm]64/ 29 | bld/ 30 | [Bb]in/ 31 | [Oo]bj/ 32 | [Ll]og/ 33 | [Ll]ogs/ 34 | 35 | # Visual Studio 2015/2017 cache/options directory 36 | .vs/ 37 | # Uncomment if you have tasks that create the project's static files in wwwroot 38 | #wwwroot/ 39 | 40 | # Visual Studio 2017 auto generated files 41 | Generated\ Files/ 42 | 43 | # MSTest test Results 44 | [Tt]est[Rr]esult*/ 45 | [Bb]uild[Ll]og.* 46 | 47 | # NUnit 48 | *.VisualState.xml 49 | TestResult.xml 50 | nunit-*.xml 51 | 52 | # Build Results of an ATL Project 53 | [Dd]ebugPS/ 54 | [Rr]eleasePS/ 55 | dlldata.c 56 | 57 | # Benchmark Results 58 | BenchmarkDotNet.Artifacts/ 59 | 60 | # .NET Core 61 | project.lock.json 62 | project.fragment.lock.json 63 | artifacts/ 64 | 65 | # StyleCop 66 | StyleCopReport.xml 67 | 68 | # Files built by Visual Studio 69 | *_i.c 70 | *_p.c 71 | *_h.h 72 | *.ilk 73 | *.meta 74 | *.obj 75 | *.iobj 76 | *.pch 77 | *.pdb 78 | *.ipdb 79 | *.pgc 80 | *.pgd 81 | *.rsp 82 | *.sbr 83 | *.tlb 84 | *.tli 85 | *.tlh 86 | *.tmp 87 | *.tmp_proj 88 | *_wpftmp.csproj 89 | *.log 90 | *.vspscc 91 | *.vssscc 92 | .builds 93 | *.pidb 94 | *.svclog 95 | *.scc 96 | 97 | # Chutzpah Test files 98 | _Chutzpah* 99 | 100 | # Visual C++ cache files 101 | ipch/ 102 | *.aps 103 | *.ncb 104 | *.opendb 105 | *.opensdf 106 | *.sdf 107 | *.cachefile 108 | *.VC.db 109 | *.VC.VC.opendb 110 | 111 | # Visual Studio profiler 112 | *.psess 113 | *.vsp 114 | *.vspx 115 | *.sap 116 | 117 | # Visual Studio Trace Files 118 | *.e2e 119 | 120 | # TFS 2012 Local Workspace 121 | $tf/ 122 | 123 | # Guidance Automation Toolkit 124 | *.gpState 125 | 126 | # ReSharper is a .NET coding add-in 127 | _ReSharper*/ 128 | *.[Rr]e[Ss]harper 129 | *.DotSettings.user 130 | 131 | # TeamCity is a build add-in 132 | _TeamCity* 133 | 134 | # DotCover is a Code Coverage Tool 135 | *.dotCover 136 | 137 | # AxoCover is a Code Coverage Tool 138 | .axoCover/* 139 | !.axoCover/settings.json 140 | 141 | # Coverlet is a free, cross platform Code Coverage Tool 142 | coverage*[.json, .xml, .info] 143 | 144 | # Visual Studio code coverage results 145 | *.coverage 146 | *.coveragexml 147 | 148 | # NCrunch 149 | _NCrunch_* 150 | .*crunch*.local.xml 151 | nCrunchTemp_* 152 | 153 | # MightyMoose 154 | *.mm.* 155 | AutoTest.Net/ 156 | 157 | # Web workbench (sass) 158 | .sass-cache/ 159 | 160 | # Installshield output folder 161 | [Ee]xpress/ 162 | 163 | # DocProject is a documentation generator add-in 164 | DocProject/buildhelp/ 165 | DocProject/Help/*.HxT 166 | DocProject/Help/*.HxC 167 | DocProject/Help/*.hhc 168 | DocProject/Help/*.hhk 169 | DocProject/Help/*.hhp 170 | DocProject/Help/Html2 171 | DocProject/Help/html 172 | 173 | # Click-Once directory 174 | publish/ 175 | 176 | # Publish Web Output 177 | *.[Pp]ublish.xml 178 | *.azurePubxml 179 | # Note: Comment the next line if you want to checkin your web deploy settings, 180 | # but database connection strings (with potential passwords) will be unencrypted 181 | *.pubxml 182 | *.publishproj 183 | 184 | # Microsoft Azure Web App publish settings. Comment the next line if you want to 185 | # checkin your Azure Web App publish settings, but sensitive information contained 186 | # in these scripts will be unencrypted 187 | PublishScripts/ 188 | 189 | # NuGet Packages 190 | *.nupkg 191 | # NuGet Symbol Packages 192 | *.snupkg 193 | # The packages folder can be ignored because of Package Restore 194 | **/[Pp]ackages/* 195 | # except build/, which is used as an MSBuild target. 196 | !**/[Pp]ackages/build/ 197 | # Uncomment if necessary however generally it will be regenerated when needed 198 | #!**/[Pp]ackages/repositories.config 199 | # NuGet v3's project.json files produces more ignorable files 200 | *.nuget.props 201 | *.nuget.targets 202 | 203 | # Microsoft Azure Build Output 204 | csx/ 205 | *.build.csdef 206 | 207 | # Microsoft Azure Emulator 208 | ecf/ 209 | rcf/ 210 | 211 | # Windows Store app package directories and files 212 | AppPackages/ 213 | BundleArtifacts/ 214 | Package.StoreAssociation.xml 215 | _pkginfo.txt 216 | *.appx 217 | *.appxbundle 218 | *.appxupload 219 | 220 | # Visual Studio cache files 221 | # files ending in .cache can be ignored 222 | *.[Cc]ache 223 | # but keep track of directories ending in .cache 224 | !?*.[Cc]ache/ 225 | 226 | # Others 227 | ClientBin/ 228 | ~$* 229 | *~ 230 | *.dbmdl 231 | *.dbproj.schemaview 232 | *.jfm 233 | *.pfx 234 | *.publishsettings 235 | orleans.codegen.cs 236 | 237 | # Including strong name files can present a security risk 238 | # (https://github.com/github/gitignore/pull/2483#issue-259490424) 239 | #*.snk 240 | 241 | # Since there are multiple workflows, uncomment next line to ignore bower_components 242 | # (https://github.com/github/gitignore/pull/1529#issuecomment-104372622) 243 | #bower_components/ 244 | 245 | # RIA/Silverlight projects 246 | Generated_Code/ 247 | 248 | # Backup & report files from converting an old project file 249 | # to a newer Visual Studio version. Backup files are not needed, 250 | # because we have git ;-) 251 | _UpgradeReport_Files/ 252 | Backup*/ 253 | UpgradeLog*.XML 254 | UpgradeLog*.htm 255 | ServiceFabricBackup/ 256 | *.rptproj.bak 257 | 258 | # SQL Server files 259 | *.mdf 260 | *.ldf 261 | *.ndf 262 | 263 | # Business Intelligence projects 264 | *.rdl.data 265 | *.bim.layout 266 | *.bim_*.settings 267 | *.rptproj.rsuser 268 | *- [Bb]ackup.rdl 269 | *- [Bb]ackup ([0-9]).rdl 270 | *- [Bb]ackup ([0-9][0-9]).rdl 271 | 272 | # Microsoft Fakes 273 | FakesAssemblies/ 274 | 275 | # GhostDoc plugin setting file 276 | *.GhostDoc.xml 277 | 278 | # Node.js Tools for Visual Studio 279 | .ntvs_analysis.dat 280 | node_modules/ 281 | 282 | # Visual Studio 6 build log 283 | *.plg 284 | 285 | # Visual Studio 6 workspace options file 286 | *.opt 287 | 288 | # Visual Studio 6 auto-generated workspace file (contains which files were open etc.) 289 | *.vbw 290 | 291 | # Visual Studio LightSwitch build output 292 | **/*.HTMLClient/GeneratedArtifacts 293 | **/*.DesktopClient/GeneratedArtifacts 294 | **/*.DesktopClient/ModelManifest.xml 295 | **/*.Server/GeneratedArtifacts 296 | **/*.Server/ModelManifest.xml 297 | _Pvt_Extensions 298 | 299 | # Paket dependency manager 300 | .paket/paket.exe 301 | paket-files/ 302 | 303 | # FAKE - F# Make 304 | .fake/ 305 | 306 | # CodeRush personal settings 307 | .cr/personal 308 | 309 | # Python Tools for Visual Studio (PTVS) 310 | __pycache__/ 311 | *.pyc 312 | 313 | # Cake - Uncomment if you are using it 314 | # tools/** 315 | # !tools/packages.config 316 | 317 | # Tabs Studio 318 | *.tss 319 | 320 | # Telerik's JustMock configuration file 321 | *.jmconfig 322 | 323 | # BizTalk build output 324 | *.btp.cs 325 | *.btm.cs 326 | *.odx.cs 327 | *.xsd.cs 328 | 329 | # OpenCover UI analysis results 330 | OpenCover/ 331 | 332 | # Azure Stream Analytics local run output 333 | ASALocalRun/ 334 | 335 | # MSBuild Binary and Structured Log 336 | *.binlog 337 | 338 | # NVidia Nsight GPU debugger configuration file 339 | *.nvuser 340 | 341 | # MFractors (Xamarin productivity tool) working folder 342 | .mfractor/ 343 | 344 | # Local History for Visual Studio 345 | .localhistory/ 346 | 347 | # BeatPulse healthcheck temp database 348 | healthchecksdb 349 | 350 | # Backup folder for Package Reference Convert tool in Visual Studio 2017 351 | MigrationBackup/ 352 | 353 | # Ionide (cross platform F# VS Code tools) working folder 354 | .ionide/ 355 | -------------------------------------------------------------------------------- /notebooks/2-Image_stats_and_image_processing.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "kernelspec": { 6 | "display_name": "Python 3", 7 | "language": "python", 8 | "name": "python3" 9 | }, 10 | "language_info": { 11 | "codemirror_mode": { 12 | "name": "ipython", 13 | "version": 3 14 | }, 15 | "file_extension": ".py", 16 | "mimetype": "text/x-python", 17 | "name": "python", 18 | "nbconvert_exporter": "python", 19 | "pygments_lexer": "ipython3", 20 | "version": "3.6.0" 21 | }, 22 | "colab": { 23 | "name": "2-Image-stats-and-image-processing.ipynb", 24 | "provenance": [], 25 | "collapsed_sections": [] 26 | } 27 | }, 28 | "cells": [ 29 | { 30 | "cell_type": "markdown", 31 | "metadata": { 32 | "colab_type": "text", 33 | "deletable": true, 34 | "editable": true, 35 | "id": "2TcCJuat3fWI" 36 | }, 37 | "source": [ 38 | "# Image stats and image processing\n", 39 | "This notebook follows on from the fundamentals notebook.\n", 40 | "\n", 41 | "This will introduce some simple stats, smoothing, and basic image processing.\n", 42 | "\n", 43 | "But first let us include what we need to include and load up our test image.\n", 44 | "\n", 45 | "

\n", 46 | " Estimated time needed: 20 min\n", 47 | "

" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "metadata": { 53 | "colab_type": "code", 54 | "deletable": true, 55 | "editable": true, 56 | "id": "sWiMMUll3fWL", 57 | "colab": {} 58 | }, 59 | "source": [ 60 | "# Download the test image and utils files\n", 61 | "!wget --no-check-certificate \\\n", 62 | " https://raw.githubusercontent.com/computationalcore/introduction-to-opencv/master/assets/noidea.jpg \\\n", 63 | " -O noidea.jpg\n", 64 | "!wget --no-check-certificate \\\n", 65 | " https://raw.githubusercontent.com/computationalcore/introduction-to-opencv/master/utils/common.py \\\n", 66 | " -O common.py\n", 67 | "# these imports let you use opencv\n", 68 | "import cv2 #opencv itself\n", 69 | "import common #some useful opencv functions\n", 70 | "import numpy as np # matrix manipulations\n", 71 | "\n", 72 | "#the following are to do with this interactive notebook code\n", 73 | "%matplotlib inline \n", 74 | "from matplotlib import pyplot as plt # this lets you draw inline pictures in the notebooks\n", 75 | "import pylab # this allows you to control figure size \n", 76 | "pylab.rcParams['figure.figsize'] = (10.0, 8.0) # this controls figure size in the notebook\n", 77 | "\n", 78 | "input_image=cv2.imread('noidea.jpg')" 79 | ], 80 | "execution_count": 0, 81 | "outputs": [] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "metadata": { 86 | "colab_type": "text", 87 | "deletable": true, 88 | "editable": true, 89 | "id": "b8PY3kZ63fWO" 90 | }, 91 | "source": [ 92 | "## Basic manipulations\n", 93 | "\n", 94 | "Rotate, flip... " 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "metadata": { 100 | "colab_type": "code", 101 | "deletable": true, 102 | "editable": true, 103 | "id": "4LHzdNvt3fWP", 104 | "colab": {} 105 | }, 106 | "source": [ 107 | "flipped_code_0=cv2.flip(input_image,0) # vertical flip\n", 108 | "plt.imshow(flipped_code_0)" 109 | ], 110 | "execution_count": 0, 111 | "outputs": [] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "metadata": { 116 | "colab_type": "code", 117 | "deletable": true, 118 | "editable": true, 119 | "id": "9SOq_oD-3fWR", 120 | "colab": {} 121 | }, 122 | "source": [ 123 | "flipped_code_1=cv2.flip(input_image,1) # horizontal flip\n", 124 | "plt.imshow(flipped_code_1)" 125 | ], 126 | "execution_count": 0, 127 | "outputs": [] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "metadata": { 132 | "colab_type": "code", 133 | "deletable": true, 134 | "editable": true, 135 | "id": "7zapvC1p3fWU", 136 | "colab": {} 137 | }, 138 | "source": [ 139 | "transposed=cv2.transpose(input_image)\n", 140 | "plt.imshow(transposed)" 141 | ], 142 | "execution_count": 0, 143 | "outputs": [] 144 | }, 145 | { 146 | "cell_type": "markdown", 147 | "metadata": { 148 | "colab_type": "text", 149 | "deletable": true, 150 | "editable": true, 151 | "id": "n8yTtUGq3fWX" 152 | }, 153 | "source": [ 154 | "## Minimum, maximum\n", 155 | "\n", 156 | "To find the min or max of a matrix, you can use minMaxLoc. This takes a single channel image (it doesn't make much sense to take the max of a 3 channel image). So in the next code snippet you see a for loop, using python style image slicing, to look at each channel of the input image separately. " 157 | ] 158 | }, 159 | { 160 | "cell_type": "code", 161 | "metadata": { 162 | "colab_type": "code", 163 | "deletable": true, 164 | "editable": true, 165 | "id": "S7nLP0QL3fWY", 166 | "colab": {} 167 | }, 168 | "source": [ 169 | "for i in range(0,3):\n", 170 | " min_value, max_value, min_location, max_location=cv2.minMaxLoc(input_image[:,:,i])\n", 171 | " print(\"min {} is at {}, and max {} is at {}\".format(min_value, min_location, max_value, max_location))\n" 172 | ], 173 | "execution_count": 0, 174 | "outputs": [] 175 | }, 176 | { 177 | "cell_type": "markdown", 178 | "metadata": { 179 | "colab_type": "text", 180 | "deletable": true, 181 | "editable": true, 182 | "id": "qF_AsupK3fWa" 183 | }, 184 | "source": [ 185 | "## Arithmetic operations on images\n", 186 | "\n", 187 | "OpenCV has a lot of functions for doing mathematics on images. Some of these have \"analogous\" numpy alternatives, but it is nearly always better to use the OpenCV version. The reason for this that OpenCV is designed to work on images and so handles overflow better (OpenCV add, for example, truncates to 255 if the datatype is image-like and 8 bit; Numpy's alternative wraps around).\n", 188 | "\n", 189 | "Useful arithmetic operations include add and addWeighted, which combine two images that are the same size. " 190 | ] 191 | }, 192 | { 193 | "cell_type": "code", 194 | "metadata": { 195 | "colab_type": "code", 196 | "deletable": true, 197 | "editable": true, 198 | "id": "Gg5M59Lt3fWa", 199 | "colab": {} 200 | }, 201 | "source": [ 202 | "#First create an image the same size as our input\n", 203 | "blank_image = np.zeros((input_image.shape), np.uint8)\n", 204 | "\n", 205 | "blank_image[100:200,100:200,1]=100; #give it a green square\n", 206 | "\n", 207 | "new_image=cv2.add(blank_image,input_image) # add the two images together\n", 208 | "\n", 209 | "plt.imshow(cv2.cvtColor(new_image, cv2.COLOR_BGR2RGB))" 210 | ], 211 | "execution_count": 0, 212 | "outputs": [] 213 | }, 214 | { 215 | "cell_type": "markdown", 216 | "metadata": { 217 | "colab_type": "text", 218 | "deletable": true, 219 | "editable": true, 220 | "id": "ke-afLFp3fWd" 221 | }, 222 | "source": [ 223 | "## Noise reduction\n", 224 | "Noise reduction usually involves blurring/smoothing an image using a Gaussian kernel.\n", 225 | "The width of the kernel determines the amount of smoothing." 226 | ] 227 | }, 228 | { 229 | "cell_type": "code", 230 | "metadata": { 231 | "colab_type": "code", 232 | "deletable": true, 233 | "editable": true, 234 | "id": "EyxmwP0E3fWd", 235 | "colab": {} 236 | }, 237 | "source": [ 238 | "d=3\n", 239 | "img_blur3 = cv2.GaussianBlur(input_image, (2*d+1, 2*d+1), -1)[d:-d,d:-d]\n", 240 | "\n", 241 | "plt.imshow(cv2.cvtColor(img_blur3, cv2.COLOR_BGR2RGB))" 242 | ], 243 | "execution_count": 0, 244 | "outputs": [] 245 | }, 246 | { 247 | "cell_type": "code", 248 | "metadata": { 249 | "colab_type": "code", 250 | "deletable": true, 251 | "editable": true, 252 | "id": "GjGY7Dl33fWg", 253 | "colab": {} 254 | }, 255 | "source": [ 256 | "d=5\n", 257 | "img_blur5 = cv2.GaussianBlur(input_image, (2*d+1, 2*d+1), -1)[d:-d,d:-d]\n", 258 | "\n", 259 | "plt.imshow(cv2.cvtColor(img_blur5, cv2.COLOR_BGR2RGB))" 260 | ], 261 | "execution_count": 0, 262 | "outputs": [] 263 | }, 264 | { 265 | "cell_type": "code", 266 | "metadata": { 267 | "colab_type": "code", 268 | "deletable": true, 269 | "editable": true, 270 | "id": "AaJ7zd1w3fWi", 271 | "colab": {} 272 | }, 273 | "source": [ 274 | "d=15\n", 275 | "img_blur15 = cv2.GaussianBlur(input_image, (2*d+1, 2*d+1), -1)[d:-d,d:-d]\n", 276 | "\n", 277 | "plt.imshow(cv2.cvtColor(img_blur15, cv2.COLOR_BGR2RGB))" 278 | ], 279 | "execution_count": 0, 280 | "outputs": [] 281 | }, 282 | { 283 | "cell_type": "markdown", 284 | "metadata": { 285 | "colab_type": "text", 286 | "deletable": true, 287 | "editable": true, 288 | "id": "Hp50ZT_A3fWk" 289 | }, 290 | "source": [ 291 | "## Edges\n", 292 | "\n", 293 | "Edge detection is the final image processing technique we're going to look at in this tutorial.\n", 294 | "\n", 295 | "For a lot of what we think of as \"modern\" computer vision techniques, edge detection functions as a building block. Much edge detection actually works by **convolution**, and indeed **convolutional neural networks** are absolutely the flavour of the month in some parts of computer vision. Sobel's edge detector was one of the first truly successful edge detection (enhancement) technique and that involves convolution at its core. You can read more about the background to Sobel here in the OpenCV docs [here](https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_gradients/py_gradients.html). " 296 | ] 297 | }, 298 | { 299 | "cell_type": "code", 300 | "metadata": { 301 | "colab_type": "code", 302 | "deletable": true, 303 | "editable": true, 304 | "id": "d7ceQSv13fWk", 305 | "colab": {} 306 | }, 307 | "source": [ 308 | "sobelimage=cv2.cvtColor(input_image,cv2.COLOR_BGR2GRAY)\n", 309 | "\n", 310 | "sobelx = cv2.Sobel(sobelimage,cv2.CV_64F,1,0,ksize=9)\n", 311 | "sobely = cv2.Sobel(sobelimage,cv2.CV_64F,0,1,ksize=9)\n", 312 | "plt.imshow(sobelx,cmap = 'gray') \n", 313 | "# Sobel works in x and in y, change sobelx to sobely in the olt line above to see the difference\n" 314 | ], 315 | "execution_count": 0, 316 | "outputs": [] 317 | }, 318 | { 319 | "cell_type": "markdown", 320 | "metadata": { 321 | "colab_type": "text", 322 | "deletable": true, 323 | "editable": true, 324 | "id": "RWxFMGQe3fWm" 325 | }, 326 | "source": [ 327 | "Canny edge detection is another winnning technique - it takes two thresholds.\n", 328 | "The first one determines how likely Canny is to find an edge, and the second determines how likely it is to follow that edge once it's found. Investigate the effect of these thresholds by altering the values below." 329 | ] 330 | }, 331 | { 332 | "cell_type": "code", 333 | "metadata": { 334 | "colab_type": "code", 335 | "deletable": true, 336 | "editable": true, 337 | "id": "1MJQRgXL3fWn", 338 | "colab": {} 339 | }, 340 | "source": [ 341 | "th1=30\n", 342 | "th2=60 # Canny recommends threshold 2 is 3 times threshold 1 - you could try experimenting with this...\n", 343 | "d=3 # gaussian blur\n", 344 | "\n", 345 | "edgeresult=input_image.copy()\n", 346 | "edgeresult = cv2.GaussianBlur(edgeresult, (2*d+1, 2*d+1), -1)[d:-d,d:-d]\n", 347 | "\n", 348 | "gray = cv2.cvtColor(edgeresult, cv2.COLOR_BGR2GRAY)\n", 349 | "\n", 350 | "edge = cv2.Canny(gray, th1, th2)\n", 351 | "\n", 352 | "edgeresult[edge != 0] = (0, 255, 0) # this takes pixels in edgeresult where edge non-zero colours them bright green\n", 353 | "\n", 354 | "plt.imshow(cv2.cvtColor(edgeresult, cv2.COLOR_BGR2RGB))" 355 | ], 356 | "execution_count": 0, 357 | "outputs": [] 358 | }, 359 | { 360 | "cell_type": "markdown", 361 | "metadata": { 362 | "colab_type": "text", 363 | "collapsed": true, 364 | "deletable": true, 365 | "editable": true, 366 | "id": "AV4RbCru3fWs" 367 | }, 368 | "source": [ 369 | "[Previous](1-Fundamentals.ipynb) [Next](3-Features.ipynb)" 370 | ] 371 | } 372 | ] 373 | } -------------------------------------------------------------------------------- /notebooks/3-Features.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "kernelspec": { 6 | "display_name": "Python 3", 7 | "language": "python", 8 | "name": "python3" 9 | }, 10 | "language_info": { 11 | "codemirror_mode": { 12 | "name": "ipython", 13 | "version": 3 14 | }, 15 | "file_extension": ".py", 16 | "mimetype": "text/x-python", 17 | "name": "python", 18 | "nbconvert_exporter": "python", 19 | "pygments_lexer": "ipython3", 20 | "version": "3.6.0" 21 | }, 22 | "colab": { 23 | "name": "3-Features.ipynb", 24 | "provenance": [] 25 | } 26 | }, 27 | "cells": [ 28 | { 29 | "cell_type": "markdown", 30 | "metadata": { 31 | "colab_type": "text", 32 | "deletable": true, 33 | "editable": true, 34 | "id": "X7HD5UmG5NY-" 35 | }, 36 | "source": [ 37 | "## Features in computer vision\n", 38 | "\n", 39 | "Features are image locations that are \"easy\" to find in the future. Indeed, one of the early feature detection techniques Lucas-Kanade, sometimes called Kanade-Lucas-Tomasi or KLT features come from a seminal paper called \"Good features to track\".\n", 40 | "\n", 41 | "Edges find brightness discontinuities in an image, features find distinctive regions. There are a bunch of different feature detectors and these all have some characteristics in common: they should be quick to find, and things that are close in image-space are close in feature-space (that is, the feature representation of an object looks like the feature representation of objects that look like that object).\n", 42 | "\n", 43 | "There is a more in depth *features in OpenCV* set of tutorials [here](https://docs.opencv.org/master/db/d27/tutorial_py_table_of_contents_feature2d.html) and I'll link to various parts of that as appropriate: for more background though, go and work through the whole thing.\n", 44 | "\n", 45 | "

\n", 46 | " Estimated time needed: 20 min\n", 47 | "

" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "metadata": { 53 | "colab_type": "code", 54 | "deletable": true, 55 | "editable": true, 56 | "id": "_8B7e-Rn5NY_", 57 | "colab": {} 58 | }, 59 | "source": [ 60 | "# Download the test image and utils files\n", 61 | "!wget --no-check-certificate \\\n", 62 | " https://raw.githubusercontent.com/computationalcore/introduction-to-opencv/master/assets/noidea.jpg \\\n", 63 | " -O noidea.jpg\n", 64 | "!wget --no-check-certificate \\\n", 65 | " https://raw.githubusercontent.com/computationalcore/introduction-to-opencv/master/utils/common.py \\\n", 66 | " -O common.py\n", 67 | "\n", 68 | "# our usual set of includes\n", 69 | "# these imports let you use opencv\n", 70 | "import cv2 #opencv itself\n", 71 | "import common #some useful opencv functions\n", 72 | "import numpy as np # matrix manipulations\n", 73 | "\n", 74 | "#the following are to do with this interactive notebook code\n", 75 | "%matplotlib inline \n", 76 | "from matplotlib import pyplot as plt # this lets you draw inline pictures in the notebooks\n", 77 | "import pylab # this allows you to control figure size \n", 78 | "pylab.rcParams['figure.figsize'] = (10.0, 8.0) # this controls figure size in the notebook\n", 79 | "\n", 80 | "\n", 81 | "input_image=cv2.imread('noidea.jpg')" 82 | ], 83 | "execution_count": 0, 84 | "outputs": [] 85 | }, 86 | { 87 | "cell_type": "markdown", 88 | "metadata": { 89 | "colab_type": "text", 90 | "deletable": true, 91 | "editable": true, 92 | "id": "yW11cMb15NZB" 93 | }, 94 | "source": [ 95 | "## Corner detectors\n", 96 | "If you think of edges as being lines, then corners are an obvious choice for features as they represent the intersection of two lines. One of the earlier corner detectors was introduced by Harris, and it is still a very effective corner detector that gets used quite a lot: it's reliable and it's fast. There's a tutorial explaining how Harris works on the OpenCV site [here](https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_feature2d/py_features_harris/py_features_harris.html)" 97 | ] 98 | }, 99 | { 100 | "cell_type": "code", 101 | "metadata": { 102 | "colab_type": "code", 103 | "deletable": true, 104 | "editable": true, 105 | "id": "JdU2qm635NZC", 106 | "colab": {} 107 | }, 108 | "source": [ 109 | "harris_test=input_image.copy()\n", 110 | "#greyscale it\n", 111 | "gray = cv2.cvtColor(harris_test,cv2.COLOR_BGR2GRAY)\n", 112 | "\n", 113 | "gray = np.float32(gray)\n", 114 | "blocksize=4 # \n", 115 | "kernel_size=3 # sobel kernel: must be odd and fairly small\n", 116 | "\n", 117 | "# run the harris corner detector\n", 118 | "dst = cv2.cornerHarris(gray,blocksize,kernel_size,0.05) # parameters are blocksize, Sobel parameter and Harris threshold\n", 119 | "\n", 120 | "#result is dilated for marking the corners, this is visualisation related and just makes them bigger\n", 121 | "dst = cv2.dilate(dst,None)\n", 122 | "#we then plot these on the input image for visualisation purposes, using bright red\n", 123 | "harris_test[dst>0.01*dst.max()]=[0,0,255]\n", 124 | "plt.imshow(cv2.cvtColor(harris_test, cv2.COLOR_BGR2RGB))" 125 | ], 126 | "execution_count": 0, 127 | "outputs": [] 128 | }, 129 | { 130 | "cell_type": "markdown", 131 | "metadata": { 132 | "colab_type": "text", 133 | "collapsed": true, 134 | "deletable": true, 135 | "editable": true, 136 | "id": "Rd8R15G45NZE" 137 | }, 138 | "source": [ 139 | "Properly speaking the Harris Corner detection is more like a Sobel operator - indeed it is very much like a sobel operator. It doesn't really return a set of features, instead it is a filter which gives a strong response on corner-like regions of the image. We can see this more clearly if we look at the Harris output from the cell above (dst is the Harris response, before thresholding). Well we can kind-of see. You should be able to see that there are slightly light places in the image where there are corner like features, and that there are really light parts of the image around the black and white corners of the writing " 140 | ] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "metadata": { 145 | "colab_type": "code", 146 | "deletable": true, 147 | "editable": true, 148 | "id": "7z0RTbtO5NZF", 149 | "colab": {} 150 | }, 151 | "source": [ 152 | "plt.imshow(dst,cmap = 'gray') \n" 153 | ], 154 | "execution_count": 0, 155 | "outputs": [] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": { 160 | "colab_type": "text", 161 | "collapsed": true, 162 | "deletable": true, 163 | "editable": true, 164 | "id": "0_6wW-nY5NZH" 165 | }, 166 | "source": [ 167 | "## Moving towards feature space\n", 168 | "When we consider modern feature detectors there are a few things we need to mention. What makes a good feature includes the following: \n", 169 | "\n", 170 | "* Repeatability (got to be able to find it again)\n", 171 | "* Distinctiveness/informativeness (features representing different things need to be different)\n", 172 | "* Locality (they need to be local to the image feature and not, like, the whole image)\n", 173 | "* Quantity (you need to be able to find enough of them for them to be properly useful)\n", 174 | "* Accuracy (they need to accurately locate the image feature)\n", 175 | "* Efficiency (they've got to be computable in reasonable time)\n", 176 | "\n", 177 | "This comes from a good survey which you can find here (and which I'd thoroughly recommend reading if you're doing feature detection work) [here](https://www.slideshare.net/AhmedOne1/survey-1-project-overview)\n", 178 | "\n", 179 | "**Note:** some of the very famous feature detectors (SIFT/SURF and so on) are around, but aren't in OpenCV by default due to patent issues. You can build them for OpenCV if you want - or you can find other implementations (David Lowe's SIFT implementation works just fine). Just google for instructions. For the purposes of this tutorial (and to save time) we're only going to look at those which are actually in OpenCV." 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "metadata": { 185 | "colab_type": "code", 186 | "deletable": true, 187 | "editable": true, 188 | "id": "DXu8K_FJ5NZI", 189 | "colab": {} 190 | }, 191 | "source": [ 192 | "orbimg=input_image.copy()\n", 193 | "\n", 194 | "orb = cv2.ORB_create()\n", 195 | "# find the keypoints with ORB\n", 196 | "kp = orb.detect(orbimg,None)\n", 197 | "# compute the descriptors with ORB\n", 198 | "kp, des = orb.compute(orbimg, kp)\n", 199 | "# draw keypoints\n", 200 | "cv2.drawKeypoints(orbimg,kp,orbimg)\n", 201 | "\n", 202 | "plt.imshow(cv2.cvtColor(orbimg, cv2.COLOR_BGR2RGB))" 203 | ], 204 | "execution_count": 0, 205 | "outputs": [] 206 | }, 207 | { 208 | "cell_type": "markdown", 209 | "metadata": { 210 | "colab_type": "text", 211 | "collapsed": true, 212 | "deletable": true, 213 | "editable": true, 214 | "id": "tynAouFx5NZK" 215 | }, 216 | "source": [ 217 | "## Matching features\n", 218 | "Finding features is one thing but actually we want to use them for matching. \n", 219 | "First let's get something where we know there's going to be a match\n" 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "metadata": { 225 | "colab_type": "code", 226 | "deletable": true, 227 | "editable": true, 228 | "id": "xyjEKvnK5NZK", 229 | "colab": {} 230 | }, 231 | "source": [ 232 | "img2match=np.zeros(input_image.shape,np.uint8)\n", 233 | "dogface=input_image[60:250, 70:350] # copy out a bit\n", 234 | "img2match[60:250,70:350]=[0,0,0] # blank that region\n", 235 | "dogface=cv2.flip(dogface,0) #flip the copy\n", 236 | "img2match[200:200+dogface.shape[0], 200:200+dogface.shape[1]]=dogface # paste it back somewhere else\n", 237 | "plt.imshow(cv2.cvtColor(img2match, cv2.COLOR_BGR2RGB))" 238 | ], 239 | "execution_count": 0, 240 | "outputs": [] 241 | }, 242 | { 243 | "cell_type": "markdown", 244 | "metadata": { 245 | "colab_type": "text", 246 | "deletable": true, 247 | "editable": true, 248 | "id": "-pzjjdcD5NZN" 249 | }, 250 | "source": [ 251 | "## Matching keypoints\n", 252 | "\n", 253 | "The feature matching function (in this case Orb) detects and then computes keypoint descriptors. These are a higher dimensional representation of the image region immediately around a point of interest (sometimes literally called \"interest points\"). \n", 254 | "\n", 255 | "These higher-dimensional representations can then be matched; the strength you gain from matching these descriptors rather than image regions directly is that they have a certain invariance to transformations (like rotation, or scaling). OpenCV providers matcher routines to do this, in which you can specify the distance measure to use." 256 | ] 257 | }, 258 | { 259 | "cell_type": "code", 260 | "metadata": { 261 | "colab_type": "code", 262 | "deletable": true, 263 | "editable": true, 264 | "id": "a8pk6NeP5NZN", 265 | "colab": {} 266 | }, 267 | "source": [ 268 | "\n", 269 | "kp2 = orb.detect(img2match,None)\n", 270 | "# compute the descriptors with ORB\n", 271 | "kp2, des2 = orb.compute(img2match, kp2)\n", 272 | "# create BFMatcher object: this is a Brute Force matching object\n", 273 | "bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)\n", 274 | "# Match descriptors.\n", 275 | "matches = bf.match(des,des2)\n", 276 | " \n", 277 | "# Sort them by distance between matches in feature space - so the best matches are first.\n", 278 | "matches = sorted(matches, key = lambda x:x.distance)\n", 279 | " \n", 280 | "# Draw first 50 matches.\n", 281 | "oimg = cv2.drawMatches(orbimg,kp,img2match,kp2,matches[:50], orbimg)\n", 282 | " \n", 283 | "plt.imshow(cv2.cvtColor(oimg, cv2.COLOR_BGR2RGB))\n" 284 | ], 285 | "execution_count": 0, 286 | "outputs": [] 287 | }, 288 | { 289 | "cell_type": "markdown", 290 | "metadata": { 291 | "colab_type": "text", 292 | "deletable": true, 293 | "editable": true, 294 | "id": "E2qdpBTV5NZP" 295 | }, 296 | "source": [ 297 | "As you can see there are some false matches, but it's fairly clear that most of the matched keypoints found are actual matches between image regions on the dogface.\n", 298 | "\n", 299 | "To be more precise about our matching we could choose to enforce **homography** constraints, which looks for features than sit on the same plane. If you want to investigate that check out this tutorial online\n", 300 | "[here](https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_feature2d/py_feature_homography/py_feature_homography.html)" 301 | ] 302 | }, 303 | { 304 | "cell_type": "markdown", 305 | "metadata": { 306 | "colab_type": "text", 307 | "id": "5PUQBH5z5NZS" 308 | }, 309 | "source": [ 310 | "[Previous](2-Image_stats_and_image_processing.ipynb) [Next](4-Cascade_classification.ipynb)" 311 | ] 312 | } 313 | ] 314 | } -------------------------------------------------------------------------------- /notebooks/1-Fundamentals.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "kernelspec": { 6 | "display_name": "Python 3", 7 | "language": "python", 8 | "name": "python3" 9 | }, 10 | "language_info": { 11 | "codemirror_mode": { 12 | "name": "ipython", 13 | "version": 3 14 | }, 15 | "file_extension": ".py", 16 | "mimetype": "text/x-python", 17 | "name": "python", 18 | "nbconvert_exporter": "python", 19 | "pygments_lexer": "ipython3", 20 | "version": "3.6.0" 21 | }, 22 | "colab": { 23 | "name": "1-Fundamentals.ipynb", 24 | "provenance": [], 25 | "collapsed_sections": [] 26 | } 27 | }, 28 | "cells": [ 29 | { 30 | "cell_type": "markdown", 31 | "metadata": { 32 | "colab_type": "text", 33 | "deletable": true, 34 | "editable": true, 35 | "id": "eP9VcOVoebew" 36 | }, 37 | "source": [ 38 | "# OpenCV fundamentals\n", 39 | "\n", 40 | "This notebook covers opening files, looking at pixels, and some simple image processing techniques.\n", 41 | "\n", 42 | "We'll use the following sample image, stolen from the Internet. But you can use whatever image you like.\n", 43 | "\n", 44 | "![No idea](https://raw.githubusercontent.com/computationalcore/introduction-to-opencv/master/assets/noidea.jpg \"I have no idea\")\n", 45 | "\n", 46 | "

\n", 47 | " Estimated time needed: 20 min\n", 48 | "

\n", 49 | "\n" 50 | ] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "metadata": { 55 | "colab_type": "text", 56 | "deletable": true, 57 | "editable": true, 58 | "id": "cIU7W8Wmebey" 59 | }, 60 | "source": [ 61 | "## Python getting started\n", 62 | "\n", 63 | "First we need to import the relevant libraries: OpenCV itself, Numpy, and a couple of others. Common and Video are simple data handling and opening routines that you can find in the OpenCV Python Samples directory or from the github repo linked above. We'll start each notebook with the same includes - you don't need all of them every time (so this is bad form, really) but it's easier to just copy and paste. " 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "metadata": { 69 | "colab_type": "code", 70 | "deletable": true, 71 | "editable": true, 72 | "id": "TXKxw8iJebez", 73 | "colab": {} 74 | }, 75 | "source": [ 76 | "\n", 77 | "# Download the test image and utils files\n", 78 | "!wget --no-check-certificate \\\n", 79 | " https://raw.githubusercontent.com/computationalcore/introduction-to-opencv/master/assets/noidea.jpg \\\n", 80 | " -O noidea.jpg\n", 81 | "!wget --no-check-certificate \\\n", 82 | " https://raw.githubusercontent.com/computationalcore/introduction-to-opencv/master/utils/common.py \\\n", 83 | " -O common.py\n", 84 | "\n", 85 | "# These imports let you use opencv\n", 86 | "import cv2 #opencv itself\n", 87 | "import common #some useful opencv functions\n", 88 | "import numpy as np # matrix manipulations\n", 89 | "\n", 90 | "#the following are to do with this interactive notebook code\n", 91 | "%matplotlib inline \n", 92 | "from matplotlib import pyplot as plt # this lets you draw inline pictures in the notebooks\n", 93 | "import pylab # this allows you to control figure size \n", 94 | "pylab.rcParams['figure.figsize'] = (10.0, 8.0) # this controls figure size in the notebook" 95 | ], 96 | "execution_count": 0, 97 | "outputs": [] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "metadata": { 102 | "colab_type": "text", 103 | "deletable": true, 104 | "editable": true, 105 | "id": "htK6mm-Gebe2" 106 | }, 107 | "source": [ 108 | "Now we can open an image:" 109 | ] 110 | }, 111 | { 112 | "cell_type": "code", 113 | "metadata": { 114 | "colab_type": "code", 115 | "deletable": true, 116 | "editable": true, 117 | "id": "Ah762ATHebe3", 118 | "colab": {} 119 | }, 120 | "source": [ 121 | "input_image=cv2.imread('noidea.jpg')" 122 | ], 123 | "execution_count": 0, 124 | "outputs": [] 125 | }, 126 | { 127 | "cell_type": "markdown", 128 | "metadata": { 129 | "colab_type": "text", 130 | "deletable": true, 131 | "editable": true, 132 | "id": "ut1_Lwdgebe5" 133 | }, 134 | "source": [ 135 | "We can find out various things about that image" 136 | ] 137 | }, 138 | { 139 | "cell_type": "code", 140 | "metadata": { 141 | "colab_type": "code", 142 | "deletable": true, 143 | "editable": true, 144 | "id": "awdTYn4Gebe6", 145 | "scrolled": true, 146 | "colab": {} 147 | }, 148 | "source": [ 149 | "print(input_image.size)" 150 | ], 151 | "execution_count": 0, 152 | "outputs": [] 153 | }, 154 | { 155 | "cell_type": "code", 156 | "metadata": { 157 | "colab_type": "code", 158 | "deletable": true, 159 | "editable": true, 160 | "id": "af7iQyhqebe8", 161 | "colab": {} 162 | }, 163 | "source": [ 164 | "print(input_image.shape)\n", 165 | "\n" 166 | ], 167 | "execution_count": 0, 168 | "outputs": [] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "metadata": { 173 | "colab_type": "code", 174 | "deletable": true, 175 | "editable": true, 176 | "id": "UhxrodZrebe_", 177 | "colab": {} 178 | }, 179 | "source": [ 180 | "print(input_image.dtype)" 181 | ], 182 | "execution_count": 0, 183 | "outputs": [] 184 | }, 185 | { 186 | "cell_type": "markdown", 187 | "metadata": { 188 | "colab_type": "text", 189 | "deletable": true, 190 | "editable": true, 191 | "id": "stSDqhuBebfA" 192 | }, 193 | "source": [ 194 | "**gotcha** that last one (datatype) is one of the tricky things about working in Python. As it's not strongly typed, Python will allow you to have arrays of different types but the same size, and some functions will return arrays of types that you probably don't want. Being able to check and inspect the datatype like this is very useful and is one of the things I often find myself doing in debugging." 195 | ] 196 | }, 197 | { 198 | "cell_type": "code", 199 | "metadata": { 200 | "colab_type": "code", 201 | "deletable": true, 202 | "editable": true, 203 | "id": "woP9RhyCebfB", 204 | "colab": {} 205 | }, 206 | "source": [ 207 | "plt.imshow(input_image)" 208 | ], 209 | "execution_count": 0, 210 | "outputs": [] 211 | }, 212 | { 213 | "cell_type": "markdown", 214 | "metadata": { 215 | "colab_type": "text", 216 | "collapsed": true, 217 | "deletable": true, 218 | "editable": true, 219 | "id": "6VFxWhvUebfD" 220 | }, 221 | "source": [ 222 | "What this illustrates is something key about OpenCV: it doesn't store images in RGB format, but in BGR format." 223 | ] 224 | }, 225 | { 226 | "cell_type": "code", 227 | "metadata": { 228 | "colab_type": "code", 229 | "deletable": true, 230 | "editable": true, 231 | "id": "zgEQX0isebfD", 232 | "colab": {} 233 | }, 234 | "source": [ 235 | "# split channels\n", 236 | "b,g,r=cv2.split(input_image)\n", 237 | "# show one of the channels (this is red - see that the sky is kind of dark. try changing it to b)\n", 238 | "plt.imshow(r, cmap='gray')\n" 239 | ], 240 | "execution_count": 0, 241 | "outputs": [] 242 | }, 243 | { 244 | "cell_type": "markdown", 245 | "metadata": { 246 | "colab_type": "text", 247 | "deletable": true, 248 | "editable": true, 249 | "id": "XqE1jCKaebfG" 250 | }, 251 | "source": [ 252 | "## converting between colour spaces, merging and splitting channels\n", 253 | "\n", 254 | "We can convert between various colourspaces in OpenCV easily. We've seen how to split, above. We can also merge channels:" 255 | ] 256 | }, 257 | { 258 | "cell_type": "code", 259 | "metadata": { 260 | "colab_type": "code", 261 | "deletable": true, 262 | "editable": true, 263 | "id": "Ev_3hJKLebfH", 264 | "colab": {} 265 | }, 266 | "source": [ 267 | "merged=cv2.merge([r,g,b])\n", 268 | "# merge takes an array of single channel matrices\n", 269 | "plt.imshow(merged)\n" 270 | ], 271 | "execution_count": 0, 272 | "outputs": [] 273 | }, 274 | { 275 | "cell_type": "markdown", 276 | "metadata": { 277 | "colab_type": "text", 278 | "deletable": true, 279 | "editable": true, 280 | "id": "cJ-UCAynebfJ" 281 | }, 282 | "source": [ 283 | "OpenCV also has a function specifically for dealing with image colorspaces, so rather than split and merge channels by hand you can use this instead. It is usually marginally faster...\n", 284 | "\n", 285 | "There are something like 250 color related flags in OpenCV for conversion and display. The ones you are most likely to use are COLOR_BGR2RGB for RGB conversion, COLOR_BGR2GRAY for conversion to greyscale, and COLOR_BGR2HSV for conversion to Hue,Saturation,Value colour space. [http://docs.opencv.org/trunk/de/d25/imgproc_color_conversions.html] has more information on how these colour conversions are done. " 286 | ] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "metadata": { 291 | "colab_type": "code", 292 | "deletable": true, 293 | "editable": true, 294 | "id": "egPmVUvYebfK", 295 | "colab": {} 296 | }, 297 | "source": [ 298 | "COLORflags = [flag for flag in dir(cv2) if flag.startswith('COLOR') ]\n", 299 | "print(len(COLORflags))\n", 300 | "\n", 301 | "# If you want to see them all, rather than just a count uncomment the following line\n", 302 | "#print(COLORflags)" 303 | ], 304 | "execution_count": 0, 305 | "outputs": [] 306 | }, 307 | { 308 | "cell_type": "code", 309 | "metadata": { 310 | "colab_type": "code", 311 | "deletable": true, 312 | "editable": true, 313 | "id": "INRZEZdvebfM", 314 | "colab": {} 315 | }, 316 | "source": [ 317 | "opencv_merged=cv2.cvtColor(input_image, cv2.COLOR_BGR2RGB)\n", 318 | "plt.imshow(opencv_merged)\n" 319 | ], 320 | "execution_count": 0, 321 | "outputs": [] 322 | }, 323 | { 324 | "cell_type": "markdown", 325 | "metadata": { 326 | "colab_type": "text", 327 | "deletable": true, 328 | "editable": true, 329 | "id": "lfo1Lue9ebfN" 330 | }, 331 | "source": [ 332 | "## Getting image data and setting image data\n", 333 | "\n", 334 | "Images in python OpenCV are numpy arrays. Numpy arrays are optimised for fast array operations and so there are usually fast methods for doing array calculations which don't actually involve writing all the detail yourself. So it's usually bad practice to access individual pixels, but you can." 335 | ] 336 | }, 337 | { 338 | "cell_type": "code", 339 | "metadata": { 340 | "colab_type": "code", 341 | "deletable": true, 342 | "editable": true, 343 | "id": "32AQVQ0uebfO", 344 | "colab": {} 345 | }, 346 | "source": [ 347 | "pixel = input_image[100,100]\n", 348 | "print(pixel)" 349 | ], 350 | "execution_count": 0, 351 | "outputs": [] 352 | }, 353 | { 354 | "cell_type": "code", 355 | "metadata": { 356 | "colab_type": "code", 357 | "deletable": true, 358 | "editable": true, 359 | "id": "OE1vlYo2ebfQ", 360 | "colab": {} 361 | }, 362 | "source": [ 363 | "input_image[100,100] = [0,0,0]\n", 364 | "pixelnew = input_image[100,100]\n", 365 | "print(pixelnew)" 366 | ], 367 | "execution_count": 0, 368 | "outputs": [] 369 | }, 370 | { 371 | "cell_type": "markdown", 372 | "metadata": { 373 | "colab_type": "text", 374 | "deletable": true, 375 | "editable": true, 376 | "id": "KsFd9SBzebfS" 377 | }, 378 | "source": [ 379 | "## Getting and setting regions of an image\n", 380 | "\n", 381 | "In the same way as we can get or set individual pixels, we can get or set regions of an image. This is a particularly useful way to get a region of interest to work on. " 382 | ] 383 | }, 384 | { 385 | "cell_type": "code", 386 | "metadata": { 387 | "colab_type": "code", 388 | "deletable": true, 389 | "editable": true, 390 | "id": "D0rwsf8sebfS", 391 | "colab": {} 392 | }, 393 | "source": [ 394 | "dogface = input_image[60:250, 70:350]\n", 395 | "plt.imshow(dogface)" 396 | ], 397 | "execution_count": 0, 398 | "outputs": [] 399 | }, 400 | { 401 | "cell_type": "code", 402 | "metadata": { 403 | "colab_type": "code", 404 | "deletable": true, 405 | "editable": true, 406 | "id": "i1xm1L5MebfU", 407 | "colab": {} 408 | }, 409 | "source": [ 410 | "fresh_image=cv2.imread('noidea.jpg') # it's either start with a fresh read of the image, \n", 411 | " # or end up with dogfaces on dogfaces on dogfaces \n", 412 | " # as you re-run parts of the notebook but not others... \n", 413 | " \n", 414 | "fresh_image[200:200+dogface.shape[0], 200:200+dogface.shape[1]]=dogface\n", 415 | "print(dogface.shape[0])\n", 416 | "print(dogface.shape[1])\n", 417 | "plt.imshow(fresh_image)" 418 | ], 419 | "execution_count": 0, 420 | "outputs": [] 421 | }, 422 | { 423 | "cell_type": "markdown", 424 | "metadata": { 425 | "colab_type": "text", 426 | "collapsed": true, 427 | "deletable": true, 428 | "editable": true, 429 | "id": "bCVZFlDhebfW" 430 | }, 431 | "source": [ 432 | "## Matrix slicing\n", 433 | "In OpenCV python style, as I have mentioned, images are numpy arrays. There are some superb array manipulation in numpy tutorials out there: this is a great introduction if you've not done it before [http://www.scipy-lectures.org/intro/numpy/numpy.html#indexing-and-slicing]. The getting and setting of regions above uses slicing, though, and I'd like to finish this notebook with a little more detail on what is going on there. " 434 | ] 435 | }, 436 | { 437 | "cell_type": "code", 438 | "metadata": { 439 | "colab_type": "code", 440 | "deletable": true, 441 | "editable": true, 442 | "id": "9LLhUPE7ebfX", 443 | "scrolled": true, 444 | "colab": {} 445 | }, 446 | "source": [ 447 | "freshim2 = cv2.imread(\"noidea.jpg\")\n", 448 | "crop = freshim2[100:400, 130:300] \n", 449 | "plt.imshow(crop)" 450 | ], 451 | "execution_count": 0, 452 | "outputs": [] 453 | }, 454 | { 455 | "cell_type": "markdown", 456 | "metadata": { 457 | "colab_type": "text", 458 | "deletable": true, 459 | "editable": true, 460 | "id": "3pURgAtbebfZ" 461 | }, 462 | "source": [ 463 | "The key thing to note here is that the slicing works like\n", 464 | "```\n", 465 | "[top_y:bottom_y, left_x:right_x]\n", 466 | "```\n", 467 | "This can also be thought of as \n", 468 | "```\n", 469 | "[y:y+height, x:x+width]\n", 470 | "```\n", 471 | "\n", 472 | "You can also use slicing to separate out channels. In this case you want \n", 473 | "```\n", 474 | "[y:y+height, x:x+width, channel]\n", 475 | "```\n", 476 | "where channel represents the colour you're interested in - this could be 0 = blue, 1 = green or 2=red if you're dealing with a default OpenCV image, but if you've got an image that has been converted it could be something else. Here's an example that converts to HSV then selects the S (Saturation) channel of the same crop above:" 477 | ] 478 | }, 479 | { 480 | "cell_type": "code", 481 | "metadata": { 482 | "colab_type": "code", 483 | "deletable": true, 484 | "editable": true, 485 | "id": "9cSa7WDHebfZ", 486 | "colab": {} 487 | }, 488 | "source": [ 489 | "hsvim=cv2.cvtColor(freshim2,cv2.COLOR_BGR2HSV)\n", 490 | "bcrop =hsvim[100:400, 100:300, 1]\n", 491 | "plt.imshow(bcrop, cmap=\"gray\")" 492 | ], 493 | "execution_count": 0, 494 | "outputs": [] 495 | }, 496 | { 497 | "cell_type": "markdown", 498 | "metadata": { 499 | "colab_type": "text", 500 | "id": "-jEyTpSTebff" 501 | }, 502 | "source": [ 503 | "[Next](2-Image_stats_and_image_processing.ipynb)" 504 | ] 505 | } 506 | ] 507 | } --------------------------------------------------------------------------------