├── README.md ├── data ├── PETS09-S2L1-raw.webm ├── PETS09-S2L1-raw_back.png ├── baboon.tif ├── black_circle.png ├── face.png ├── lena.tif ├── peppers.tif ├── salt_and_pepper.png └── sudoku.png ├── examples ├── Canny_edge.py ├── Sobel_edge.py ├── alpha_blending.py ├── background_extraction.py ├── background_subtraction.py ├── bilateral_filter.py ├── color_bgr2hsv.py ├── contrast_stretching.py ├── free_drawing.py ├── histogram.py ├── histogram_equalization+color.py ├── histogram_equalization.py ├── image_converter.py ├── image_creation.py ├── image_difference.py ├── image_filtering.py ├── image_formation.py ├── image_resize.py ├── image_rotation.py ├── image_stitching.py ├── image_viewer+zoom.py ├── image_viewer.py ├── intensity_transformation.py ├── median_filter.py ├── morpology.py ├── negative_image_and_flip.py ├── shape_drawing.py ├── thresholding.py ├── video_converter.py ├── video_player+navigation.py └── video_player.py ├── requirements.txt └── slides ├── 01_introduction.pdf ├── 02_image_editing.pdf ├── 03_image_processing.pdf ├── 04_color.pdf ├── 05_image_formation.pdf ├── 06_image_geometry.pdf ├── 07_solving_problems.pdf ├── 08_image_correspondence.pdf └── 14_3d_vision.pdf /README.md: -------------------------------------------------------------------------------- 1 | ## Computer Vision Tutorial 2 | 3 | _Computer Vision Tutorial_ includes classical theories and techniques and also recent ML/DL-based methods for computer vision. As classical theories and techniques, the tutorial contains image processing, camera projection models, camera calibration, and pose estimation. As recent ML/DL-based methods, the tutorial deals with object categorization (and backbone networks), and its extensions such as object detection and instance segmentation. It also explains about further topics such as multi-object tracking, structure-from-motion, NeRF, and so on. 4 | 5 | This tutorial has been initiated and maintained to teach undergraduate CSE students in [SEOULTECH](https://en.seoultech.ac.kr/) as the course of _Computer Vision_ (109079). 6 | 7 | This tutorial contains code examples briefly written in [Python](https://python.org/) with [OpenCV](https://opencv.org/) and [PyTorch](https://pytorch.org/). 8 | * :bulb: Some of code examples will help readers to understand **inside** of algorithms (e.g. how it works). 9 | * :wrench: Some of code examples will provide **usages and applications** of OpenCV functions (e.g. how to use it). 10 | * :camera: Some of code examples came from my **3D Vision Tutorial**, [3dv_tutorial](https://github.com/mint-lab/3dv_tutorial). 11 | 12 | 13 | 14 | ### Lecture Slides 15 | * [Section 1. Introduction](https://github.com/mint-lab/cv_tutorial/blob/master/slides/01_introduction.pdf) 16 | * [Section 2. Image Editing: Learning OpenCV](https://github.com/mint-lab/cv_tutorial/blob/master/slides/02_image_editing.pdf) 17 | * [Section 3. Image Processing](https://github.com/mint-lab/cv_tutorial/blob/master/slides/03_image_processing.pdf) 18 | * [Section 4. Color](https://github.com/mint-lab/cv_tutorial/blob/master/slides/04_color.pdf) 19 | * [Section 5. Image Formation](https://github.com/mint-lab/cv_tutorial/blob/master/slides/05_image_formation.pdf) 20 | * [Section 6. Image Geometry](https://github.com/mint-lab/cv_tutorial/blob/master/slides/06_image_geometry.pdf) 21 | * [Section 7. Solving Problems](https://github.com/mint-lab/cv_tutorial/blob/master/slides/07_solving_problems.pdf) 22 | * [Section 8. Image Correspondence](https://github.com/mint-lab/cv_tutorial/blob/master/slides/08_image_correspondence.pdf) 23 | * Section 9. Image Classification: CNN Backbones 24 | * Acknowledgement) Many slides about CNNs are adopted from [Stanford CS231n](https://cs231n.stanford.edu/). 25 | * Section 10. Image Segmentation 26 | * Section 11. Object Detection 27 | * Section 12. Vision Foundation Models 28 | * Section 13. Object Tracking 29 | * [Section 14. 3D Vision](https://github.com/mint-lab/cv_tutorial/blob/master/slides/14_3d_vision.pdf) 30 | * Acknowledgement) Many slides about NeRF are adopted from [NeRF Tutorial](https://sites.google.com/berkeley.edu/nerf-tutorial/) (ECCV 2022). 31 | 32 | 33 | 34 | ### Example Codes 35 | * **Section 1. Introduction** [[slides]](https://github.com/mint-lab/cv_tutorial/blob/master/slides/01_introduction.pdf) 36 | * Note) How to install prerequisite packages in Python: `pip install -r requirements.txt` 37 | 38 | * **Section 2. Image Editing: Learning OpenCV** [[slides]](https://github.com/mint-lab/cv_tutorial/blob/master/slides/02_image_editing.pdf) 39 | * OpenCV Image Representation 40 | * Image creation: [image_creation.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/image_creation.py) :bulb: 41 | * OpenCV Image and Video Input/Output 42 | * Image file viewer: [image_viewer.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/image_viewer.py) :wrench: 43 | * Image format converter: [image_converter.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/image_converter.py) :wrench: 44 | * Video file player: [video_player.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/video_player.py) :wrench: 45 | * Video format converter: [video_converter.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/video_converter.py) :wrench: 46 | * OpenCV Drawing Functions 47 | * Shape drawing: [shape_drawing.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/shape_drawing.py) :wrench: 48 | * OpenCV High-level GUI 49 | * (Handling keyboard events) Video file player with frame navigation: [video_player+navigation.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/video_player%2Bnavigation.py) :wrench: 50 | * (Handling mouse events) Free drawing: [free_drawing.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/free_drawing.py) :wrench: 51 | * Image Editing 52 | * Negative image and flip: [negative_image_and_flip.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/negative_image_and_flip.py) :bulb: 53 | * Intensity transformation with contrast and brightness: [intensity_transformation.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/intensity_transformation.py) :bulb: 54 | * (Image addition) Alpha blending: [alpha_blending.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/alpha_blending.py) :bulb: 55 | * (Image addition) Background extraction: [background_extraction.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/background_extraction.py) :bulb: 56 | * (Image subtraction) Image difference: [image_difference.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/image_difference.py) :bulb: 57 | * (Image subtraction) Background subtraction: [background_subtraction.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/background_subtraction.py) :bulb: 58 | * (Image crop) Image file viewer with the zoom window: [image_viewer+zoom.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/image_viewer%2Bzoom.py) :bulb: 59 | * Image resize with backward value copy: [image_resize.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/image_resize.py) :bulb: 60 | * Image rotation with backward/forward value copy: [image_rotation.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/image_rotation.py) :bulb: 61 | 62 | * **Section 3. Image Processing** [[slides]](https://github.com/mint-lab/cv_tutorial/blob/master/slides/03_image_processing.pdf) 63 | * Intensity Transformation 64 | * Image histogram: [histogram.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/histogram.py) :bulb: 65 | * Contrast stretching with min-max stretching: [contrast_stretching.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/contrast_stretching.py) :bulb: 66 | * Histogram equalization: [histogram_equalization.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/histogram_equalization.py) :wrench: 67 | * Image Segmentation 68 | * Thresholding: [thresholding.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/thresholding.py) :wrench: 69 | * Image Filtering 70 | * Image filtering with various kernels: [image_filtering.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/image_filtering.py) :bulb: 71 | * Median filter: [median_filter.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/median_filter.py) :wrench: 72 | * Sobel edge detection: [Sobel_edge.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/Sobel_edge.py) :bulb: 73 | * Canny edge detection: [Canny_edge.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/Canny_edge.py) :wrench: 74 | * Bilateral filter: [bilateral_filter.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/bilateral_filter.py) :wrench: 75 | * Morphological Operations 76 | * Morphological operations with various operations and kernels: [morpology.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/morpology.py) :wrench: 77 | * Application) Background subtraction (foreground extraction): [background_subtraction.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/background_subtraction.py) :wrench: 78 | 79 | * **Section 4. Color** [[slides]](https://github.com/mint-lab/cv_tutorial/blob/master/slides/04_color.pdf) 80 | * Color space conversion: [color_bgr2hsv.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/color_bgr2hsv.py) :wrench: 81 | * Color histogram equalization: [histogram_equalization+color.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/histogram_equalization+color.py) :bulb: 82 | 83 | * **Section 5. Image Formation** [[slides]](https://github.com/mint-lab/cv_tutorial/blob/master/slides/05_image_formation.pdf) 84 | * Getting Started with 2D 85 | * 3D rotation conversion: [3d_rotation_conversion.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/3d_rotation_conversion.py) :camera: 86 | * Pinhole Camera Model 87 | * Object localization: [object_localization.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/object_localization.py) :camera: 88 | * Image formation: [image_formation.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/image_formation.py) :camera::bulb: 89 | * Geometric Distortion Models 90 | * Geometric distortion visualization: [distortion_visualization.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/distortion_visualization.py) :camera: 91 | * Geometric distortion correction: [distortion_correction.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/distortion_correction.py) :camera: [[result video]](https://youtu.be/HKetupWh4V8) 92 | * Camera Calibration 93 | * Camera calibration: [camera_calibration.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/camera_calibration.py) :camera: 94 | * Absolute Camera Pose Estimation (a.k.a. perspective-n-point; PnP) 95 | * Pose estimation (chessboard): [pose_estimation_chessboard.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/pose_estimation_chessboard.py) :camera: [[result video]](https://youtu.be/4nA1OQGL-ig) 96 | * Pose estimation (book): [pose_estimation_book1.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/pose_estimation_book1.py) :camera: 97 | * Pose estimation (book) with camera calibration: [pose_estimation_book2.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/pose_estimation_book2.py) :camera: 98 | * Pose estimation (book) with camera calibration without initial $K$: [pose_estimation_book3.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/pose_estimation_book3.py) :camera: [[result video]](https://youtu.be/GYp4h0yyB3Y) 99 | 100 | * **Section 6. Image Geometry** [[slides]](https://github.com/mint-lab/cv_tutorial/blob/master/slides/06_image_geometry.pdf) 101 | * Planar Homography 102 | * Perspective distortion correction: [perspective_correction.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/perspective_correction.py) :camera: 103 | * Planar image stitching: [image_stitching.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/image_stitching.py) :camera: 104 | * 2D video stabilization: [video_stabilization.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/video_stabilization.py) :camera: [[result video]](https://youtu.be/be_dzYicEzI) 105 | * Triangulation 106 | * Triangulation: [triangulation.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/triangulation.py) :camera: 107 | 108 | * **Section 7. Solving Problems** [[slides]](https://github.com/mint-lab/cv_tutorial/blob/master/slides/07_solving_problems.pdf) 109 | * Solving Linear Equations in 3D Vision 110 | * Affine transformation estimation: [affine_estimation_implement](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/affine_estimation_implement.py) :camera: :bulb: 111 | * Planar homography estimation: [homography_estimation_implement](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/homography_estimation_implement.py) :camera: :bulb: 112 | * Appendix) Image warping using homography: [image_warping_implement.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/image_warping_implement.py) :camera: :bulb: 113 | * Triangulation: [triangulation_implement.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/triangulation_implement.py) :camera: :bulb: 114 | * Solving Nonlinear Equations in 3D Vision 115 | * Absolute camera pose estimation: [pose_estimation_implement.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/pose_estimation_implement.py) :camera: :bulb: 116 | * Camera calibration: [camera_calibration_implement.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/camera_calibration_implement.py) :camera: :bulb: 117 | 118 | * **Section 8. Image Correspondence** [[slides]](https://github.com/mint-lab/cv_tutorial/blob/master/slides/08_image_correspondence.pdf) 119 | * Feature Points and Descriptors 120 | * Harris corner: [harris_corner_implement.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/harris_corner_implement.py) :camera: :bulb: 121 | * SuperPoint [[Github]](https://github.com/magicleap/SuperPointPretrainedNetwork) 122 | * Feature Matching and Tracking 123 | * Feature matching comparison: [feature_matching.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/feature_matching.py) :camera: 124 | * SuperGlue [[Github]](https://github.com/magicleap/SuperGluePretrainedNetwork) 125 | * Feature tracking with KLT tracker: [feature_tracking_klt.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/feature_tracking_klt.py) :camera: 126 | * Outlier Rejection 127 | * Line fitting with RANSAC: [line_fitting_ransac.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/line_fitting_ransac.py) :camera: :bulb: 128 | * Planar homography estimation with RANSAC: [image_stitching_implement.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/image_stitching_implement.py) :camera: :bulb: 129 | 130 | * **Section 9. Image Classification: CNN Backbones** 131 | * **Section 10. Image Segmentation** 132 | * **Section 11. Object Detection** 133 | * **Section 12. Vision Foundation Models** 134 | * **Section 13. Object Tracking** 135 | * **Section 14. 3D Vision** [[slides]](https://github.com/mint-lab/cv_tutorial/blob/master/slides/14_3d_vision.pdf) 136 | * Structure-from-Motion 137 | * COLMAP [[Homepage]](https://demuc.de/colmap/) [[Documentation]](https://colmap.github.io/) [[Github]](https://github.com/colmap/colmap) 138 | * 3D Representations 139 | * NeRF [[Homepage]](https://www.matthewtancik.com/nerf) [[Github]](https://github.com/bmild/nerf) 140 | 141 | 142 | 143 | 144 | ### Authors 145 | * [Sunglok Choi](https://github.com/sunglok) 146 | 147 | 148 | 149 | ### Acknowledgements 150 | The authors thank the following contributors and projects. 151 | 152 | * Youngjin Hong: He reported two bugs in [image_rotation.py](https://github.com/mint-lab/cv_tutorial/blob/master/examples/image_rotation.py) and [object_localization.py](https://github.com/mint-lab/3dv_tutorial/blob/master/examples/object_localization.py). 153 | * [ImageProcessingPlace.com](https://www.imageprocessingplace.com/root_files_V3/image_databases.htm) for test images (`lena.tif`, `baboon.tif`, and `peppers.tif`) 154 | * [MOTChallenge](https://motchallenge.net/vis/PETS09-S2L1) for test images (`PETS09-S2L1-raw.webm`) 155 | * [Wikipedia](https://en.wikipedia.org/wiki/Salt-and-pepper_noise) for a test image (`salt_and_pepper.png`) 156 | * [OpenCV](https://github.com/opencv/opencv/tree/4.x/samples/data) for a test image (`sudoku.png`) 157 | -------------------------------------------------------------------------------- /data/PETS09-S2L1-raw.webm: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/data/PETS09-S2L1-raw.webm -------------------------------------------------------------------------------- /data/PETS09-S2L1-raw_back.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/data/PETS09-S2L1-raw_back.png -------------------------------------------------------------------------------- /data/baboon.tif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/data/baboon.tif -------------------------------------------------------------------------------- /data/black_circle.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/data/black_circle.png -------------------------------------------------------------------------------- /data/face.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/data/face.png -------------------------------------------------------------------------------- /data/lena.tif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/data/lena.tif -------------------------------------------------------------------------------- /data/peppers.tif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/data/peppers.tif -------------------------------------------------------------------------------- /data/salt_and_pepper.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/data/salt_and_pepper.png -------------------------------------------------------------------------------- /data/sudoku.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/data/sudoku.png -------------------------------------------------------------------------------- /examples/Canny_edge.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import numpy as np 3 | 4 | img_list = [ 5 | '../data/lena.tif', 6 | '../data/baboon.tif', 7 | '../data/peppers.tif', 8 | '../data/black_circle.png', 9 | '../data/salt_and_pepper.png', 10 | '../data/sudoku.png', 11 | ] 12 | 13 | # Initialize control parameters 14 | threshold1 = 500 15 | threshold2 = 1200 16 | aperture_size = 5 17 | img_select = -1 18 | 19 | while True: 20 | # Read the given image 21 | img = cv.imread(img_list[img_select], cv.IMREAD_GRAYSCALE) 22 | assert img is not None, 'Cannot read the given image, ' + img_list[img_select] 23 | 24 | # Get the Canny edge image 25 | edge = cv.Canny(img, threshold1, threshold2, apertureSize=aperture_size) 26 | 27 | # Show all images 28 | info = f'Thresh1: {threshold1}, Thresh2: {threshold2}, KernelSize: {aperture_size}' 29 | cv.putText(edge, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (255, 255, 255), thickness=2) 30 | cv.putText(edge, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (0, 0, 0)) 31 | merge = np.hstack((img, edge)) 32 | cv.imshow('Canny Edge: Original | Result', merge) 33 | 34 | # Process the key event 35 | key = cv.waitKey() 36 | if key == 27: # ESC 37 | break 38 | elif key == ord('+') or key == ord('='): 39 | threshold1 += 2 40 | elif key == ord('-') or key == ord('_'): 41 | threshold1 -= 2 42 | elif key == ord(']') or key == ord('}'): 43 | threshold2 += 2 44 | elif key == ord('[') or key == ord('{'): 45 | threshold2 -= 2 46 | elif key == ord('>') or key == ord('.'): 47 | aperture_size = min(aperture_size + 2, 7) 48 | elif key == ord('<') or key == ord(','): 49 | aperture_size = max(aperture_size - 2, 3) 50 | elif key == ord('\t'): 51 | img_select = (img_select + 1) % len(img_list) 52 | 53 | cv.destroyAllWindows() 54 | -------------------------------------------------------------------------------- /examples/Sobel_edge.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import numpy as np 3 | 4 | def drawText(img, text, org=(10, 25), fontFace=cv.FONT_HERSHEY_DUPLEX, fontScale=0.6, color=(0, 0, 0), colorBoundary=(255, 255, 255)): 5 | cv.putText(img, text, org, fontFace, fontScale, colorBoundary, thickness=2) 6 | cv.putText(img, text, org, fontFace, fontScale, color) 7 | 8 | 9 | 10 | if __name__ == '__main__': 11 | img_list = [ 12 | '../data/lena.tif', 13 | '../data/baboon.tif', 14 | '../data/peppers.tif', 15 | '../data/black_circle.png', 16 | '../data/salt_and_pepper.png', 17 | '../data/sudoku.png', 18 | ] 19 | 20 | # Initialize control parameters 21 | edge_threshold = 0.1 22 | img_select = 3 23 | 24 | while True: 25 | # Read the given image as gray scale 26 | img = cv.imread(img_list[img_select], cv.IMREAD_GRAYSCALE) 27 | assert img is not None, 'Cannot read the given image, ' + img_list[img_select] 28 | 29 | # Extract edges using two-directional Sobel responses 30 | # and normalize their values within [0, 1] (Note: 1020 derived from 255 * (1+2+1)) 31 | dx = cv.Sobel(img, cv.CV_64F, 1, 0) / 1020 # Sobel x-directional response 32 | dy = cv.Sobel(img, cv.CV_64F, 0, 1) / 1020 # Sobel y-directional response 33 | mag = np.sqrt(dx*dx + dy*dy) / np.sqrt(2) # Sobel magnitude 34 | ori = np.arctan2(dy, dx) # Sobel orientation 35 | edge = mag > edge_threshold # Alternative) cv.threshold(), cv.adaptiveThreshold() 36 | 37 | # Prepare the orientation image as the BGR color 38 | ori[ori < 0] = ori[ori < 0] + 2*np.pi # Convert [-np.pi, np.pi) to [0, 2*np.pi) 39 | ori_hsv = np.dstack((ori / (2*np.pi) * 180, # HSV color - Hue channel 40 | np.full_like(ori, 255), # HSV color - Saturation channel 41 | mag * 255)) # HSV color - Value channel 42 | ori_bgr = cv.cvtColor(ori_hsv.astype(np.uint8), cv.COLOR_HSV2BGR) 43 | 44 | # Prepare the original, Sobel X/Y, magnitude, and edge images as the BGR color 45 | img_bgr = cv.cvtColor(img, cv.COLOR_GRAY2BGR) 46 | dx_bgr = cv.cvtColor(abs(dx * 255).astype(np.uint8), cv.COLOR_GRAY2BGR) 47 | dy_bgr = cv.cvtColor(abs(dy * 255).astype(np.uint8), cv.COLOR_GRAY2BGR) 48 | mag_bgr = cv.cvtColor((mag * 255).astype(np.uint8), cv.COLOR_GRAY2BGR) 49 | edge_bgr = cv.cvtColor((edge * 255).astype(np.uint8), cv.COLOR_GRAY2BGR) 50 | 51 | # Show all images 52 | drawText(img_bgr, 'Original') 53 | drawText(dx_bgr, 'SobelX') 54 | drawText(dy_bgr, 'SobelY') 55 | drawText(mag_bgr, 'Magnitude') 56 | drawText(ori_bgr, 'Orientation') 57 | drawText(edge_bgr, f'EdgeThreshold: {edge_threshold:.2f}') 58 | merge = np.vstack((np.hstack((img_bgr, dx_bgr, dy_bgr)), 59 | np.hstack((edge_bgr, mag_bgr, ori_bgr)))) 60 | cv.imshow('Sobel Edge', merge) 61 | key = cv.waitKey() 62 | if key == 27: # ESC 63 | break 64 | elif key == ord('+') or key == ord('='): 65 | edge_threshold = min(edge_threshold + 0.02, 1) 66 | elif key == ord('-') or key == ord('_'): 67 | edge_threshold = max(edge_threshold - 0.02, 0) 68 | elif key == ord('\t'): 69 | img_select = (img_select + 1) % len(img_list) 70 | 71 | cv.destroyAllWindows() 72 | -------------------------------------------------------------------------------- /examples/alpha_blending.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import numpy as np 3 | 4 | # Read the given images 5 | img1 = cv.imread('../data/baboon.tif') 6 | img2 = cv.imread('../data/peppers.tif') 7 | assert img1 is not None and img2 is not None, 'Cannot read the given images' 8 | 9 | # Initialize a control parameter 10 | alpha = 0.5 11 | 12 | while True: 13 | # Apply alpha blending 14 | blend = (alpha * img1 + (1 - alpha) * img2).astype(np.uint8) # Alternative) cv.addWeighted() 15 | 16 | # Show all images 17 | info = f'Alpha: {alpha:.1f}' 18 | cv.putText(blend, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (255, 255, 255), thickness=2) 19 | cv.putText(blend, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (0, 0, 0)) 20 | merge = np.hstack((img1, img2, blend)) 21 | cv.imshow('Image Blending: Image1 | Image2 | Blended', merge) 22 | 23 | # Process the key event 24 | key = cv.waitKey() 25 | if key == 27: # ESC 26 | break 27 | elif key == ord('+') or key == ord('='): 28 | alpha = min(alpha + 0.1, 1) 29 | elif key == ord('-') or key == ord('_'): 30 | alpha = max(alpha - 0.1, 0) 31 | 32 | cv.destroyAllWindows() 33 | -------------------------------------------------------------------------------- /examples/background_extraction.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | # Read the given video 5 | video = cv.VideoCapture('../data/PETS09-S2L1-raw.webm') 6 | assert video.isOpened(), 'Cannot read the given video' 7 | 8 | frame_count = 0 9 | img_back = None 10 | while True: 11 | # Get an image from 'video' 12 | valid, img = video.read() 13 | if not valid: 14 | break 15 | frame_count += 1 16 | 17 | # Show progress 18 | if frame_count % 100 == 0: 19 | print(f'Frame: {frame_count}') 20 | 21 | # Add the image to the averaged image (the background image) 22 | # Alternative) cv.createBackgroundSubtractorMOG2(), cv::bgsegm 23 | if img_back is None: 24 | img_back = np.zeros_like(img, dtype=np.float64) 25 | img_back += img.astype(np.float64) 26 | img_back = img_back / frame_count 27 | img_back = img_back.astype(np.uint8) 28 | 29 | # Save and show the background image 30 | cv.imwrite('../data/PETS09-S2L1-raw_back.png', img_back) 31 | cv.imshow('Background Extraction', img_back) 32 | cv.waitKey() 33 | cv.destroyAllWindows() 34 | -------------------------------------------------------------------------------- /examples/background_subtraction.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | # Read the given video 5 | video = cv.VideoCapture('../data/PETS09-S2L1-raw.webm') 6 | assert video.isOpened(), 'Cannot read the given video' 7 | 8 | # Initialize control parameters 9 | blur_ksize = (9, 9) 10 | blur_sigma = 3 11 | diff_threshold = 50 12 | bg_update_rate = 0.05 13 | fg_update_rate = 0.001 14 | zoom_level = 0.8 15 | 16 | # Read the background image 17 | img_back = cv.imread('../data/PETS09-S2L1-raw_back.png') 18 | assert img_back is not None, 'Cannot read the initial background image' 19 | img_back = cv.GaussianBlur(img_back, blur_ksize, blur_sigma).astype(np.float64) 20 | 21 | box = lambda ksize: np.ones((ksize, ksize), dtype=np.uint8) 22 | while True: 23 | # Get an image from 'video' 24 | valid, img = video.read() 25 | if not valid: 26 | break 27 | 28 | # Get the difference between the current image and background 29 | img_blur = cv.GaussianBlur(img, blur_ksize, blur_sigma) 30 | img_diff = img_blur - img_back 31 | 32 | # Apply thresholding 33 | img_norm = np.linalg.norm(img_diff, axis=2) 34 | img_bin = np.zeros_like(img_norm, dtype=np.uint8) 35 | img_bin[img_norm > diff_threshold] = 255 36 | 37 | # Apply morphological operations 38 | img_mask = img_bin.copy() 39 | img_mask = cv.erode(img_mask, box(3)) # Suppress small noise 40 | img_mask = cv.dilate(img_mask, box(5)) # Connect broken parts 41 | img_mask = cv.dilate(img_mask, box(3)) # Connect broken parts 42 | fg = img_mask == 255 # Keep the (thick) foreground mask 43 | img_mask = cv.erode(img_mask, box(3), iterations=2) # Restore the thick mask thin 44 | 45 | # Update the background 46 | # Alternative) cv.createBackgroundSubtractorMOG2(), cv.bgsegm 47 | bg = ~fg 48 | img_back[bg] = (bg_update_rate * img_blur[bg] + (1 - bg_update_rate) * img_back[bg]) # With the higher weight 49 | img_back[fg] = (fg_update_rate * img_blur[fg] + (1 - fg_update_rate) * img_back[fg]) # With the lower weight 50 | 51 | # Get the foreground image 52 | img_fore = np.zeros_like(img) 53 | img_fore[fg] = img[fg] 54 | 55 | # Show all images 56 | merge = np.vstack((np.hstack((img, img_back.astype(np.uint8))), 57 | np.hstack((cv.cvtColor(img_mask, cv.COLOR_GRAY2BGR), img_fore)))) 58 | merge = cv.resize(merge, None, None, zoom_level, zoom_level) 59 | cv.imshow('Change Detection: Original | Background | Foreground Mask | Foreground', merge) 60 | 61 | # Process the key event 62 | key = cv.waitKey(1) 63 | if key == ord(' '): 64 | key = cv.waitKey() 65 | if key == 27: # ESC 66 | break 67 | 68 | cv.destroyAllWindows() 69 | -------------------------------------------------------------------------------- /examples/bilateral_filter.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import numpy as np 3 | 4 | img_list = [ 5 | '../data/lena.tif', 6 | '../data/baboon.tif', 7 | '../data/peppers.tif', 8 | '../data/black_circle.png', 9 | '../data/salt_and_pepper.png', 10 | '../data/sudoku.png', 11 | ] 12 | 13 | # Initialize control parameters 14 | kernel_size = 9 15 | sigma_color = 150 16 | sigma_space = 2.4 17 | n_iterations = 1 18 | img_select = 0 19 | 20 | while True: 21 | # Read the given image 22 | img = cv.imread(img_list[img_select]) 23 | assert img is not None, 'Cannot read the given image, ' + img_list[img_select] 24 | 25 | # Apply the bilateral filter iteratively 26 | result = img.copy() 27 | for itr in range(n_iterations): 28 | result = cv.bilateralFilter(result, kernel_size, sigma_color, sigma_space) 29 | 30 | # Show all images 31 | info = f'KSize: {kernel_size}, SColor: {sigma_color}, SSpace: {sigma_space:.1f}, NIter: {n_iterations}' 32 | cv.putText(result, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, 255, thickness=2) 33 | cv.putText(result, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, 0) 34 | merge = np.hstack((img, result)) 35 | cv.imshow('Bilateral Filter: Original | Result', merge) 36 | 37 | # Process the key event 38 | key = cv.waitKey() 39 | if key == 27: # ESC 40 | break 41 | elif key == ord('+') or key == ord('='): 42 | kernel_size = kernel_size + 2 43 | elif key == ord('-') or key == ord('_'): 44 | kernel_size = max(kernel_size - 2, -1) 45 | elif key == ord(']') or key == ord('}'): 46 | sigma_color += 2 47 | elif key == ord('[') or key == ord('{'): 48 | sigma_color -= 2 49 | elif key == ord('>') or key == ord('.'): 50 | sigma_space += 0.1 51 | elif key == ord('<') or key == ord(','): 52 | sigma_space -= 0.1 53 | elif key == ord(')') or key == ord('0'): 54 | n_iterations += 1 55 | elif key == ord('(') or key == ord('9'): 56 | n_iterations = max(n_iterations - 1, 1) 57 | elif key == ord('\t'): 58 | img_select = (img_select + 1) % len(img_list) 59 | 60 | cv.destroyAllWindows() 61 | -------------------------------------------------------------------------------- /examples/color_bgr2hsv.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | img = cv.imread('../data/peppers.tif') 5 | assert img is not None, 'Cannot read the given image' 6 | 7 | # Convert the BGR image to its HSV image 8 | img_hsv = cv.cvtColor(img, cv.COLOR_BGR2HSV) 9 | 10 | # Show hue, saturation, and value channels as color images 11 | img_hue = np.dstack((img_hsv[:,:,0], 12 | np.full_like(img_hsv[:,:,0], 255), 13 | np.full_like(img_hsv[:,:,0], 255))) 14 | img_hue = cv.cvtColor(img_hue, cv.COLOR_HSV2BGR) 15 | img_sat = np.dstack((img_hsv[:,:,1], ) * 3) 16 | img_val = np.dstack((img_hsv[:,:,2], ) * 3) 17 | merge = np.hstack((img, img_hue, img_sat, img_val)) 18 | cv.imshow('Color Conversion: Image | Hue | Saturation | Value', merge) 19 | cv.waitKey() 20 | cv.destroyAllWindows() 21 | -------------------------------------------------------------------------------- /examples/contrast_stretching.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | from histogram import get_histogram, conv_hist2img 4 | 5 | # Read the given image as gray scale 6 | img = cv.imread('../data/baboon.tif', cv.IMREAD_GRAYSCALE) 7 | assert img is not None, 'Cannot read the given image' 8 | 9 | # Initialize control parameters 10 | value_range = [20, 200] # [lower limit, upper limit] 11 | 12 | while True: 13 | # Apply contrast and brightness 14 | # Alternative) cv.intensity_transform.contrastStretching() (with s1=0 and s2=255) 15 | img_tran = 255 / (value_range[1] - value_range[0]) * (img.astype(np.int32) - value_range[0]) 16 | img_tran = img_tran.astype(np.uint8) # Apply saturation 17 | 18 | # Get image histograms 19 | hist = conv_hist2img(get_histogram(img)) 20 | hist_tran = conv_hist2img(get_histogram(img_tran)) 21 | 22 | # Mark the intensity range, 'value_range' 23 | if value_range[0] >= 0 and value_range[0] <= 255: 24 | mark = hist[:, value_range[0]] == 255 25 | hist[mark, value_range[0]] = 200 26 | if value_range[1] >= 0 and value_range[1] <= 255: 27 | mark = hist[:, value_range[1]] == 255 28 | hist[mark, value_range[1]] = 100 29 | 30 | # Show all images 31 | row0 = np.hstack((img, img_tran)) 32 | row1 = np.hstack((hist, hist_tran)) 33 | row1 = cv.resize(row1, (row0.shape[1], 255)) 34 | merge = np.vstack((row0, row1)) 35 | cv.imshow('Contrast Stretching: Original | Stretching', merge) 36 | 37 | # Process the key event 38 | key = cv.waitKey() 39 | if key == 27: # ESC 40 | break 41 | elif key == ord('+') or key == ord('='): 42 | value_range[0] += 1 43 | elif key == ord('-') or key == ord('_'): 44 | value_range[0] -= 1 45 | elif key == ord(']') or key == ord('}'): 46 | value_range[1] += 1 47 | elif key == ord('[') or key == ord('{'): 48 | value_range[1] -= 1 49 | 50 | cv.destroyAllWindows() 51 | -------------------------------------------------------------------------------- /examples/free_drawing.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | def mouse_event_handler(event, x, y, flags, param): 5 | # Change 'mouse_state' (given as 'param') according to the mouse 'event' 6 | if event == cv.EVENT_LBUTTONDOWN: 7 | param['dragged'] = True 8 | param['xy'] = (x, y) 9 | elif event == cv.EVENT_LBUTTONUP: 10 | param['dragged'] = False 11 | elif event == cv.EVENT_MOUSEMOVE and param['dragged']: 12 | param['xy'] = (x, y) 13 | 14 | def free_drawing(canvas_width=640, canvas_height=480, init_brush_radius=3): 15 | # Prepare a canvas and palette 16 | canvas = np.full((canvas_height, canvas_width, 3), 255, dtype=np.uint8) 17 | palette = [(0, 0, 0), (255, 255, 255), (0, 0, 255), (0, 255, 0), (255, 0, 0), (255, 255, 0), (255, 0, 255), (0, 255, 255)] 18 | 19 | # Initialize drawing states 20 | mouse_state = {'dragged': False, 'xy': (-1, -1)} 21 | brush_color = 0 22 | brush_radius = init_brush_radius 23 | 24 | # Instantiate a window and assign its callback function for mouse events 25 | cv.namedWindow('Free Drawing') 26 | cv.setMouseCallback('Free Drawing', mouse_event_handler, mouse_state) 27 | 28 | while True: 29 | # Draw a point if necessary 30 | if mouse_state['dragged']: 31 | cv.circle(canvas, mouse_state['xy'], brush_radius, palette[brush_color], -1) 32 | 33 | # Show the canvas 34 | canvas_copy = canvas.copy() 35 | info = f'Brush Radius: {brush_radius}' 36 | cv.putText(canvas_copy, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (127, 127, 127), thickness=2) 37 | cv.putText(canvas_copy, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, palette[brush_color]) 38 | cv.imshow('Free Drawing', canvas_copy) 39 | 40 | # Process the key event 41 | key = cv.waitKey(1) 42 | if key == 27: # ESC 43 | break 44 | elif key == ord('\t'): 45 | brush_color = (brush_color + 1) % len(palette) 46 | elif key == ord('+') or key == ord('='): 47 | brush_radius += 1 48 | elif key == ord('-') or key == ord('_'): 49 | brush_radius = max(brush_radius - 1, 1) 50 | 51 | cv.destroyAllWindows() 52 | 53 | if __name__ == '__main__': 54 | free_drawing() 55 | -------------------------------------------------------------------------------- /examples/histogram.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | def get_histogram(gray_img): # Alternative) cv.calcHist() 5 | # Assume a gray input image 6 | # Fix the bin range [0, 256) and bin size 256 7 | hist = np.zeros((256), dtype=np.uint32) 8 | for val in range(0, 256): 9 | hist[val] = sum(sum(gray_img == val)) # Count the occurence in 2D 10 | return hist 11 | 12 | def conv_hist2img(hist): 13 | img = np.full((256, 256), 255, dtype=np.uint8) 14 | max_freq = max(hist) 15 | for val in range(len(hist)): 16 | normalized_freq = int(hist[val] / max_freq * 255) 17 | img[0:normalized_freq, val] = 0 # Mark as black 18 | return img[::-1,:] 19 | 20 | 21 | 22 | if __name__ == '__main__': 23 | # Read the given image as gray scale 24 | img = cv.imread('../data/baboon.tif', cv.IMREAD_GRAYSCALE) 25 | assert img is not None, 'Cannot read the given image' 26 | 27 | # Get its histogram 28 | hist = get_histogram(img) 29 | print(f'* The number of bins: {len(hist)}') 30 | print(f'* The maximum frequency: {max(hist)}') 31 | print(f'* The minimum frequency: {min(hist)}') 32 | 33 | # Show the image and its histogram 34 | img_hist = conv_hist2img(hist) 35 | img_hist = cv.resize(img_hist, (len(img[0]), len(img_hist))) # Note) Be careful at (width, height) 36 | merge = np.vstack((img, img_hist)) 37 | cv.imshow('Histogram: Image | Histogram', merge) 38 | cv.waitKey() 39 | cv.destroyAllWindows() 40 | -------------------------------------------------------------------------------- /examples/histogram_equalization+color.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | img_list = [ 5 | '../data/lena.tif', 6 | '../data/baboon.tif', 7 | '../data/peppers.tif', 8 | ] 9 | 10 | # Initialize a control parameter 11 | img_select = 0 12 | 13 | while True: 14 | # Read the given image 15 | img = cv.imread(img_list[img_select]) 16 | assert img is not None, 'Cannot read the given image, ' + img_list[img_select] 17 | 18 | # Apply histogram equalization to each channel 19 | img_hist1 = np.dstack((cv.equalizeHist(img[:,:,0]), 20 | cv.equalizeHist(img[:,:,1]), 21 | cv.equalizeHist(img[:,:,2]))) 22 | 23 | # Apply histogram equalization only to the luminance channel in YCbCr 24 | img_cvt = cv.cvtColor(img, cv.COLOR_BGR2YCrCb) 25 | img_hist2 = np.dstack((cv.equalizeHist(img_cvt[:,:,0]), 26 | img_cvt[:,:,1], 27 | img_cvt[:,:,2])) 28 | img_hist2 = cv.cvtColor(img_hist2, cv.COLOR_YCrCb2BGR) 29 | 30 | # Show all images 31 | merge = np.hstack((img, img_hist1, img_hist2)) 32 | cv.imshow('Color Histogram Equalization: Image | Each Channel | Luminance Channel', merge) 33 | key = cv.waitKey() 34 | if key == 27: # ESC 35 | break 36 | elif key == ord('\t'): 37 | img_select = (img_select + 1) % len(img_list) 38 | 39 | cv.destroyAllWindows() 40 | -------------------------------------------------------------------------------- /examples/histogram_equalization.py: -------------------------------------------------------------------------------- 1 | import matplotlib.pyplot as plt 2 | import cv2 as cv 3 | 4 | # Read the given image as gray scale 5 | img = cv.imread('../data/baboon.tif', cv.IMREAD_GRAYSCALE) 6 | assert img is not None, 'Cannot read the given image' 7 | 8 | # Apply histogram equalization 9 | img_tran = cv.equalizeHist(img) 10 | 11 | # Derieve the histogram 12 | bin_width = 4 # Note) The value should be the power of 2. 13 | bin_num = int(256 / bin_width) 14 | hist = cv.calcHist([img], [0], None, [bin_num], [0, 255]) 15 | hist_tran = cv.calcHist([img_tran], [0], None, [bin_num], [0, 255]) 16 | 17 | # Show all images and their histograms 18 | plt.subplot(2, 2, 1) 19 | plt.imshow(img, cmap='gray') 20 | plt.axis('off') 21 | plt.subplot(2, 2, 2) 22 | plt.plot(range(0, 256, bin_width), hist / 1000) 23 | plt.xlabel('Intensity [0, 255]') 24 | plt.ylabel('Frequency (1k)') 25 | plt.subplot(2, 2, 3) 26 | plt.imshow(img_tran, cmap='gray') 27 | plt.axis('off') 28 | plt.subplot(2, 2, 4) 29 | plt.plot(range(0, 256, bin_width), hist_tran / 1000) 30 | plt.xlabel('Intensity [0, 255]') 31 | plt.ylabel('Frequency (1k)') 32 | plt.show() 33 | -------------------------------------------------------------------------------- /examples/image_converter.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | 3 | img_file = '../data/peppers.tif' 4 | target_format = 'png' 5 | 6 | # Read the given image file 7 | img = cv.imread(img_file) 8 | assert img is not None, 'Cannot read the given image' 9 | 10 | # Write 'img' as a file named 'target_file' 11 | target_file = img_file[:img_file.rfind('.')] + '.' + target_format 12 | cv.imwrite(target_file, img) 13 | -------------------------------------------------------------------------------- /examples/image_creation.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | img_gray = np.full((480, 640), 255, dtype=np.uint8) # Create a gray image (white) 5 | img_gray[140:240, 220:420] = 0 # Draw the black box 6 | img_gray[240:340, 220:420] = 127 # Draw the gray box 7 | 8 | img_color = np.zeros((480, 640, 3), dtype=np.uint8) # Create a color image (black) 9 | img_color[:] = 255 # Make the color image white 10 | img_color[140:240, 220:420, :] = (0, 0, 255) # Draw the red box 11 | img_color[240:340, 220:420, :] = (255, 0, 0) # Draw the blue box 12 | 13 | cv.imshow('Gray Image', img_gray) # Show 'img_gray' on a new window named as 'Gray Image' 14 | cv.imshow('Color Image', img_color) # Show 'img_color' on a new window named as 'Color Image' 15 | cv.waitKey() # Wait until a user press any key 16 | cv.destroyAllWindows() # It is necessary only for Spyder IDE. 17 | -------------------------------------------------------------------------------- /examples/image_difference.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | # Read the given video 5 | video = cv.VideoCapture('../data/PETS09-S2L1-raw.webm') 6 | assert video.isOpened(), 'Cannot read the given video' 7 | 8 | img_prev = None 9 | while True: 10 | # Get an image from 'video' 11 | valid, img = video.read() 12 | if not valid: 13 | break 14 | 15 | # Get the image difference 16 | if img_prev is None: 17 | img_prev = img.copy() 18 | continue 19 | img_diff = np.abs(img.astype(np.int32) - img_prev).astype(np.uint8) # Alternative) cv.absdiff() 20 | img_prev = img.copy() 21 | 22 | # Show all images 23 | merge = np.hstack((img, img_diff)) 24 | cv.imshow('Image Difference: Original | Difference', merge) 25 | 26 | # Process the key event 27 | key = cv.waitKey(1) 28 | if key == ord(' '): 29 | key = cv.waitKey() 30 | if key == 27: # ESC 31 | break 32 | 33 | cv.destroyAllWindows() 34 | -------------------------------------------------------------------------------- /examples/image_filtering.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import numpy as np 3 | 4 | # Define kernels 5 | kernel_table = [ 6 | {'name': 'Box 3x3', 'kernel': np.ones((3, 3)) / 9}, # Alternative: cv.boxFilter(), cv.blur() 7 | {'name': 'Gaussian 3x3', 'kernel': np.array([[1, 2, 1], # Alternative: cv.GaussianBlur() 8 | [2, 4, 2], 9 | [1, 2, 1]]) / 16}, 10 | {'name': 'Box 5x5', 'kernel': np.ones((5, 5)) / 25}, 11 | {'name': 'Gaussian 5x5', 'kernel': np.array([[1, 4, 6, 4, 1], 12 | [4, 16, 24, 16, 4], 13 | [6, 24, 36, 24, 6], 14 | [4, 16, 24, 16, 4], 15 | [1, 4, 6, 4, 1]]) / 256}, 16 | {'name': 'Gradient X', 'kernel': np.array([[-1, 1]])}, 17 | {'name': 'Robert DownRight','kernel': np.array([[-1, 0], 18 | [ 0, 1]])}, 19 | {'name': 'Prewitt X', 'kernel': np.array([[-1, 0, 1], 20 | [-1, 0, 1], 21 | [-1, 0, 1]])}, 22 | {'name': 'Sobel X', 'kernel': np.array([[-1, 0, 1], # Alternative: Sobel() 23 | [-2, 0, 2], 24 | [-1, 0, 1]])}, 25 | {'name': 'Scharr X', 'kernel': np.array([[-3, 0, 3], # Alternative: Scharr() 26 | [-10, 0, 10], 27 | [-3, 0, 3]])}, 28 | {'name': 'Gradient Y', 'kernel': np.array([[-1], [1]])}, 29 | {'name': 'Robert UpRight', 'kernel': np.array([[ 0, 1], 30 | [-1, 0]])}, 31 | {'name': 'Prewitt Y', 'kernel': np.array([[-1, -1, -1], 32 | [ 0, 0, 0], 33 | [ 1, 1, 1]])}, 34 | {'name': 'Sobel Y', 'kernel': np.array([[-1, -2, -1], 35 | [ 0, 0, 0], 36 | [ 1, 2, 1]])}, 37 | {'name': 'Scharr Y', 'kernel': np.array([[-3,-10, -3], 38 | [ 0, 0, 0], 39 | [ 3, 10, 3]])}, 40 | {'name': 'Laplacian (4)', 'kernel': np.array([[ 0, -1, 0], # Alternative: Laplacian 41 | [-1, 4, -1], 42 | [ 0, -1, 0]])}, 43 | {'name': 'Laplacian (8)', 'kernel': np.array([[-1, -1, -1], 44 | [-1, 8, -1], 45 | [-1, -1, -1]])}, 46 | {'name': 'Sharpen (5)', 'kernel': np.array([[ 0, -1, 0], 47 | [-1, 5, -1], 48 | [ 0, -1, 0]])}, 49 | {'name': 'Sharpen (9)', 'kernel': np.array([[-1, -1, -1], 50 | [-1, 9, -1], 51 | [-1, -1, -1]])}, 52 | {'name': 'Emboss (0)', 'kernel': np.array([[-2, -1, 0], 53 | [-1, 0, 1], 54 | [ 0, 1, 2]])}, 55 | {'name': 'Emboss (1)', 'kernel': np.array([[-2, -1, 0], 56 | [-1, 1, 1], 57 | [ 0, 1, 2]])}, 58 | ] 59 | 60 | img_list = [ 61 | '../data/lena.tif', 62 | '../data/baboon.tif', 63 | '../data/peppers.tif', 64 | '../data/black_circle.png', 65 | '../data/salt_and_pepper.png', 66 | '../data/sudoku.png', 67 | ] 68 | 69 | # Initialize control parameters 70 | kernel_select = 0 71 | img_select = 0 72 | 73 | while True: 74 | # Read the given image as gray scale 75 | img = cv.imread(img_list[img_select], cv.IMREAD_GRAYSCALE) 76 | assert img is not None, 'Cannot read the given image, ' + img_list[img_select] 77 | 78 | # Apply convolution to the image with the given 'kernel' 79 | name, kernel = kernel_table[kernel_select].values() # Make (short) alias 80 | # result = cv.filter2D(img, -1, kernel) # Note) dtype: np.uint8 (range: [0, 255]; Be careful!) 81 | result = cv.filter2D(img, cv.CV_64F, kernel) # Note) dtype: np.float64 82 | result = cv.convertScaleAbs(result) # Convert 'np.float64' to 'np.uint8' with saturation 83 | 84 | # Show the image and its filtered result 85 | cv.putText(result, name, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (255, 255, 255), thickness=2) 86 | cv.putText(result, name, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (0, 0, 0)) 87 | merge = np.hstack((img, result)) 88 | cv.imshow('Image Filtering: Original | Filtered', merge) 89 | 90 | # Process the key event 91 | key = cv.waitKey() 92 | if key == 27: # ESC 93 | break 94 | elif key == ord('+') or key == ord('='): 95 | kernel_select = (kernel_select + 1) % len(kernel_table) 96 | elif key == ord('-') or key == ord('_'): 97 | kernel_select = (kernel_select - 1) % len(kernel_table) 98 | elif key == ord('\t'): 99 | img_select = (img_select + 1) % len(img_list) 100 | 101 | cv.destroyAllWindows() 102 | -------------------------------------------------------------------------------- /examples/image_formation.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | from scipy.spatial.transform import Rotation 4 | 5 | # The given camera configuration: Focal length, principal point, image resolution, position, and orientation 6 | f, cx, cy, noise_std = 1000, 320, 240, 1 7 | img_res = (640, 480) 8 | cam_pos = [[0, 0, 0], [-2, -2, 0], [2, 2, 0], [-2, 2, 0], [2, -2, 0]] # Unit: [m] 9 | cam_ori = [[0, 0, 0], [-15 , 15, 0], [15, -15, 0], [15, 15, 0], [-15, -15, 0]] # Unit: [deg] 10 | 11 | # Load a point cloud in the homogeneous coordinate 12 | X = np.loadtxt('../data/box.xyz') # Size: N x 3 13 | 14 | # Generate images for each camera pose 15 | K = np.array([[f, 0, cx], [0, f, cy], [0, 0, 1]]) 16 | for i, (pos, ori) in enumerate(zip(cam_pos, cam_ori)): 17 | # Derive 'R' and 't' 18 | Rc = Rotation.from_euler('zyx', ori[::-1], degrees=True).as_matrix() 19 | R = Rc.T 20 | t = -Rc.T @ pos 21 | 22 | # Project the points (Alternative: cv.projectPoints()) 23 | x = K @ (R @ X.T + t.reshape(-1, 1)) # Size: 3 x N 24 | x /= x[-1] 25 | 26 | # Add Gaussian noise 27 | noise = np.random.normal(scale=noise_std, size=(2, len(X))) 28 | x[0:2,:] += noise 29 | 30 | # Show and store the points 31 | img = np.zeros(img_res[::-1], dtype=np.uint8) 32 | for c in range(x.shape[1]): 33 | cv.circle(img, x[0:2,c].astype(np.int32), 2, 255, -1) 34 | cv.imshow(f'Image Formation {i}', img) 35 | np.savetxt(f'../data/image_formation{i}.xyz', x.T) # Size: N x 2 36 | 37 | cv.waitKey() 38 | cv.destroyAllWindows() 39 | -------------------------------------------------------------------------------- /examples/image_resize.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | def resize(img, scale): 5 | # Prepare the (empty) resized image 6 | img_shape = list(img.shape) 7 | img_shape[0] = int(img_shape[0] * scale) 8 | img_shape[1] = int(img_shape[1] * scale) 9 | img_resize = np.zeros(img_shape, dtype=np.uint8) 10 | 11 | # Copy each pixel from the given image 12 | for ry in range(img_shape[0]): 13 | y = ry / scale 14 | for rx in range(img_shape[1]): 15 | x = rx / scale 16 | img_resize[ry, rx, :] = img[int(y+0.5), int(x+0.5), :] # Note) Rounding: int(x+0.5) 17 | return img_resize 18 | 19 | 20 | 21 | if __name__ == '__main__': 22 | img_file = '../data/peppers.tif' 23 | 24 | # Read the given image file 25 | img = cv.imread(img_file) 26 | assert img is not None, 'Cannot read the given image, ' + img_file 27 | 28 | # Initialize a control parameter 29 | scale = 1 30 | 31 | while True: 32 | # Resize the given image 33 | img_resize = resize(img, scale) # Alternative) cv.resize() 34 | 35 | # Show the resized image 36 | info = f'x{scale:.1f}' 37 | cv.putText(img_resize, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (255, 255, 255), thickness=2) 38 | cv.putText(img_resize, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (0, 0, 0)) 39 | cv.imshow('Image Resize', img_resize) 40 | 41 | # Process the key event 42 | key = cv.waitKey() 43 | if key == 27: # ESC 44 | break 45 | elif key == ord('+') or key == ord('='): 46 | scale = min(scale + 0.1, 3) 47 | elif key == ord('-') or key == ord('_'): 48 | scale = max(scale - 0.1, 0.3) 49 | 50 | cv.destroyAllWindows() 51 | -------------------------------------------------------------------------------- /examples/image_rotation.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | def rotate(img, degree): 5 | # Prepare the (empty) rotated image 6 | img_rotate = np.zeros(img.shape, dtype=np.uint8) 7 | 8 | # Prepare the inverse transformation 9 | c, s = np.cos(np.deg2rad(degree)), np.sin(np.deg2rad(degree)) 10 | R = np.array([[c, -s], [s, c]]).T # Note) Transpose is the inverse. 11 | h, w, *_ = img.shape 12 | cx = (w - 1) / 2 13 | cy = (h - 1) / 2 14 | 15 | # Copy each pixel from the given image (backward mapping) 16 | for ry in range(h): 17 | for rx in range(w): 18 | dx, dy = R @ [rx - cx, ry - cy] 19 | x, y = int(dx + cx + 0.5), int(dy + cy + 0.5) # The nearest pixel 20 | if x >= 0 and y >= 0 and x < w and y < h: 21 | img_rotate[ry, rx, :] = img[y, x, :] 22 | return img_rotate 23 | 24 | def rotate_forward(img, degree): 25 | # Prepare the (empty) rotated image 26 | img_rotate = np.zeros(img.shape, dtype=np.uint8) 27 | 28 | # Prepare the forward transformation 29 | c, s = np.cos(np.deg2rad(degree)), np.sin(np.deg2rad(degree)) 30 | R = np.array([[c, -s], [s, c]]) 31 | h, w, *_ = img.shape 32 | cx = (w - 1) / 2 33 | cy = (h - 1) / 2 34 | 35 | # Copy each pixel from the given image (forward mapping) 36 | for y in range(h): 37 | for x in range(w): 38 | dx, dy = R @ [x - cx, y - cy] 39 | rx, ry = int(dx + cx + 0.5), int(dy + cy + 0.5) # The nearest pixel 40 | if rx >= 0 and ry >= 0 and rx < w and ry < h: 41 | img_rotate[ry, rx, :] = img[y, x, :] 42 | return img_rotate 43 | 44 | 45 | 46 | if __name__ == '__main__': 47 | # Read the given image file 48 | img = cv.imread('../data/peppers.tif') 49 | assert img is not None, 'Cannot read the given image' 50 | 51 | # Initialize a control parameter 52 | degree = 0 53 | 54 | while True: 55 | # Rotate the given image 56 | # Note) Please try 'rotate_forward()' and observe missing pixels 57 | img_rotate = rotate(img, degree) # Alternative) cv.rotate() only for 90, 180, and 270 58 | # cv.warpAffine for more general cases 59 | # Show the rotated image 60 | info = f'{degree} [deg]' 61 | cv.putText(img_rotate, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (255, 255, 255), thickness=2) 62 | cv.putText(img_rotate, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (0, 0, 0)) 63 | cv.imshow('Image Rotation', img_rotate) 64 | 65 | # Process the key event 66 | key = cv.waitKey() 67 | if key == 27: # ESC 68 | break 69 | elif key == ord('+') or key == ord('='): 70 | degree = min(degree + 10, 180) 71 | elif key == ord('-') or key == ord('_'): 72 | degree = max(degree - 10, -180) 73 | 74 | cv.destroyAllWindows() 75 | -------------------------------------------------------------------------------- /examples/image_stitching.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | # Load two images 5 | img1 = cv.imread('../data/hill01.jpg') 6 | img2 = cv.imread('../data/hill02.jpg') 7 | assert (img1 is not None) and (img2 is not None), 'Cannot read the given images' 8 | 9 | # Retrieve matching points 10 | brisk = cv.BRISK_create() 11 | keypoints1, descriptors1 = brisk.detectAndCompute(img1, None) 12 | keypoints2, descriptors2 = brisk.detectAndCompute(img2, None) 13 | 14 | fmatcher = cv.DescriptorMatcher_create('BruteForce-Hamming') 15 | match = fmatcher.match(descriptors1, descriptors2) 16 | 17 | # Calculate planar homography and merge them 18 | pts1, pts2 = [], [] 19 | for i in range(len(match)): 20 | pts1.append(keypoints1[match[i].queryIdx].pt) 21 | pts2.append(keypoints2[match[i].trainIdx].pt) 22 | pts1 = np.array(pts1, dtype=np.float32) 23 | pts2 = np.array(pts2, dtype=np.float32) 24 | 25 | H, inlier_mask = cv.findHomography(pts2, pts1, cv.RANSAC) 26 | img_merged = cv.warpPerspective(img2, H, (img1.shape[1]*2, img1.shape[0])) 27 | img_merged[:,:img1.shape[1]] = img1 # Copy 28 | 29 | # Show the merged image 30 | img_matched = cv.drawMatches(img1, keypoints1, img2, keypoints2, match, None, None, None, 31 | matchesMask=inlier_mask.ravel().tolist()) # Remove 'matchesMask' if you want to show all putative matches 32 | merge = np.vstack((np.hstack((img1, img2)), img_matched, img_merged)) 33 | cv.imshow('Image Stitching', merge) 34 | cv.waitKey(0) 35 | cv.destroyAllWindows() 36 | -------------------------------------------------------------------------------- /examples/image_viewer+zoom.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | 3 | def mouse_event_handler(event, x, y, flags, param): 4 | # Catch the mouse position when it moves 5 | if event == cv.EVENT_MOUSEMOVE: 6 | param[0] = x # Note) Please do not use 'param = [x, y]' 7 | param[1] = y 8 | 9 | def image_viewer(img_file, zoom_level=10, zoom_box_radius=5, zoom_box_margin=10): 10 | # Read the given image file 11 | img = cv.imread(img_file) 12 | if img is None: 13 | return False 14 | img_height, img_width, *_ = img.shape 15 | 16 | # Instantiate a window and register the mouse callback function 17 | cv.namedWindow('Image Viewer') 18 | mouse_xy = [-1, -1] 19 | cv.setMouseCallback('Image Viewer', mouse_event_handler, mouse_xy) 20 | 21 | while True: 22 | # Paste 'zoom_box' on 'img_copy' 23 | img_copy = img.copy() 24 | if mouse_xy[0] >= zoom_box_radius and mouse_xy[0] < (img_width - zoom_box_radius) and \ 25 | mouse_xy[1] >= zoom_box_radius and mouse_xy[1] < (img_height - zoom_box_radius): 26 | # Crop the target region 27 | img_crop = img[mouse_xy[1]-zoom_box_radius:mouse_xy[1]+zoom_box_radius, \ 28 | mouse_xy[0]-zoom_box_radius:mouse_xy[0]+zoom_box_radius, :] 29 | 30 | # Get the zoomed (resized) image 31 | zoom_box = cv.resize(img_crop, None, None, zoom_level, zoom_level) 32 | 33 | # Paste the zoomed image on 'img_copy' 34 | s = zoom_box_margin 35 | e = zoom_box_margin + len(zoom_box) 36 | img_copy[s:e,s:e,:] = zoom_box 37 | 38 | # Show the image with the zoom 39 | cv.imshow('Image Viewer', img_copy) 40 | key = cv.waitKey(10) 41 | if key == 27: # ESC 42 | break 43 | 44 | cv.destroyAllWindows() 45 | return True 46 | 47 | 48 | 49 | if __name__ == '__main__': 50 | img_file = '../data/peppers.tif' 51 | 52 | if not image_viewer(img_file): 53 | print(f'Cannot open the given file, {img_file}') 54 | -------------------------------------------------------------------------------- /examples/image_viewer.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | 3 | img_file = '../data/peppers.tif' 4 | 5 | # Read the given image file 6 | img = cv.imread(img_file) 7 | assert img is not None, 'Cannot read the given image, ' + img_file 8 | 9 | # Show the image 10 | cv.imshow('Image Viewer', img) 11 | cv.waitKey() 12 | cv.destroyAllWindows() 13 | -------------------------------------------------------------------------------- /examples/intensity_transformation.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | # Read the given image as gray scale 5 | img = cv.imread('../data/baboon.tif', cv.IMREAD_GRAYSCALE) 6 | assert img is not None, 'Cannot read the given image' 7 | 8 | # Intialize control parameters 9 | contrast = 1.6 10 | contrast_step = 0.1 11 | brightness = -40 12 | brightness_step = 1 13 | 14 | while True: 15 | # Apply contrast and brightness 16 | img_tran = contrast * img + brightness # Alternative) cv.equalizeHist(), cv.intensity_transform 17 | img_tran[img_tran < 0] = 0 18 | img_tran[img_tran > 255] = 255 19 | img_tran = img_tran.astype(np.uint8) # Alternative) cv.convertScaleAbs() for the above 3 lines 20 | 21 | # Show all images 22 | info = f'Contrast: {contrast:.1f}, Brightness: {brightness:.0f}' 23 | cv.putText(img_tran, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, 255, thickness=2) 24 | cv.putText(img_tran, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, 0) 25 | merge = np.hstack((img, img_tran)) 26 | cv.imshow('Intensity Transformation: Original | Contrast/Brightness', merge) 27 | 28 | # Process the key event 29 | key = cv.waitKey() 30 | if key == 27: # ESC 31 | break 32 | elif key == ord('+') or key == ord('='): 33 | contrast += contrast_step 34 | elif key == ord('-') or key == ord('_'): 35 | contrast -= contrast_step 36 | elif key == ord(']') or key == ord('}'): 37 | brightness += brightness_step 38 | elif key == ord('[') or key == ord('{'): 39 | brightness -= brightness_step 40 | 41 | cv.destroyAllWindows() 42 | -------------------------------------------------------------------------------- /examples/median_filter.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import numpy as np 3 | 4 | img_list = [ 5 | '../data/lena.tif', 6 | '../data/baboon.tif', 7 | '../data/peppers.tif', 8 | '../data/black_circle.png', 9 | '../data/salt_and_pepper.png', 10 | '../data/sudoku.png', 11 | ] 12 | 13 | # Initialize control parameters 14 | kernel_size = 5 15 | img_select = 4 16 | 17 | while True: 18 | # Read the given image 19 | img = cv.imread(img_list[img_select]) 20 | assert img is not None, 'Cannot read the given image, ' + img_list[img_select] 21 | 22 | # Apply the median filter 23 | result = cv.medianBlur(img, kernel_size) 24 | 25 | # Show all images 26 | info = f'KernelSize: {kernel_size}' 27 | cv.putText(result, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, 255, thickness=2) 28 | cv.putText(result, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, 0) 29 | merge = np.hstack((img, result)) 30 | cv.imshow('Medial Filter: Original | Result', merge) 31 | 32 | # Process the key event 33 | key = cv.waitKey() 34 | if key == 27: # ESC 35 | break 36 | elif key == ord('+') or key == ord('='): 37 | kernel_size = kernel_size + 2 38 | elif key == ord('-') or key == ord('_'): 39 | kernel_size = max(kernel_size - 2, 3) 40 | elif key == ord('\t'): 41 | img_select = (img_select + 1) % len(img_list) 42 | 43 | cv.destroyAllWindows() 44 | -------------------------------------------------------------------------------- /examples/morpology.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import numpy as np 3 | 4 | # Define morphological operations and kernels 5 | morph_operations = [ 6 | {'name': 'Erode', 'operation': cv.MORPH_ERODE}, # Alternative) cv.erode() 7 | {'name': 'Dilate', 'operation': cv.MORPH_DILATE}, # Alternative) cv.dilate() 8 | {'name': 'Open', 'operation': cv.MORPH_OPEN}, 9 | {'name': 'Close', 'operation': cv.MORPH_CLOSE}, 10 | {'name': 'Gradient', 'operation': cv.MORPH_GRADIENT}, 11 | {'name': 'Tophat', 'operation': cv.MORPH_TOPHAT}, 12 | {'name': 'Blackhat', 'operation': cv.MORPH_BLACKHAT}, 13 | {'name': 'Hitmiss', 'operation': cv.MORPH_HITMISS}, 14 | ] 15 | 16 | kernel_tables = [ 17 | {'name': '3x3 Box', 'kenerl': np.ones((3, 3), dtype=np.uint8)}, 18 | {'name': '5x5 Box', 'kenerl': np.ones((5, 5), dtype=np.uint8)}, 19 | {'name': '5x1 Bar', 'kernel': np.ones((5, 1), dtype=np.uint8)}, 20 | {'name': '1x5 Bar', 'kernel': np.ones((1, 5), dtype=np.uint8)}, 21 | {'name': '5x5 Cross', 'kernel': np.array([[0,0,1,0,0], [0,0,1,0,0], [1,1,1,1,1], [0,0,1,0,0], [0,0,1,0,0]], dtype=np.uint8)}, 22 | ] 23 | 24 | # Read the given image as gray scale 25 | img = cv.imread('../data/face.png', cv.IMREAD_GRAYSCALE) 26 | assert img is not None, 'Cannot read the given image' 27 | 28 | # Initialize a control parameter 29 | morph_select = 0 30 | kernel_select = 0 31 | n_iterations = 1 32 | 33 | while True: 34 | # Apply morphological operation to the image with the given 'kernel' 35 | m_name, operation = morph_operations[morph_select].values() # Make alias 36 | k_name, kernel = kernel_tables[kernel_select].values() # Make alias 37 | result = cv.morphologyEx(img, operation, kernel, iterations=n_iterations) 38 | 39 | # Show the image and its filtered result 40 | info = f'{m_name}({n_iterations}) with {k_name}' 41 | cv.putText(result, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (255, 255, 255), thickness=2) 42 | cv.putText(result, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (0, 0, 0)) 43 | merge = np.hstack((img, result)) 44 | cv.imshow('Morphological Operation: Original | Result', merge) 45 | 46 | # Process the key event 47 | key = cv.waitKey() 48 | if key == 27: # ESC 49 | break 50 | elif key == ord('+') or key == ord('='): 51 | morph_select = (morph_select + 1) % len(morph_operations) 52 | elif key == ord('-') or key == ord('_'): 53 | morph_select = (morph_select - 1) % len(morph_operations) 54 | elif key == ord(']') or key == ord('}'): 55 | kernel_select = (kernel_select + 1) % len(kernel_tables) 56 | elif key == ord('[') or key == ord('{'): 57 | kernel_select = (kernel_select - 1) % len(kernel_tables) 58 | elif key == ord(')') or key == ord('0'): 59 | n_iterations += 1 60 | elif key == ord('(') or key == ord('9'): 61 | n_iterations = max(n_iterations - 1, 1) 62 | 63 | cv.destroyAllWindows() 64 | -------------------------------------------------------------------------------- /examples/negative_image_and_flip.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | import numpy as np 3 | 4 | # Read the given image 5 | img = cv.imread('../data/peppers.tif') 6 | assert img is not None, 'Cannot read the given image' 7 | 8 | # Get its negative image 9 | img_nega = 255 - img # Alternative) cv.bitwise_xor() 10 | 11 | # Get its vertically flipped image 12 | img_flip = img[::-1,:,:] # Alternative) cv.flip() 13 | 14 | # Show all images 15 | merge = np.hstack((img, img_nega, img_flip)) # Alternative) cv.hconcat() 16 | cv.imshow('Image Editing: Original | Negative | Flip', merge) 17 | cv.waitKey() 18 | cv.destroyAllWindows() 19 | -------------------------------------------------------------------------------- /examples/shape_drawing.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | 4 | # Prepare a canvas 5 | canvas = np.full((480, 640, 3), 255, dtype=np.uint8) 6 | 7 | # Draw lines with its label 8 | cv.line(canvas, ( 10, 10), (630, 470), color=(200, 200, 200), thickness=2) 9 | cv.line(canvas, (630, 10), ( 10, 470), color=(200, 200, 200), thickness=2) 10 | cv.line(canvas, (320, 10), (320, 470), color=(200, 200, 200), thickness=2) 11 | cv.line(canvas, ( 10, 240), (630, 240), color=(200, 200, 200), thickness=2) 12 | cv.putText(canvas, 'Line', (10, 20), cv.FONT_HERSHEY_DUPLEX, 0.5, (0, 0, 0)) 13 | 14 | # Draw a circle with its label 15 | center = (100, 240) # Note) How about real numbers? 16 | cv.circle(canvas, center, radius=60, color=(0, 0, 255), thickness=5) 17 | cv.putText(canvas, 'Circle', center, cv.FONT_HERSHEY_DUPLEX, 0.5, (255, 255, 0)) 18 | 19 | # Draw a rectangle with its label 20 | pt1, pt2 = (320-60, 240-50), (320+60, 240+50) 21 | cv.rectangle(canvas, pt1, pt2, color=(0, 255, 0), thickness=-1) 22 | cv.putText(canvas, 'Rectangle', pt1, cv.FONT_HERSHEY_DUPLEX, 0.5, (255, 0, 255)) 23 | 24 | # Draw a polygon (triangle) with its label 25 | pts = np.array([(540, 240-50), (540-55, 240+50), (540+55, 240+50)]) # Note) Why np.array? 26 | cv.polylines(canvas, [pts], True, color=(255, 0, 0), thickness=5) 27 | cv.putText(canvas, 'Polylines', pts[0].flatten(), cv.FONT_HERSHEY_DUPLEX, 0.5, (0, 200, 200)) 28 | 29 | # Show the canvas 30 | cv.imshow('Shape Drawing', canvas) 31 | cv.waitKey() 32 | cv.destroyAllWindows() 33 | -------------------------------------------------------------------------------- /examples/thresholding.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 as cv 3 | from Sobel_edge import drawText 4 | 5 | # Read the given image as gray scale 6 | img = cv.imread('../data/sudoku.png', cv.IMREAD_GRAYSCALE) 7 | assert img is not None, 'Cannot read the given image' 8 | img_threshold_type = cv.THRESH_BINARY_INV # Type: Detect pixels close to 'black' (inverse) 9 | 10 | # Initialize control parameters 11 | threshold = 127 12 | adaptive_type = cv.ADAPTIVE_THRESH_MEAN_C 13 | adaptive_blocksize = 99 14 | adaptive_C = 4 15 | 16 | while True: 17 | # Apply thresholding to the image 18 | _, binary_user = cv.threshold(img, threshold, 255, img_threshold_type) 19 | threshold_otsu, binary_otsu = cv.threshold(img, threshold, 255, img_threshold_type | cv.THRESH_OTSU) 20 | binary_adaptive = cv.adaptiveThreshold(img, 255, adaptive_type, img_threshold_type, adaptive_blocksize, adaptive_C) 21 | 22 | # Show the image and its thresholded result 23 | drawText(binary_user, f'Threshold: {threshold}') 24 | drawText(binary_otsu, f'Otsu Threshold: {threshold_otsu}') 25 | adaptive_type_text = 'M' if adaptive_type == cv.ADAPTIVE_THRESH_MEAN_C else 'G' 26 | drawText(binary_adaptive, f'Type: {adaptive_type_text}, BlockSize: {adaptive_blocksize}, C: {adaptive_C}') 27 | merge = np.vstack((np.hstack((img, binary_user)), 28 | np.hstack((binary_otsu, binary_adaptive)))) 29 | cv.imshow('Thresholding: Original | User | Otsu | Adaptive', merge) 30 | 31 | # Process the key event 32 | key = cv.waitKey() 33 | if key == 27: # ESC 34 | break 35 | elif key == ord('+') or key == ord('='): 36 | threshold += 1 37 | elif key == ord('-') or key == ord('_'): 38 | threshold -= 1 39 | elif key == ord('\t'): 40 | if adaptive_type == cv.ADAPTIVE_THRESH_MEAN_C: 41 | adaptive_type = cv.ADAPTIVE_THRESH_GAUSSIAN_C 42 | else: 43 | adaptive_type = cv.ADAPTIVE_THRESH_MEAN_C 44 | elif key == ord(']') or key == ord('}'): 45 | adaptive_blocksize += 2 46 | elif key == ord('[') or key == ord('{'): 47 | adaptive_blocksize = max(adaptive_blocksize - 2, 3) 48 | elif key == ord('>') or key == ord('.'): 49 | adaptive_C += 1 50 | elif key == ord('<') or key == ord(','): 51 | adaptive_C -= 1 52 | 53 | cv.destroyAllWindows() 54 | -------------------------------------------------------------------------------- /examples/video_converter.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | 3 | video_file = '../data/PETS09-S2L1-raw.webm' 4 | target_format = 'avi' 5 | target_fourcc = 'XVID' # Note) FourCC: https://learn.microsoft.com/en-us/windows/win32/medfound/video-fourccs 6 | 7 | # Read the given video file 8 | video = cv.VideoCapture(video_file) 9 | assert video.isOpened(), 'Cannot read the given video, ' + video_file 10 | 11 | target = cv.VideoWriter() 12 | while True: 13 | # Get an image from 'video' 14 | valid, img = video.read() 15 | if not valid: 16 | break 17 | 18 | if not target.isOpened(): 19 | # Open the target video file 20 | target_file = video_file[:video_file.rfind('.')] + '.' + target_format 21 | fps = video.get(cv.CAP_PROP_FPS) 22 | h, w, *_ = img.shape 23 | is_color = (img.ndim > 2) and (img.shape[2] > 1) 24 | target.open(target_file, cv.VideoWriter_fourcc(*target_fourcc), fps, (w, h), is_color) 25 | assert target.isOpened(), 'Cannot open the given video, ' + target_file + '.' 26 | 27 | # Add the image to 'target' 28 | target.write(img) 29 | 30 | target.release() 31 | -------------------------------------------------------------------------------- /examples/video_player+navigation.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | 3 | video_file = '../data/PETS09-S2L1-raw.webm' 4 | 5 | # Read the given video file 6 | video = cv.VideoCapture(video_file) 7 | assert video.isOpened(), 'Cannot read the given video, ' + video_file 8 | 9 | # Get FPS and calculate the waiting time in millisecond 10 | fps = video.get(cv.CAP_PROP_FPS) 11 | wait_msec = int(1 / fps * 1000) 12 | 13 | # Configure the frame navigation 14 | frame_total = int(video.get(cv.CAP_PROP_FRAME_COUNT)) 15 | frame_shift = 10 16 | speed_table = [1/10, 1/8, 1/4, 1/2, 1, 2, 3, 4, 5, 8, 10] 17 | speed_index = 4 18 | 19 | while True: 20 | # Get an image from 'video' 21 | valid, img = video.read() 22 | if not valid: 23 | break 24 | 25 | # Show the image 26 | frame = int(video.get(cv.CAP_PROP_POS_FRAMES)) 27 | info = f'Frame: {frame}/{frame_total}, Speed: x{speed_table[speed_index]:.2g}' 28 | cv.putText(img, info, (10, 25), cv.FONT_HERSHEY_DUPLEX, 0.6, (0, 255, 0)) 29 | cv.imshow('Video Player', img) 30 | 31 | # Process the key event 32 | key = cv.waitKey(max(int(wait_msec / speed_table[speed_index]), 1)) 33 | if key == ord(' '): 34 | key = cv.waitKey() 35 | if key == 27: # ESC 36 | break 37 | elif key == ord('\t'): 38 | speed_index = 4 39 | elif key == ord('>') or key == ord('.'): 40 | speed_index = min(speed_index + 1, len(speed_table) - 1) 41 | elif key == ord('<') or key == ord(','): 42 | speed_index = max(speed_index - 1, 0) 43 | elif key == ord(']') or key == ord('}'): 44 | video.set(cv.CAP_PROP_POS_FRAMES, frame + frame_shift) 45 | elif key == ord('[') or key == ord('{'): 46 | video.set(cv.CAP_PROP_POS_FRAMES, max(frame - frame_shift, 0)) 47 | 48 | cv.destroyAllWindows() 49 | -------------------------------------------------------------------------------- /examples/video_player.py: -------------------------------------------------------------------------------- 1 | import cv2 as cv 2 | 3 | video_file = '../data/PETS09-S2L1-raw.webm' 4 | 5 | # Read the given video file 6 | # Note) Additional argument examples 7 | # - Image sequence: video_file = '../data/PETS09-S2L1-raw_%04d.png' 8 | # - Camera : video_file = 0 (Note: The camera index) 9 | video = cv.VideoCapture(video_file) 10 | assert video.isOpened(), 'Cannot read the given video, ' + video_file 11 | 12 | # Get FPS and calculate the waiting time in millisecond 13 | fps = video.get(cv.CAP_PROP_FPS) 14 | wait_msec = int(1 / fps * 1000) 15 | 16 | while True: 17 | # Read an image from 'video' 18 | valid, img = video.read() 19 | if not valid: 20 | break 21 | 22 | # Show the image 23 | cv.imshow('Video Player', img) 24 | 25 | # Terminate if the given key is ESC 26 | key = cv.waitKey(wait_msec) 27 | if key == 27: # ESC 28 | break 29 | 30 | cv.destroyAllWindows() 31 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy 2 | matplotlib 3 | scipy 4 | opencv-python 5 | opencv-contrib-python 6 | -------------------------------------------------------------------------------- /slides/01_introduction.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/slides/01_introduction.pdf -------------------------------------------------------------------------------- /slides/02_image_editing.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/slides/02_image_editing.pdf -------------------------------------------------------------------------------- /slides/03_image_processing.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/slides/03_image_processing.pdf -------------------------------------------------------------------------------- /slides/04_color.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/slides/04_color.pdf -------------------------------------------------------------------------------- /slides/05_image_formation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/slides/05_image_formation.pdf -------------------------------------------------------------------------------- /slides/06_image_geometry.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/slides/06_image_geometry.pdf -------------------------------------------------------------------------------- /slides/07_solving_problems.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/slides/07_solving_problems.pdf -------------------------------------------------------------------------------- /slides/08_image_correspondence.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/slides/08_image_correspondence.pdf -------------------------------------------------------------------------------- /slides/14_3d_vision.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mint-lab/cv_tutorial/e291c17eaa7b81a319244db68a49e68c4c763ca4/slides/14_3d_vision.pdf --------------------------------------------------------------------------------