├── src ├── __init__.py ├── matching_cost │ ├── __init__.py │ ├── matching_cost.py │ ├── sum_of_squared_differences.py │ ├── sum_of_absolute_differences.py │ └── normalised_cross_correlation.py ├── matching_algorithm │ ├── __init__.py │ ├── winner_takes_it_all.py │ ├── matching_algorithm.py │ └── semi_global_matching.py ├── stereo_matching.py ├── utilities.py └── main.py ├── test ├── __init__.py └── test_utilities.py ├── .gitattributes ├── doc ├── Theory.pdf ├── Docker.md └── Theory.tex ├── data ├── cones_gt.png ├── bowling_left.png ├── cones_left.png ├── cones_mask.png ├── cones_right.png ├── Adirondack_gt.png ├── bowling_right.png ├── AdirondackE_right.png ├── Adirondack_left.png ├── Adirondack_mask.png └── Adirondack_right.png ├── output ├── bowling_NCC_SGM_D30_R3.jpg ├── bowling_NCC_WTA_D30_R3.jpg ├── bowling_SAD_SGM_D30_R3.jpg ├── bowling_SAD_WTA_D30_R3.jpg ├── bowling_SSD_SGM_D30_R3.jpg ├── bowling_SSD_WTA_D30_R3.jpg ├── cones_NCC_SGM_D60_R3_accX0,95.jpg ├── cones_NCC_WTA_D60_R3_accX0,91.jpg ├── cones_SAD_SGM_D60_R3_accX0,91.jpg ├── cones_SAD_WTA_D60_R3_accX0,86.jpg ├── cones_SSD_SGM_D60_R3_accX0,95.jpg ├── cones_SSD_WTA_D60_R3_accX0,88.jpg ├── Adirondack_NCC_SGM_D70_R3_accX0,92.jpg ├── Adirondack_NCC_WTA_D70_R3_accX0,82.jpg ├── Adirondack_SAD_SGM_D70_R3_accX0,48.jpg ├── Adirondack_SAD_WTA_D70_R3_accX0,44.jpg ├── Adirondack_SSD_SGM_D70_R3_accX0,75.jpg └── Adirondack_SSD_WTA_D70_R3_accX0,49.jpg ├── .gitignore ├── docker ├── docker-compose-nvidia.yml ├── docker-compose-gui-nvidia.yml ├── docker-compose-gui.yml ├── .dockerignore ├── docker-compose.yml └── Dockerfile ├── .github └── workflows │ ├── run-tests.yml │ └── update-dockerhub.yml ├── .devcontainer └── devcontainer.json ├── .vscode └── tasks.json ├── License.md └── ReadMe.md /src/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /test/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/matching_cost/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /src/matching_algorithm/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | src/main.ipynb linguist-documentation=true 2 | -------------------------------------------------------------------------------- /doc/Theory.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/doc/Theory.pdf -------------------------------------------------------------------------------- /data/cones_gt.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/cones_gt.png -------------------------------------------------------------------------------- /data/bowling_left.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/bowling_left.png -------------------------------------------------------------------------------- /data/cones_left.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/cones_left.png -------------------------------------------------------------------------------- /data/cones_mask.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/cones_mask.png -------------------------------------------------------------------------------- /data/cones_right.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/cones_right.png -------------------------------------------------------------------------------- /data/Adirondack_gt.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/Adirondack_gt.png -------------------------------------------------------------------------------- /data/bowling_right.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/bowling_right.png -------------------------------------------------------------------------------- /data/AdirondackE_right.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/AdirondackE_right.png -------------------------------------------------------------------------------- /data/Adirondack_left.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/Adirondack_left.png -------------------------------------------------------------------------------- /data/Adirondack_mask.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/Adirondack_mask.png -------------------------------------------------------------------------------- /data/Adirondack_right.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/Adirondack_right.png -------------------------------------------------------------------------------- /output/bowling_NCC_SGM_D30_R3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/bowling_NCC_SGM_D30_R3.jpg -------------------------------------------------------------------------------- /output/bowling_NCC_WTA_D30_R3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/bowling_NCC_WTA_D30_R3.jpg -------------------------------------------------------------------------------- /output/bowling_SAD_SGM_D30_R3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/bowling_SAD_SGM_D30_R3.jpg -------------------------------------------------------------------------------- /output/bowling_SAD_WTA_D30_R3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/bowling_SAD_WTA_D30_R3.jpg -------------------------------------------------------------------------------- /output/bowling_SSD_SGM_D30_R3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/bowling_SSD_SGM_D30_R3.jpg -------------------------------------------------------------------------------- /output/bowling_SSD_WTA_D30_R3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/bowling_SSD_WTA_D30_R3.jpg -------------------------------------------------------------------------------- /output/cones_NCC_SGM_D60_R3_accX0,95.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/cones_NCC_SGM_D60_R3_accX0,95.jpg -------------------------------------------------------------------------------- /output/cones_NCC_WTA_D60_R3_accX0,91.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/cones_NCC_WTA_D60_R3_accX0,91.jpg -------------------------------------------------------------------------------- /output/cones_SAD_SGM_D60_R3_accX0,91.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/cones_SAD_SGM_D60_R3_accX0,91.jpg -------------------------------------------------------------------------------- /output/cones_SAD_WTA_D60_R3_accX0,86.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/cones_SAD_WTA_D60_R3_accX0,86.jpg -------------------------------------------------------------------------------- /output/cones_SSD_SGM_D60_R3_accX0,95.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/cones_SSD_SGM_D60_R3_accX0,95.jpg -------------------------------------------------------------------------------- /output/cones_SSD_WTA_D60_R3_accX0,88.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/cones_SSD_WTA_D60_R3_accX0,88.jpg -------------------------------------------------------------------------------- /output/Adirondack_NCC_SGM_D70_R3_accX0,92.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/Adirondack_NCC_SGM_D70_R3_accX0,92.jpg -------------------------------------------------------------------------------- /output/Adirondack_NCC_WTA_D70_R3_accX0,82.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/Adirondack_NCC_WTA_D70_R3_accX0,82.jpg -------------------------------------------------------------------------------- /output/Adirondack_SAD_SGM_D70_R3_accX0,48.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/Adirondack_SAD_SGM_D70_R3_accX0,48.jpg -------------------------------------------------------------------------------- /output/Adirondack_SAD_WTA_D70_R3_accX0,44.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/Adirondack_SAD_WTA_D70_R3_accX0,44.jpg -------------------------------------------------------------------------------- /output/Adirondack_SSD_SGM_D70_R3_accX0,75.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/Adirondack_SSD_SGM_D70_R3_accX0,75.jpg -------------------------------------------------------------------------------- /output/Adirondack_SSD_WTA_D70_R3_accX0,49.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/Adirondack_SSD_WTA_D70_R3_accX0,49.jpg -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.nbc 2 | *.nbi 3 | __pycache__/ 4 | *.py[cod] 5 | doc/**/*.aux 6 | doc/**/*.bbl 7 | doc/**/*.blg 8 | doc/**/*.log 9 | doc/**/*.out 10 | doc/**/*.run.xml 11 | doc/**/*.synctex.gz 12 | doc/**/*.bib 13 | src/.ipynb_checkpoints/** 14 | 15 | -------------------------------------------------------------------------------- /docker/docker-compose-nvidia.yml: -------------------------------------------------------------------------------- 1 | version: "3.9" 2 | services: 3 | stereo_matching_docker: 4 | extends: 5 | file: docker-compose.yml 6 | service: stereo_matching_docker 7 | environment: 8 | - NVIDIA_VISIBLE_DEVICES=all 9 | runtime: nvidia 10 | -------------------------------------------------------------------------------- /docker/docker-compose-gui-nvidia.yml: -------------------------------------------------------------------------------- 1 | version: "3.9" 2 | services: 3 | stereo_matching_docker: 4 | extends: 5 | file: docker-compose-gui.yml 6 | service: stereo_matching_docker 7 | environment: 8 | - NVIDIA_VISIBLE_DEVICES=all 9 | - NVIDIA_DRIVER_CAPABILITIES=all 10 | runtime: nvidia 11 | -------------------------------------------------------------------------------- /docker/docker-compose-gui.yml: -------------------------------------------------------------------------------- 1 | version: "3.9" 2 | services: 3 | stereo_matching_docker: 4 | extends: 5 | file: docker-compose.yml 6 | service: stereo_matching_docker 7 | environment: 8 | - DISPLAY=${DISPLAY} 9 | - QT_X11_NO_MITSHM=1 10 | volumes: 11 | - /tmp/.X11-unix:/tmp/.X11-unix:rw 12 | - /tmp/.docker.xauth:/tmp/.docker.xauth:rw 13 | -------------------------------------------------------------------------------- /.github/workflows/run-tests.yml: -------------------------------------------------------------------------------- 1 | name: Tests 2 | 3 | on: 4 | push 5 | 6 | jobs: 7 | unit-tests: 8 | runs-on: ubuntu-latest 9 | container: 10 | image: tobitflatscher/stereo-matching 11 | volumes: 12 | - ${{ github.workspace }}:/stereo_matching 13 | steps: 14 | - name: Checkout code 15 | uses: actions/checkout@v2 16 | - name: Run unittests in workspace 17 | run: python3 -m unittest discover 18 | 19 | -------------------------------------------------------------------------------- /docker/.dockerignore: -------------------------------------------------------------------------------- 1 | **/.classpath 2 | **/.dockerignore 3 | **/.env 4 | **/.git 5 | **/.gitignore 6 | **/.project 7 | **/.settings 8 | **/.toolstarget 9 | **/.vs 10 | **/.vscode 11 | **/*.*proj.user 12 | **/*.dbmdl 13 | **/*.jfm 14 | **/bin 15 | **/charts 16 | **/docker-compose* 17 | **/compose* 18 | **/Dockerfile* 19 | **/node_modules 20 | **/npm-debug.log 21 | **/obj 22 | **/secrets.dev.yaml 23 | **/values.dev.yaml 24 | **/ReadMe.md 25 | -------------------------------------------------------------------------------- /.devcontainer/devcontainer.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "Stereo Matching Docker Compose", 3 | "dockerComposeFile": [ 4 | "../docker/docker-compose-gui.yml" // Alternatives: "../docker/docker-compose-gui.yml", "../docker/docker-compose-gui-nvidia.yml", "../docker/docker-compose-nvidia.yml" 5 | ], 6 | "service": "stereo_matching_docker", 7 | "workspaceFolder": "/code/stereo_matching", 8 | "shutdownAction": "stopCompose", 9 | "extensions": [ 10 | ] 11 | } 12 | -------------------------------------------------------------------------------- /docker/docker-compose.yml: -------------------------------------------------------------------------------- 1 | version: "3.9" 2 | services: 3 | stereo_matching_docker: 4 | build: 5 | context: . 6 | dockerfile: Dockerfile 7 | #stdin_open: true # Docker run -i 8 | tty: true # Docker run -t 9 | privileged: true 10 | network_mode: "host" 11 | volumes: # Mount relevant folders into container 12 | - ../.vscode:/code/stereo_matching/.vscode # Necessary for using VS Code Tasks inside container 13 | - ../data:/code/stereo_matching/data 14 | - ../doc:/code/stereo_matching/doc 15 | - ../src:/code/stereo_matching/src 16 | - ../test:/code/stereo_matching/test 17 | 18 | -------------------------------------------------------------------------------- /.vscode/tasks.json: -------------------------------------------------------------------------------- 1 | { 2 | "version": "2.0.0", 3 | "tasks": [ 4 | { 5 | "label": "run", 6 | "detail": "Run Jupyter Notebook.", 7 | "type": "shell", 8 | "command": "jupyter notebook --ip=127.0.0.1 --port=8888 --allow-root", 9 | "group": { 10 | "kind": "build", 11 | "isDefault": true 12 | } 13 | }, 14 | { 15 | "label": "test", 16 | "detail": "Run all unit tests and show results.", 17 | "type": "shell", 18 | "command": "(cd ${workspaceFolder} && python3 -m unittest discover)", 19 | "group": { 20 | "kind": "test", 21 | "isDefault": true 22 | } 23 | "problemMatcher": [] 24 | } 25 | ] 26 | } 27 | -------------------------------------------------------------------------------- /src/matching_algorithm/winner_takes_it_all.py: -------------------------------------------------------------------------------- 1 | # Tobit Flatscher - github.com/2b-t (2022) 2 | 3 | # @file winner_takes_it_all.py 4 | # @brief Winner-takes-it-all (WTA) stereo matching algorithm 5 | 6 | import abc 7 | import numpy as np 8 | 9 | from .matching_algorithm import MatchingAlgorithm 10 | 11 | 12 | class WinnerTakesItAll(MatchingAlgorithm): 13 | 14 | @staticmethod 15 | def match(cost_volume: np.ndarray) -> np.ndarray: 16 | # Function for matching the best suiting pixels for the disparity image 17 | # @param[in] cost_volume: The three-dimensional cost volume to be searched for the best matching pixel (H,W,D) 18 | # @return: The two-dimensional disparity image resulting from the best matching pixel inside the cost volume (H,W) 19 | 20 | return np.argmin(cost_volume, axis=2) -------------------------------------------------------------------------------- /.github/workflows/update-dockerhub.yml: -------------------------------------------------------------------------------- 1 | name: Dockerhub 2 | 3 | on: 4 | push: 5 | paths: 6 | - 'docker/Dockerfile' 7 | 8 | jobs: 9 | docker: 10 | runs-on: ubuntu-latest 11 | steps: 12 | - name: Set up QEMU for architectures 13 | uses: docker/setup-qemu-action@v1 14 | - name: Set up Docker Buildx 15 | uses: docker/setup-buildx-action@v1 16 | - name: Login to DockerHub 17 | uses: docker/login-action@v1 18 | with: 19 | username: ${{ secrets.DOCKERHUB_USERNAME }} 20 | password: ${{ secrets.DOCKERHUB_TOKEN }} 21 | - name: Build and push 22 | uses: docker/build-push-action@v2 23 | with: 24 | builder: ${{ steps.buildx.outputs.name }} 25 | file: ./docker/Dockerfile 26 | platforms: linux/amd64,linux/arm64 27 | push: true 28 | tags: tobitflatscher/stereo-matching:latest 29 | 30 | -------------------------------------------------------------------------------- /src/matching_algorithm/matching_algorithm.py: -------------------------------------------------------------------------------- 1 | # Tobit Flatscher - github.com/2b-t (2022) 2 | 3 | # @file matching_algorithm.py 4 | # @brief Base class for stereo matching algorithms 5 | 6 | import abc 7 | import numpy as np 8 | 9 | 10 | class MatchingAlgorithm(abc.ABC): 11 | # Base class for stereo matching algorithms which finds the best matching pixel 12 | 13 | @staticmethod 14 | @abc.abstractmethod 15 | def match(cost_volume: np.ndarray) -> np.ndarray: 16 | # Function for matching the best suiting pixels for the disparity image 17 | # @param[in] cost_volume: The three-dimensional cost volume to be searched for the best matching pixel (H,W,D) 18 | # @return: The two-dimensional disparity image resulting from the best matching pixel inside the cost volume (H,W) 19 | if cost_volume.ndim == 3: 20 | raise ValueError("Cost volume (" + cost_volume.shape + ") must be three-dimensional!") 21 | pass -------------------------------------------------------------------------------- /docker/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM ubuntu:20.04 2 | 3 | WORKDIR /code 4 | 5 | ARG DEBIAN_FRONTEND=noninteractive 6 | 7 | # General tools 8 | RUN apt-get update \ 9 | && apt-get install -y \ 10 | build-essential \ 11 | cmake \ 12 | git-all \ 13 | && rm -rf /var/lib/apt/lists/* 14 | 15 | # Python3 and libraries 16 | RUN apt-get update \ 17 | && apt-get install -y \ 18 | python3 \ 19 | python3-scipy \ 20 | python3-skimage \ 21 | python3-numpy \ 22 | python3-numba \ 23 | python3-notebook \ 24 | python3-matplotlib \ 25 | python3-parameterized \ 26 | && rm -rf /var/lib/apt/lists/* 27 | 28 | # For visualisation of Jupyter notebooks 29 | RUN apt-get update \ 30 | && apt-get install -y \ 31 | firefox \ 32 | && rm -rf /var/lib/apt/lists/* 33 | 34 | # For documentation only 35 | #RUN apt-get update \ 36 | # && apt-get install -y \ 37 | # texlive-full \ 38 | # texstudio \ 39 | # && rm -rf /var/lib/apt/lists/* 40 | 41 | ARG DEBIAN_FRONTEND=dialog 42 | 43 | -------------------------------------------------------------------------------- /src/matching_cost/matching_cost.py: -------------------------------------------------------------------------------- 1 | # Tobit Flatscher - github.com/2b-t (2022) 2 | 3 | # @file matching_cost.py 4 | # @brief Base class for stereo matching costs 5 | 6 | import abc 7 | import numpy as np 8 | 9 | 10 | class MatchingCost(abc.ABC): 11 | # Base class for stereo matching costs for calculating a cost volume 12 | 13 | @staticmethod 14 | @abc.abstractmethod 15 | def compute(self, left_image: np.ndarray, right_image: np.ndarray, max_disparity: int, filter_radius: int) -> np.ndarray: 16 | # Function for calculating the cost volume 17 | # @param[in] left_image: The left image to be used for stereo matching (H,W) 18 | # @param[in] right_image: The right image to be used for stereo matching (H,W) 19 | # @param[in] max_disparity: The maximum disparity to consider 20 | # @param[in] filter_radius: The filter radius to be considered for matching 21 | # @return: The best matching pixel inside the cost volume according to the pre-defined criterion (H,W,D) 22 | 23 | pass 24 | -------------------------------------------------------------------------------- /License.md: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Tobit Flatscher 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /src/matching_cost/sum_of_squared_differences.py: -------------------------------------------------------------------------------- 1 | # Tobit Flatscher - github.com/2b-t (2022) 2 | 3 | # @file sum_of_squared_differences.py 4 | # @brief Sum of squared differences (SSD) stereo matching cost 5 | 6 | from numba import jit 7 | import numpy as np 8 | 9 | from .matching_cost import MatchingCost 10 | 11 | 12 | class SumOfSquaredDifferences(MatchingCost): 13 | 14 | @staticmethod 15 | @jit(nopython = True, parallel = True, cache = True) 16 | def compute(left_image: np.ndarray, right_image: np.ndarray, max_disparity: int, filter_radius: int) -> np.ndarray: 17 | # Compute a cost volume with maximum disparity D considering a neighbourhood R with Sum of Squared Differences (SSD) 18 | # @param[in] left_image: The left image to be used for stereo matching (H,W) 19 | # @param[in] right_image: The right image to be used for stereo matching (H,W) 20 | # @param[in] max_disparity: The maximum disparity to consider 21 | # @param[in] filter_radius: The filter radius to be considered for matching 22 | # @return: The best matching pixel inside the cost volume according to the pre-defined criterion (H,W,D) 23 | 24 | (H,W) = left_image.shape 25 | cost_volume = np.zeros((H,W,max_disparity)) 26 | 27 | # Loop over internal image 28 | for y in range(filter_radius, H - filter_radius): 29 | for x in range(filter_radius, W - filter_radius): 30 | # Loop over window 31 | for v in range(-filter_radius, filter_radius + 1): 32 | for u in range(-filter_radius, filter_radius + 1): 33 | # Loop over all possible disparities 34 | for d in range(0, max_disparity): 35 | cost_volume[y,x,d] += (left_image[y+v, x+u] - right_image[y+v, x+u-d])**2 36 | 37 | return cost_volume -------------------------------------------------------------------------------- /src/matching_cost/sum_of_absolute_differences.py: -------------------------------------------------------------------------------- 1 | # Tobit Flatscher - github.com/2b-t (2022) 2 | 3 | # @file sum_of_absolute_differences.py 4 | # @brief Sum of absolute differences (SAD) stereo matching cost 5 | 6 | from numba import jit 7 | import numpy as np 8 | 9 | from .matching_cost import MatchingCost 10 | 11 | 12 | class SumOfAbsoluteDifferences(MatchingCost): 13 | 14 | @staticmethod 15 | @jit(nopython = True, parallel = True, cache = True) 16 | def compute(left_image: np.ndarray, right_image: np.ndarray, max_disparity: int, filter_radius: int) -> np.ndarray: 17 | # Compute a cost volume with maximum disparity D considering a neighbourhood R with Sum of Absolute Differences (SAD) 18 | # @param[in] left_image: The left image to be used for stereo matching (H,W) 19 | # @param[in] right_image: The right image to be used for stereo matching (H,W) 20 | # @param[in] max_disparity: The maximum disparity to consider 21 | # @param[in] filter_radius: The filter radius to be considered for matching 22 | # @return: The best matching pixel inside the cost volume according to the pre-defined criterion (H,W,D) 23 | 24 | (H,W) = left_image.shape 25 | cost_volume = np.zeros((H,W,max_disparity)) 26 | 27 | # Loop over internal image 28 | for y in range(filter_radius, H - filter_radius): 29 | for x in range(filter_radius, W - filter_radius): 30 | # Loop over window 31 | for v in range(-filter_radius, filter_radius + 1): 32 | for u in range(-filter_radius, filter_radius + 1): 33 | # Loop over all possible disparities 34 | for d in range(0, max_disparity): 35 | cost_volume[y,x,d] += np.absolute(left_image[y+v, x+u] - right_image[y+v, x+u-d]) 36 | 37 | return cost_volume -------------------------------------------------------------------------------- /src/matching_cost/normalised_cross_correlation.py: -------------------------------------------------------------------------------- 1 | # Tobit Flatscher - github.com/2b-t (2022) 2 | 3 | # @file normalised_cross_correlation.py 4 | # @brief Normalised cross correlation (NCC) stereo matching cost 5 | 6 | from numba import jit 7 | import numpy as np 8 | 9 | from .matching_cost import MatchingCost 10 | 11 | 12 | class NormalisedCrossCorrelation(MatchingCost): 13 | 14 | @staticmethod 15 | @jit(nopython = True, parallel = True, cache = True) 16 | def compute(left_image: np.ndarray, right_image: np.ndarray, max_disparity: int, filter_radius: int) -> np.ndarray: 17 | # Compute a cost volume with maximum disparity D considering a neighbourhood R with Normalized Cross Correlation (NCC) 18 | # @param[in] left_image: The left image to be used for stereo matching (H,W) 19 | # @param[in] right_image: The right image to be used for stereo matching (H,W) 20 | # @param[in] max_disparity: The maximum disparity to consider 21 | # @param[in] filter_radius: The filter radius to be considered for matching 22 | # @return: The best matching pixel inside the cost volume according to the pre-defined criterion (H,W,D) 23 | 24 | (H,W) = left_image.shape 25 | cost_volume = np.zeros((max_disparity,H,W)) 26 | 27 | # Loop over all possible disparities 28 | for d in range(0, max_disparity): 29 | # Loop over image 30 | for y in range(filter_radius, H - filter_radius): 31 | for x in range(filter_radius, W - filter_radius): 32 | l_mean = 0 33 | r_mean = 0 34 | n = 0 35 | 36 | # Loop over window 37 | for v in range(-filter_radius, filter_radius + 1): 38 | for u in range(-filter_radius, filter_radius + 1): 39 | # Calculate cumulative sum 40 | l_mean += left_image[y+v, x+u] 41 | r_mean += right_image[y+v, x+u-d] 42 | n += 1 43 | 44 | l_mean = l_mean/n 45 | r_mean = r_mean/n 46 | 47 | l_r = 0 48 | l_var = 0 49 | r_var = 0 50 | 51 | for v in range(-filter_radius, filter_radius + 1): 52 | for u in range(-filter_radius, filter_radius + 1): 53 | # Calculate terms 54 | l = left_image[y+v, x+u] - l_mean 55 | r = right_image[y+v, x+u-d] - r_mean 56 | 57 | l_r += l*r 58 | l_var += l**2 59 | r_var += r**2 60 | 61 | # Assemble terms 62 | cost_volume[d,y,x] = -l_r/np.sqrt(l_var*r_var) 63 | 64 | return np.transpose(cost_volume, (1, 2, 0)) -------------------------------------------------------------------------------- /src/stereo_matching.py: -------------------------------------------------------------------------------- 1 | # Tobit Flatscher - github.com/2b-t (2022) 2 | 3 | # @file stereo_matching.py 4 | # @brief Interface class for setting up stereo matching 5 | 6 | from enum import Enum 7 | import numpy as np 8 | 9 | from matching_algorithm.matching_algorithm import MatchingAlgorithm 10 | from matching_cost.matching_cost import MatchingCost 11 | 12 | 13 | class StereoMatching: 14 | # Recreate the depth image from two images with a given maximum disparity to consider and given filter radius 15 | 16 | def __init__(self, left_image: np.ndarray, right_image: np.ndarray, 17 | matching_cost: MatchingCost, 18 | matching_algorithm: MatchingAlgorithm, 19 | max_disparity: int = 60, filter_radius: int = 3): 20 | # Class constructor 21 | # @param[in] left_image: The left stereo image (H,W) 22 | # @param[in] right_image: The right stereo image (H,W) 23 | # @param[in] matching_cost: The class implementing the matching cost 24 | # @param[in] matching_algorithm: The class implementing the matching algorithm 25 | # @param[in] max_disparity: The maximum disparity to consider 26 | # @param[in] filter_radius: The radius of the filter 27 | 28 | if (left_image.ndim != 2): 29 | raise ValueError("The left image has to be a grey-scale image with a single channel as its last dimension.") 30 | if (right_image.ndim != 2): 31 | raise ValueError("The right image has to be a grey-scale image with a single channel as its last dimension.") 32 | if (left_image.shape != right_image.shape): 33 | raise ValueError("Dimensions of left (" + left_image.shape + ") and right image (" + right_image.shape + ") do not match.") 34 | if (max_disparity <= 0): 35 | raise ValueError("Maximum disparity (" + max_disparity + ") has to be greater than zero.") 36 | if (filter_radius <= 0): 37 | raise ValueError("Radius (" + filter_radius + ") has to be greater than zero.") 38 | 39 | # Convert images to gray-scale 40 | self._left_image = left_image 41 | self._right_image = right_image 42 | 43 | self._max_disparity = max_disparity 44 | self._filter_radius = filter_radius 45 | self._matching_cost = matching_cost 46 | self._matching_algorithm = matching_algorithm 47 | self._cost_volume = None 48 | self._result = None 49 | return 50 | 51 | def compute(self) -> None: 52 | # Compute the cost volume according to given matching cost and match according matching algorithm 53 | 54 | self._cost_volume = self._matching_cost.compute(self._left_image, self._right_image, self._max_disparity, self._filter_radius) 55 | self._result = self._matching_algorithm.match(self._cost_volume) 56 | return 57 | 58 | def result(self) -> np.ndarray: 59 | # Export image to disk with an approriate file name 60 | # @return: The generated result image or None if the image has not been generated yet 61 | 62 | return self._result 63 | -------------------------------------------------------------------------------- /src/matching_algorithm/semi_global_matching.py: -------------------------------------------------------------------------------- 1 | # Tobit Flatscher - github.com/2b-t (2022) 2 | 3 | # @file semi_global_matching.py 4 | # @brief Semi-global matching (SGM) stereo matching algorithm 5 | 6 | import abc 7 | from numba import jit 8 | import numpy as np 9 | from scipy.sparse import diags 10 | 11 | from .matching_algorithm import MatchingAlgorithm 12 | 13 | 14 | class SemiGlobalMatching(MatchingAlgorithm): 15 | 16 | @staticmethod 17 | def match(cost_volume: np.ndarray) -> np.ndarray: 18 | # Function for matching the best suiting pixels for the disparity image 19 | # @param[in] cost_volume: The three-dimensional cost volume to be searched for the best matching pixel (H,W,D) 20 | # @return: The two-dimensional disparity image resulting from the best matching pixel inside the cost volume (H,W) 21 | 22 | (_, _, max_disparity) = cost_volume.shape 23 | f = SemiGlobalMatching._get_f(max_disparity) 24 | return SemiGlobalMatching._compute_sgm(cost_volume, f) 25 | 26 | def _get_f(D: int, L1: float = 0.025, L2: float = 0.5) -> np.ndarray: 27 | # Get pairwise cost matrix for semi-global matching 28 | # @param[in] D: Maximum disparity, number of possible choices 29 | # @param[in] L1: Parameter for setting cost for jumps between two layers of depth 30 | # @param[in] L2: Cost for jumping more than one layer of depth 31 | # @return: Pairwise_costs of shape (D,D) 32 | 33 | return np.full((D, D), L2) + diags([L1 - L2, -L2, L1 - L2], [-1, 0, 1], (D, D)).toarray() 34 | 35 | # For some reason @jit(nopython = True, parallel = True, cache = True) does not work here! 36 | # See Issue #1: https://github.com/2b-t/stereo-matching/issues/1 37 | @staticmethod 38 | @jit 39 | def _compute_message(cost_volume: np.ndarray, f: np.ndarray) -> np.ndarray: 40 | # Compute the messages in one particular direction for semi-global matching 41 | # 42 | # @param[in] cost_volume: Cost volume of shape (H,W,D) 43 | # @param[in] f: Pairwise costs of shape (D,D) 44 | # @return: Messages for all H in positive direction of W with possible options D (H,W,D) 45 | 46 | (H,W,D) = cost_volume.shape 47 | mes = np.zeros((H,W,D)) 48 | # Loop over passive direction 49 | for y in range(0, H): 50 | # Loop over forward direction 51 | for x in range(0, W - 1): 52 | # Loop over all possible nodes 53 | for t in range(0, D): 54 | 55 | # Loop over all possible connections 56 | buffer = np.zeros(D) 57 | for s in range(0, D): 58 | # Input messages + unary cost + binary cost 59 | buffer[s] = mes[y,x,s] + cost_volume[y,x,s] + f[t,s] 60 | 61 | # Choose path of least effort 62 | mes[y, x+1, t] = np.min(buffer) 63 | 64 | return mes 65 | 66 | @staticmethod 67 | def _compute_sgm(cost_volume: np.ndarray, f: np.ndarray) -> np.ndarray: 68 | # Compute semi-global matching by message passing in four directions 69 | # @param[in] cost_volume: Cost volume of shape (H,W,D) 70 | # @param[in] f: Pairwise costs of shape (H,W,D,D) 71 | # @return: Pixel-wise disparity map of shape (H,W) 72 | 73 | # Messages for every single spatial direction and collect in single message 74 | (H,W,D) = cost_volume.shape 75 | mes = np.zeros((H,W,D)) 76 | 77 | # Positive W 78 | mes += SemiGlobalMatching._compute_message(cost_volume, f) 79 | 80 | # Negative W 81 | mes_buffer = np.zeros((H,W)) 82 | mes_buffer = SemiGlobalMatching._compute_message(np.flip(cost_volume, axis=1), f) 83 | mes += np.flip(mes_buffer, axis=1) 84 | 85 | # Positive H 86 | mes_buffer = SemiGlobalMatching._compute_message(np.transpose(cost_volume, (1, 0, 2)), f) 87 | mes += np.transpose(mes_buffer, (1, 0, 2)) 88 | 89 | # Negative H 90 | mes_buffer = SemiGlobalMatching._compute_message(np.flip(np.transpose(cost_volume, (1, 0, 2)), axis=1), f) 91 | mes += np.transpose(np.flip(mes_buffer, axis=1), (1, 0, 2)) 92 | 93 | # Choose best believe from all messages 94 | disp_map = np.zeros((H,W)) 95 | for y in range(0, H): 96 | for x in range(0, W): 97 | # Minimum argument of unary cost and messages 98 | disp_map[y,x] = np.argmin(cost_volume[y,x,:] + mes[y,x,:]) 99 | 100 | return disp_map 101 | -------------------------------------------------------------------------------- /ReadMe.md: -------------------------------------------------------------------------------- 1 | # Stereo matching 2 | 3 | Author: [Tobit Flatscher](https://github.com/2b-t) (January 2020) 4 | 5 | [![Dockerhub](https://github.com/2b-t/stereo-matching/actions/workflows/update-dockerhub.yml/badge.svg)](https://github.com/2b-t/stereo-matching/actions/workflows/update-dockerhub.yml) [![Tests](https://github.com/2b-t/stereo-matching/actions/workflows/run-tests.yml/badge.svg)](https://github.com/2b-t/stereo-matching/actions/workflows/run-tests.yml) [![Python 3.8.10](https://img.shields.io/badge/Python-3.8-yellow.svg?style=flat&logo=python)](https://www.python.org/downloads/release/python-3810/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) 6 | 7 | 8 | 9 | ## Overview 10 | 11 | Left image | Right image| Depth image 12 | :-------------------------:|:-------------------------:|--------------------------- 13 | ![Left image](data/Adirondack_left.png) | ![Right image](data/Adirondack_right.png) | ![Depth image](output/Adirondack_NCC_SGM_D70_R3_accX0,92.jpg) 14 | 15 | This small tool is a **manual implementation of simple stereo-matching** in Python 3. Two rectified images taken from different views are combined to a **depth image** by means of two **matching algorithms**: 16 | 17 | - a simple **winner-takes-it-all (WTA)** or 18 | - a more sophisticated **semi-global matching (SGM)** 19 | 20 | with several **matching costs**: 21 | 22 | - **Sum of Absolute Differences (SAD)**, 23 | - **Sum of Squared Differences (SSD)** or 24 | - **Normalized Cross-Correlation (NCC)**. 25 | 26 | The results are compared to a ground-truth using the accX accuracy measure excluding occluded pixels with a mask. 27 | 28 | For the precise details of the involved formulas (matching cost, matching algorithms and accuracy measure) refer to [`doc/Theory.pdf`](./doc/Theory.pdf). 29 | 30 | The repository is structured as follows: 31 | 32 | ```bash 33 | . 34 | ├── data/ # Directory for the input images (left and right eye) 35 | ├── doc/ # Further documentation, in particular the computational approach 36 | ├── docker/ # contains a Docker container as well as a Docker-Compose configuration file 37 | ├── output/ # contains the resulting depth-image output 38 | ├── src/ 39 | │ ├── main.ipynb # The Jupyter notebook that allows a convenient access to the underlying Python functions 40 | │ └── stereo_matching.py # The Python3 implementation of the core functions with Scipy, Scimage, Numba, Numpy and Matplotlib 41 | ├── test/ # contains parametrized unit tests for the implementations 42 | ├── .devcontainer/ # contains configuration files for containers in Visual Studio Code 43 | └── .vscode/ # contains configuration files for Visual Studio Code 44 | ``` 45 | 46 | 47 | 48 | ## 1. Download it 49 | Either download and copy this folder manually or directly **clone this repository** by typing 50 | ``` 51 | $ git clone https://github.com/2b-t/stereo-matching.git 52 | ``` 53 | 54 | 55 | ## 2. Launch it 56 | 57 | Now you have two options for launching the code. Either you can install all libraries on your system and launch the code there or you can use the Docker container located in [`docker/`](./docker/). 58 | 59 | ### 2.1 On your system 60 | 61 | For launching the code directly on your system make sure Scipy, Numba, Numpy and potentially also Jupyter are installed on your system by typing. If they are not installed on your system yet, install them - ideally with [Anaconda](https://www.anaconda.com/distribution/) or use the supplied Docker as described below. 62 | 63 | #### 2.1.1 Jupyter notebook 64 | 65 | For debugging purposes it can be pretty helpful to launch the Jupyter notebook by typing 66 | 67 | ``` 68 | $ jupyter notebook 69 | ``` 70 | Browse and open the Jupyter notebook [`src/main.ipynb`](./src/main.ipynb) and run it by pressing the play-button. 71 | 72 | #### 2.1.2 Command line interface 73 | 74 | Alternatively you can also edit the Python-file [`src/main.py`](./src/main.py) in your editor of choice (e.g. Visual Studio Code) and launch it from there or from the console. When launching it with `$ python3 main.py -h` it will tell you the available options that you can set. 75 | 76 | #### 2.1.3 Library 77 | 78 | Finally you can also use this package as a library. For this purpose have a look at [`src/main.py`](./src/main.py), [`src/main.ipynb`](./src/main.ipynb) as well as at the unit tests located in [`test/`](./test/) for a reference. 79 | 80 | ### 2.2 Run from Docker 81 | 82 | This is discussed in detail in the document [`doc/Docker.md`](./doc/Docker.md). 83 | -------------------------------------------------------------------------------- /src/utilities.py: -------------------------------------------------------------------------------- 1 | # Tobit Flatscher - github.com/2b-t (2022) 2 | 3 | # @file utilities.py 4 | # @brief Different utilities for AccX accuracy measure and file input and output 5 | 6 | import numpy as np 7 | import os 8 | 9 | from skimage import img_as_float, img_as_ubyte 10 | from skimage.io import imread, imsave 11 | from skimage.color import rgb2gray 12 | 13 | 14 | class AccX: 15 | # Class for the AccX accuracy measure 16 | 17 | @staticmethod 18 | def compute(prediction_image: np.ndarray, groundtruth_image: np.ndarray, mask_image: np.ndarray = None, threshold_disparity: int = 3) -> float: 19 | # Compute the accX accuracy measure [0..1] 20 | # @param[in] prediction_image: The stereo image as reconstructed by an algorithm 21 | # @param[in] groundtruth_image: The ground truth stereo image 22 | # @param[in] mask_image: The mask for excluding invalid pixels such as occluded areas 23 | # @param[in] threshold_disparity: Threshold disparity measure (X) 24 | # @return The accX measure of the reconstructed stereo image 25 | 26 | if (prediction_image.shape != groundtruth_image.shape): 27 | raise ValueError("Dimensions of guess (" + prediction_image.shape + ") and groundtruth (" + groundtruth_image.shape + ") do not match.") 28 | 29 | if (mask_image is None): 30 | mask_image = np.ones(prediction_image.shape) 31 | 32 | number_of_pixels = max(np.sum(mask_image), 1) # Catch error if no pixels selected 33 | 34 | weighted_image = mask_image*(np.absolute(prediction_image - groundtruth_image) <= threshold_disparity) 35 | return 1/number_of_pixels*np.sum(weighted_image) 36 | 37 | 38 | class IO: 39 | # Class for input output tools 40 | 41 | @staticmethod 42 | def import_image(file_name: str) -> np.ndarray: 43 | # Import image and convert it to useable grey-scale image 44 | # @param[in] file_name: The file name of the file to be imported 45 | # @return The parsed image as a numpy array 46 | img = imread(file_name) 47 | return rgb2gray(img) 48 | 49 | @staticmethod 50 | def export_image(image: np.ndarray, directory: str, name: str, matching_cost: str, matching_algorithm: str, 51 | max_disparity: int, filter_radius: int, accx = None) -> str: 52 | # Export image to disk with an approriate file name 53 | # @param[in] export_image: The image data that has to be exported as numpy array 54 | # @param[in] directory: Sub-directory where the file should be saved 55 | # @param[in] name: Scenario name 56 | # @param[in] matching_cost: The matching cost used (e.g. SSD, SAD, NCC) 57 | # @param[in] matching_algorithm: The measure used for matching point (e.g. WTA, SGM) 58 | # @param[in] max_disparity: Maximum disparity 59 | # @param[in] filter_radius: Filter radius 60 | # @param[in] accx: accX measure for evaluation (if available) 61 | # @return: The resulting file name 62 | 63 | if directory is None: 64 | directory = "" 65 | elif not os.path.isdir(directory): 66 | os.mkdir(directory) 67 | 68 | if name is None: 69 | name = "" 70 | 71 | path = os.path.join(directory, name) 72 | 73 | file_name = str(path) + "_" + matching_cost + "_" + matching_algorithm + "_D" + IO._str_comma(max_disparity) + "_R" + IO._str_comma(filter_radius) 74 | 75 | if accx is not None: 76 | file_name += "_accX" + IO._str_comma(accx) 77 | 78 | file_name = file_name + ".jpg" 79 | imsave(file_name, img_as_ubyte(image), quality = 100) 80 | return file_name 81 | 82 | @staticmethod 83 | def _str_comma(number: float, number_of_decimals: int = 2) -> str: 84 | # Create a string from a number and replace all dots by a comma 85 | # @param[in] number: A number that should be converted to a string 86 | # @param[in] number_of_decimals: Number of decimals to be kept 87 | # @return: A string of the number with 2 decimals where all dots are replaced by commas 88 | 89 | return str(round(number, number_of_decimals)).replace('.',',') 90 | 91 | @staticmethod 92 | def normalise_image(image: np.ndarray, groundtruth_image: np.ndarray = None) -> np.ndarray: 93 | # Normalise image with groundtruth or itself to floating number points in interval 0..1 94 | # @param[in] image: Non-normalised image 95 | # @param[in] groundtruth_image: Ground-truth 96 | # @return: Image normalised with ground truth or maximimum distance 97 | 98 | normalised_image = image 99 | 100 | if groundtruth_image is not None: 101 | if (np.max(groundtruth_image) <= 0): 102 | raise ValueError("Maximum value in groundtruth image must be greater than 0.") 103 | normalised_image = image/np.max(groundtruth_image) 104 | 105 | if (np.max(image) <= 0): 106 | raise ValueError("Maximum value in image must be greater than 0.") 107 | 108 | return normalised_image/np.max(normalised_image) 109 | -------------------------------------------------------------------------------- /doc/Docker.md: -------------------------------------------------------------------------------- 1 | # Stereo matching 2 | 3 | Author: [Tobit Flatscher](https://github.com/2b-t) (January 2020) 4 | 5 | 6 | 7 | ## Docker 8 | 9 | ### 2.2 Run from Docker 10 | 11 | This code is shipped with a [Docker](https://www.docker.com/) that allows the software to be run without having to install all the dependencies. For this one has to [set-up Docker](https://docs.docker.com/get-docker/) (select your operating system and follow the steps) as well as [Docker Compose](https://docs.docker.com/compose/install/) ideally with `$ sudo pip3 install docker-compose`. 12 | 13 | Then browse the `docker` folder containing all the different Docker files, open a console and start the docker with 14 | 15 | ```bash 16 | $ sudo docker-compose up 17 | ``` 18 | 19 | and then - after the container has been compiled - open another terminal and connect to the Docker 20 | 21 | ```bash 22 | $ sudo docker-compose exec stereo_matching_docker sh 23 | ``` 24 | 25 | Now you can work inside the Docker as if it was your own machine. Later it is discussed how one can use Visual Studio Code as an IDE and not having to launch the Docker from the console. 26 | 27 | Advantages of Docker compared to an installation on the host system are discussed in more detail [here](https://hentsu.com/docker-containers-top-7-benefits/). 28 | 29 | When opening a Jupyter notebook from inside the container you might have to supply the following options: 30 | 31 | ```bash 32 | $ jupyter notebook --ip=127.0.0.1 --port=8888 --allow-root 33 | ``` 34 | 35 | #### 2.2.1 Graphic user interfaces inside the Docker 36 | 37 | Docker was actually not designed to be used with a graphic user interface. There are several workarounds for this, most of them mount relevant X11 folders from the host system into the Docker. In our case this is achieved by a corresponding Docker Compose file `docker-compose-gui.yml` that [extends](https://docs.docker.com/compose/extends/) the basic `docker-compose.yml` file. 38 | 39 | Before launching it one has to allow the user to access the X server from within the Docker with 40 | 41 | ```bash 42 | $ xhost +local:root 43 | ``` 44 | 45 | Then one can open the Docker by additionally supplying the command line argument `-f ` 46 | 47 | ```bash 48 | $ docker-compose -f docker-compose-gui.yml up 49 | ``` 50 | 51 | ##### 2.2.1.1 Hardware accelerated OpenGL with `nvidia-container-runtime` 52 | 53 | Another problem emerges when wanting to use hardware acceleration such as with OpenGL. In such a case one has to allow the Docker to access the host graphics card. This can be achieved with the [`nvidia-docker`](https://github.com/NVIDIA/nvidia-docker) or alternatively with the [`nvidia-container-runtime`](https://github.com/NVIDIA/nvidia-container-runtime). 54 | 55 | Latter was chosen for this Docker: The configuration files `docker-compose-gui-nvidia.yml` and `docker-compose-nvidia.yml` inside the `docker` folder contain Docker Compose configurations for accessing the hardware accelerators inside the Docker. Former is useful when running hardware-accelerated graphic user interfaces while the latter can be used to run CUDA inside the Docker. 56 | 57 | For this start by launching `docker info` and check if the field `runtime` additionally to the default `runc` also holds an `nvidia` runtime. If not please follow the [installation guide](https://github.com/NVIDIA/nvidia-container-runtime#installation) as well as the [engine setup](https://github.com/NVIDIA/nvidia-container-runtime#docker-engine-setup) (and then restart your computer). 58 | 59 | Then you should be able to run the Docker Compose image with 60 | 61 | ```bash 62 | $ docker-compose -f docker-compose-gui-nvidia.yml up 63 | ``` 64 | 65 | To verify that the hardware acceleration is actually working you can check the output of `nvidia-smi`. If working correctly it should output you the available hardware accelerators on your system. 66 | 67 | ```bash 68 | $ nvidia-smi 69 | ``` 70 | 71 | #### 2.2.2 Docker inside Visual Studio Code 72 | 73 | Additionally this repository comes with a Visual Studio Code project. The following sections will walk you through how this can be set-up. 74 | 75 | ##### 2.2.2.1 Set-up 76 | 77 | If you do not have Visual Studio Code installed on your system then [install it](https://code.visualstudio.com/download). And finally follow the Docker post-installation steps given [here](https://docs.docker.com/engine/install/linux-postinstall/) so that you can run Docker without `sudo`. Finally install the [Docker](https://marketplace.visualstudio.com/items?itemName=ms-azuretools.vscode-docker) and [Remote - Containers](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) plugins inside Visual Studio Code and you should be ready to go. 78 | 79 | ##### 2.2.2.2 Open the project 80 | 81 | More information about Docker with Visual Studio Code can be found [here](https://code.visualstudio.com/docs/containers/overview). 82 | 83 | ##### 2.2.2.3 Change the Docker Compose file 84 | 85 | The Docker Compose file can be changed inside `.devcontainer/devcontainers.json`: 86 | 87 | ```json 88 | { 89 | "name": "Stereo Matching Docker Compose", 90 | "dockerComposeFile": [ 91 | "../docker/docker-compose.yml" // Change Docker-Compose file here 92 | ], 93 | ``` 94 | -------------------------------------------------------------------------------- /src/main.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # Tobit Flatscher - github.com/2b-t (2022) 3 | 4 | # @file main.py 5 | # @brief Command line interface for stereo matching 6 | 7 | import argparse 8 | import matplotlib.pyplot as plt 9 | import numpy as np 10 | 11 | from matching_algorithm.matching_algorithm import MatchingAlgorithm 12 | from matching_algorithm.semi_global_matching import SemiGlobalMatching 13 | from matching_algorithm.winner_takes_it_all import WinnerTakesItAll 14 | 15 | from matching_cost.matching_cost import MatchingCost 16 | from matching_cost.normalised_cross_correlation import NormalisedCrossCorrelation 17 | from matching_cost.sum_of_absolute_differences import SumOfAbsoluteDifferences 18 | from matching_cost.sum_of_squared_differences import SumOfSquaredDifferences 19 | 20 | from stereo_matching import StereoMatching 21 | from utilities import AccX, IO 22 | 23 | 24 | def main(left_image_path: str, right_image_path: str, 25 | matching_algorithm_name: str, matching_cost_name: str, 26 | max_disparity: int, filter_radius: int, 27 | groundtruth_image_path: str, mask_image_path: str, accx_threshold: int, 28 | output_path: str = None, output_name: str = "unknown", is_plot: bool = True) -> None: 29 | # Imports images for stereo matching, performs stereo matching, plots results and outputs them to a file 30 | # @param[in] left_image_path: Path to the image for the left eye 31 | # @param[in] right_image_path: Path to the image for the right eye 32 | # @param[in] matching_algorithm_name: Name of the matching algorithm 33 | # @param[in] matching_cost_name: Name of the matching cost type 34 | # @param[in] max_disparity: Maximum disparity to consider 35 | # @param[in] filter_radius: Filter radius to be considered for cost volume 36 | # @param[in] groundtruth_image_path: Path to the ground truth image 37 | # @param[in] mask_image_path: Path to the mask for excluding pixels from the AccX accuracy measure 38 | # @param[in] accx_threshold: Mismatch in disparity to accept for AccX accuracy measure 39 | # @param[in] output_path: Location of the output path, if None no output is generated 40 | # @param[in] output_name: Name of the scenario for pre-pending the output file 41 | # @param[in] is_plot: Flag for turning plot of results on and off 42 | 43 | # Load input images 44 | left_image = IO.import_image(left_image_path) 45 | right_image = IO.import_image(right_image_path) 46 | 47 | # Load ground truth images 48 | groundtruth_image = None 49 | mask_image = None 50 | try: 51 | groundtruth_image = IO.import_image(groundtruth_image_path) 52 | mask_image = IO.import_image(mask_image_path) 53 | except: 54 | pass 55 | 56 | # Plot input images 57 | if is_plot is True: 58 | plt.figure(figsize=(8,4)) 59 | plt.subplot(1,2,1), plt.imshow(left_image, cmap='gray'), plt.title('Left') 60 | plt.subplot(1,2,2), plt.imshow(right_image, cmap='gray'), plt.title('Right') 61 | plt.tight_layout() 62 | 63 | # Set-up algorithm 64 | matching_algorithm = None 65 | if matching_algorithm_name == "SGM": 66 | matching_algorithm = SemiGlobalMatching 67 | elif matching_algorithm_name == "WTA": 68 | matching_algorithm = WinnerTakesItAll 69 | else: 70 | raise ValueError("Matching algorithm '" + matching_algorithm_name + "' not recognised!") 71 | 72 | matching_cost = None 73 | if matching_cost_name == "NCC": 74 | matching_cost = NormalisedCrossCorrelation 75 | elif matching_cost_name == "SAD": 76 | matching_cost = SumOfAbsoluteDifferences 77 | elif matching_cost_name == "SSD": 78 | matching_cost = SumOfSquaredDifferences 79 | else: 80 | raise ValueError("Matching cost '" + matching_cost_name + "' not recognised!") 81 | 82 | # Perform stereo matching 83 | sm = StereoMatching(left_image, right_image, matching_cost, matching_algorithm, max_disparity, filter_radius) 84 | print("Performing stereo matching...") 85 | sm.compute() 86 | print("Stereo matching completed.") 87 | res_image = sm.result() 88 | 89 | # Compute accuracy 90 | try: 91 | accx = AccX.compute(res_image, groundtruth_image, mask_image, accx_threshold) 92 | print("AccX accuracy measure for threshold " + str(accx_threshold) + ": " + str(accx)) 93 | except: 94 | accx = None 95 | 96 | # Plot result 97 | if is_plot is True: 98 | plt.figure() 99 | plt.imshow(res_image, cmap='gray') 100 | plt.show() 101 | 102 | # Output to file 103 | if output_path is not None: 104 | result_file_path = IO.export_image(IO.normalise_image(res_image, groundtruth_image), 105 | output_path, output_name, matching_cost_name, matching_algorithm_name, 106 | max_disparity, filter_radius, accx_threshold) 107 | print("Exported result to file '" + result_file_path + "'.") 108 | return 109 | 110 | 111 | if __name__== "__main__": 112 | # Parse input arguments 113 | parser = argparse.ArgumentParser() 114 | parser.add_argument("-l", "--left", type=str, 115 | help="Path to left image") 116 | parser.add_argument("-r", "--right", type=str, 117 | help="Path to right image") 118 | parser.add_argument("-a", "--algorithm", type=str, choices=["SGM", "WTA"], 119 | help="Matching cost algorithm", default = "WTA") 120 | parser.add_argument("-c", "--cost", type=str, choices=["NCC", "SAD", "SSD"], 121 | help="Matching cost type", default = "SAD") 122 | parser.add_argument("-D", "--disparity", type=int, 123 | help="Maximum disparity", default = 60) 124 | parser.add_argument("-R", "--radius", type=int, 125 | help="Filter radius", default = 3) 126 | parser.add_argument("-o", "--output", type=str, 127 | help="Output directory, by default no output", default = None) 128 | parser.add_argument("-n", "--name", type=str, 129 | help="Output file name", default = "unknown") 130 | parser.add_argument("-p", "--no-plot", action='store_true', 131 | help="Flag for de-activating plotting") 132 | parser.add_argument("-g", "--groundtruth", type=str, 133 | help="Path to groundtruth image", default = None) 134 | parser.add_argument("-m", "--mask", type=str, 135 | help="Path to mask image for AccX accuracy measure", default = None) 136 | parser.add_argument("-X", "--accx", type=int, 137 | help="AccX accuracy measure threshold", default = 60) 138 | args = parser.parse_args() 139 | 140 | main(args.left, args.right, args.algorithm, args.cost, args.disparity, args.radius, 141 | args.groundtruth, args.mask, args.accx, 142 | args.output, args.name, not args.no_plot) 143 | -------------------------------------------------------------------------------- /test/test_utilities.py: -------------------------------------------------------------------------------- 1 | # Tobit Flatscher - github.com/2b-t (2022) 2 | 3 | # @file utilities_test.py 4 | # @brief Different testing routines for utility functions for accuracy calculation and file import and export 5 | 6 | import numpy as np 7 | from parameterized import parameterized 8 | from typing import Tuple 9 | import unittest 10 | 11 | from src.utilities import AccX, IO 12 | 13 | 14 | class TestAccX(unittest.TestCase): 15 | _shape = (10,20) 16 | _disparities = [ ["disparity = 1", 1], 17 | ["disparity = 2", 2], 18 | ["disparity = 3", 3] 19 | ] 20 | 21 | @parameterized.expand(_disparities) 22 | def test_same_image(self, name: str, threshold_disparity: int) -> None: 23 | # Parameterised unit test for testing if two identical images result in an accuracy measure of unity 24 | # @param[in] name: The name of the parameterised test 25 | # @param[in] threshold_disparity: The threshold disparity for the accuracy measure 26 | 27 | mag = threshold_disparity*10 28 | groundtruth_image = mag*np.ones(self._shape) 29 | prediction_image = mag*np.ones(groundtruth_image.shape) 30 | mask_image = np.ones(groundtruth_image.shape) 31 | accx = AccX.compute(prediction_image, groundtruth_image, mask_image, threshold_disparity) 32 | self.assertAlmostEqual(accx, 1.0, places=7) 33 | return 34 | 35 | @parameterized.expand(_disparities) 36 | def test_slightly_shifted_image(self, name: str, threshold_disparity: int) -> None: 37 | # Parameterised unit test for testing if an image and its slightly shifted counterpart result in an accuracy measure of unity 38 | # @param[in] name: The name of the parameterised test 39 | # @param[in] threshold_disparity: The threshold disparity for the accuracy measure 40 | 41 | mag = threshold_disparity*10 42 | groundtruth_image = mag*np.ones(self._shape) 43 | prediction_image = (mag+threshold_disparity-1)*np.ones(groundtruth_image.shape) 44 | mask_image = np.ones(groundtruth_image.shape) 45 | accx = AccX.compute(prediction_image, groundtruth_image, mask_image, threshold_disparity) 46 | self.assertAlmostEqual(accx, 1.0, places=7) 47 | return 48 | 49 | @parameterized.expand(_disparities) 50 | def test_no_mask(self, name: str, threshold_disparity: int) -> None: 51 | # Parameterised unit test for testing if two identical images with no given mask result in an accuracy measure of unity 52 | # @param[in] name: The name of the parameterised test 53 | # @param[in] threshold_disparity: The threshold disparity for the accuracy measure 54 | 55 | mag = threshold_disparity*10 56 | groundtruth_image = mag*np.ones(self._shape) 57 | prediction_image = mag*np.ones(groundtruth_image.shape) 58 | mask_image = None 59 | accx = AccX.compute(prediction_image, groundtruth_image, mask_image, threshold_disparity) 60 | self.assertAlmostEqual(accx, 1.0, places=7) 61 | return 62 | 63 | @parameterized.expand(_disparities) 64 | def test_inverse_image(self, name: str, threshold_disparity: int) -> None: 65 | # Parameterised unit test for testing if two inverse images result in an accuracy measure of zero 66 | # @param[in] name: The name of the parameterised test 67 | # @param[in] threshold_disparity: The threshold disparity for the accuracy measure 68 | 69 | mag = threshold_disparity*10 70 | groundtruth_image = mag*np.ones(self._shape) 71 | prediction_image = np.zeros(groundtruth_image.shape) 72 | mask_image = np.ones(groundtruth_image.shape) 73 | accx = AccX.compute(prediction_image, groundtruth_image, mask_image, threshold_disparity) 74 | self.assertAlmostEqual(accx, 0.0, places=7) 75 | return 76 | 77 | @parameterized.expand(_disparities) 78 | def test_significantly_shifted_image(self, name: str, threshold_disparity: int) -> None: 79 | # Parameterised unit test for testing if an image and its significantly shifted counterpart result in an accuracy measure of zero 80 | # @param[in] name: The name of the parameterised test 81 | # @param[in] threshold_disparity: The threshold disparity for the accuracy measure 82 | 83 | mag = threshold_disparity*10 84 | groundtruth_image = mag*np.ones(self._shape) 85 | prediction_image = (mag+threshold_disparity+1)*np.ones(groundtruth_image.shape) 86 | mask_image = np.ones(groundtruth_image.shape) 87 | accx = AccX.compute(prediction_image, groundtruth_image, mask_image, threshold_disparity) 88 | self.assertAlmostEqual(accx, 0.0, places=7) 89 | return 90 | 91 | @parameterized.expand(_disparities) 92 | def test_zero_mask(self, name: str, threshold_disparity: int) -> None: 93 | # Parameterised unit test for testing if two equal images with a mask of zero results in an accuracy measure of zero 94 | # @param[in] name: The name of the parameterised test 95 | # @param[in] threshold_disparity: The threshold disparity for the accuracy measure 96 | 97 | mag = threshold_disparity*10 98 | groundtruth_image = mag*np.ones(self._shape) 99 | prediction_image = groundtruth_image 100 | mask_image = np.zeros(groundtruth_image.shape) 101 | accx = AccX.compute(prediction_image, groundtruth_image, mask_image, threshold_disparity) 102 | self.assertAlmostEqual(accx, 0.0, places=7) 103 | return 104 | 105 | 106 | class TestIO(unittest.TestCase): 107 | _resolutions = [ ["resolution = (10, 20)", (10, 20)], 108 | ["resolution = (30, 4)", (30, 4)], 109 | ["resolution = (65, 24)", (65, 24)] 110 | ] 111 | def test_import_image(self) -> None: 112 | # TODO(tobit): Implement 113 | 114 | pass 115 | 116 | def test_export_image(self) -> None: 117 | # TODO(tobit): Implement 118 | 119 | pass 120 | 121 | def test_str_comma(self) -> None: 122 | # Function for testing conversion of numbers to comma-separated numbers 123 | 124 | self.assertEqual(IO._str_comma(10, 2), "10") 125 | self.assertEqual(IO._str_comma(9.3, 2), "9,3") 126 | self.assertEqual(IO._str_comma(1.234, 2), "1,23") 127 | return 128 | 129 | @parameterized.expand(_resolutions) 130 | def test_normalise_positive_image_no_groundtruth(self, name: str, shape: Tuple[int, int]) -> None: 131 | # Function for testing normalising a positive image with a no ground-truth should result in a positive image 132 | # @param[in] name: The name of the parameterised test 133 | # @param[in] shape: The image resolution to be considered for the test 134 | 135 | mag = 13 136 | image = mag*np.ones(shape) 137 | groundtruth_image = None 138 | result = IO.normalise_image(image, groundtruth_image) 139 | self.assertGreaterEqual(np.min(result), 0.0) 140 | self.assertLessEqual(np.max(result), 1.0) 141 | return 142 | 143 | @parameterized.expand(_resolutions) 144 | def test_normalise_positive_image_positive_groundtruth(self, name: str, shape: Tuple[int, int]) -> None: 145 | # Function for testing normalising a regular image with a regular ground-truth should result in a positive image 146 | # @param[in] name: The name of the parameterised test 147 | # @param[in] shape: The image resolution to be considered for the test 148 | 149 | mag = 13 150 | image = mag*np.ones(shape) 151 | groundtruth_image = 2*image 152 | result = IO.normalise_image(image, groundtruth_image) 153 | self.assertGreaterEqual(np.min(result), 0.0) 154 | self.assertLessEqual(np.max(result), 1.0) 155 | return 156 | 157 | @parameterized.expand(_resolutions) 158 | def test_normalise_negative_image_positive_groundtruth(self, name: str, shape: Tuple[int, int]) -> None: 159 | # Function for testing normalising a negative image which should result in a ValueError 160 | # @param[in] name: The name of the parameterised test 161 | # @param[in] shape: The image resolution to be considered for the test 162 | 163 | mag = 13 164 | groundtruth_image = mag*np.ones(shape) 165 | image = -2*groundtruth_image 166 | self.assertRaises(ValueError, IO.normalise_image, image, groundtruth_image) 167 | return 168 | 169 | @parameterized.expand(_resolutions) 170 | def test_normalise_positive_image_negative_groundtruth(self, name: str, shape: Tuple[int, int]) -> None: 171 | # Function for testing normalising a negative ground-truth which should result in a ValueError 172 | # @param[in] name: The name of the parameterised test 173 | # @param[in] shape: The image resolution to be considered for the test 174 | 175 | mag = 13 176 | image = mag*np.ones(shape) 177 | groundtruth_image = -2*image 178 | self.assertRaises(ValueError, IO.normalise_image, image, groundtruth_image) 179 | return 180 | 181 | 182 | if __name__ == '__main__': 183 | unittest.main() -------------------------------------------------------------------------------- /doc/Theory.tex: -------------------------------------------------------------------------------- 1 | \documentclass{article} 2 | \usepackage[utf8]{inputenc} 3 | \usepackage[ngerman,english]{babel} 4 | \usepackage[T1]{fontenc} 5 | %\renewcommand{\familydefault}{\rmdefault} 6 | 7 | \addtolength{\oddsidemargin}{-0.75in} 8 | \addtolength{\evensidemargin}{-0.75in} 9 | \addtolength{\textwidth}{1.5in} 10 | 11 | \addtolength{\topmargin}{-.9in} 12 | \addtolength{\textheight}{1.5in} 13 | 14 | \usepackage{amsmath,amssymb} 15 | \usepackage{longtable} 16 | \usepackage{graphicx} 17 | \graphicspath{ {pictures/} } 18 | \usepackage{caption} 19 | \usepackage{subcaption} 20 | \usepackage{adjustbox} 21 | \usepackage{multirow} 22 | 23 | \usepackage{titlesec} 24 | \usepackage{anyfontsize} 25 | 26 | \usepackage{hyperref} 27 | \hypersetup{ 28 | colorlinks=true, 29 | linkcolor=black, 30 | filecolor=black, 31 | urlcolor=blue, 32 | }\pagestyle{myheadings} 33 | 34 | \usepackage{scrpage2} 35 | \pagestyle{scrheadings} 36 | \clearscrheadfoot 37 | 38 | \urlstyle{same} 39 | 40 | \graphicspath{ {../data/} {../output/}} 41 | 42 | \usepackage[backend=bibtex,style=numeric]{biblatex} 43 | \addbibresource{literature.bib} 44 | 45 | 46 | \begin{document} 47 | \noindent 48 | 49 | \section*{Code documentation} 50 | 51 | In this code several methods for estimating depth in a stereo image (image \ref{fig:Stereo}) are implemented. For this purpose a \textit{matching cost volume} is calculated by means of sum of squared differences (SSD), sum of absolute differences (SAD) and normalised cross-correlation (NCC) and then the most appropriate match chosen either by the simple \textit{winner-takes-it-all} approach (WTA) or \textit{semi-global matching} (SGM). For this purpose the given images have to be converted to grayscale. 52 | 53 | \begin{figure}[!htb] 54 | \captionsetup[subfigure]{labelformat=empty} 55 | \centering 56 | \begin{adjustbox}{minipage=\linewidth,scale=0.95} 57 | \begin{subfigure}{0.45\textwidth} 58 | \centering 59 | \includegraphics[width=0.9\linewidth]{cones_left.png} 60 | \caption{a) left} 61 | \end{subfigure} 62 | \begin{subfigure}{0.45\textwidth} 63 | \centering 64 | \includegraphics[width=0.9\linewidth]{cones_right.png} 65 | \caption{b) right} 66 | \end{subfigure}% 67 | \par \bigskip 68 | \begin{subfigure}{0.45\textwidth} 69 | \centering 70 | \includegraphics[width=0.9\linewidth]{cones_gt.png} 71 | \caption{c) ground-truth} 72 | \end{subfigure} 73 | \begin{subfigure}{0.45\textwidth} 74 | \centering 75 | \includegraphics[width=0.9\linewidth]{cones_mask.png} 76 | \caption{d) mask} 77 | \end{subfigure}% 78 | \end{adjustbox} 79 | \caption[Input]{Stereo images: left (a) and right (b), the corresponding ground-truth (c) and the mask (d) needed for the accX evaluation} 80 | \label{fig:Stereo} 81 | \end{figure} 82 | 83 | 84 | \section{Local matching} 85 | The following error-measures and correlations will be used for evaluating a corresponding matching cost between two image patches $p$ and $q$ of equal size $W \times H$. 86 | 87 | \subsection{Sum of absolute differences} 88 | In case of the sum of absolute differences the matching of two patches $p$ and $q$ is penalised depending on the sum of absolute differences of the two windows according to 89 | \begin{equation} 90 | SAD(p,q) = \sum\limits_{x=1}^W \sum\limits_{y=1}^H | p(x,y) - q(x,y) | 91 | \end{equation} 92 | This means very similar image patches lead to a low SAD while non-matching patches result in a high SAD. 93 | 94 | \subsection{Sum of squared differences} 95 | In case of the sum of squared differences the matching process is penalised quadratically instead of linearly making use of the squared difference instead 96 | \begin{equation} 97 | SSD(p,q) = \sum\limits_{x=1}^W \sum\limits_{y=1}^H ( p(x,y) - q(x,y) )^2 98 | \end{equation} 99 | 100 | \subsection{Normalised cross-correlation} 101 | In the case of the more sophisticated normalised cross-correlation the patches are normalised by substracting the means to account for slight deviations in lighting between the two pictures 102 | \begin{equation} 103 | \overline{p} = \frac{1}{H \, W} \sum\limits_{x=1}^W \sum\limits_{y=1}^H p(x,y) \hspace{3cm} \overline{q} = \frac{1}{H \, W} \sum\limits_{x=1}^W \sum\limits_{y=1}^H q(x,y) 104 | \end{equation} 105 | and calculating the a correlation measure for local matching according to 106 | \begin{equation} 107 | NCC(p,q) = \frac{\sum\limits_{x=1}^W \sum\limits_{y=1}^H (p(x,y) - \overline{p}) (q(x,y) - \overline{q})}{\sqrt{\left[ \sum\limits_{x=1}^W \sum\limits_{y=1}^H (p(x,y) - \overline{p})^2 \right] \cdot \left[ \sum\limits_{x=1}^W \sum\limits_{y=1}^H (q(x,y) - \overline{q})^2 \right] }} 108 | \end{equation} 109 | where in this case a high similarity between the two patches contrary to SAD and SSD is characterised by a high NCC. This means for our cost volume we either have to reverse the sign multiplying the $NCC$ by $-1$. 110 | 111 | \section{Cost volume} 112 | We use these similarity measures to compute a cost-volume $CV$ for a pre-defined range of disparities $D$ 113 | \begin{equation} 114 | CV(x,y,d) = S(I_0(x,y) \, I_1(x - d,y) ) 115 | \end{equation} 116 | where the parameter $d \in \mathcal{D}$ and $\mathcal{D} = \left\{ 0, ... \, , D-1 \right\}$ are all valid disparities and $S$ is any of the aforementioned error-measures. 117 | 118 | This basically means that we take the left picture and translate the right picture trying to overlap the objects in the two pictures taken from different views. The points at a certain depth have a certain disparity and thus the optimal shift can be used to determine the correct depth. In order to account for a certain deviation we use a certain search window $(W,H)$ rather than trying to match the points directly. 119 | 120 | \section{Matching algorithm} 121 | 122 | \subsection{Winner-takes-it-all solution} 123 | One fast way of obtaining then the best disparity for each image point would be taking the point with the highest value in the cost volume along the disparity axis according to 124 | \begin{equation} 125 | \overline{d}(x,y) \in \arg \min_d CV(x,y,d) 126 | \end{equation} 127 | This though leads to noisy results as this approach doesn't penalise label changes at all. 128 | 129 | \subsection{Semi-global matching} 130 | 131 | In semi-global matching a different approach is taken, rather than looking for the best fit on a scanline, a sort of global optimisation is used. Each pixel with a corresponding unary cost given by the cost volume is assigned an additional pairwise cost that depends on wherever the neighbouring pixels have a similar depth value or deviate significantly. This energy can be written as 132 | \begin{equation} 133 | \min_z \left[ \sum_{i \in \mathcal{V}} g_i (z_i) + \sum_{(i,j) \in \mathcal{E}} f_{i,j} (z_i, z_j) \right] 134 | \end{equation} 135 | where $\mathcal{V}$ are the image pixels, $\mathcal{E}$ the edges, the connections between two pixels. The $g_i$ are given by the cost volume and the pairwise cost $f_{i,j}$ defines a penalty for jumps between neighbouring pixels. 136 | \begin{equation} 137 | f_{i,j} (z_i, z_j) = \begin{cases} 138 | 0, \hspace{0.7cm} \text{if} \, z_i = z_j \\ 139 | L_1, \hspace{0.5cm} \text{if} \, |z_i - z_j| = 1\\ 140 | L_2\phantom{,} \hspace{0.5cm} \text{else} 141 | \end{cases} 142 | \end{equation} 143 | This is done as following: First messages for all four disparity directions are calculated where the first message in each direction is initialised with $\vec 0$. 144 | \begin{equation} 145 | m_{i+1}^a(t) = \min_{s \in \mathcal{D}} \left[ m_i^a(s) + f_{i, i+1} (s,t) + g_i(s) \right] 146 | \end{equation} 147 | This can be done for every direction by a combination of mirroring and transposing the cost volume. Then the beliefs are computed 148 | \begin{equation} 149 | b_i(s) = g_i(s) \sum_{a \in \{ L,R,U,D \}} m_i^a(s) 150 | \end{equation} 151 | The correct disparity is then calculated from the believes as follows 152 | \begin{equation} 153 | \hat{d} (x,y) \in \arg \min_d b(x,y,d) 154 | \end{equation} 155 | The last formula contains is intentionally given as $\in$ as the solutions might not be unique. 156 | 157 | \section{Evaluation: compare to ground-truth} 158 | The performance of the stereo workflow is evaluated by comparing it with a ground-truth disparity map, in this case with the $accX$ measure 159 | \begin{equation} 160 | accX(z,z*) = \frac{1}{Z} \sum\limits_{x=1}^W \sum\limits_{y=1}^H m(x,y) \cdot \begin{cases} 161 | 1, \hspace{0.5cm} \text{if} \, |(z(x,y) - z*(x,y)| \leq X\\ 162 | 0\phantom{,} \hspace{0.5cm} \text{else} 163 | \end{cases} 164 | \end{equation} 165 | This measure characterises errors less than or equal to $X$ disparities, between the prediction $z$ and the ground truth disparity map $z*$ with a mask $m$ that contains $1$ for the $Z$ valid pixels and $0$ for the invalid pixels. 166 | 167 | The mask basically excludes pixels that should not be evaluated e.g. because they are occluded in either of the two pictures. The average of the remaining pixels that were estimated correctly is determined. All pixels that guessed the depth correctly (threshold $X$) are set to $1$, all pixels that did not estimate it correctly do not contribute. In this way the $accX$ measures the amount of pixels that were matched correctly to those that could possibly be matched. An $accX$ of 1 would correspond to the ground truth. 168 | 169 | \end{document} --------------------------------------------------------------------------------