├── src
    ├── __init__.py
    ├── matching_cost
    │   ├── __init__.py
    │   ├── matching_cost.py
    │   ├── sum_of_squared_differences.py
    │   ├── sum_of_absolute_differences.py
    │   └── normalised_cross_correlation.py
    ├── matching_algorithm
    │   ├── __init__.py
    │   ├── winner_takes_it_all.py
    │   ├── matching_algorithm.py
    │   └── semi_global_matching.py
    ├── stereo_matching.py
    ├── utilities.py
    └── main.py
├── test
    ├── __init__.py
    └── test_utilities.py
├── .gitattributes
├── doc
    ├── Theory.pdf
    ├── Docker.md
    └── Theory.tex
├── data
    ├── cones_gt.png
    ├── bowling_left.png
    ├── cones_left.png
    ├── cones_mask.png
    ├── cones_right.png
    ├── Adirondack_gt.png
    ├── bowling_right.png
    ├── AdirondackE_right.png
    ├── Adirondack_left.png
    ├── Adirondack_mask.png
    └── Adirondack_right.png
├── output
    ├── bowling_NCC_SGM_D30_R3.jpg
    ├── bowling_NCC_WTA_D30_R3.jpg
    ├── bowling_SAD_SGM_D30_R3.jpg
    ├── bowling_SAD_WTA_D30_R3.jpg
    ├── bowling_SSD_SGM_D30_R3.jpg
    ├── bowling_SSD_WTA_D30_R3.jpg
    ├── cones_NCC_SGM_D60_R3_accX0,95.jpg
    ├── cones_NCC_WTA_D60_R3_accX0,91.jpg
    ├── cones_SAD_SGM_D60_R3_accX0,91.jpg
    ├── cones_SAD_WTA_D60_R3_accX0,86.jpg
    ├── cones_SSD_SGM_D60_R3_accX0,95.jpg
    ├── cones_SSD_WTA_D60_R3_accX0,88.jpg
    ├── Adirondack_NCC_SGM_D70_R3_accX0,92.jpg
    ├── Adirondack_NCC_WTA_D70_R3_accX0,82.jpg
    ├── Adirondack_SAD_SGM_D70_R3_accX0,48.jpg
    ├── Adirondack_SAD_WTA_D70_R3_accX0,44.jpg
    ├── Adirondack_SSD_SGM_D70_R3_accX0,75.jpg
    └── Adirondack_SSD_WTA_D70_R3_accX0,49.jpg
├── .gitignore
├── docker
    ├── docker-compose-nvidia.yml
    ├── docker-compose-gui-nvidia.yml
    ├── docker-compose-gui.yml
    ├── .dockerignore
    ├── docker-compose.yml
    └── Dockerfile
├── .github
    └── workflows
    │   ├── run-tests.yml
    │   └── update-dockerhub.yml
├── .devcontainer
    └── devcontainer.json
├── .vscode
    └── tasks.json
├── License.md
└── ReadMe.md


/src/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/test/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/src/matching_cost/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/src/matching_algorithm/__init__.py:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/.gitattributes:
--------------------------------------------------------------------------------
1 | src/main.ipynb linguist-documentation=true
2 | 


--------------------------------------------------------------------------------
/doc/Theory.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/doc/Theory.pdf


--------------------------------------------------------------------------------
/data/cones_gt.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/cones_gt.png


--------------------------------------------------------------------------------
/data/bowling_left.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/bowling_left.png


--------------------------------------------------------------------------------
/data/cones_left.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/cones_left.png


--------------------------------------------------------------------------------
/data/cones_mask.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/cones_mask.png


--------------------------------------------------------------------------------
/data/cones_right.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/cones_right.png


--------------------------------------------------------------------------------
/data/Adirondack_gt.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/Adirondack_gt.png


--------------------------------------------------------------------------------
/data/bowling_right.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/bowling_right.png


--------------------------------------------------------------------------------
/data/AdirondackE_right.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/AdirondackE_right.png


--------------------------------------------------------------------------------
/data/Adirondack_left.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/Adirondack_left.png


--------------------------------------------------------------------------------
/data/Adirondack_mask.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/Adirondack_mask.png


--------------------------------------------------------------------------------
/data/Adirondack_right.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/data/Adirondack_right.png


--------------------------------------------------------------------------------
/output/bowling_NCC_SGM_D30_R3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/bowling_NCC_SGM_D30_R3.jpg


--------------------------------------------------------------------------------
/output/bowling_NCC_WTA_D30_R3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/bowling_NCC_WTA_D30_R3.jpg


--------------------------------------------------------------------------------
/output/bowling_SAD_SGM_D30_R3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/bowling_SAD_SGM_D30_R3.jpg


--------------------------------------------------------------------------------
/output/bowling_SAD_WTA_D30_R3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/bowling_SAD_WTA_D30_R3.jpg


--------------------------------------------------------------------------------
/output/bowling_SSD_SGM_D30_R3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/bowling_SSD_SGM_D30_R3.jpg


--------------------------------------------------------------------------------
/output/bowling_SSD_WTA_D30_R3.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/bowling_SSD_WTA_D30_R3.jpg


--------------------------------------------------------------------------------
/output/cones_NCC_SGM_D60_R3_accX0,95.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/cones_NCC_SGM_D60_R3_accX0,95.jpg


--------------------------------------------------------------------------------
/output/cones_NCC_WTA_D60_R3_accX0,91.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/cones_NCC_WTA_D60_R3_accX0,91.jpg


--------------------------------------------------------------------------------
/output/cones_SAD_SGM_D60_R3_accX0,91.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/cones_SAD_SGM_D60_R3_accX0,91.jpg


--------------------------------------------------------------------------------
/output/cones_SAD_WTA_D60_R3_accX0,86.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/cones_SAD_WTA_D60_R3_accX0,86.jpg


--------------------------------------------------------------------------------
/output/cones_SSD_SGM_D60_R3_accX0,95.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/cones_SSD_SGM_D60_R3_accX0,95.jpg


--------------------------------------------------------------------------------
/output/cones_SSD_WTA_D60_R3_accX0,88.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/cones_SSD_WTA_D60_R3_accX0,88.jpg


--------------------------------------------------------------------------------
/output/Adirondack_NCC_SGM_D70_R3_accX0,92.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/Adirondack_NCC_SGM_D70_R3_accX0,92.jpg


--------------------------------------------------------------------------------
/output/Adirondack_NCC_WTA_D70_R3_accX0,82.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/Adirondack_NCC_WTA_D70_R3_accX0,82.jpg


--------------------------------------------------------------------------------
/output/Adirondack_SAD_SGM_D70_R3_accX0,48.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/Adirondack_SAD_SGM_D70_R3_accX0,48.jpg


--------------------------------------------------------------------------------
/output/Adirondack_SAD_WTA_D70_R3_accX0,44.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/Adirondack_SAD_WTA_D70_R3_accX0,44.jpg


--------------------------------------------------------------------------------
/output/Adirondack_SSD_SGM_D70_R3_accX0,75.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/Adirondack_SSD_SGM_D70_R3_accX0,75.jpg


--------------------------------------------------------------------------------
/output/Adirondack_SSD_WTA_D70_R3_accX0,49.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/2b-t/stereo-matching/HEAD/output/Adirondack_SSD_WTA_D70_R3_accX0,49.jpg


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | *.nbc
 2 | *.nbi
 3 | __pycache__/
 4 | *.py[cod]
 5 | doc/**/*.aux
 6 | doc/**/*.bbl
 7 | doc/**/*.blg
 8 | doc/**/*.log
 9 | doc/**/*.out
10 | doc/**/*.run.xml
11 | doc/**/*.synctex.gz
12 | doc/**/*.bib
13 | src/.ipynb_checkpoints/**
14 | 
15 | 


--------------------------------------------------------------------------------
/docker/docker-compose-nvidia.yml:
--------------------------------------------------------------------------------
 1 | version: "3.9"
 2 | services:
 3 |   stereo_matching_docker:
 4 |     extends:
 5 |       file: docker-compose.yml
 6 |       service: stereo_matching_docker
 7 |     environment:
 8 |      - NVIDIA_VISIBLE_DEVICES=all
 9 |     runtime: nvidia
10 | 


--------------------------------------------------------------------------------
/docker/docker-compose-gui-nvidia.yml:
--------------------------------------------------------------------------------
 1 | version: "3.9"
 2 | services:
 3 |   stereo_matching_docker:
 4 |     extends:
 5 |       file: docker-compose-gui.yml
 6 |       service: stereo_matching_docker
 7 |     environment:
 8 |      - NVIDIA_VISIBLE_DEVICES=all
 9 |      - NVIDIA_DRIVER_CAPABILITIES=all
10 |     runtime: nvidia
11 | 


--------------------------------------------------------------------------------
/docker/docker-compose-gui.yml:
--------------------------------------------------------------------------------
 1 | version: "3.9"
 2 | services:
 3 |   stereo_matching_docker:
 4 |     extends:
 5 |       file: docker-compose.yml
 6 |       service: stereo_matching_docker
 7 |     environment:
 8 |      - DISPLAY=${DISPLAY}
 9 |      - QT_X11_NO_MITSHM=1
10 |     volumes:
11 |       - /tmp/.X11-unix:/tmp/.X11-unix:rw
12 |       - /tmp/.docker.xauth:/tmp/.docker.xauth:rw
13 | 


--------------------------------------------------------------------------------
/.github/workflows/run-tests.yml:
--------------------------------------------------------------------------------
 1 | name: Tests
 2 | 
 3 | on:
 4 |   push
 5 | 
 6 | jobs:
 7 |   unit-tests:
 8 |     runs-on: ubuntu-latest
 9 |     container:
10 |       image: tobitflatscher/stereo-matching
11 |       volumes:
12 |         - ${{ github.workspace }}:/stereo_matching
13 |     steps:
14 |       - name: Checkout code
15 |         uses: actions/checkout@v2
16 |       - name: Run unittests in workspace
17 |         run: python3 -m unittest discover
18 | 
19 | 


--------------------------------------------------------------------------------
/docker/.dockerignore:
--------------------------------------------------------------------------------
 1 | **/.classpath
 2 | **/.dockerignore
 3 | **/.env
 4 | **/.git
 5 | **/.gitignore
 6 | **/.project
 7 | **/.settings
 8 | **/.toolstarget
 9 | **/.vs
10 | **/.vscode
11 | **/*.*proj.user
12 | **/*.dbmdl
13 | **/*.jfm
14 | **/bin
15 | **/charts
16 | **/docker-compose*
17 | **/compose*
18 | **/Dockerfile*
19 | **/node_modules
20 | **/npm-debug.log
21 | **/obj
22 | **/secrets.dev.yaml
23 | **/values.dev.yaml
24 | **/ReadMe.md
25 | 


--------------------------------------------------------------------------------
/.devcontainer/devcontainer.json:
--------------------------------------------------------------------------------
 1 | {
 2 |   "name": "Stereo Matching Docker Compose",
 3 |   "dockerComposeFile": [
 4 |     "../docker/docker-compose-gui.yml" // Alternatives: "../docker/docker-compose-gui.yml", "../docker/docker-compose-gui-nvidia.yml", "../docker/docker-compose-nvidia.yml"
 5 |   ],
 6 |   "service": "stereo_matching_docker",
 7 |   "workspaceFolder": "/code/stereo_matching",
 8 |   "shutdownAction": "stopCompose",
 9 |   "extensions": [
10 |   ]
11 | }
12 | 


--------------------------------------------------------------------------------
/docker/docker-compose.yml:
--------------------------------------------------------------------------------
 1 | version: "3.9"
 2 | services:
 3 |   stereo_matching_docker:
 4 |     build:
 5 |       context: .
 6 |       dockerfile: Dockerfile
 7 |     #stdin_open: true # Docker run -i
 8 |     tty: true        # Docker run -t
 9 |     privileged: true
10 |     network_mode: "host"
11 |     volumes: # Mount relevant folders into container
12 |       - ../.vscode:/code/stereo_matching/.vscode # Necessary for using VS Code Tasks inside container
13 |       - ../data:/code/stereo_matching/data
14 |       - ../doc:/code/stereo_matching/doc
15 |       - ../src:/code/stereo_matching/src
16 |       - ../test:/code/stereo_matching/test
17 | 
18 | 


--------------------------------------------------------------------------------
/.vscode/tasks.json:
--------------------------------------------------------------------------------
 1 | {
 2 |   "version": "2.0.0",
 3 |   "tasks": [
 4 |     {
 5 |       "label": "run",
 6 |       "detail": "Run Jupyter Notebook.",
 7 |       "type": "shell",
 8 |       "command": "jupyter notebook --ip=127.0.0.1 --port=8888 --allow-root",
 9 |       "group": {
10 |         "kind": "build",
11 |         "isDefault": true
12 |       }
13 |     },
14 |     {
15 |       "label": "test",
16 |       "detail": "Run all unit tests and show results.",
17 |       "type": "shell",
18 |       "command": "(cd ${workspaceFolder} && python3 -m unittest discover)",
19 |       "group": {
20 |         "kind": "test",
21 |         "isDefault": true
22 |       }
23 |       "problemMatcher": []
24 |     }
25 |   ]
26 | }
27 | 


--------------------------------------------------------------------------------
/src/matching_algorithm/winner_takes_it_all.py:
--------------------------------------------------------------------------------
 1 | # Tobit Flatscher - github.com/2b-t (2022)
 2 | 
 3 | # @file winner_takes_it_all.py
 4 | # @brief Winner-takes-it-all (WTA) stereo matching algorithm
 5 | 
 6 | import abc
 7 | import numpy as np
 8 | 
 9 | from .matching_algorithm import MatchingAlgorithm
10 | 
11 | 
12 | class WinnerTakesItAll(MatchingAlgorithm):
13 | 
14 |   @staticmethod
15 |   def match(cost_volume: np.ndarray) -> np.ndarray:
16 |     # Function for matching the best suiting pixels for the disparity image
17 |     #   @param[in] cost_volume: The three-dimensional cost volume to be searched for the best matching pixel (H,W,D)
18 |     #   @return: The two-dimensional disparity image resulting from the best matching pixel inside the cost volume (H,W)
19 |     
20 |     return np.argmin(cost_volume, axis=2)


--------------------------------------------------------------------------------
/.github/workflows/update-dockerhub.yml:
--------------------------------------------------------------------------------
 1 | name: Dockerhub
 2 | 
 3 | on:
 4 |   push:
 5 |     paths:
 6 |       - 'docker/Dockerfile'
 7 | 
 8 | jobs:
 9 |   docker:
10 |     runs-on: ubuntu-latest
11 |     steps:
12 |       - name: Set up QEMU for architectures
13 |         uses: docker/setup-qemu-action@v1
14 |       - name: Set up Docker Buildx
15 |         uses: docker/setup-buildx-action@v1
16 |       - name: Login to DockerHub
17 |         uses: docker/login-action@v1 
18 |         with:
19 |           username: ${{ secrets.DOCKERHUB_USERNAME }}
20 |           password: ${{ secrets.DOCKERHUB_TOKEN }}
21 |       - name: Build and push
22 |         uses: docker/build-push-action@v2
23 |         with:
24 |           builder: ${{ steps.buildx.outputs.name }}
25 |           file: ./docker/Dockerfile
26 |           platforms: linux/amd64,linux/arm64
27 |           push: true
28 |           tags: tobitflatscher/stereo-matching:latest
29 | 
30 | 


--------------------------------------------------------------------------------
/src/matching_algorithm/matching_algorithm.py:
--------------------------------------------------------------------------------
 1 | # Tobit Flatscher - github.com/2b-t (2022)
 2 | 
 3 | # @file matching_algorithm.py
 4 | # @brief Base class for stereo matching algorithms
 5 | 
 6 | import abc
 7 | import numpy as np
 8 | 
 9 | 
10 | class MatchingAlgorithm(abc.ABC):
11 |   # Base class for stereo matching algorithms which finds the best matching pixel
12 | 
13 |   @staticmethod
14 |   @abc.abstractmethod
15 |   def match(cost_volume: np.ndarray) -> np.ndarray:
16 |     # Function for matching the best suiting pixels for the disparity image
17 |     #   @param[in] cost_volume: The three-dimensional cost volume to be searched for the best matching pixel (H,W,D)
18 |     #   @return: The two-dimensional disparity image resulting from the best matching pixel inside the cost volume (H,W)
19 |     if cost_volume.ndim == 3:
20 |       raise ValueError("Cost volume (" + cost_volume.shape + ") must be three-dimensional!")
21 |     pass


--------------------------------------------------------------------------------
/docker/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM ubuntu:20.04
 2 | 
 3 | WORKDIR /code
 4 | 
 5 | ARG DEBIAN_FRONTEND=noninteractive
 6 | 
 7 | # General tools
 8 | RUN apt-get update \
 9 |  && apt-get install -y \
10 |     build-essential \
11 |     cmake \
12 |     git-all \
13 |  && rm -rf /var/lib/apt/lists/*
14 | 
15 | # Python3 and libraries
16 | RUN apt-get update \
17 |  && apt-get install -y \
18 |     python3 \
19 |     python3-scipy \
20 |     python3-skimage \
21 |     python3-numpy \
22 |     python3-numba \
23 |     python3-notebook \
24 |     python3-matplotlib \
25 |     python3-parameterized \
26 |  && rm -rf /var/lib/apt/lists/*
27 | 
28 | # For visualisation of Jupyter notebooks
29 | RUN apt-get update \
30 |  && apt-get install -y \
31 |     firefox \
32 |  && rm -rf /var/lib/apt/lists/*
33 | 
34 | # For documentation only
35 | #RUN apt-get update \
36 | # && apt-get install -y \
37 | #    texlive-full \
38 | #    texstudio \
39 | # && rm -rf /var/lib/apt/lists/*
40 |  
41 |  ARG DEBIAN_FRONTEND=dialog
42 | 
43 | 


--------------------------------------------------------------------------------
/src/matching_cost/matching_cost.py:
--------------------------------------------------------------------------------
 1 | # Tobit Flatscher - github.com/2b-t (2022)
 2 | 
 3 | # @file matching_cost.py
 4 | # @brief Base class for stereo matching costs
 5 | 
 6 | import abc
 7 | import numpy as np
 8 | 
 9 | 
10 | class MatchingCost(abc.ABC):
11 |   # Base class for stereo matching costs for calculating a cost volume
12 | 
13 |   @staticmethod
14 |   @abc.abstractmethod
15 |   def compute(self, left_image: np.ndarray, right_image: np.ndarray, max_disparity: int, filter_radius: int) -> np.ndarray:
16 |     # Function for calculating the cost volume
17 |     #   @param[in] left_image: The left image to be used for stereo matching (H,W)
18 |     #   @param[in] right_image: The right image to be used for stereo matching (H,W)
19 |     #   @param[in] max_disparity: The maximum disparity to consider
20 |     #   @param[in] filter_radius: The filter radius to be considered for matching
21 |     #   @return: The best matching pixel inside the cost volume according to the pre-defined criterion (H,W,D)
22 | 
23 |     pass
24 | 


--------------------------------------------------------------------------------
/License.md:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2022 Tobit Flatscher
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/src/matching_cost/sum_of_squared_differences.py:
--------------------------------------------------------------------------------
 1 | # Tobit Flatscher - github.com/2b-t (2022)
 2 | 
 3 | # @file sum_of_squared_differences.py
 4 | # @brief Sum of squared differences (SSD) stereo matching cost
 5 | 
 6 | from numba import jit
 7 | import numpy as np
 8 | 
 9 | from .matching_cost import MatchingCost
10 | 
11 | 
12 | class SumOfSquaredDifferences(MatchingCost):
13 | 
14 |   @staticmethod
15 |   @jit(nopython = True, parallel = True, cache = True)
16 |   def compute(left_image: np.ndarray, right_image: np.ndarray, max_disparity: int, filter_radius: int) -> np.ndarray:
17 |     # Compute a cost volume with maximum disparity D considering a neighbourhood R with Sum of Squared Differences (SSD)
18 |     #   @param[in] left_image: The left image to be used for stereo matching (H,W)
19 |     #   @param[in] right_image: The right image to be used for stereo matching (H,W)
20 |     #   @param[in] max_disparity: The maximum disparity to consider
21 |     #   @param[in] filter_radius: The filter radius to be considered for matching
22 |     #   @return: The best matching pixel inside the cost volume according to the pre-defined criterion (H,W,D)
23 |     
24 |     (H,W) = left_image.shape
25 |     cost_volume = np.zeros((H,W,max_disparity))
26 |     
27 |     # Loop over internal image
28 |     for y in range(filter_radius, H - filter_radius):
29 |       for x in range(filter_radius, W - filter_radius):
30 |         # Loop over window
31 |         for v in range(-filter_radius, filter_radius + 1):
32 |           for u in range(-filter_radius, filter_radius + 1):
33 |             # Loop over all possible disparities
34 |             for d in range(0, max_disparity):
35 |               cost_volume[y,x,d] += (left_image[y+v, x+u] - right_image[y+v, x+u-d])**2
36 |         
37 |     return cost_volume


--------------------------------------------------------------------------------
/src/matching_cost/sum_of_absolute_differences.py:
--------------------------------------------------------------------------------
 1 | # Tobit Flatscher - github.com/2b-t (2022)
 2 | 
 3 | # @file sum_of_absolute_differences.py
 4 | # @brief Sum of absolute differences (SAD) stereo matching cost
 5 | 
 6 | from numba import jit
 7 | import numpy as np
 8 | 
 9 | from .matching_cost import MatchingCost
10 | 
11 | 
12 | class SumOfAbsoluteDifferences(MatchingCost):
13 | 
14 |   @staticmethod
15 |   @jit(nopython = True, parallel = True, cache = True)
16 |   def compute(left_image: np.ndarray, right_image: np.ndarray, max_disparity: int, filter_radius: int) -> np.ndarray:
17 |     # Compute a cost volume with maximum disparity D considering a neighbourhood R with Sum of Absolute Differences (SAD)
18 |     #   @param[in] left_image: The left image to be used for stereo matching (H,W)
19 |     #   @param[in] right_image: The right image to be used for stereo matching (H,W)
20 |     #   @param[in] max_disparity: The maximum disparity to consider
21 |     #   @param[in] filter_radius: The filter radius to be considered for matching
22 |     #   @return: The best matching pixel inside the cost volume according to the pre-defined criterion (H,W,D)
23 |     
24 |     (H,W) = left_image.shape
25 |     cost_volume = np.zeros((H,W,max_disparity))
26 |     
27 |     # Loop over internal image
28 |     for y in range(filter_radius, H - filter_radius):
29 |       for x in range(filter_radius, W - filter_radius):  
30 |         # Loop over window
31 |         for v in range(-filter_radius, filter_radius + 1):
32 |           for u in range(-filter_radius, filter_radius + 1):
33 |             # Loop over all possible disparities
34 |             for d in range(0, max_disparity):
35 |               cost_volume[y,x,d] += np.absolute(left_image[y+v, x+u] - right_image[y+v, x+u-d])
36 |         
37 |     return cost_volume


--------------------------------------------------------------------------------
/src/matching_cost/normalised_cross_correlation.py:
--------------------------------------------------------------------------------
 1 | # Tobit Flatscher - github.com/2b-t (2022)
 2 | 
 3 | # @file normalised_cross_correlation.py
 4 | # @brief Normalised cross correlation (NCC) stereo matching cost
 5 | 
 6 | from numba import jit
 7 | import numpy as np
 8 | 
 9 | from .matching_cost import MatchingCost
10 | 
11 | 
12 | class NormalisedCrossCorrelation(MatchingCost):
13 | 
14 |   @staticmethod
15 |   @jit(nopython = True, parallel = True, cache = True)
16 |   def compute(left_image: np.ndarray, right_image: np.ndarray, max_disparity: int, filter_radius: int) -> np.ndarray:
17 |     # Compute a cost volume with maximum disparity D considering a neighbourhood R with Normalized Cross Correlation (NCC)
18 |     #   @param[in] left_image: The left image to be used for stereo matching (H,W)
19 |     #   @param[in] right_image: The right image to be used for stereo matching (H,W)
20 |     #   @param[in] max_disparity: The maximum disparity to consider
21 |     #   @param[in] filter_radius: The filter radius to be considered for matching
22 |     #   @return: The best matching pixel inside the cost volume according to the pre-defined criterion (H,W,D)
23 |     
24 |     (H,W) = left_image.shape
25 |     cost_volume = np.zeros((max_disparity,H,W))
26 |     
27 |     # Loop over all possible disparities
28 |     for d in range(0, max_disparity):
29 |       # Loop over image
30 |       for y in range(filter_radius, H - filter_radius):
31 |         for x in range(filter_radius, W - filter_radius):
32 |           l_mean = 0
33 |           r_mean = 0
34 |           n = 0
35 |           
36 |           # Loop over window
37 |           for v in range(-filter_radius, filter_radius + 1):
38 |             for u in range(-filter_radius, filter_radius + 1):     
39 |               # Calculate cumulative sum
40 |               l_mean += left_image[y+v, x+u]
41 |               r_mean += right_image[y+v, x+u-d]
42 |               n  += 1
43 |           
44 |           l_mean = l_mean/n
45 |           r_mean = r_mean/n
46 |           
47 |           l_r = 0
48 |           l_var = 0
49 |           r_var = 0
50 |           
51 |           for v in range(-filter_radius, filter_radius + 1):
52 |             for u in range(-filter_radius, filter_radius + 1):     
53 |               # Calculate terms
54 |               l = left_image[y+v, x+u]    - l_mean
55 |               r = right_image[y+v, x+u-d] - r_mean
56 |               
57 |               l_r   += l*r
58 |               l_var += l**2
59 |               r_var += r**2
60 |           
61 |           # Assemble terms
62 |           cost_volume[d,y,x] = -l_r/np.sqrt(l_var*r_var)
63 |     
64 |     return np.transpose(cost_volume, (1, 2, 0))


--------------------------------------------------------------------------------
/src/stereo_matching.py:
--------------------------------------------------------------------------------
 1 | # Tobit Flatscher - github.com/2b-t (2022)
 2 | 
 3 | # @file stereo_matching.py
 4 | # @brief Interface class for setting up stereo matching
 5 | 
 6 | from enum import Enum
 7 | import numpy as np
 8 | 
 9 | from matching_algorithm.matching_algorithm import MatchingAlgorithm
10 | from matching_cost.matching_cost import MatchingCost
11 | 
12 | 
13 | class StereoMatching:
14 |   # Recreate the depth image from two images with a given maximum disparity to consider and given filter radius
15 | 
16 |   def __init__(self, left_image: np.ndarray, right_image: np.ndarray,
17 |                      matching_cost: MatchingCost, 
18 |                      matching_algorithm: MatchingAlgorithm, 
19 |                      max_disparity: int = 60, filter_radius: int = 3):
20 |     # Class constructor
21 |     #   @param[in] left_image: The left stereo image (H,W)
22 |     #   @param[in] right_image: The right stereo image (H,W)
23 |     #   @param[in] matching_cost: The class implementing the matching cost
24 |     #   @param[in] matching_algorithm: The class implementing the matching algorithm
25 |     #   @param[in] max_disparity: The maximum disparity to consider
26 |     #   @param[in] filter_radius: The radius of the filter
27 | 
28 |     if (left_image.ndim != 2):
29 |       raise ValueError("The left image has to be a grey-scale image with a single channel as its last dimension.")
30 |     if (right_image.ndim != 2):
31 |       raise ValueError("The right image has to be a grey-scale image with a single channel as its last dimension.")
32 |     if (left_image.shape != right_image.shape):
33 |       raise ValueError("Dimensions of left (" + left_image.shape + ") and right image (" + right_image.shape + ") do not match.")
34 |     if (max_disparity <= 0):
35 |       raise ValueError("Maximum disparity (" + max_disparity + ") has to be greater than zero.")
36 |     if (filter_radius <= 0):
37 |       raise ValueError("Radius (" + filter_radius + ") has to be greater than zero.")
38 | 
39 |     # Convert images to gray-scale
40 |     self._left_image = left_image
41 |     self._right_image = right_image
42 | 
43 |     self._max_disparity = max_disparity
44 |     self._filter_radius = filter_radius
45 |     self._matching_cost = matching_cost
46 |     self._matching_algorithm = matching_algorithm
47 |     self._cost_volume = None
48 |     self._result = None
49 |     return
50 | 
51 |   def compute(self) -> None:
52 |     # Compute the cost volume according to given matching cost and match according matching algorithm
53 | 
54 |     self._cost_volume = self._matching_cost.compute(self._left_image, self._right_image, self._max_disparity, self._filter_radius)
55 |     self._result = self._matching_algorithm.match(self._cost_volume)
56 |     return
57 |   
58 |   def result(self) -> np.ndarray:
59 |     # Export image to disk with an approriate file name
60 |     #   @return: The generated result image or None if the image has not been generated yet
61 | 
62 |     return self._result
63 | 


--------------------------------------------------------------------------------
/src/matching_algorithm/semi_global_matching.py:
--------------------------------------------------------------------------------
  1 | # Tobit Flatscher - github.com/2b-t (2022)
  2 | 
  3 | # @file semi_global_matching.py
  4 | # @brief Semi-global matching (SGM) stereo matching algorithm
  5 | 
  6 | import abc
  7 | from numba import jit
  8 | import numpy as np
  9 | from scipy.sparse import diags
 10 | 
 11 | from .matching_algorithm import MatchingAlgorithm
 12 | 
 13 | 
 14 | class SemiGlobalMatching(MatchingAlgorithm):
 15 | 
 16 |   @staticmethod
 17 |   def match(cost_volume: np.ndarray) -> np.ndarray:
 18 |     # Function for matching the best suiting pixels for the disparity image
 19 |     #   @param[in] cost_volume: The three-dimensional cost volume to be searched for the best matching pixel (H,W,D)
 20 |     #   @return: The two-dimensional disparity image resulting from the best matching pixel inside the cost volume (H,W)
 21 | 
 22 |     (_, _, max_disparity) = cost_volume.shape
 23 |     f = SemiGlobalMatching._get_f(max_disparity)
 24 |     return SemiGlobalMatching._compute_sgm(cost_volume, f)
 25 | 
 26 |   def _get_f(D: int, L1: float = 0.025, L2: float = 0.5) -> np.ndarray:
 27 |     # Get pairwise cost matrix for semi-global matching
 28 |     #   @param[in] D: Maximum disparity, number of possible choices
 29 |     #   @param[in] L1: Parameter for setting cost for jumps between two layers of depth
 30 |     #   @param[in] L2: Cost for jumping more than one layer of depth
 31 |     #   @return: Pairwise_costs of shape (D,D)
 32 |     
 33 |     return np.full((D, D), L2) + diags([L1 - L2, -L2, L1 - L2], [-1, 0, 1], (D, D)).toarray()
 34 | 
 35 |   # For some reason @jit(nopython = True, parallel = True, cache = True) does not work here!
 36 |   # See Issue #1: https://github.com/2b-t/stereo-matching/issues/1
 37 |   @staticmethod
 38 |   @jit
 39 |   def _compute_message(cost_volume: np.ndarray, f: np.ndarray) -> np.ndarray:
 40 |     # Compute the messages in one particular direction for semi-global matching
 41 |     #
 42 |     #   @param[in] cost_volume: Cost volume of shape (H,W,D)
 43 |     #   @param[in] f: Pairwise costs of shape (D,D)
 44 |     #   @return: Messages for all H in positive direction of W with possible options D (H,W,D)
 45 |     
 46 |     (H,W,D) = cost_volume.shape
 47 |     mes = np.zeros((H,W,D))
 48 |     # Loop over passive direction
 49 |     for y in range(0, H):
 50 |       # Loop over forward direction
 51 |       for x in range(0, W - 1):
 52 |         # Loop over all possible nodes
 53 |         for t in range(0, D):
 54 | 
 55 |           # Loop over all possible connections
 56 |           buffer = np.zeros(D)
 57 |           for s in range(0, D):
 58 |             # Input messages + unary cost + binary cost
 59 |             buffer[s] = mes[y,x,s] + cost_volume[y,x,s] + f[t,s]
 60 |           
 61 |           # Choose path of least effort
 62 |           mes[y, x+1, t] = np.min(buffer)
 63 |     
 64 |     return mes
 65 | 
 66 |   @staticmethod
 67 |   def _compute_sgm(cost_volume: np.ndarray, f: np.ndarray) -> np.ndarray:
 68 |     # Compute semi-global matching by message passing in four directions
 69 |     #   @param[in] cost_volume: Cost volume of shape (H,W,D)
 70 |     #   @param[in] f: Pairwise costs of shape (H,W,D,D)
 71 |     #   @return: Pixel-wise disparity map of shape (H,W)
 72 |     
 73 |     # Messages for every single spatial direction and collect in single message
 74 |     (H,W,D) = cost_volume.shape
 75 |     mes = np.zeros((H,W,D))
 76 |     
 77 |     # Positive W
 78 |     mes += SemiGlobalMatching._compute_message(cost_volume, f)
 79 |     
 80 |     # Negative W
 81 |     mes_buffer = np.zeros((H,W))
 82 |     mes_buffer = SemiGlobalMatching._compute_message(np.flip(cost_volume, axis=1), f)
 83 |     mes += np.flip(mes_buffer, axis=1)
 84 |     
 85 |     # Positive H
 86 |     mes_buffer = SemiGlobalMatching._compute_message(np.transpose(cost_volume, (1, 0, 2)), f)
 87 |     mes += np.transpose(mes_buffer, (1, 0, 2))
 88 |     
 89 |     # Negative H
 90 |     mes_buffer = SemiGlobalMatching._compute_message(np.flip(np.transpose(cost_volume, (1, 0, 2)), axis=1), f)
 91 |     mes += np.transpose(np.flip(mes_buffer, axis=1), (1, 0, 2))
 92 |     
 93 |     # Choose best believe from all messages
 94 |     disp_map = np.zeros((H,W))
 95 |     for y in range(0, H):
 96 |       for x in range(0, W):
 97 |         # Minimum argument of unary cost and messages
 98 |         disp_map[y,x] = np.argmin(cost_volume[y,x,:] + mes[y,x,:])
 99 |     
100 |     return disp_map
101 | 


--------------------------------------------------------------------------------
/ReadMe.md:
--------------------------------------------------------------------------------
 1 | # Stereo matching
 2 | 
 3 | Author: [Tobit Flatscher](https://github.com/2b-t) (January 2020)
 4 | 
 5 | [![Dockerhub](https://github.com/2b-t/stereo-matching/actions/workflows/update-dockerhub.yml/badge.svg)](https://github.com/2b-t/stereo-matching/actions/workflows/update-dockerhub.yml) [![Tests](https://github.com/2b-t/stereo-matching/actions/workflows/run-tests.yml/badge.svg)](https://github.com/2b-t/stereo-matching/actions/workflows/run-tests.yml) [![Python 3.8.10](https://img.shields.io/badge/Python-3.8-yellow.svg?style=flat&logo=python)](https://www.python.org/downloads/release/python-3810/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 6 | 
 7 | 
 8 | 
 9 | ## Overview
10 | 
11 | Left image             |  Right image| Depth image 
12 | :-------------------------:|:-------------------------:|---------------------------
13 | ![Left image](data/Adirondack_left.png) | ![Right image](data/Adirondack_right.png) | ![Depth image](output/Adirondack_NCC_SGM_D70_R3_accX0,92.jpg) 
14 | 
15 | This small tool is a **manual implementation of simple stereo-matching** in Python 3. Two rectified images taken from different views are combined to a **depth image** by means of two **matching algorithms**:
16 | 
17 | - a simple **winner-takes-it-all (WTA)** or 
18 | - a more sophisticated **semi-global matching (SGM)**
19 | 
20 | with several **matching costs**:
21 | 
22 | - **Sum of Absolute Differences (SAD)**,
23 | - **Sum of Squared Differences (SSD)** or
24 | - **Normalized Cross-Correlation (NCC)**.
25 | 
26 | The results are compared to a ground-truth using the accX accuracy measure excluding occluded pixels with a mask.
27 | 
28 | For the precise details of the involved formulas (matching cost, matching algorithms and accuracy measure) refer to [`doc/Theory.pdf`](./doc/Theory.pdf).
29 | 
30 | The repository is structured as follows:
31 | 
32 | ```bash
33 | .
34 | ├── data/                  # Directory for the input images (left and right eye)
35 | ├── doc/                   # Further documentation, in particular the computational approach
36 | ├── docker/                # contains a Docker container as well as a Docker-Compose configuration file
37 | ├── output/                # contains the resulting depth-image output
38 | ├── src/
39 | │   ├── main.ipynb         # The Jupyter notebook that allows a convenient access to the underlying Python functions
40 | │   └── stereo_matching.py # The Python3 implementation of the core functions with Scipy, Scimage, Numba, Numpy and Matplotlib
41 | ├── test/                  # contains parametrized unit tests for the implementations
42 | ├── .devcontainer/         # contains configuration files for containers in Visual Studio Code
43 | └── .vscode/               # contains configuration files for Visual Studio Code
44 | ```
45 | 
46 | 
47 | 
48 | ## 1. Download it
49 | Either download and copy this folder manually or directly **clone this repository** by typing
50 | ```
51 | $ git clone https://github.com/2b-t/stereo-matching.git
52 | ```
53 | 
54 | 
55 | ## 2. Launch it
56 | 
57 | Now you have two options for launching the code. Either you can install all libraries on your system and launch the code there or you can use the Docker container located in [`docker/`](./docker/).
58 | 
59 | ### 2.1 On your system
60 | 
61 | For launching the code directly on your system make sure Scipy, Numba, Numpy and potentially also Jupyter are installed on your system by typing. If they are not installed on your system yet, install them - ideally with [Anaconda](https://www.anaconda.com/distribution/) or use the supplied Docker as described below.
62 | 
63 | #### 2.1.1 Jupyter notebook
64 | 
65 | For debugging purposes it can be pretty helpful to launch the Jupyter notebook by typing
66 | 
67 | ```
68 | $ jupyter notebook
69 | ```
70 | Browse and open the Jupyter notebook [`src/main.ipynb`](./src/main.ipynb) and run it by pressing the play-button.
71 | 
72 | #### 2.1.2 Command line interface
73 | 
74 | Alternatively you can also edit the Python-file [`src/main.py`](./src/main.py) in your editor of choice (e.g. Visual Studio Code) and launch it from there or from the console. When launching it with `$ python3 main.py -h` it will tell you the available options that you can set.
75 | 
76 | #### 2.1.3 Library
77 | 
78 | Finally you can also use this package as a library. For this purpose have a look at [`src/main.py`](./src/main.py), [`src/main.ipynb`](./src/main.ipynb) as well as at the unit tests located in [`test/`](./test/) for a reference.
79 | 
80 | ### 2.2 Run from Docker
81 | 
82 | This is discussed in detail in the document [`doc/Docker.md`](./doc/Docker.md).
83 | 


--------------------------------------------------------------------------------
/src/utilities.py:
--------------------------------------------------------------------------------
  1 | # Tobit Flatscher - github.com/2b-t (2022)
  2 | 
  3 | # @file utilities.py
  4 | # @brief Different utilities for AccX accuracy measure and file input and output
  5 | 
  6 | import numpy as np
  7 | import os
  8 | 
  9 | from skimage import img_as_float, img_as_ubyte
 10 | from skimage.io import imread, imsave
 11 | from skimage.color import rgb2gray
 12 | 
 13 | 
 14 | class AccX:
 15 |   # Class for the AccX accuracy measure
 16 | 
 17 |   @staticmethod
 18 |   def compute(prediction_image: np.ndarray, groundtruth_image: np.ndarray, mask_image: np.ndarray = None, threshold_disparity: int = 3) -> float:
 19 |     # Compute the accX accuracy measure [0..1]
 20 |     #   @param[in] prediction_image: The stereo image as reconstructed by an algorithm
 21 |     #   @param[in] groundtruth_image: The ground truth stereo image
 22 |     #   @param[in] mask_image: The mask for excluding invalid pixels such as occluded areas
 23 |     #   @param[in] threshold_disparity: Threshold disparity measure (X)
 24 |     #   @return The accX measure of the reconstructed stereo image
 25 |     
 26 |     if (prediction_image.shape != groundtruth_image.shape):
 27 |       raise ValueError("Dimensions of guess (" + prediction_image.shape + ") and groundtruth (" + groundtruth_image.shape + ") do not match.")
 28 |     
 29 |     if (mask_image is None):
 30 |       mask_image = np.ones(prediction_image.shape)
 31 |     
 32 |     number_of_pixels = max(np.sum(mask_image), 1) # Catch error if no pixels selected
 33 |     
 34 |     weighted_image = mask_image*(np.absolute(prediction_image - groundtruth_image) <= threshold_disparity)
 35 |     return 1/number_of_pixels*np.sum(weighted_image)
 36 | 
 37 | 
 38 | class IO:
 39 |   # Class for input output tools
 40 | 
 41 |   @staticmethod
 42 |   def import_image(file_name: str) -> np.ndarray:
 43 |     # Import image and convert it to useable grey-scale image
 44 |     # @param[in] file_name: The file name of the file to be imported
 45 |     # @return The parsed image as a numpy array
 46 |     img = imread(file_name)
 47 |     return rgb2gray(img)
 48 |   
 49 |   @staticmethod
 50 |   def export_image(image: np.ndarray, directory: str, name: str, matching_cost: str, matching_algorithm: str, 
 51 |                    max_disparity: int, filter_radius: int, accx = None) -> str:
 52 |     # Export image to disk with an approriate file name
 53 |     #   @param[in] export_image: The image data that has to be exported as numpy array
 54 |     #   @param[in] directory: Sub-directory where the file should be saved
 55 |     #   @param[in] name: Scenario name
 56 |     #   @param[in] matching_cost: The matching cost used (e.g. SSD, SAD, NCC)
 57 |     #   @param[in] matching_algorithm: The measure used for matching point (e.g. WTA, SGM)
 58 |     #   @param[in] max_disparity: Maximum disparity
 59 |     #   @param[in] filter_radius: Filter radius
 60 |     #   @param[in] accx: accX measure for evaluation (if available)
 61 |     #   @return: The resulting file name
 62 | 
 63 |     if directory is None:
 64 |       directory = ""
 65 |     elif not os.path.isdir(directory):
 66 |       os.mkdir(directory)
 67 | 
 68 |     if name is None:
 69 |       name = ""
 70 | 
 71 |     path = os.path.join(directory, name)
 72 | 
 73 |     file_name = str(path) + "_" + matching_cost + "_" + matching_algorithm + "_D" + IO._str_comma(max_disparity) + "_R" + IO._str_comma(filter_radius)
 74 | 
 75 |     if accx is not None:
 76 |       file_name += "_accX" + IO._str_comma(accx)
 77 | 
 78 |     file_name = file_name + ".jpg"
 79 |     imsave(file_name, img_as_ubyte(image), quality = 100)
 80 |     return file_name
 81 | 
 82 |   @staticmethod
 83 |   def _str_comma(number: float, number_of_decimals: int = 2) -> str:
 84 |     # Create a string from a number and replace all dots by a comma
 85 |     #   @param[in] number: A number that should be converted to a string
 86 |     #   @param[in] number_of_decimals: Number of decimals to be kept
 87 |     #   @return: A string of the number with 2 decimals where all dots are replaced by commas
 88 | 
 89 |     return str(round(number, number_of_decimals)).replace('.',',')
 90 |         
 91 |   @staticmethod
 92 |   def normalise_image(image: np.ndarray, groundtruth_image: np.ndarray = None) -> np.ndarray:
 93 |     # Normalise image with groundtruth or itself to floating number points in interval 0..1
 94 |     #   @param[in] image: Non-normalised image
 95 |     #   @param[in] groundtruth_image: Ground-truth
 96 |     #   @return: Image normalised with ground truth or maximimum distance
 97 |     
 98 |     normalised_image = image
 99 |     
100 |     if groundtruth_image is not None:
101 |       if (np.max(groundtruth_image) <= 0):
102 |         raise ValueError("Maximum value in groundtruth image must be greater than 0.")
103 |       normalised_image = image/np.max(groundtruth_image)
104 |     
105 |     if (np.max(image) <= 0):
106 |       raise ValueError("Maximum value in image must be greater than 0.")
107 | 
108 |     return normalised_image/np.max(normalised_image)
109 |   


--------------------------------------------------------------------------------
/doc/Docker.md:
--------------------------------------------------------------------------------
 1 | # Stereo matching
 2 | 
 3 | Author: [Tobit Flatscher](https://github.com/2b-t) (January 2020)
 4 | 
 5 | 
 6 | 
 7 | ## Docker
 8 | 
 9 | ### 2.2 Run from Docker
10 | 
11 | This code is shipped with a [Docker](https://www.docker.com/) that allows the software to be run without having to install all the dependencies. For this one has to [set-up Docker](https://docs.docker.com/get-docker/) (select your operating system and follow the steps) as well as [Docker Compose](https://docs.docker.com/compose/install/) ideally with `$ sudo pip3 install docker-compose`.
12 | 
13 | Then browse the `docker` folder containing all the different Docker files, open a console and start the docker with
14 | 
15 | ```bash
16 | $ sudo docker-compose up
17 | ```
18 | 
19 | and then - after the container has been compiled - open another terminal and connect to the Docker
20 | 
21 | ```bash
22 | $ sudo docker-compose exec stereo_matching_docker sh
23 | ```
24 | 
25 | Now you can work inside the Docker as if it was your own machine. Later it is discussed how one can use Visual Studio Code as an IDE and not having to launch the Docker from the console.
26 | 
27 | Advantages of Docker compared to an installation on the host system are discussed in more detail [here](https://hentsu.com/docker-containers-top-7-benefits/).
28 | 
29 | When opening a Jupyter notebook from inside the container you might have to supply the following options:
30 | 
31 | ```bash
32 | $ jupyter notebook --ip=127.0.0.1 --port=8888 --allow-root
33 | ```
34 | 
35 | #### 2.2.1 Graphic user interfaces inside the Docker
36 | 
37 | Docker was actually not designed to be used with a graphic user interface. There are several workarounds for this, most of them mount relevant X11 folders from the host system into the Docker. In our case this is achieved by a corresponding Docker Compose file `docker-compose-gui.yml` that [extends](https://docs.docker.com/compose/extends/) the basic `docker-compose.yml` file.
38 | 
39 | Before launching it one has to allow the user to access the X server from within the Docker with
40 | 
41 | ```bash
42 | $ xhost +local:root
43 | ```
44 | 
45 | Then one can open the Docker by additionally supplying the command line argument `-f <filename>`
46 | 
47 | ```bash
48 | $ docker-compose -f docker-compose-gui.yml up
49 | ```
50 | 
51 | ##### 2.2.1.1 Hardware accelerated OpenGL with `nvidia-container-runtime`
52 | 
53 | Another problem emerges when wanting to use hardware acceleration such as with OpenGL. In such a case one has to allow the Docker to access the host graphics card. This can be achieved with the [`nvidia-docker`](https://github.com/NVIDIA/nvidia-docker) or alternatively with the [`nvidia-container-runtime`](https://github.com/NVIDIA/nvidia-container-runtime).
54 | 
55 | Latter was chosen for this Docker: The configuration files `docker-compose-gui-nvidia.yml` and `docker-compose-nvidia.yml` inside the `docker` folder contain Docker Compose configurations for accessing the hardware accelerators inside the Docker. Former is useful when running hardware-accelerated graphic user interfaces while the latter can be used to run CUDA inside the Docker.
56 | 
57 | For this start by launching `docker info` and check if the field `runtime` additionally to the default `runc` also holds an `nvidia` runtime. If not please follow the [installation guide](https://github.com/NVIDIA/nvidia-container-runtime#installation) as well as the [engine setup](https://github.com/NVIDIA/nvidia-container-runtime#docker-engine-setup) (and then restart your computer).
58 | 
59 | Then you should be able to run the Docker Compose image with
60 | 
61 | ```bash
62 | $ docker-compose -f docker-compose-gui-nvidia.yml up
63 | ```
64 | 
65 | To verify that the hardware acceleration is actually working you can check the output of `nvidia-smi`. If working correctly it should output you the available hardware accelerators on your system.
66 | 
67 | ```bash
68 | $ nvidia-smi
69 | ```
70 | 
71 | #### 2.2.2 Docker inside Visual Studio Code
72 | 
73 | Additionally this repository comes with a Visual Studio Code project. The following sections will walk you through how this can be set-up.
74 | 
75 | ##### 2.2.2.1 Set-up
76 | 
77 | If you do not have Visual Studio Code installed on your system then [install it](https://code.visualstudio.com/download). And finally follow the Docker post-installation steps given [here](https://docs.docker.com/engine/install/linux-postinstall/) so that you can run Docker without `sudo`. Finally install the [Docker](https://marketplace.visualstudio.com/items?itemName=ms-azuretools.vscode-docker) and [Remote - Containers](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) plugins inside Visual Studio Code and you should be ready to go.
78 | 
79 | ##### 2.2.2.2 Open the project
80 | 
81 | More information about Docker with Visual Studio Code can be found [here](https://code.visualstudio.com/docs/containers/overview).
82 | 
83 | ##### 2.2.2.3 Change the Docker Compose file
84 | 
85 | The Docker Compose file can be changed inside `.devcontainer/devcontainers.json`:
86 | 
87 | ```json
88 | {
89 |   "name": "Stereo Matching Docker Compose",
90 |   "dockerComposeFile": [
91 |     "../docker/docker-compose.yml" // Change Docker-Compose file here
92 |   ],
93 | ```
94 | 


--------------------------------------------------------------------------------
/src/main.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | # Tobit Flatscher - github.com/2b-t (2022)
  3 | 
  4 | # @file main.py
  5 | # @brief Command line interface for stereo matching
  6 | 
  7 | import argparse
  8 | import matplotlib.pyplot as plt
  9 | import numpy as np
 10 | 
 11 | from matching_algorithm.matching_algorithm import MatchingAlgorithm
 12 | from matching_algorithm.semi_global_matching import SemiGlobalMatching
 13 | from matching_algorithm.winner_takes_it_all import WinnerTakesItAll
 14 | 
 15 | from matching_cost.matching_cost import MatchingCost
 16 | from matching_cost.normalised_cross_correlation import NormalisedCrossCorrelation
 17 | from matching_cost.sum_of_absolute_differences import SumOfAbsoluteDifferences
 18 | from matching_cost.sum_of_squared_differences import SumOfSquaredDifferences
 19 | 
 20 | from stereo_matching import StereoMatching
 21 | from utilities import AccX, IO
 22 | 
 23 | 
 24 | def main(left_image_path: str, right_image_path: str, 
 25 |          matching_algorithm_name: str, matching_cost_name: str, 
 26 |          max_disparity: int, filter_radius: int, 
 27 |          groundtruth_image_path: str, mask_image_path: str, accx_threshold: int,
 28 |          output_path: str = None, output_name: str = "unknown", is_plot: bool = True) -> None:
 29 |   # Imports images for stereo matching, performs stereo matching, plots results and outputs them to a file
 30 |   #   @param[in] left_image_path:          Path to the image for the left eye
 31 |   #   @param[in] right_image_path:         Path to the image for the right eye
 32 |   #   @param[in] matching_algorithm_name:  Name of the matching algorithm
 33 |   #   @param[in] matching_cost_name:       Name of the matching cost type
 34 |   #   @param[in] max_disparity:            Maximum disparity to consider
 35 |   #   @param[in] filter_radius:            Filter radius to be considered for cost volume
 36 |   #   @param[in] groundtruth_image_path:   Path to the ground truth image
 37 |   #   @param[in] mask_image_path:          Path to the mask for excluding pixels from the AccX accuracy measure
 38 |   #   @param[in] accx_threshold:           Mismatch in disparity to accept for AccX accuracy measure
 39 |   #   @param[in] output_path:              Location of the output path, if None no output is generated
 40 |   #   @param[in] output_name:              Name of the scenario for pre-pending the output file
 41 |   #   @param[in] is_plot:                  Flag for turning plot of results on and off
 42 |   
 43 |   # Load input images
 44 |   left_image = IO.import_image(left_image_path)
 45 |   right_image = IO.import_image(right_image_path)
 46 | 
 47 |   # Load ground truth images
 48 |   groundtruth_image = None
 49 |   mask_image = None
 50 |   try:
 51 |     groundtruth_image = IO.import_image(groundtruth_image_path)
 52 |     mask_image = IO.import_image(mask_image_path)
 53 |   except:
 54 |     pass
 55 | 
 56 |   # Plot input images
 57 |   if is_plot is True:
 58 |     plt.figure(figsize=(8,4))
 59 |     plt.subplot(1,2,1), plt.imshow(left_image, cmap='gray'), plt.title('Left')
 60 |     plt.subplot(1,2,2), plt.imshow(right_image, cmap='gray'), plt.title('Right')
 61 |     plt.tight_layout()
 62 | 
 63 |   # Set-up algorithm
 64 |   matching_algorithm = None
 65 |   if matching_algorithm_name == "SGM":
 66 |     matching_algorithm = SemiGlobalMatching
 67 |   elif matching_algorithm_name == "WTA":
 68 |     matching_algorithm = WinnerTakesItAll
 69 |   else:
 70 |     raise ValueError("Matching algorithm '" + matching_algorithm_name + "' not recognised!")
 71 |   
 72 |   matching_cost = None
 73 |   if matching_cost_name == "NCC":
 74 |     matching_cost = NormalisedCrossCorrelation
 75 |   elif matching_cost_name == "SAD":
 76 |     matching_cost = SumOfAbsoluteDifferences
 77 |   elif matching_cost_name == "SSD":
 78 |     matching_cost = SumOfSquaredDifferences
 79 |   else:
 80 |     raise ValueError("Matching cost '" + matching_cost_name + "' not recognised!")
 81 | 
 82 |   # Perform stereo matching
 83 |   sm = StereoMatching(left_image, right_image, matching_cost, matching_algorithm, max_disparity, filter_radius)
 84 |   print("Performing stereo matching...")
 85 |   sm.compute()
 86 |   print("Stereo matching completed.")
 87 |   res_image = sm.result()
 88 | 
 89 |   # Compute accuracy
 90 |   try:
 91 |     accx = AccX.compute(res_image, groundtruth_image, mask_image, accx_threshold)
 92 |     print("AccX accuracy measure for threshold " + str(accx_threshold) + ": " + str(accx))
 93 |   except:
 94 |     accx = None
 95 | 
 96 |   # Plot result
 97 |   if is_plot is True:
 98 |     plt.figure()
 99 |     plt.imshow(res_image, cmap='gray')
100 |     plt.show()
101 |   
102 |   # Output to file
103 |   if output_path is not None:
104 |     result_file_path = IO.export_image(IO.normalise_image(res_image, groundtruth_image), 
105 |                                        output_path, output_name, matching_cost_name, matching_algorithm_name, 
106 |                                        max_disparity, filter_radius, accx_threshold)
107 |     print("Exported result to file '" + result_file_path + "'.")
108 |   return
109 | 
110 | 
111 | if __name__== "__main__":
112 |   # Parse input arguments
113 |   parser = argparse.ArgumentParser()
114 |   parser.add_argument("-l", "--left", type=str, 
115 |                       help="Path to left image")
116 |   parser.add_argument("-r", "--right", type=str, 
117 |                       help="Path to right image")
118 |   parser.add_argument("-a", "--algorithm", type=str, choices=["SGM", "WTA"],
119 |                       help="Matching cost algorithm", default = "WTA")
120 |   parser.add_argument("-c", "--cost", type=str, choices=["NCC", "SAD", "SSD"],
121 |                       help="Matching cost type", default = "SAD")
122 |   parser.add_argument("-D", "--disparity", type=int, 
123 |                       help="Maximum disparity", default = 60)
124 |   parser.add_argument("-R", "--radius", type=int, 
125 |                       help="Filter radius", default = 3)
126 |   parser.add_argument("-o", "--output", type=str, 
127 |                       help="Output directory, by default no output", default = None)
128 |   parser.add_argument("-n", "--name", type=str, 
129 |                       help="Output file name", default = "unknown")
130 |   parser.add_argument("-p", "--no-plot", action='store_true', 
131 |                       help="Flag for de-activating plotting")
132 |   parser.add_argument("-g", "--groundtruth", type=str, 
133 |                       help="Path to groundtruth image", default = None)
134 |   parser.add_argument("-m", "--mask", type=str, 
135 |                       help="Path to mask image for AccX accuracy measure", default = None)
136 |   parser.add_argument("-X", "--accx", type=int, 
137 |                       help="AccX accuracy measure threshold", default = 60)
138 |   args = parser.parse_args()
139 | 
140 |   main(args.left, args.right, args.algorithm, args.cost, args.disparity, args.radius, 
141 |        args.groundtruth, args.mask, args.accx, 
142 |        args.output, args.name, not args.no_plot)
143 | 


--------------------------------------------------------------------------------
/test/test_utilities.py:
--------------------------------------------------------------------------------
  1 | # Tobit Flatscher - github.com/2b-t (2022)
  2 | 
  3 | # @file utilities_test.py
  4 | # @brief Different testing routines for utility functions for accuracy calculation and file import and export
  5 | 
  6 | import numpy as np
  7 | from parameterized import parameterized
  8 | from typing import Tuple
  9 | import unittest
 10 | 
 11 | from src.utilities import AccX, IO
 12 | 
 13 | 
 14 | class TestAccX(unittest.TestCase):
 15 |   _shape = (10,20)
 16 |   _disparities = [ ["disparity = 1", 1],
 17 |                    ["disparity = 2", 2],
 18 |                    ["disparity = 3", 3]
 19 |                  ]
 20 | 
 21 |   @parameterized.expand(_disparities)
 22 |   def test_same_image(self, name: str, threshold_disparity: int) -> None:
 23 |     # Parameterised unit test for testing if two identical images result in an accuracy measure of unity
 24 |     #   @param[in] name: The name of the parameterised test
 25 |     #   @param[in] threshold_disparity: The threshold disparity for the accuracy measure
 26 | 
 27 |     mag = threshold_disparity*10
 28 |     groundtruth_image = mag*np.ones(self._shape)
 29 |     prediction_image = mag*np.ones(groundtruth_image.shape)
 30 |     mask_image = np.ones(groundtruth_image.shape)
 31 |     accx = AccX.compute(prediction_image, groundtruth_image, mask_image, threshold_disparity)
 32 |     self.assertAlmostEqual(accx, 1.0, places=7)
 33 |     return
 34 | 
 35 |   @parameterized.expand(_disparities)
 36 |   def test_slightly_shifted_image(self, name: str, threshold_disparity: int) -> None:
 37 |     # Parameterised unit test for testing if an image and its slightly shifted counterpart result in an accuracy measure of unity
 38 |     #   @param[in] name: The name of the parameterised test
 39 |     #   @param[in] threshold_disparity: The threshold disparity for the accuracy measure
 40 | 
 41 |     mag = threshold_disparity*10
 42 |     groundtruth_image = mag*np.ones(self._shape)
 43 |     prediction_image = (mag+threshold_disparity-1)*np.ones(groundtruth_image.shape)
 44 |     mask_image = np.ones(groundtruth_image.shape)
 45 |     accx = AccX.compute(prediction_image, groundtruth_image, mask_image, threshold_disparity)
 46 |     self.assertAlmostEqual(accx, 1.0, places=7)
 47 |     return
 48 |   
 49 |   @parameterized.expand(_disparities)
 50 |   def test_no_mask(self, name: str, threshold_disparity: int) -> None:
 51 |     # Parameterised unit test for testing if two identical images with no given mask result in an accuracy measure of unity
 52 |     #   @param[in] name: The name of the parameterised test
 53 |     #   @param[in] threshold_disparity: The threshold disparity for the accuracy measure
 54 | 
 55 |     mag = threshold_disparity*10
 56 |     groundtruth_image = mag*np.ones(self._shape)
 57 |     prediction_image = mag*np.ones(groundtruth_image.shape)
 58 |     mask_image = None
 59 |     accx = AccX.compute(prediction_image, groundtruth_image, mask_image, threshold_disparity)
 60 |     self.assertAlmostEqual(accx, 1.0, places=7)
 61 |     return
 62 | 
 63 |   @parameterized.expand(_disparities)
 64 |   def test_inverse_image(self, name: str, threshold_disparity: int) -> None:
 65 |     # Parameterised unit test for testing if two inverse images result in an accuracy measure of zero
 66 |     #   @param[in] name: The name of the parameterised test
 67 |     #   @param[in] threshold_disparity: The threshold disparity for the accuracy measure
 68 | 
 69 |     mag = threshold_disparity*10
 70 |     groundtruth_image = mag*np.ones(self._shape)
 71 |     prediction_image = np.zeros(groundtruth_image.shape)
 72 |     mask_image = np.ones(groundtruth_image.shape)
 73 |     accx = AccX.compute(prediction_image, groundtruth_image, mask_image, threshold_disparity)
 74 |     self.assertAlmostEqual(accx, 0.0, places=7)
 75 |     return
 76 | 
 77 |   @parameterized.expand(_disparities)
 78 |   def test_significantly_shifted_image(self, name: str, threshold_disparity: int) -> None:
 79 |     # Parameterised unit test for testing if an image and its significantly shifted counterpart result in an accuracy measure of zero
 80 |     #   @param[in] name: The name of the parameterised test
 81 |     #   @param[in] threshold_disparity: The threshold disparity for the accuracy measure
 82 | 
 83 |     mag = threshold_disparity*10
 84 |     groundtruth_image = mag*np.ones(self._shape)
 85 |     prediction_image = (mag+threshold_disparity+1)*np.ones(groundtruth_image.shape)
 86 |     mask_image = np.ones(groundtruth_image.shape)
 87 |     accx = AccX.compute(prediction_image, groundtruth_image, mask_image, threshold_disparity)
 88 |     self.assertAlmostEqual(accx, 0.0, places=7)
 89 |     return
 90 | 
 91 |   @parameterized.expand(_disparities)
 92 |   def test_zero_mask(self, name: str, threshold_disparity: int) -> None:
 93 |     # Parameterised unit test for testing if two equal images with a mask of zero results in an accuracy measure of zero
 94 |     #   @param[in] name: The name of the parameterised test
 95 |     #   @param[in] threshold_disparity: The threshold disparity for the accuracy measure
 96 | 
 97 |     mag = threshold_disparity*10
 98 |     groundtruth_image = mag*np.ones(self._shape)
 99 |     prediction_image = groundtruth_image
100 |     mask_image = np.zeros(groundtruth_image.shape)
101 |     accx = AccX.compute(prediction_image, groundtruth_image, mask_image, threshold_disparity)
102 |     self.assertAlmostEqual(accx, 0.0, places=7)
103 |     return
104 | 
105 | 
106 | class TestIO(unittest.TestCase):
107 |   _resolutions = [ ["resolution = (10, 20)", (10, 20)],
108 |                    ["resolution = (30,  4)", (30,  4)],
109 |                    ["resolution = (65, 24)", (65, 24)]
110 |                  ]
111 |   def test_import_image(self) -> None:
112 |     # TODO(tobit): Implement
113 | 
114 |     pass
115 |   
116 |   def test_export_image(self) -> None:
117 |     # TODO(tobit): Implement
118 | 
119 |     pass
120 | 
121 |   def test_str_comma(self) -> None:
122 |     # Function for testing conversion of numbers to comma-separated numbers
123 | 
124 |     self.assertEqual(IO._str_comma(10, 2), "10")
125 |     self.assertEqual(IO._str_comma(9.3, 2), "9,3")
126 |     self.assertEqual(IO._str_comma(1.234, 2), "1,23")
127 |     return
128 |   
129 |   @parameterized.expand(_resolutions)
130 |   def test_normalise_positive_image_no_groundtruth(self, name: str, shape: Tuple[int, int]) -> None:
131 |     # Function for testing normalising a positive image with a no ground-truth should result in a positive image
132 |     #   @param[in] name: The name of the parameterised test
133 |     #   @param[in] shape: The image resolution to be considered for the test
134 |     
135 |     mag = 13
136 |     image = mag*np.ones(shape)
137 |     groundtruth_image = None
138 |     result = IO.normalise_image(image, groundtruth_image)
139 |     self.assertGreaterEqual(np.min(result), 0.0)
140 |     self.assertLessEqual(np.max(result), 1.0)
141 |     return
142 | 
143 |   @parameterized.expand(_resolutions)
144 |   def test_normalise_positive_image_positive_groundtruth(self, name: str, shape: Tuple[int, int]) -> None:
145 |     # Function for testing normalising a regular image with a regular ground-truth should result in a positive image
146 |     #   @param[in] name: The name of the parameterised test
147 |     #   @param[in] shape: The image resolution to be considered for the test
148 | 
149 |     mag = 13
150 |     image = mag*np.ones(shape)
151 |     groundtruth_image = 2*image
152 |     result = IO.normalise_image(image, groundtruth_image)
153 |     self.assertGreaterEqual(np.min(result), 0.0)
154 |     self.assertLessEqual(np.max(result), 1.0)
155 |     return
156 | 
157 |   @parameterized.expand(_resolutions)
158 |   def test_normalise_negative_image_positive_groundtruth(self, name: str, shape: Tuple[int, int]) -> None:
159 |     # Function for testing normalising a negative image which should result in a ValueError
160 |     #   @param[in] name: The name of the parameterised test
161 |     #   @param[in] shape: The image resolution to be considered for the test
162 | 
163 |     mag = 13
164 |     groundtruth_image = mag*np.ones(shape)
165 |     image = -2*groundtruth_image
166 |     self.assertRaises(ValueError, IO.normalise_image, image, groundtruth_image)
167 |     return
168 |   
169 |   @parameterized.expand(_resolutions)
170 |   def test_normalise_positive_image_negative_groundtruth(self, name: str, shape: Tuple[int, int]) -> None:
171 |     # Function for testing normalising a negative ground-truth which should result in a ValueError
172 |     #   @param[in] name: The name of the parameterised test
173 |     #   @param[in] shape: The image resolution to be considered for the test
174 | 
175 |     mag = 13
176 |     image = mag*np.ones(shape)
177 |     groundtruth_image = -2*image
178 |     self.assertRaises(ValueError, IO.normalise_image, image, groundtruth_image)
179 |     return
180 | 
181 | 
182 | if __name__ == '__main__':
183 |   unittest.main()


--------------------------------------------------------------------------------
/doc/Theory.tex:
--------------------------------------------------------------------------------
  1 | \documentclass{article}
  2 | \usepackage[utf8]{inputenc}
  3 | \usepackage[ngerman,english]{babel}
  4 | \usepackage[T1]{fontenc}
  5 | %\renewcommand{\familydefault}{\rmdefault}
  6 | 
  7 | \addtolength{\oddsidemargin}{-0.75in}
  8 | \addtolength{\evensidemargin}{-0.75in}
  9 | \addtolength{\textwidth}{1.5in}
 10 | 
 11 | \addtolength{\topmargin}{-.9in}
 12 | \addtolength{\textheight}{1.5in}
 13 | 
 14 | \usepackage{amsmath,amssymb}
 15 | \usepackage{longtable}
 16 | \usepackage{graphicx}
 17 | \graphicspath{ {pictures/} }
 18 | \usepackage{caption}
 19 | \usepackage{subcaption}
 20 | \usepackage{adjustbox}
 21 | \usepackage{multirow}
 22 | 
 23 | \usepackage{titlesec}
 24 | \usepackage{anyfontsize}
 25 | 
 26 | \usepackage{hyperref}
 27 | \hypersetup{
 28 | 	colorlinks=true,
 29 | 	linkcolor=black,
 30 | 	filecolor=black,      
 31 | 	urlcolor=blue,
 32 | }\pagestyle{myheadings}
 33 | 
 34 | \usepackage{scrpage2}
 35 | \pagestyle{scrheadings}
 36 | \clearscrheadfoot
 37 | 
 38 | \urlstyle{same}
 39 | 
 40 | \graphicspath{ {../data/} {../output/}}
 41 | 
 42 | \usepackage[backend=bibtex,style=numeric]{biblatex}
 43 | \addbibresource{literature.bib}
 44 | 
 45 |  
 46 | \begin{document}
 47 | \noindent
 48 | 
 49 | \section*{Code documentation}
 50 | 
 51 | In this code several methods for estimating depth in a stereo image (image \ref{fig:Stereo}) are implemented. For this purpose a \textit{matching cost volume} is calculated by means of sum of squared differences (SSD), sum of absolute differences (SAD) and normalised cross-correlation (NCC) and then the most appropriate match chosen either by the simple \textit{winner-takes-it-all} approach (WTA) or \textit{semi-global matching} (SGM). For this purpose the given images have to be converted to grayscale.
 52 | 
 53 | \begin{figure}[!htb]
 54 | 	\captionsetup[subfigure]{labelformat=empty}
 55 | 	\centering
 56 | 	\begin{adjustbox}{minipage=\linewidth,scale=0.95}
 57 | 		\begin{subfigure}{0.45\textwidth}
 58 | 			\centering
 59 | 			\includegraphics[width=0.9\linewidth]{cones_left.png}
 60 | 			\caption{a) left}
 61 | 		\end{subfigure}
 62 | 		\begin{subfigure}{0.45\textwidth}
 63 | 			\centering
 64 | 			\includegraphics[width=0.9\linewidth]{cones_right.png}
 65 | 			\caption{b) right}
 66 | 		\end{subfigure}%
 67 | 		\par \bigskip
 68 | 		\begin{subfigure}{0.45\textwidth}
 69 | 			\centering
 70 | 			\includegraphics[width=0.9\linewidth]{cones_gt.png}
 71 | 			\caption{c) ground-truth}
 72 | 		\end{subfigure}
 73 | 		\begin{subfigure}{0.45\textwidth}
 74 | 			\centering
 75 | 			\includegraphics[width=0.9\linewidth]{cones_mask.png}
 76 | 			\caption{d) mask}
 77 | 		\end{subfigure}%
 78 | 	\end{adjustbox}
 79 | 	\caption[Input]{Stereo images: left (a) and right (b), the corresponding ground-truth (c) and the mask (d) needed for the accX evaluation}
 80 | 	\label{fig:Stereo}
 81 | \end{figure}
 82 | 
 83 | 
 84 | \section{Local matching}
 85 | The following error-measures and correlations will be used for evaluating a corresponding matching cost between two image patches $p$ and $q$ of equal size $W \times H$.
 86 | 
 87 | \subsection{Sum of absolute differences}
 88 | In case of the sum of absolute differences the matching of two patches $p$ and $q$ is penalised depending on the sum of absolute differences of the two windows according to
 89 | \begin{equation}
 90 | SAD(p,q) = \sum\limits_{x=1}^W \sum\limits_{y=1}^H | p(x,y) - q(x,y) |
 91 | \end{equation}
 92 | This means very similar image patches lead to a low SAD while non-matching patches result in a high SAD.
 93 | 
 94 | \subsection{Sum of squared differences}
 95 | In case of the sum of squared differences the matching process is penalised quadratically instead of linearly making use of the squared difference instead
 96 | \begin{equation}
 97 | SSD(p,q) = \sum\limits_{x=1}^W \sum\limits_{y=1}^H ( p(x,y) - q(x,y) )^2
 98 | \end{equation}
 99 | 
100 | \subsection{Normalised cross-correlation}
101 | In the case of the more sophisticated normalised cross-correlation the patches are normalised by substracting the means to account for slight deviations in lighting between the two pictures
102 | \begin{equation}
103 | \overline{p} = \frac{1}{H \, W} \sum\limits_{x=1}^W \sum\limits_{y=1}^H p(x,y) \hspace{3cm} \overline{q} = \frac{1}{H \, W} \sum\limits_{x=1}^W \sum\limits_{y=1}^H q(x,y)
104 | \end{equation}
105 | and calculating the a correlation measure for local matching according to
106 | \begin{equation}
107 | NCC(p,q) = \frac{\sum\limits_{x=1}^W \sum\limits_{y=1}^H (p(x,y) - \overline{p}) (q(x,y) - \overline{q})}{\sqrt{\left[ \sum\limits_{x=1}^W \sum\limits_{y=1}^H (p(x,y) - \overline{p})^2 \right] \cdot \left[ \sum\limits_{x=1}^W \sum\limits_{y=1}^H (q(x,y) - \overline{q})^2 \right] }}
108 | \end{equation}
109 | where in this case a high similarity between the two patches contrary to SAD and SSD is characterised by a high NCC. This means for our cost volume we either have to reverse the sign multiplying the $NCC$ by $-1$.
110 | 
111 | \section{Cost volume}
112 | We use these similarity measures to compute a cost-volume $CV$ for a pre-defined range of disparities $D$
113 | \begin{equation}
114 | CV(x,y,d) = S(I_0(x,y) \, I_1(x - d,y) )
115 | \end{equation}
116 | where the parameter $d \in \mathcal{D}$ and $\mathcal{D} = \left\{ 0, ... \, , D-1 \right\}$ are all valid disparities and $S$ is any of the aforementioned error-measures.
117 | 
118 | This basically means that we take the left picture and translate the right picture trying to overlap the objects in the two pictures taken from different views. The points at a certain depth have a certain disparity and thus the optimal shift can be used to determine the correct depth. In order to account for a certain deviation we use a certain search window $(W,H)$ rather than trying to match the points directly.
119 | 
120 | \section{Matching algorithm}
121 | 
122 | \subsection{Winner-takes-it-all solution}
123 | One fast way of obtaining then the best disparity for each image point would be taking the point with the highest value in the cost volume along the disparity axis according to
124 | \begin{equation}
125 | \overline{d}(x,y) \in \arg \min_d CV(x,y,d)
126 | \end{equation}
127 | This though leads to noisy results as this approach doesn't penalise label changes at all.
128 | 
129 | \subsection{Semi-global matching}
130 | 
131 | In semi-global matching a different approach is taken, rather than looking for the best fit on a scanline, a sort of global optimisation is used. Each pixel with a corresponding unary cost given by the cost volume is assigned an additional pairwise cost that depends on wherever the neighbouring pixels have a similar depth value or deviate significantly. This energy can be written as
132 | \begin{equation}
133 | \min_z \left[ \sum_{i \in \mathcal{V}} g_i (z_i) + \sum_{(i,j) \in \mathcal{E}} f_{i,j} (z_i, z_j) \right]
134 | \end{equation}
135 | where $\mathcal{V}$ are the image pixels, $\mathcal{E}$ the edges, the connections between two pixels. The $g_i$ are given by the cost volume and the pairwise cost $f_{i,j}$ defines a penalty for jumps between neighbouring pixels.
136 | \begin{equation}
137 | f_{i,j} (z_i, z_j) = \begin{cases}
138 | 0, \hspace{0.7cm} \text{if} \, z_i = z_j \\
139 | L_1, \hspace{0.5cm} \text{if} \, |z_i - z_j| = 1\\
140 | L_2\phantom{,} \hspace{0.5cm} \text{else}
141 | \end{cases}
142 | \end{equation}
143 | This is done as following: First messages for all four disparity directions are calculated where the first message in each direction is initialised with $\vec 0$.
144 | \begin{equation}
145 | m_{i+1}^a(t) = \min_{s \in \mathcal{D}} \left[ m_i^a(s) + f_{i, i+1} (s,t) + g_i(s) \right]
146 | \end{equation}
147 | This can be done for every direction by a combination of mirroring and transposing the cost volume. Then the beliefs are computed
148 | \begin{equation}
149 | b_i(s) = g_i(s) \sum_{a \in \{ L,R,U,D \}} m_i^a(s)
150 | \end{equation}
151 | The correct disparity is then calculated from the believes as follows
152 | \begin{equation}
153 | \hat{d} (x,y) \in \arg \min_d b(x,y,d)
154 | \end{equation}
155 | The last formula contains is intentionally given as $\in$ as the solutions might not be unique.
156 | 
157 | \section{Evaluation: compare to ground-truth}
158 | The performance of the stereo workflow is evaluated by comparing it with a ground-truth disparity map, in this case with the $accX$ measure
159 | \begin{equation}
160 | accX(z,z*) = \frac{1}{Z} \sum\limits_{x=1}^W \sum\limits_{y=1}^H m(x,y) \cdot \begin{cases}
161 | 1, \hspace{0.5cm} \text{if} \, |(z(x,y) - z*(x,y)| \leq X\\
162 | 0\phantom{,} \hspace{0.5cm} \text{else}
163 | \end{cases}
164 | \end{equation}
165 | This measure characterises errors less than or equal to $X$ disparities, between the prediction $z$ and the ground truth disparity map $z*$ with a mask $m$ that contains $1$ for the $Z$ valid pixels and $0$ for the invalid pixels.
166 | 
167 | The mask basically excludes pixels that should not be evaluated e.g. because they are occluded in either of the two pictures. The average of the remaining pixels that were estimated correctly is determined. All pixels that guessed the depth correctly (threshold $X$) are set to $1$, all pixels that did not estimate it correctly do not contribute. In this way the $accX$ measures the amount of pixels that were matched correctly to those that could possibly be matched. An $accX$ of 1 would correspond to the ground truth.
168 | 
169 | \end{document}


--------------------------------------------------------------------------------