├── .gitignore
├── LICENSE
├── Lecture - Computational Image Analysis - Jonas Hartmann.pdf
├── Lecture - Interesting Bonus Stuff - Jonas Hartmann.pdf
├── README.md
├── batch_processing_solution.py
├── example_data
├── example_cells_1.tif
├── example_cells_2.tif
└── not_a_valid_input_file.txt
├── image_analysis_tutorial.ipynb
├── image_analysis_tutorial_solutions.ipynb
└── ipynb_images
├── adaptive_bg_1D.png
├── distance_transform.png
├── fig_gen.py
├── gaussian_kernel_grid.png
├── uniform_filter_SE.png
└── watershed_illustration.png
/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__
2 | *.ipynb_checkpoints*
3 | example_cells_*
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2018 European Molecular Biology Laboratory, Jonas Hartmann, Karin Sasaki, Toby Hodges
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the Software), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, andor sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/Lecture - Computational Image Analysis - Jonas Hartmann.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WhoIsJack/python-bioimage-analysis-tutorial/1d2473994c0151d8b83f0385f007425ad4c7a055/Lecture - Computational Image Analysis - Jonas Hartmann.pdf
--------------------------------------------------------------------------------
/Lecture - Interesting Bonus Stuff - Jonas Hartmann.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WhoIsJack/python-bioimage-analysis-tutorial/1d2473994c0151d8b83f0385f007425ad4c7a055/Lecture - Interesting Bonus Stuff - Jonas Hartmann.pdf
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | Python BioImage Analysis Tutorial
2 | =================================
3 |
4 | *originally created in 2016*
5 | *updated and converted to a Jupyter notebook in 2017*
6 | *updated and converted to python 3 in 2018*
7 | *by Jonas Hartmann (Mayor group, University College London)*
8 |
9 | ----
10 |
11 | ⚠️ Note that some of the materials in this tuorial are slightly out of date by now (2023), but recent feedback has shown that the tutorial is still very useful and approachable for many learners. I would like to create an updated and slightly extended version at some point, but academic pressures limit my ability to work on this, so we'll have to see how it goes... 🤞
12 |
13 | (Latest maintenance update: 17.02.2023, python 3.9.16 (basic Anaconda distro), with thanks to @Koushouu!)
14 |
15 | -----
16 |
17 | ## Aims and Overview
18 |
19 | This tutorial teaches the basics of bio-image processing, segmentation and analysis in python. It integrates explanations and exercises in a (hopefully) self-explanatory fashion, enabling participants to build their own image analysis pipelines step by step.
20 |
21 | The tutorial uses single-cell segmentation of 2D confocal fluorescence microscopy images to illustrate key concepts from preprocessing to segmentation to (very basic) data analysis. It concludes with a small section on how to apply such a pipeline to multiple images at once (batch processing).
22 |
23 | Everything you need to know to get started can be found in the jupyter notebook `image_analysis_tutorial.ipynb`. To find out more about how to run these materials interactively, see the [Jupyter documentation](https://jupyter.readthedocs.io/en/latest/index.html).
24 |
25 | Note that this tutorial was part of a course aimed at people with basic knowledge of python. The course included introductory sessions/lectures on scientific python (in particular `numpy` and `matplotlib`) as well as on image analysis (see the slides in this repository). For those tackling this tutorial on their own, it is therefore recommended to first acquire basic scientific python knowledge elsewhere (e.g. at [python-course.eu](https://www.python-course.eu)).
26 |
27 | ## Content Overview
28 |
29 | - Lecture
30 | - Working with digital images
31 | - Images as arrays of numbers
32 | - Look-up tables (LUTs)
33 | - Dimensions
34 | - Bit-depth
35 | - Image analysis pipelines
36 | - Preprocessing: filters, kernels, convolution, background subtraction
37 | - Foreground detection: thresholding, morphological operations
38 | - Segmentation: labels, seeds, watershed
39 | - Postprocessing: object filtering
40 | - Making measurements
41 |
42 | - Tutorial
43 | - Importing Modules & Packages
44 | - Loading & Handling Image Data
45 | - Preprocessing
46 | - Manual Thresholding & Threshold Detection
47 | - Adaptive Thresholding
48 | - Improving Masks with Binary Morphology
49 | - Connected Components Labeling
50 | - Cell Segmentation by Seeding & Expansion
51 | - Postprocessing: Removing Cells at the Image Border
52 | - Identifying Cell Edges
53 | - Extracting Quantitative Measurements
54 | - Simple Analysis & Visualization
55 | - Writing Output to Files
56 | - Batch Processing
57 |
58 | ## Old Versions and Other Sources
59 |
60 | This was part of the EMBL Bio-IT/ALMF `Image Analysis with Python 2018` course (see the [EMBL Gitlab repo](https://git.embl.de/grp-bio-it/image-analysis-with-python-2018)).
61 |
62 | If you are looking for the python 2 version from 2017, see the `2017_legacy_python_version` branch or the corresponding [EMBL GitLab repo](https://git.embl.de/grp-bio-it/python-workshop-image-processing).
63 |
64 | The original 2016 materials can be found in Karin Sasaki's corresponding Github [repo](https://github.com/karinsasaki/python-workshop-image-processing).
65 |
66 | ## Acknowledgements
67 |
68 | The first version of this tutorial was created for the `EMBL Python Workshop - Image Processing` course organized by Karin Sasaki and Jonas Hartmann in 2016. Additional lecturers and TAs contributing to this course were Kota Miura, Volker Hilsenstein, Aliaksandr Halavatyi, Imre Gaspar, and Toby Hodges.
69 |
70 | The second installment (the `EMBL Bio-IT Image Processing Course`, 2017) was organized and taught by Jonas Hartmann and Toby Hodges.
71 |
72 | The third version of this tutorial was part of the `EMBL Bio-IT/ALMF Image Analysis with Python 2018` course, organized by Jonas Hartmann and Toby Hodges in collaboration with Tobias Rasse and Volker Hilsenstein. Additional organizational help came from Christian Tischer and Malvika Sharan.
73 |
74 | Many thanks to all the helpful collaborators and the interested students who were instrumental in making these courses a success.
75 |
76 | ## Feedback
77 |
78 | Feedback on this tutorial is always welcome! Please open an issue on GitHub or write to *jonas.hartmann(AT)ucl.ac.uk*.
79 |
--------------------------------------------------------------------------------
/batch_processing_solution.py:
--------------------------------------------------------------------------------
1 |
2 | # coding: utf-8
3 |
4 | # Image Analysis with Python - Solution for Batch Processing
5 |
6 | # The following is the script version of the tutorial's solution pipeline, where all the code
7 | # has been wrapped in a single function that can be called many times for many images.
8 | # Please refer to the jupyter notebooks ('image_analysis_tutorial[_solutions].ipynb') for
9 | # more information, including detailed comments on every step.
10 |
11 |
12 | ## Importing Modules & Packages
13 |
14 | import numpy as np
15 | import matplotlib.pyplot as plt
16 | import scipy.ndimage as ndi
17 |
18 |
19 | ## Defining the pipeline function
20 | def run_pipeline(dirpath, filename):
21 | """Run 2D single-cell segmentation pipeline optimized for membrane-labeled
22 | spinning-disk confocal images of membrane markers in zebrafish early embryos.
23 |
24 | Parameters
25 | ----------
26 | dirpath : string
27 | Path to the directory containing the input image.
28 | filename : string
29 | Name of the input file, including file ending (should be .tif).
30 |
31 | Returns
32 | -------
33 | clean_ws : 3D numpy array of same shape as input image
34 | The single-cell segmentation. Every cell is labeled with a unique
35 | integer ID. Background is 0.
36 | results : dict
37 | A number of measurements extracted from each cell. The dict keys
38 | name the type of measurement. The dict values are lists containing
39 | the measured values. The order of all lists is the same and relates
40 | to the segmentation IDs through the list in results['cell_id'].
41 | """
42 |
43 |
44 | ## Importing & Handling Image Data
45 |
46 | from os.path import join
47 | filepath = join(dirpath, filename)
48 |
49 | from skimage.io import imread
50 | img = imread(filepath)
51 |
52 |
53 | ## Preprocessing
54 |
55 | sigma = 3
56 | img_smooth = ndi.gaussian_filter(img, sigma)
57 |
58 |
59 | ## Adaptive Thresholding
60 |
61 | i = 31
62 | SE = (np.mgrid[:i,:i][0] - np.floor(i/2))**2 + (np.mgrid[:i,:i][1] - np.floor(i/2))**2 <= np.floor(i/2)**2
63 |
64 | from skimage.filters import rank
65 | bg = rank.mean(img_smooth, footprint=SE)
66 |
67 | mem = img_smooth > bg
68 |
69 |
70 | ## Improving Masks with Binary Morphology
71 |
72 | mem_holefilled = ~ndi.binary_fill_holes(~mem) # Short form
73 |
74 | i = 15
75 | SE = (np.mgrid[:i,:i][0] - np.floor(i/2))**2 + (np.mgrid[:i,:i][1] - np.floor(i/2))**2 <= np.floor(i/2)**2
76 |
77 | pad_size = i+1
78 | mem_padded = np.pad(mem_holefilled, pad_size, mode='reflect')
79 | mem_final = ndi.binary_closing(mem_padded, structure=SE)
80 | mem_final = mem_final[pad_size:-pad_size, pad_size:-pad_size]
81 |
82 |
83 | ## Cell Segmentation by Seeding & Expansion
84 |
85 | ### Seeding by Distance Transform
86 |
87 | dist_trans = ndi.distance_transform_edt(~mem_final)
88 | dist_trans_smooth = ndi.gaussian_filter(dist_trans, sigma=5)
89 |
90 | from skimage.feature import peak_local_max
91 | seed_coords = peak_local_max(dist_trans_smooth, min_distance=10)
92 | seeds = np.zeros_like(dist_trans_smooth, dtype=bool)
93 | seeds[tuple(seed_coords.T)] = True
94 |
95 | seeds_labeled = ndi.label(seeds)[0]
96 |
97 | ### Expansion by Watershed
98 |
99 | from skimage.segmentation import watershed
100 | ws = watershed(img_smooth, seeds_labeled)
101 |
102 |
103 | ## Postprocessing: Removing Cells at the Image Border
104 |
105 | border_mask = np.zeros(ws.shape, dtype=bool)
106 | border_mask = ndi.binary_dilation(border_mask, border_value=1)
107 |
108 | clean_ws = np.copy(ws)
109 |
110 | for cell_ID in np.unique(ws):
111 | cell_mask = ws==cell_ID
112 | cell_border_overlap = np.logical_and(cell_mask, border_mask)
113 | total_overlap_pixels = np.sum(cell_border_overlap)
114 | if total_overlap_pixels > 0:
115 | clean_ws[cell_mask] = 0
116 |
117 | for new_ID, cell_ID in enumerate(np.unique(clean_ws)[1:]):
118 | clean_ws[clean_ws==cell_ID] = new_ID+1
119 |
120 |
121 | ## Identifying Cell Edges
122 |
123 | edges = np.zeros_like(clean_ws)
124 |
125 | for cell_ID in np.unique(clean_ws)[1:]:
126 | cell_mask = clean_ws==cell_ID
127 | eroded_cell_mask = ndi.binary_erosion(cell_mask, iterations=1)
128 | edge_mask = np.logical_xor(cell_mask, eroded_cell_mask)
129 | edges[edge_mask] = cell_ID
130 |
131 |
132 | ## Extracting Quantitative Measurements
133 |
134 | results = {"cell_id" : [],
135 | "int_mean" : [],
136 | "int_mem_mean" : [],
137 | "cell_area" : [],
138 | "cell_edge" : []}
139 |
140 | for cell_id in np.unique(clean_ws)[1:]:
141 | cell_mask = clean_ws==cell_id
142 | edge_mask = edges==cell_id
143 | results["cell_id"].append(cell_id)
144 | results["int_mean"].append(np.mean(img[cell_mask]))
145 | results["int_mem_mean"].append(np.mean(img[edge_mask]))
146 | results["cell_area"].append(np.sum(cell_mask))
147 | results["cell_edge"].append(np.sum(edge_mask))
148 |
149 |
150 | ## Returning the results
151 | return clean_ws, results
152 |
153 |
--------------------------------------------------------------------------------
/example_data/example_cells_1.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WhoIsJack/python-bioimage-analysis-tutorial/1d2473994c0151d8b83f0385f007425ad4c7a055/example_data/example_cells_1.tif
--------------------------------------------------------------------------------
/example_data/example_cells_2.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WhoIsJack/python-bioimage-analysis-tutorial/1d2473994c0151d8b83f0385f007425ad4c7a055/example_data/example_cells_2.tif
--------------------------------------------------------------------------------
/example_data/not_a_valid_input_file.txt:
--------------------------------------------------------------------------------
1 |
2 | This file exists to illustrate a point regarding batch processing. Just leave it here.
--------------------------------------------------------------------------------
/image_analysis_tutorial.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Image Analysis with Python - Tutorial Pipeline\n",
8 | "\n",
9 | "*originally created in 2016*
\n",
10 | "*updated and converted to a Jupyter notebook in 2017*
\n",
11 | "*updated and converted to python 3 in 2018*
\n",
12 | "*by Jonas Hartmann (Gilmour group, EMBL Heidelberg)*
"
13 | ]
14 | },
15 | {
16 | "cell_type": "markdown",
17 | "metadata": {},
18 | "source": [
19 | "## Table of Contents\n",
20 | "\n",
21 | "1. [About this Tutorial](#about)\n",
22 | "2. [Setup](#setup)\n",
23 | "3. [Importing Modules & Packages](#import)\n",
24 | "4. [Loading & Handling Image Data](#load)\n",
25 | "5. [Preprocessing](#prepro)\n",
26 | "6. [Manual Thresholding & Threshold Detection](#thresh)\n",
27 | "7. [Adaptive Thresholding](#adaptive)\n",
28 | "8. [Improving Masks with Binary Morphology](#morpho)\n",
29 | "9. [Connected Components Labeling](#label)\n",
30 | "10. [Cell Segmentation by Seeding & Expansion](#seg)\n",
31 | "11. [Postprocessing: Removing Cells at the Image Border](#postpro)\n",
32 | "12. [Identifying Cell Edges](#edges)\n",
33 | "13. [Extracting Quantitative Measurements](#measure)\n",
34 | "14. [Simple Analysis & Visualization](#analysis)\n",
35 | "15. [Writing Output to Files](#write)\n",
36 | "16. [Batch Processing](#batch)"
37 | ]
38 | },
39 | {
40 | "cell_type": "markdown",
41 | "metadata": {},
42 | "source": [
43 | "## About this Tutorial \n",
44 | "\n",
45 | "*This tutorial aims to teach the basics of (bio-)image processing with python, in particular the analysis of fluorescence microscopy data. The tutorial is self-explanatory and follows a \"learning by doing\" philosophy. It consists of step by step instructions that guide students through the construction of a 2D single-cell segmentation pipeline.*\n",
46 | "\n",
47 | "\n",
48 | "#### Instructions\n",
49 | "\n",
50 | "- This notebook contains detailed instructions on how to program a pipeline that can segment cells in 2D fluorescence microscopy images of a developing tissue.\n",
51 | "\n",
52 | "\n",
53 | "- Simply go through the instructions step by step and try to implement each step as best as you can.\n",
54 | " - By doing so, you will learn about key concepts of (bio-)image processing with python!\n",
55 | "\n",
56 | "\n",
57 | "- If you are stuck...\n",
58 | " - ...first think some more about the problem and see if you can get yourself unstuck.\n",
59 | " - ...search the internet (in particular Stack Overflow) for a solution to your problem.\n",
60 | " - ...if you are working through this tutorial in class, ask one of the tutors for help.\n",
61 | " - ...if nothing else helps, you can have a look at the solutions (`image_analysis_tutorial_solutions.py`) for inspiration.\n",
62 | " - **Beware:** If you simply copy a solution, you will learn *nothing* in the process! Use the solutions only as an inspiration for implementing your own version and make sure you fully understand how the solution works!\n",
63 | " \n",
64 | " \n",
65 | "- Some exercises are labeled as 'BONUS' exercises. It is up to you whether you'd like to complete them. \n",
66 | " - If things are going slowly, you may want to skip them and come back to them later.\n",
67 | "\n",
68 | "\n",
69 | "#### Background \n",
70 | "\n",
71 | "The aim is to construct a pipeline for the identification and segmentation of cells in 2D confocal fluorescence microscopy images of a tissue with labeled membranes. This is among the most common tasks in bio-image analysis and is often essential for the extraction of useful quantitative information from microscopy data.\n",
72 | "\n",
73 | "The pipeline is constructed based on provided example images (see `example_data` directory), which are single-color spinning-disk confocal micrographs (objective: `40X 1.2NA W`) of cells in live zebrafish embryos in early development (~10h post fertilization), fluorescently labeled with a membrane-localized fusion protein (`mNeonGreen:Ggamma9`)."
74 | ]
75 | },
76 | {
77 | "cell_type": "markdown",
78 | "metadata": {},
79 | "source": [
80 | "## Setup \n",
81 | "\n",
82 | "#### Development Environment\n",
83 | "\n",
84 | "- Depending on your preferences, there are several alternatives for implementing this pipeline:\n",
85 | " 1. Implement it directly in this Jupyter notebook *[recommended]*.\n",
86 | " 2. Implement it in your favorite Integrated Development Environment (IDE) like PyCharm.\n",
87 | " 3. Old School: Implement it in a text editor (like Vim...) and run it on the terminal.\n",
88 | "\n",
89 | "\n",
90 | "#### Example Data\n",
91 | "\n",
92 | "Make sure that you have downloaded the example image data (`example_cells_1.tif` and `example_cells_2.tif`), which should be located in the directory `example_data`, which in turn should be in the same directory as this notebook.\n",
93 | "\n",
94 | "\n",
95 | "#### Python version\n",
96 | "\n",
97 | "- The solutions were last tested with **python 3.9**, but any version of python 3 should work without changes\n",
98 | "- To install python, jupyter notebook and the required modules, we recommend the **Anaconda Distribution**. The installer can be downloaded [here](https://www.anaconda.com/download/).\n",
99 | "\n",
100 | "\n",
101 | "#### Required modules\n",
102 | "\n",
103 | "- Make sure the following modules are installed before you get started:\n",
104 | " - numpy\n",
105 | " - scipy\n",
106 | " - matplotlib\n",
107 | " - scikit-image\n",
108 | " - ipywidgets\n",
109 | "- All required modules come pre-installed if you are using the **[Anaconda distribution](https://www.anaconda.com/download/)** of python."
110 | ]
111 | },
112 | {
113 | "cell_type": "markdown",
114 | "metadata": {},
115 | "source": [
116 | "*Everything ready? Yes? Then let's get started!*\n",
117 | "\n",
118 | "----"
119 | ]
120 | },
121 | {
122 | "cell_type": "markdown",
123 | "metadata": {},
124 | "source": [
125 | "## Importing Modules & Packages "
126 | ]
127 | },
128 | {
129 | "cell_type": "markdown",
130 | "metadata": {},
131 | "source": [
132 | "Let's start by importing the package NumPy, which enables the manipulation of numerical arrays:"
133 | ]
134 | },
135 | {
136 | "cell_type": "code",
137 | "execution_count": null,
138 | "metadata": {},
139 | "outputs": [],
140 | "source": [
141 | "import numpy as np"
142 | ]
143 | },
144 | {
145 | "cell_type": "markdown",
146 | "metadata": {},
147 | "source": [
148 | "*Important note: If you are not at all familiar with arrays and NumPy, we strongly recommend that you first complete an introductory tutorial on this topic before carrying on!*"
149 | ]
150 | },
151 | {
152 | "cell_type": "markdown",
153 | "metadata": {},
154 | "source": [
155 | "Recall that, once imported, we can use functions/modules from the package, for example to create an array:"
156 | ]
157 | },
158 | {
159 | "cell_type": "code",
160 | "execution_count": null,
161 | "metadata": {},
162 | "outputs": [],
163 | "source": [
164 | "a = np.array([1, 2, 3])\n",
165 | "\n",
166 | "print(a)\n",
167 | "print(type(a))"
168 | ]
169 | },
170 | {
171 | "cell_type": "markdown",
172 | "metadata": {},
173 | "source": [
174 | "Note that the package is imported under a variable name (here `np`). You can freely choose this name yourself. For example, it would be just as valid (but not as convenient) to write:\n",
175 | "\n",
176 | "```python\n",
177 | "import numpy as lovelyArrayTool\n",
178 | "a = lovelyArrayTool.array([1,2,3])\n",
179 | "```"
180 | ]
181 | },
182 | {
183 | "cell_type": "markdown",
184 | "metadata": {},
185 | "source": [
186 | "#### Exercise\n",
187 | "\n",
188 | "Using the import command as above, follow the instructions in the comments below to import two additional modules that we will be using frequently in this pipeline."
189 | ]
190 | },
191 | {
192 | "cell_type": "code",
193 | "execution_count": null,
194 | "metadata": {},
195 | "outputs": [],
196 | "source": [
197 | "# The plotting module matplotlib.pyplot as plt\n",
198 | "### YOUR CODE HERE!\n",
199 | "\n",
200 | "# The image processing module scipy.ndimage as ndi\n",
201 | "### YOUR CODE HERE!"
202 | ]
203 | },
204 | {
205 | "cell_type": "markdown",
206 | "metadata": {},
207 | "source": [
208 | "#### Side Note for Jupyter Notebook Users\n",
209 | "\n",
210 | "You can configure how the figures made by matplotlib are displayed.\n",
211 | "\n",
212 | "The most common options are the following:\n",
213 | "\n",
214 | "- **inline**: displays as static figure in code cell output\n",
215 | "- **notebook**: displays as interactive figure in code cell output \n",
216 | "- **qt**: displays as interactive figure in a separate window\n",
217 | "\n",
218 | "Feel free to test them out on one of the figures you will generate later on in the tutorial. The code cell below shows how to set the different options. Note that combinations of different options in the same notebook do not always work well, so it is best to decide for one and use it throughout. You may need to restart the kernel (`Kernel > Restart`) when you change from one option to another."
219 | ]
220 | },
221 | {
222 | "cell_type": "code",
223 | "execution_count": null,
224 | "metadata": {},
225 | "outputs": [],
226 | "source": [
227 | "# Set matplotlib backend\n",
228 | "%matplotlib inline \n",
229 | "#%matplotlib notebook \n",
230 | "#%matplotlib qt "
231 | ]
232 | },
233 | {
234 | "cell_type": "markdown",
235 | "metadata": {},
236 | "source": [
237 | "## Loading & Handling Image Data "
238 | ]
239 | },
240 | {
241 | "cell_type": "markdown",
242 | "metadata": {},
243 | "source": [
244 | "#### Background\n",
245 | "\n",
246 | "Images are essentially just numbers (representing intensity) in an ordered grid of pixels. Image processing is simply to carry out mathematical operations on these numbers.\n",
247 | "\n",
248 | "The ideal object for storing and manipulating ordered grids of numbers is the **array**. Many mathematical operations are well defined on arrays and can be computed quickly by vector-based computation.\n",
249 | "\n",
250 | "Arrays can have any number of dimensions (or \"axes\"). For example, a 2D array could represent the x and y axis of a grayscale image (xy), a 3D array could contain a z-stack (zyx), a 4D array could also have multiple channels for each image (czyx) and a 5D array could have time on top of that (tczyx)."
251 | ]
252 | },
253 | {
254 | "cell_type": "markdown",
255 | "metadata": {},
256 | "source": [
257 | "#### Exercise\n",
258 | "\n",
259 | "We will now proceed to load one of the example images and verify that we get what we expect. \n",
260 | "\n",
261 | "Note: Before starting, it always makes sense to have a quick look at the data in Fiji/ImageJ so you know what you are working with!\n",
262 | "\n",
263 | "Follow the instructions in the comments below."
264 | ]
265 | },
266 | {
267 | "cell_type": "code",
268 | "execution_count": null,
269 | "metadata": {},
270 | "outputs": [],
271 | "source": [
272 | "# (i) Specify the directory path and file name\n",
273 | "\n",
274 | "# Create a string variable with the name of the file you'd like to load (here: 'example_cells_1.tif').\n",
275 | "# Suggested name for the variable: filename\n",
276 | "# Note: Paths and filenames can contain slashes, empty spaces and other special symbols, which can cause \n",
277 | "# trouble for programming languages under certain circumstances. To circumvent such trouble, add \n",
278 | "# the letter r before your string definition to create a so-called 'raw string', which is not\n",
279 | "# affected by these problems (e.g. `my_raw_string = r\"some string with funny symbols: \\\\\\!/~***!\"`).\n",
280 | "### YOUR CODE HERE!\n",
281 | "\n",
282 | "# If the file is not in the current working directory, you must also have a way of specifying the path\n",
283 | "# to the directory where the file is stored. Most likely, your example images are stored in a directory\n",
284 | "# called 'example_data' in the same folder as this notebook. Note that you can use either the full path\n",
285 | "# - something like r\"/home/jack/data/python_image_analysis/example_data\"\n",
286 | "# or the relative path, starting from the current working directory\n",
287 | "# - here that would just be r\"example_data\"\n",
288 | "# Create a string variable with the path to the directory that contains the file you'd like to load.\n",
289 | "# Suggested name for the variable: dirpath\n",
290 | "### YOUR CODE HERE!"
291 | ]
292 | },
293 | {
294 | "cell_type": "code",
295 | "execution_count": null,
296 | "metadata": {},
297 | "outputs": [],
298 | "source": [
299 | "# (ii) Combine the directory path and file name into one variable, the file path\n",
300 | "\n",
301 | "# Import the function 'join' from the module 'os.path'\n",
302 | "# This function automatically takes care of the slashes that need to be added when combining two paths.\n",
303 | "### YOUR CODE HERE!\n",
304 | "\n",
305 | "# Use the 'join' function to combine the directory path with the file name and create a new variable.\n",
306 | "# Print the result to see that everything is correct (this is always a good idea!)\n",
307 | "# Suggested name for the variable: filepath\n",
308 | "### YOUR CODE HERE!"
309 | ]
310 | },
311 | {
312 | "cell_type": "code",
313 | "execution_count": null,
314 | "metadata": {},
315 | "outputs": [],
316 | "source": [
317 | "# (iii) Load the image\n",
318 | "\n",
319 | "# Import the function 'imread' from the module 'skimage.io'.\n",
320 | "# (Note: If this gives you an error, please refer to the note below!)\n",
321 | "### YOUR CODE HERE!\n",
322 | "\n",
323 | "# Load 'example_cells_1.tif' and store it in a variable.\n",
324 | "# Suggested name for the variable: img\n",
325 | "### YOUR CODE HERE!"
326 | ]
327 | },
328 | {
329 | "cell_type": "markdown",
330 | "metadata": {},
331 | "source": [
332 | "----\n",
333 | "\n",
334 | "*Important note for those who get an error when trying to import `imread` from `skimage.io`:*\n",
335 | "\n",
336 | "Some users have been experiencing problems with this module, even though the rest of skimage is installed correctly (running `import skimage` does not given an error). This may have something to do with operating system preferences. The easiest solution in this case is to install the module `tifffile` (with three`f`) and use the function `imread` from that module (it is identical to the `imread` function of `skimage.io` when reading `tif` files). \n",
337 | "\n",
338 | "The `tifffile` module does not come with the Anaconda distribution, so it's likely that you don't have it installed. To install it, save and exit Jupyter notebook, then go to a terminal and type `conda install -c conda-forge tifffile`. After the installation is complete, restart Jupyter notebook, come back here and import `imread` from `tifffile`. This should now hopefully work.\n",
339 | "\n",
340 | "----"
341 | ]
342 | },
343 | {
344 | "cell_type": "code",
345 | "execution_count": null,
346 | "metadata": {},
347 | "outputs": [],
348 | "source": [
349 | "# (iv) Check that everything is in order\n",
350 | "\n",
351 | "# Check that 'img' is a variable of type 'ndarray' - use Python's built-in function 'type'.\n",
352 | "### YOUR CODE HERE!\n",
353 | "\n",
354 | "# Print the shape of the array by looking at its 'shape' attribute. \n",
355 | "# Make sure you understand the output!\n",
356 | "### YOUR CODE HERE!\n",
357 | "\n",
358 | "# Check the datatype of the individual numbers in the array. You can use the array attribute 'dtype' to do so.\n",
359 | "# Make sure you understand the output!\n",
360 | "### YOUR CODE HERE!"
361 | ]
362 | },
363 | {
364 | "cell_type": "code",
365 | "execution_count": null,
366 | "metadata": {},
367 | "outputs": [],
368 | "source": [
369 | "# (v) Look at the image to confirm that everything worked as intended\n",
370 | "\n",
371 | "# To plot an array as an image, use pyplot's functions 'plt.imshow' followed by 'plt.show'. \n",
372 | "# Check the documentation for 'plt.imshow' and note the parameters that can be specified, such as colormap (cmap)\n",
373 | "# and interpolation. Since you are working with scientific data, interpolation is unwelcome, so you should set it \n",
374 | "# to \"none\". The most common cmap for grayscale images is naturally \"gray\".\n",
375 | "# You may also want to adjust the size of the figure. You can do this by preparing the figure canvas with\n",
376 | "# the function 'plt.figure' before calling 'plt.imshow'. The canvas size is adjusted using the keyword argument\n",
377 | "# 'figsize' when calling 'plt.figure'.\n",
378 | "### YOUR CODE HERE!"
379 | ]
380 | },
381 | {
382 | "cell_type": "markdown",
383 | "metadata": {},
384 | "source": [
385 | "## Preprocessing "
386 | ]
387 | },
388 | {
389 | "cell_type": "markdown",
390 | "metadata": {},
391 | "source": [
392 | "#### Background\n",
393 | "\n",
394 | "The goal of image preprocessing is to prepare or optimize the images to make further analysis easier. Usually, this boils down to increasing the signal-to-noise ratio by removing noise and background and by enhancing structures of interest.\n",
395 | "\n",
396 | "The specific preprocessing steps used in a pipeline depend on the type of sample, the microscopy technique used, the image quality, and the desired downstream analysis. \n",
397 | "\n",
398 | "The most common operations include:\n",
399 | "\n",
400 | "\n",
401 | "- Deconvolution\n",
402 | " - Image reconstruction based on information about the PSF of the microscope\n",
403 | " - These days deconvolution is often included with microscope software\n",
404 | " - *Our example images are not deconvolved, but will do just fine regardless*\n",
405 | "\n",
406 | "\n",
407 | "- Conversion to 8-bit images to save memory / computational time\n",
408 | " - *Our example images are already 8-bit*\n",
409 | "\n",
410 | "\n",
411 | "- Cropping of images to an interesting region\n",
412 | " - *The field of view in our example images is fine as it is*\n",
413 | "\n",
414 | "\n",
415 | "- Smoothing of technical noise\n",
416 | " - This is a very common step and usually helps to improve almost any type of downstream analysis\n",
417 | " - Commonly used filters are the `Gaussian filter` and the `median filter`\n",
418 | " - *Here we will be using a Gaussian filter.*\n",
419 | "\n",
420 | "\n",
421 | "- Corrections of technical artifacts\n",
422 | " - Common examples are uneven illumination and multi-channel bleed-through\n",
423 | " - *Here we will deal with uneven signal by adaptive/local thresholding*\n",
424 | "\n",
425 | "\n",
426 | "- Background subtraction\n",
427 | " - There are various ways of sutracting background signal from an image\n",
428 | " - Two different types are commonly distinguished:\n",
429 | " - `uniform background subtraction` treats all regions of the image the same\n",
430 | " - `adaptive or local background subtraction` automatically accounts for differences between regions of the image\n",
431 | " - *Here we will do something similar to adaptive background subtraction when we do adaptive thresholding*"
432 | ]
433 | },
434 | {
435 | "cell_type": "markdown",
436 | "metadata": {},
437 | "source": [
438 | "#### Gaussian Smoothing\n",
439 | "\n",
440 | "A Gaussian filter smoothens an image by convolving it with a Gaussian-shaped kernel. In the case of a 2D image, the Gaussian kernel is also 2D and will look something like this:\n",
441 | "\n",
442 | "
\n",
443 | "\n",
444 | "How much the image is smoothed by a Gaussian kernel is determined by the standard deviation of the Gaussian distribution, usually referred to as **sigma** ($\\sigma$). A higher $\\sigma$ means a broader distribution and thus more smoothing.\n",
445 | "\n",
446 | "**How to choose the correct value of $\\sigma$?**\n",
447 | "\n",
448 | "This depends a lot on your images, in particular on the pixel size. In general, the chosen $\\sigma$ should be large enough to blur out noise but small enough so the \"structures of interest\" do not get blurred too much. Usually, the best value for $\\sigma$ is simply found by trying out some different options and looking at the result. "
449 | ]
450 | },
451 | {
452 | "cell_type": "markdown",
453 | "metadata": {},
454 | "source": [
455 | "#### Exercise\n",
456 | "\n",
457 | "Perform Gaussian smoothing and visualize the result.\n",
458 | "\n",
459 | "Follow the instructions in the comments below."
460 | ]
461 | },
462 | {
463 | "cell_type": "code",
464 | "execution_count": null,
465 | "metadata": {},
466 | "outputs": [],
467 | "source": [
468 | "# (i) Create a variable for the smoothing factor sigma, which should be an integer value\n",
469 | "\n",
470 | "### YOUR CODE HERE!\n",
471 | "\n",
472 | "# After implementing the Gaussian smoothing function below, you can modify this variable \n",
473 | "# to find the ideal value of sigma."
474 | ]
475 | },
476 | {
477 | "cell_type": "code",
478 | "execution_count": null,
479 | "metadata": {},
480 | "outputs": [],
481 | "source": [
482 | "# (ii) Perform the smoothing on the image\n",
483 | "\n",
484 | "# To do so, use the Gaussian filter function 'ndi.gaussian_filter' from the \n",
485 | "# image processing module 'scipy.ndimage', which was imported at the start of the tutorial. \n",
486 | "# Check out the documentation of scipy to see how to use this function. \n",
487 | "# Allocate the output to a new variable.\n",
488 | "### YOUR CODE HERE!"
489 | ]
490 | },
491 | {
492 | "cell_type": "code",
493 | "execution_count": null,
494 | "metadata": {},
495 | "outputs": [],
496 | "source": [
497 | "# (iii) Visualize the result using 'plt.imshow'\n",
498 | "\n",
499 | "# Compare with the original image visualized above. \n",
500 | "# Does the output make sense? Is this what you expected? \n",
501 | "# Can you optimize sigma such that the image looks smooth without blurring the membranes too much?\n",
502 | "### YOUR CODE HERE!\n",
503 | "\n",
504 | "# To have a closer look at a specific region of the image, crop that region out and show it in a \n",
505 | "# separate plot. Remember that you can crop arrays by \"indexing\" or \"slicing\" them similar to lists.\n",
506 | "# Use such \"zoomed-in\" views throughout this tutorial to take a closer look at your intermediate \n",
507 | "# results when necessary.\n",
508 | "### YOUR CODE HERE!"
509 | ]
510 | },
511 | {
512 | "cell_type": "code",
513 | "execution_count": null,
514 | "metadata": {},
515 | "outputs": [],
516 | "source": [
517 | "# (iv) BONUS: Show the raw and smoothed images side by side using 'plt.subplots'\n",
518 | "\n",
519 | "### YOUR CODE HERE!"
520 | ]
521 | },
522 | {
523 | "cell_type": "markdown",
524 | "metadata": {},
525 | "source": [
526 | "## Manual Thresholding & Threshold Detection "
527 | ]
528 | },
529 | {
530 | "cell_type": "markdown",
531 | "metadata": {},
532 | "source": [
533 | "#### Background\n",
534 | "\n",
535 | "The easiest way to distinguish foreground objects (here: membranes) from the image background is to threshold the image, meaning all pixels with an intensity above a certain threshold are accepted as foreground, all others are set as background.\n",
536 | "\n",
537 | "To find the best threshold for a given image, one option is to simply try out different thresholds manually. Alternatively, one of many algorithms for automated 'threshold detection' can be used. These algorithms use information about the image (such as the histogram) to automatically find a suitable threshold value, often under the assumption that the background and foreground pixels in an image belong to two clearly distinct populations in terms of their intensity. \n",
538 | "\n",
539 | "There are many different algorithms for threshold detection and it is often hard to predict which one will produce the nicest and most robust result for a particular dataset. It therefore makes sense to try out a bunch of different options.\n",
540 | "\n",
541 | "For this pipeline, we will ultimately use a more advanced thresholding approach, which also accounts (to some extent) for variations in signal across the field of view: adaptive thresholding. \n",
542 | "\n",
543 | "But first, let's experiment a bit with threshold detection."
544 | ]
545 | },
546 | {
547 | "cell_type": "markdown",
548 | "metadata": {},
549 | "source": [
550 | "#### Exercise\n",
551 | "\n",
552 | "Try out manual thresholding and automated threshold detection.\n",
553 | "\n",
554 | "Follow the instructions in the comments below."
555 | ]
556 | },
557 | {
558 | "cell_type": "code",
559 | "execution_count": null,
560 | "metadata": {},
561 | "outputs": [],
562 | "source": [
563 | "# (i) Create a variable for a manually set threshold, which should be an integer\n",
564 | "\n",
565 | "# This can be changed later to find a suitable value.\n",
566 | "### YOUR CODE HERE!"
567 | ]
568 | },
569 | {
570 | "cell_type": "code",
571 | "execution_count": null,
572 | "metadata": {},
573 | "outputs": [],
574 | "source": [
575 | "# (ii) Perform thresholding on the smoothed image\n",
576 | "\n",
577 | "# Remember that you can use relational (Boolean) expressions such as 'smaller' (<), 'equal' (==)\n",
578 | "# or 'greater or equal' (>=) with numpy arrays - and you can directly assign the result to a new\n",
579 | "# variable.\n",
580 | "### YOUR CODE HERE!\n",
581 | "\n",
582 | "# Check the dtype of your thresholded image\n",
583 | "# You should see that the dtype is 'bool', which stands for 'Boolean' and means the array\n",
584 | "# is now simply filled with 'True' and 'False', where 'True' is the foreground (the regions\n",
585 | "# above the threshold) and 'False' is the background.\n",
586 | "### YOUR CODE HERE!"
587 | ]
588 | },
589 | {
590 | "cell_type": "code",
591 | "execution_count": null,
592 | "metadata": {},
593 | "outputs": [],
594 | "source": [
595 | "# (iii) Visualize the result\n",
596 | "\n",
597 | "### YOUR CODE HERE!"
598 | ]
599 | },
600 | {
601 | "cell_type": "code",
602 | "execution_count": null,
603 | "metadata": {},
604 | "outputs": [],
605 | "source": [
606 | "# (iv) Try out different thresholds to find the best one\n",
607 | "\n",
608 | "# If you are using jupyter notebook, you can adapt the code below to\n",
609 | "# interactively change the threshold and look for the best one. These\n",
610 | "# kinds of interactive functions are called 'widgets' and are very \n",
611 | "# useful in exploratory data analysis to create greatly simplified\n",
612 | "# 'User Interfaces' (UIs) on the fly.\n",
613 | "# As a BONUS exercise, try to understand or look up how the widget works\n",
614 | "# and play around with it a bit!\n",
615 | "# (Note: If this just displays a static image without a slider to adjust\n",
616 | "# the threshold or if it displays a text warning about activating\n",
617 | "# the 'widgetsnbextension', check out the note below!)\n",
618 | "\n",
619 | "# Prepare widget\n",
620 | "from ipywidgets import interact\n",
621 | "@interact(thresh=(10,250,10))\n",
622 | "def select_threshold(thresh=100):\n",
623 | " \n",
624 | " # Thresholding\n",
625 | " ### ADAPT THIS: Change 'img_smooth' into the variable you stored the smoothed image in!\n",
626 | " mem = img_smooth > thresh\n",
627 | " \n",
628 | " # Visualization\n",
629 | " plt.figure(figsize=(7,7))\n",
630 | " plt.imshow(mem, interpolation='none', cmap='gray')\n",
631 | " plt.show()"
632 | ]
633 | },
634 | {
635 | "cell_type": "markdown",
636 | "metadata": {},
637 | "source": [
638 | "----\n",
639 | "\n",
640 | "*Important note for those who get a static image (no slider) or a text warning:*\n",
641 | "\n",
642 | "For some users, it is necessary to specifically activate the widgets plugin for Jupyter notebook. To do so, save and exit Jupyter notebook, then go to a terminal (or Anaconda prompt) and write `jupyter nbextension enable --py --sys-prefix widgetsnbextension`. After this, you should be able to restart Jupyter notebook and the widget should display correctly. \n",
643 | "\n",
644 | "If it still doesn't work, you may instead have to type `jupyter nbextension enable --py widgetsnbextension` in the terminal (or Anaconda prompt). However, note that this implies that your installation of Conda/Jupyter is not optimally configured (see [this GitHub issue](https://github.com/jupyter-widgets/ipywidgets/issues/541) for more information, although this is not something you necessarily need to worry about in the context of this course).\n",
645 | "\n",
646 | "----"
647 | ]
648 | },
649 | {
650 | "cell_type": "code",
651 | "execution_count": null,
652 | "metadata": {},
653 | "outputs": [],
654 | "source": [
655 | "# (v) Perfom automated threshold detection with Otsu's method\n",
656 | "\n",
657 | "# The scikit-image module 'skimage.filters.thresholding' provides\n",
658 | "# several threshold detection algorithms. The most popular one \n",
659 | "# among them is Otsu's method. Using what you've learned so far,\n",
660 | "# import the 'threshold_otsu' function, use it to automatically \n",
661 | "# determine a threshold for the smoothed image, apply the threshold,\n",
662 | "# and visualize the result.\n",
663 | "### YOUR CODE HERE!"
664 | ]
665 | },
666 | {
667 | "cell_type": "code",
668 | "execution_count": null,
669 | "metadata": {},
670 | "outputs": [],
671 | "source": [
672 | "# (vi) BONUS: Did you notice the 'try_all_threshold' function?\n",
673 | "\n",
674 | "# That's convenient! Use it to automatically test the threshold detection\n",
675 | "# functions in 'skimage.filters.thresholding'. Don't forget to adjust the\n",
676 | "# 'figsize' parameter so the resulting images are clearly visible.\n",
677 | "### YOUR CODE HERE!"
678 | ]
679 | },
680 | {
681 | "cell_type": "markdown",
682 | "metadata": {},
683 | "source": [
684 | "## Adaptive Thresholding "
685 | ]
686 | },
687 | {
688 | "cell_type": "markdown",
689 | "metadata": {},
690 | "source": [
691 | "#### Background\n",
692 | "\n",
693 | "Simply applying a fixed intensity threshold does not always produce a foreground mask of sufficiently high quality, since background and foreground intensities often vary across the image. In our example image, for instance, the intensity drops at the image boundaries - a problem that cannot be resolved just by changing the threshold value.\n",
694 | "\n",
695 | "One way of addressing this issue is to use an *adaptive thresholding* algorithm, which adjusts the threshold locally in different regions of the image to account for varying intensities.\n",
696 | "\n",
697 | "Although `scikit-image` provides a function for adaptive thresholding (called `threshold_local`), we will here implement our own version, which is slightly different and will hopefully make the concept of adaptive thresholding very clear.\n",
698 | "\n",
699 | "Our approach to adaptive tresholding works in two steps:\n",
700 | "\n",
701 | "1. Generation of a \"background image\"\n",
702 | "\n",
703 | " This image should - across the entire image - always have higher intensities than the local background but lower intensities than the local foreground. This can be achieved by strong blurring/smoothing of the image, as illustrated in this 1D example:\n",
704 | "\n",
705 | " 
\n",
706 | " \n",
707 | "2. Thresholding of the original image with the background\n",
708 | "\n",
709 | " Instead of thresholding with a single value, every pixel in the image is thresholded with the corresponding pixel of the \"background image\"."
710 | ]
711 | },
712 | {
713 | "cell_type": "markdown",
714 | "metadata": {},
715 | "source": [
716 | "#### Exercise \n",
717 | "\n",
718 | "Implement the two steps of the adaptive background subtraction:\n",
719 | "\n",
720 | "1. Use a strong \"mean filter\" (aka \"uniform filter\") to create the background image. This simply assigns each pixel the average value of its local neighborhood. Just like the Gaussian blur, this can be done by convolution, but this time using a \"uniform kernel\" like this one:\n",
721 | "\n",
722 | "
\n",
723 | " \n",
724 | " To define which pixels should be considered as the local neighborhood of a given pixel, a `structuring element` (`SE`) is used. This is a small binary image where all pixels set to `1` will be considered as part of the neighborhood and all pixels set to `0` will not be considered. Here, we use a disc-shaped `SE`, as this reduces artifacts compared to a square `SE`.\n",
725 | " \n",
726 | " *Side note:* A strong Gaussian blur would also work to create the background mask. For the Gaussian blur, the analogy to the `SE` is the `sigma` value, which in a way also determines the size of the local neighborhood.
\n",
727 | "\n",
728 | "2. Use the background image for thresholding. In practical terms, this works in exactly the same way as thresholding with a single value, since numpy arrays will automatically perform element-wise (pixel-by-pixel) comparisons when compared to other arrays of the same shape by a relational (Boolean) expression.\n",
729 | "\n",
730 | "Follow the instructions in the comments below."
731 | ]
732 | },
733 | {
734 | "cell_type": "code",
735 | "execution_count": null,
736 | "metadata": {},
737 | "outputs": [],
738 | "source": [
739 | "# Step 1\n",
740 | "# ------\n",
741 | "\n",
742 | "# (i) Create a disk-shaped structuring element and asign it to a new variable.\n",
743 | "\n",
744 | "# Structuring elements are small binary images that indicate which pixels \n",
745 | "# should be considered as the 'neighborhood' of the central pixel. \n",
746 | "#\n",
747 | "# An example of a small disk-shaped SE would be this:\n",
748 | "# 0 0 1 0 0\n",
749 | "# 0 1 1 1 0\n",
750 | "# 1 1 1 1 1\n",
751 | "# 0 1 1 1 0\n",
752 | "# 0 0 1 0 0\n",
753 | "#\n",
754 | "# The expression below creates such structuring elements. \n",
755 | "# It is an elegant but complicated piece of code and at the moment it is not \n",
756 | "# necessary for you to understand it in detail. Use it to create structuring \n",
757 | "# elements of different sizes (by changing 'i') and find a way to visualize \n",
758 | "# the result (remember that the SE is just a small 'image').\n",
759 | "# \n",
760 | "# Try to answer the following questions: \n",
761 | "# - Is the resulting SE really circular? \n",
762 | "# - Could certain values of 'i' cause problems? If so, why?\n",
763 | "# - What value of 'i' should be used for the SE?\n",
764 | "# Note that, similar to the sigma in Gaussian smoothing, the size of the SE\n",
765 | "# is first estimated based on the images and by thinking about what would \n",
766 | "# make sense. Later it can be optimized by trial and error.\n",
767 | "\n",
768 | "# Create SE\n",
769 | "i = ???\n",
770 | "struct = (np.mgrid[:i,:i][0] - np.floor(i/2))**2 + (np.mgrid[:i,:i][1] - np.floor(i/2))**2 <= np.floor(i/2)**2\n",
771 | "\n",
772 | "# Visualize the result\n",
773 | "### YOUR CODE HERE!"
774 | ]
775 | },
776 | {
777 | "cell_type": "code",
778 | "execution_count": null,
779 | "metadata": {},
780 | "outputs": [],
781 | "source": [
782 | "# (ii) Create the background\n",
783 | "\n",
784 | "# Run a mean filter over the image using the disc SE and assign the output to a new variable.\n",
785 | "# Use the function 'skimage.filters.rank.mean'.\n",
786 | "### YOUR CODE HERE!"
787 | ]
788 | },
789 | {
790 | "cell_type": "code",
791 | "execution_count": null,
792 | "metadata": {},
793 | "outputs": [],
794 | "source": [
795 | "# (iii) Visualize the resulting background image. Does what you get make sense?\n",
796 | "\n",
797 | "### YOUR CODE HERE!"
798 | ]
799 | },
800 | {
801 | "cell_type": "code",
802 | "execution_count": null,
803 | "metadata": {},
804 | "outputs": [],
805 | "source": [
806 | "# Step 2\n",
807 | "# ------\n",
808 | "\n",
809 | "# (iv) Threshold the Gaussian-smoothed original image against the background image created in step 1 \n",
810 | "# using a relational expression\n",
811 | "\n",
812 | "### YOUR CODE HERE!"
813 | ]
814 | },
815 | {
816 | "cell_type": "code",
817 | "execution_count": null,
818 | "metadata": {},
819 | "outputs": [],
820 | "source": [
821 | "# (v) Visualize and understand the output. \n",
822 | "\n",
823 | "### YOUR CODE HERE!\n",
824 | "\n",
825 | "# What do you observe? \n",
826 | "# Are you happy with this result as a membrane segmentation? \n",
827 | "# Adapt the size of the circular SE to optimize the result!"
828 | ]
829 | },
830 | {
831 | "cell_type": "markdown",
832 | "metadata": {},
833 | "source": [
834 | "## Improving Masks with Binary Morphology "
835 | ]
836 | },
837 | {
838 | "cell_type": "markdown",
839 | "metadata": {},
840 | "source": [
841 | "#### Background\n",
842 | "\n",
843 | "Morphological operations such as `erosion`, `dilation`, `closing` and `opening` are common tools used to improve masks after they are generated by thresholding. They can be used to fill small holes, remove noise, increase or decrease the size of an object, or smoothen mask outlines.\n",
844 | "\n",
845 | "Most morphological operations are once again simple kernel functions that are applied at each pixel of the image based on their neighborhood as defined by a `structuring element` (`SE`). For example, `dilation` simply assigns to the central pixel the maximum pixel value within the neighborhood; it is a maximum filter. Conversely, `erosion` is a minimum filter. Additional options emerge from combining the two: `morphological closing`, for example, is a `dilation` followed by an `erosion`. This is used to fill in gaps and holes or smoothing mask outlines without significantly changing the mask's area. Finally, there are also some more complicated morphological operations, such as `hole filling`."
846 | ]
847 | },
848 | {
849 | "cell_type": "markdown",
850 | "metadata": {},
851 | "source": [
852 | "#### Exercise\n",
853 | "\n",
854 | "Improve the membrane segmentation from above with morphological operations.\n",
855 | "\n",
856 | "Specifically, use `binary hole filling` to get rid of the speckles of foreground pixels that litter the insides of the cells. Furthermore, try different other types of morphological filtering to see how they change the image and to see if you can improve the membrane mask even more, e.g. by filling in gaps.\n",
857 | "\n",
858 | "Follow the instructions in the comments below. Visualize all intermediate results of your work and remember to \"zoom in\" to get a closer look by slicing out and then plotting a subsection of the image array."
859 | ]
860 | },
861 | {
862 | "cell_type": "code",
863 | "execution_count": null,
864 | "metadata": {},
865 | "outputs": [],
866 | "source": [
867 | "# (i) Get rid of speckles using binary hole filling\n",
868 | "\n",
869 | "# Use the function 'ndi.binary_fill_holes' for this. Be sure to check the docs to\n",
870 | "# understand exactly what it does. For this to work as intended, you will have to \n",
871 | "# invert the mask, which you can do using the function `np.logical_not` or the\n",
872 | "# corresponding operator '~'. Again, be sure to understand why this has to be done\n",
873 | "# and don't forget to revert the result back.\n",
874 | "\n",
875 | "### YOUR CODE HERE!"
876 | ]
877 | },
878 | {
879 | "cell_type": "code",
880 | "execution_count": null,
881 | "metadata": {},
882 | "outputs": [],
883 | "source": [
884 | "# (ii) Try out other morphological operations to further improve the membrane mask\n",
885 | "\n",
886 | "# The various operations are available in the ndimage module, for example 'ndi.binary_closing'.\n",
887 | "# Play around and see how the different functions affect the mask. Can you optimize the mask, \n",
888 | "# for example by closing gaps?\n",
889 | "# Note that the default SE for these functions is a square. Feel free to create another disc-\n",
890 | "# shaped SE and see how that changes the outcome.\n",
891 | "# BONUS: If you pay close attention, you will notice that some of these operations introduce \n",
892 | "# artefacts at the image boundaries. Can you come up with a way of solving this? (Hint: 'np.pad')\n",
893 | "\n",
894 | "### YOUR CODE HERE!"
895 | ]
896 | },
897 | {
898 | "cell_type": "code",
899 | "execution_count": null,
900 | "metadata": {},
901 | "outputs": [],
902 | "source": [
903 | "# (iii) Visualize the final result\n",
904 | "\n",
905 | "### YOUR CODE HERE\n",
906 | "\n",
907 | "# At this point you should have a pretty neat membrane mask.\n",
908 | "# If you are not satisfied with the quality your membrane segmentation, you should go back \n",
909 | "# and fine-tune the size of the SE in the adaptive thresholding section and also optimize\n",
910 | "# the morphological cleaning operations.\n",
911 | "# Note that the quality of the membrane segmentation will have a significant impact on the \n",
912 | "# cell segmentation we will perform next."
913 | ]
914 | },
915 | {
916 | "cell_type": "markdown",
917 | "metadata": {},
918 | "source": [
919 | "## Connected Components Labeling "
920 | ]
921 | },
922 | {
923 | "cell_type": "markdown",
924 | "metadata": {},
925 | "source": [
926 | "#### Background\n",
927 | "\n",
928 | "Based on the membrane segmentation, we can get a preliminary segmentation of the cells in the image by considering each background region surrounded by membranes as a cell. This can already be good enough for many simple measurements.\n",
929 | "\n",
930 | "The only thing we still need to do in order to get there is to label each cell individually. Only if each separate cell has a unique number (an `ID`) assigned, values such as the mean intensity can be measured and analyzed at the single-cell level.\n",
931 | "\n",
932 | "The approach used to achieve this is called `connected components labeling`. It gives every connected group of foreground pixels a unique `ID` number."
933 | ]
934 | },
935 | {
936 | "cell_type": "markdown",
937 | "metadata": {},
938 | "source": [
939 | "#### Exercise\n",
940 | "\n",
941 | "Use your membrane segmentation for connected components labeling.\n",
942 | "\n",
943 | "Follow the instructions in the comments below."
944 | ]
945 | },
946 | {
947 | "cell_type": "code",
948 | "execution_count": null,
949 | "metadata": {},
950 | "outputs": [],
951 | "source": [
952 | "# (i) Label connected components\n",
953 | "\n",
954 | "# Use the function 'ndi.label' from the 'ndimage' module. \n",
955 | "# Note that this function labels foreground pixels (1s, not 0s), so you may need \n",
956 | "# to invert your membrane mask just as for hole filling above.\n",
957 | "# Also, note that 'ndi.label' returns another result in addition to the labeled \n",
958 | "# image. Read up on this in the function's documentation and make sure you don't\n",
959 | "# mix up the two outputs!\n",
960 | "\n",
961 | "### YOUR CODE HERE!"
962 | ]
963 | },
964 | {
965 | "cell_type": "code",
966 | "execution_count": null,
967 | "metadata": {},
968 | "outputs": [],
969 | "source": [
970 | "# (ii) Visualize the output\n",
971 | "\n",
972 | "# Here, it is no longer ideal to use a 'gray' colormap, since we want to visualize that each\n",
973 | "# cell has a unique ID. Play around with different colormaps (check the docs to see what\n",
974 | "# types of colormaps are available) and choose one that you are happy with.\n",
975 | "\n",
976 | "### YOUR CODE HERE!\n",
977 | "\n",
978 | "# Take a close look at the picture and note mistakes in the segmentation. Depending on the\n",
979 | "# quality of your membrane mask, there will most likely be some cells that are 'fused', meaning \n",
980 | "# two or more cells are labeled as the same cell; this is called \"under-segmentation\". \n",
981 | "# We will resolve this issue in the next step. Note that our downstream pipeline does not involve \n",
982 | "# any steps to resolve \"over-segmentation\" (i.e. a cell being wrongly split into multiple labeled\n",
983 | "# areas), so you should tune your membrane mask such that this is not a common problem."
984 | ]
985 | },
986 | {
987 | "cell_type": "markdown",
988 | "metadata": {},
989 | "source": [
990 | "## Cell Segmentation by Seeding & Expansion "
991 | ]
992 | },
993 | {
994 | "cell_type": "markdown",
995 | "metadata": {},
996 | "source": [
997 | "#### Background\n",
998 | "\n",
999 | "The segmentation we achieved by membrane masking and connected components labeling is a good start. We could for example use it to measure the fluorescence intensity in each cell's cytoplasm. However, we cannot use it to measure intensities at the membrane of the cells, nor can we use it to accurately measure features like cell shape or size.\n",
1000 | "\n",
1001 | "To improve this (and to resolve cases of under-segmentation), we can use a \"seeding & expansion\" strategy. Expansion algorithms such as the `watershed` start from a small `seed` and \"grow outward\" until they touch the boundaries of neighboring cells, which are themselves growing outward from neighboring seeds. Since the \"growth rate\" at the edge of the growing areas is dependent on image intensity (higher intensity means slower expansion), these expansion methods end up tracing the cells' outlines."
1002 | ]
1003 | },
1004 | {
1005 | "cell_type": "markdown",
1006 | "metadata": {},
1007 | "source": [
1008 | "### Seeding by Distance Transform"
1009 | ]
1010 | },
1011 | {
1012 | "cell_type": "markdown",
1013 | "metadata": {},
1014 | "source": [
1015 | "#### Background\n",
1016 | "\n",
1017 | "A `seed image` contains a few pixels at the center of each cell labeled by a unique `ID` number and surrounded by zeros. The expansion algorithm will start from these central pixels and grow outward until all zeros are overwritten by an `ID` label. In the case of `watershed` expansion, one can imagine the `seeds` as the sources from which water pours into the cells and starts filling them up.\n",
1018 | "\n",
1019 | "For multi-channel images that contain a nuclear label, it is common practice to mask the nuclei by thresholding and use an eroded version of the nuclei as seeds for cell segmentation. However, there are good alternative seeding approaches for cases where nuclei are not available or not nicely separable by thresholding.\n",
1020 | "\n",
1021 | "Here, we will use a `distance transform` for seeding. In a `distance transform`, each pixel in the foreground (here the cells) is assigned a value corresponding to its distance from the closest background pixel (here the membrane segmentation). In other words, we encode within the image how far each pixel of a cell is away from the membrane (see figure below). The pixels furthest away from the membrane will be at the center of the cells and will have the highest values. Using a function to detect `local maxima`, we will find these high-value peaks and use them as seeds for our segmentation.\n",
1022 | "\n",
1023 | "
\n",
1024 | "\n",
1025 | "One big advantage of this approach is that it will create two separate seeds even if two cells are connected by a hole in the membrane segmentation. Thus, under-segmentation artifacts will be reduced."
1026 | ]
1027 | },
1028 | {
1029 | "cell_type": "markdown",
1030 | "metadata": {},
1031 | "source": [
1032 | "#### Exercise \n",
1033 | "\n",
1034 | "Find seeds using the distance transform approach.\n",
1035 | "\n",
1036 | "This involves the following three steps:\n",
1037 | "\n",
1038 | "1. Run the distance transform on your membrane mask.\n",
1039 | "\n",
1040 | "2. Due to irregularities in the membrane shape, the distance transform may have some smaller local maxima in addition to those at the center of the cells. This will lead to additional seeds, which will lead to over-segmentation. To resolve this problem, smoothen the distance transform using Gaussian smoothing. \n",
1041 | "\n",
1042 | "3. Find the seeds by detecting local maxima. Optimize the seeding by changing the amount of smoothing done in step 2, aiming to have exactly one seed for each cell (although this may not be perfectly achievable).\n",
1043 | "\n",
1044 | "Follow the instructions in the comments below."
1045 | ]
1046 | },
1047 | {
1048 | "cell_type": "code",
1049 | "execution_count": null,
1050 | "metadata": {},
1051 | "outputs": [],
1052 | "source": [
1053 | "# (i) Run a distance transform on the membrane mask\n",
1054 | "\n",
1055 | "# Use the function 'ndi.distance_transform_edt'.\n",
1056 | "# You may need to invert your membrane mask so the distances are computed on\n",
1057 | "# the cells, not on the membranes.\n",
1058 | "### YOUR CODE HERE!"
1059 | ]
1060 | },
1061 | {
1062 | "cell_type": "code",
1063 | "execution_count": null,
1064 | "metadata": {},
1065 | "outputs": [],
1066 | "source": [
1067 | "# (ii) Visualize the output and understand what you are seeing.\n",
1068 | "\n",
1069 | "### YOUR CODE HERE!"
1070 | ]
1071 | },
1072 | {
1073 | "cell_type": "code",
1074 | "execution_count": null,
1075 | "metadata": {},
1076 | "outputs": [],
1077 | "source": [
1078 | "# (iii) Smoothen the distance transform\n",
1079 | "\n",
1080 | "# Use 'ndi.gaussian_filter' to do so.\n",
1081 | "# You will have to optimize your choice of 'sigma' based on the outcome below.\n",
1082 | "### YOUR CODE HERE!"
1083 | ]
1084 | },
1085 | {
1086 | "cell_type": "code",
1087 | "execution_count": null,
1088 | "metadata": {},
1089 | "outputs": [],
1090 | "source": [
1091 | "# (iv) Retrieve the local maxima (the 'peaks') from the distance transform\n",
1092 | "\n",
1093 | "# Use the function 'peak_local_max' from the module 'skimage.feature'. By default, this function will return the\n",
1094 | "# indices (aka coordinates) of the pixels where the local maxima are. However, we instead need a boolean mask of \n",
1095 | "# the same shape as the original image, where all the local maximum pixels are labeled as `1` and everything else \n",
1096 | "# as `0`. The documentation for `peak_local_max` shows how to convert the coordinates into a Boolean mask using\n",
1097 | "# numpy (see the last of the examples in the docs). This is a bit of a technical detail, though, so feel free to\n",
1098 | "# copy the conversion from the solutions.\n",
1099 | "\n",
1100 | "# Retrieve peak coordinates\n",
1101 | "### YOUR CODE HERE!\n",
1102 | "\n",
1103 | "# Convert coords to mask as per skimage documentation\n",
1104 | "### YOUR CODE HERE!"
1105 | ]
1106 | },
1107 | {
1108 | "cell_type": "code",
1109 | "execution_count": null,
1110 | "metadata": {},
1111 | "outputs": [],
1112 | "source": [
1113 | "# (v) Visualize the output as an overlay on the raw (or smoothed) image\n",
1114 | "\n",
1115 | "# If you just look at the local maxima image, it will simply look like a bunch of distributed dots.\n",
1116 | "# To get an idea if the seeds are well-placed, you will need to overlay these dots onto the original image.\n",
1117 | "\n",
1118 | "# To do this, it is important to first understand a key point about how the 'pyplot' module works: \n",
1119 | "# every plotting command is slapped on top of the previous plotting commands, until everything is ultimately \n",
1120 | "# shown when 'plt.show' is called. Hence, you can first plot the raw (or smoothed) input image and then\n",
1121 | "# plot the seeds on top of it before showing both with 'plt.show'.\n",
1122 | "\n",
1123 | "# As you can see if you try this, you will not get the desired result because the zero values in seed array\n",
1124 | "# are painted in black over the image you want in the background. To solve this problem, you need to mask \n",
1125 | "# these zero values before plotting the seeds. You can do this by creating an appropriately masked array\n",
1126 | "# using the function 'np.ma.array' with the keyword argument 'mask'. \n",
1127 | "# Check the docs or Stack Overflow to figure out how to do this.\n",
1128 | "\n",
1129 | "# BONUS: As an additional improvement for the visualization, use 'ndi.maximum_filter' to dilate the \n",
1130 | "# seeds a little bit, making them bigger and thus better visible.\n",
1131 | "\n",
1132 | "### YOUR CODE HERE!"
1133 | ]
1134 | },
1135 | {
1136 | "cell_type": "code",
1137 | "execution_count": null,
1138 | "metadata": {},
1139 | "outputs": [],
1140 | "source": [
1141 | "# (vi) Optimize the seeding\n",
1142 | "\n",
1143 | "# Ideally, there should be exactly one seed for each cell.\n",
1144 | "# If you are not satisfied with your seeding, go back to the smoothing step above and optimize 'sigma'\n",
1145 | "# to get rid of additional maxima. You can also try using the keyword argument 'min_distance' in \n",
1146 | "# 'peak_local_max' to solve cases where there are multiple small seeds at the center of a cell. Note \n",
1147 | "# that good seeding is essential for a good segmentation with an expansion algorithm. However, no \n",
1148 | "# segmentation is perfect, so it's okay if a few cells end up being oversegmented."
1149 | ]
1150 | },
1151 | {
1152 | "cell_type": "code",
1153 | "execution_count": null,
1154 | "metadata": {},
1155 | "outputs": [],
1156 | "source": [
1157 | "# (vii) Label the seeds\n",
1158 | "\n",
1159 | "# Use connected component labeling to give each cell seed a unique ID number.\n",
1160 | "### YOUR CODE HERE!\n",
1161 | "\n",
1162 | "# Visualize the final result (the labeled seeds) as an overlay on the raw (or smoothed) image\n",
1163 | "### YOUR CODE HERE!"
1164 | ]
1165 | },
1166 | {
1167 | "cell_type": "markdown",
1168 | "metadata": {},
1169 | "source": [
1170 | "### Expansion by Watershed"
1171 | ]
1172 | },
1173 | {
1174 | "cell_type": "markdown",
1175 | "metadata": {},
1176 | "source": [
1177 | "#### Background\n",
1178 | "\n",
1179 | "To achieve a cell segmentation, the `seeds` now need to be expanded outward until they follow the outline of the cell. The most commonly used expansion algorithm is the `watershed`.\n",
1180 | "\n",
1181 | "Imagine the intensity in the raw/smoothed image as a topographical height profile; high-intensity regions are peaks, low-intensity regions are valleys. In this representation, cells are deep valleys (with the seeds at the center), enclosed by mountains. As the name suggests, the `watershed` algorithm can be understood as the gradual filling of this landscape with water, starting from the seed. As the water level rises, the seed expands - until it finally reaches the 'crest' of the cell membrane 'mountain range'. Here, the water would flow over into the neighboring valley, but since that valley is itself filled up with water from the neighboring cell's seed, the two water surfaces touch and the expansion stops.\n",
1182 | "\n",
1183 | "
"
1184 | ]
1185 | },
1186 | {
1187 | "cell_type": "markdown",
1188 | "metadata": {},
1189 | "source": [
1190 | "#### Exercise\n",
1191 | "\n",
1192 | "Expand your seeds by means of a watershed expansion.\n",
1193 | "\n",
1194 | "Follow the instructions in the comments below."
1195 | ]
1196 | },
1197 | {
1198 | "cell_type": "code",
1199 | "execution_count": null,
1200 | "metadata": {},
1201 | "outputs": [],
1202 | "source": [
1203 | "# (i) Perform watershed\n",
1204 | "\n",
1205 | "# Use the function 'watershed' from the module 'skimage.segmentation'.\n",
1206 | "# Use the labeled cell seeds and the smoothed membrane image as input.\n",
1207 | "### YOUR CODE HERE!"
1208 | ]
1209 | },
1210 | {
1211 | "cell_type": "code",
1212 | "execution_count": null,
1213 | "metadata": {},
1214 | "outputs": [],
1215 | "source": [
1216 | "# (ii) Show the result as transparent overlay over the smoothed input image\n",
1217 | "\n",
1218 | "# Like the masked overlay of the seeds, this can be achieved by making two calls to 'imshow',\n",
1219 | "# one for the background image and one for the segmentation. Instead of masking away background,\n",
1220 | "# this time you simply make the segmentation image semi-transparent by adjusting the keyword \n",
1221 | "# argument 'alpha' of the 'imshow' function, which specifies opacity.\n",
1222 | "# Be sure to choose an appropriate colormap that allows you to distinguish the segmented cells\n",
1223 | "# even if cells with a very similar ID are next to each other (I would recommend 'prism').\n",
1224 | "### YOUR CODE HERE!"
1225 | ]
1226 | },
1227 | {
1228 | "cell_type": "markdown",
1229 | "metadata": {},
1230 | "source": [
1231 | "#### *A Note on Segmentation Quality*\n",
1232 | "\n",
1233 | "This concludes the segmentation of the cells in the example image. Depending on the quality you achieved in each step along the way, the final segmentation may be of greater or lesser quality (in terms of over-/under-segmentation errors).\n",
1234 | "\n",
1235 | "It should be noted that the segmentation will likely *never* be perfect, as there is usually a trade-off between over- and undersegmentation.\n",
1236 | "\n",
1237 | "This raises an important question: ***When should I stop trying to optimize my segmentation?***\n",
1238 | "\n",
1239 | "There is no absolute answer to this question but the best answer is probably this: ***When you can use it to address your biological questions!***\n",
1240 | "\n",
1241 | "*Importantly, this implies that you should already have relatively clear questions in mind when you are working on the segmentation!*"
1242 | ]
1243 | },
1244 | {
1245 | "cell_type": "markdown",
1246 | "metadata": {},
1247 | "source": [
1248 | "## Postprocessing: Removing Cells at the Image Border "
1249 | ]
1250 | },
1251 | {
1252 | "cell_type": "markdown",
1253 | "metadata": {},
1254 | "source": [
1255 | "#### Background\n",
1256 | "\n",
1257 | "Since segmentation is never perfect, it often makes sense to explicitely remove artifacts afterwards. For example, one could filter out objects that are too small, have a very strange shape, or very strange intensity values. \n",
1258 | "\n",
1259 | "**Warning:** Filtering out objects is equivalent to the *removal of outliers* in data analysis and *should only be done for good reason and with caution!*\n",
1260 | "\n",
1261 | "As an example of postprocessing, we will now filter out a particular group of problematic cells: those that are being cut off at the image border."
1262 | ]
1263 | },
1264 | {
1265 | "cell_type": "markdown",
1266 | "metadata": {},
1267 | "source": [
1268 | "#### Exercise \n",
1269 | "\n",
1270 | "Iterate through all the cells in your segmentation and remove those touching the image border.\n",
1271 | "\n",
1272 | "Follow the instructions in the comments below. Note that the instructions will get a little less specific from here on, so you need to figure out how to approach a problem yourself."
1273 | ]
1274 | },
1275 | {
1276 | "cell_type": "code",
1277 | "execution_count": null,
1278 | "metadata": {},
1279 | "outputs": [],
1280 | "source": [
1281 | "# (i) Create an image border mask\n",
1282 | "\n",
1283 | "# We need some way to check if a cell is at the border. For this, we generate a 'mask' of the image border,\n",
1284 | "# i.e. a Boolean array of the same size as the image where only the border pixels are set to `1` and all \n",
1285 | "# others to `0`, like this:\n",
1286 | "# 1 1 1 1 1\n",
1287 | "# 1 0 0 0 1\n",
1288 | "# 1 0 0 0 1\n",
1289 | "# 1 0 0 0 1\n",
1290 | "# 1 1 1 1 1\n",
1291 | "# There are multiple ways of generating this mask, for example by erosion or by array indexing.\n",
1292 | "# It is up to you to find a way to do it.\n",
1293 | "\n",
1294 | "### YOUR CODE HERE!"
1295 | ]
1296 | },
1297 | {
1298 | "cell_type": "code",
1299 | "execution_count": null,
1300 | "metadata": {},
1301 | "outputs": [],
1302 | "source": [
1303 | "# (ii) 'Delete' the cells at the border\n",
1304 | "\n",
1305 | "# When modifying a segmentation (in this case by deleting some cells), it makes sense\n",
1306 | "# to work on a copy of the array, not on the original. This avoids unexpected behaviors,\n",
1307 | "# especially within jupyter notebooks. Use the function 'np.copy' to copy an array.\n",
1308 | "### YOUR CODE HERE!\n",
1309 | "\n",
1310 | "# Iterate over the IDs of all the cells in the segmentation. Use a for-loop and the \n",
1311 | "# function 'np.unique' (remember that each cell in our segmentation is labeled with a \n",
1312 | "# different integer value).\n",
1313 | "### YOUR CODE HERE!\n",
1314 | "\n",
1315 | " # Create a mask that contains only the 'current' cell of the iteration\n",
1316 | " # Hint: Remember that the comparison of an array with some number (array==number)\n",
1317 | " # returns a Boolean mask of the pixels in 'array' whose value is 'number'.\n",
1318 | " ### YOUR CODE HERE!\n",
1319 | " \n",
1320 | " # Using the cell mask and the border mask from above, test if the cell has pixels touching \n",
1321 | " # the image border or not.\n",
1322 | " # Hint: 'np.logical_and'\n",
1323 | " ### YOUR CODE HERE!\n",
1324 | "\n",
1325 | " # If a cell touches the image boundary, delete it by setting its pixels in the segmentation to 0.\n",
1326 | " ### YOUR CODE HERE!\n"
1327 | ]
1328 | },
1329 | {
1330 | "cell_type": "code",
1331 | "execution_count": null,
1332 | "metadata": {},
1333 | "outputs": [],
1334 | "source": [
1335 | "# OPTIONAL: re-label the remaining cells to keep the numbering consistent from 1 to N (with 0 as background).\n",
1336 | "\n",
1337 | "### YOUR CODE HERE!"
1338 | ]
1339 | },
1340 | {
1341 | "cell_type": "code",
1342 | "execution_count": null,
1343 | "metadata": {},
1344 | "outputs": [],
1345 | "source": [
1346 | "# (iii) Visualize the result\n",
1347 | "\n",
1348 | "# Show the result as transparent overlay over the raw or smoothed image. \n",
1349 | "# Here you have to combine alpha (to make cells transparent) and 'np.ma.array'\n",
1350 | "# (to hide empty space where the border cells were deleted).\n",
1351 | "\n",
1352 | "### YOUR CODE HERE!"
1353 | ]
1354 | },
1355 | {
1356 | "cell_type": "markdown",
1357 | "metadata": {},
1358 | "source": [
1359 | "## Identifying Cell Edges "
1360 | ]
1361 | },
1362 | {
1363 | "cell_type": "markdown",
1364 | "metadata": {},
1365 | "source": [
1366 | "#### Background\n",
1367 | "\n",
1368 | "With the final segmentation in hand, we can now start to think about measurements and data analysis. However, to extract interesting measurements from our cells, the segmentation on its own is often not enough: additional masks that identify sub-regions for each cell allow more precise and more biologically relevant measurements.\n",
1369 | "\n",
1370 | "The most useful example of this is an additional mask that identifies only the edge pixels of each cell. This is useful for a number of purposes, including:\n",
1371 | "\n",
1372 | "- Edge intensity is a good measure of membrane intensity, which is often a desired readout.\n",
1373 | "- The intensity profile along the edge may contain information on cell polarity.\n",
1374 | "- The length of the edge (relative to the cell area) is an informative feature about the cell shape. \n",
1375 | "- Showing colored edges is a nice way of visualizing cell segmentations.\n",
1376 | "\n",
1377 | "There are many ways of identifying edge pixels in a fully labeled segmentation. Here, we will use a simple and relatively fast method based on erosion."
1378 | ]
1379 | },
1380 | {
1381 | "cell_type": "markdown",
1382 | "metadata": {},
1383 | "source": [
1384 | "#### Exercise \n",
1385 | "\n",
1386 | "Create a labeled mask of cell edges by following these steps:\n",
1387 | "\n",
1388 | "\n",
1389 | "- Create an array of the same size and data type as the segmentation but filled with only zeros\n",
1390 | " - This will be your final cell edge mask; you gradually add cell edges as you iterate over cells\n",
1391 | " \n",
1392 | "\n",
1393 | "- *For each cell...*\n",
1394 | " - Erode the cell's mask by 1 pixel\n",
1395 | " - Using the eroded mask and the original mask, create a new mask of only the cell's edge pixels\n",
1396 | " - Add the cell's edge pixels into the empty image generated above, labeling them with the cell's original ID number\n",
1397 | "\n",
1398 | "\n",
1399 | "Follow the instructions in the comments below."
1400 | ]
1401 | },
1402 | {
1403 | "cell_type": "code",
1404 | "execution_count": null,
1405 | "metadata": {},
1406 | "outputs": [],
1407 | "source": [
1408 | "# (i) Create an array of the same size and data type as the segmentation but filled with only zeros\n",
1409 | "\n",
1410 | "### YOUR CODE HERE!"
1411 | ]
1412 | },
1413 | {
1414 | "cell_type": "code",
1415 | "execution_count": null,
1416 | "metadata": {},
1417 | "outputs": [],
1418 | "source": [
1419 | "# (ii) Iterate over the cell IDs\n",
1420 | "### YOUR CODE HERE!\n",
1421 | "\n",
1422 | " # (iii) Erode the cell's mask by 1 pixel\n",
1423 | " # Hint: 'ndi.binary_erode'\n",
1424 | " ### YOUR CODE HERE!\n",
1425 | " \n",
1426 | " # (iv) Create the cell edge mask\n",
1427 | " # Hint: 'np.logical_xor'\n",
1428 | " ### YOUR CODE HERE!\n",
1429 | " \n",
1430 | " # (v) Add the cell edge mask to the empty array generated above, labeling it with the cell's ID\n",
1431 | " ### YOUR CODE HERE!"
1432 | ]
1433 | },
1434 | {
1435 | "cell_type": "code",
1436 | "execution_count": null,
1437 | "metadata": {},
1438 | "outputs": [],
1439 | "source": [
1440 | "# (vi) Visualize the result\n",
1441 | "\n",
1442 | "# Note: Because the lines are so thin (1pxl wide), they may not be displayed correctly in small figures.\n",
1443 | "# You can 'zoom in' by showing a sub-region of the image which is then rendered bigger. You can\n",
1444 | "# also go back to the edge identification code and make the edges multiple pixels wide (but keep \n",
1445 | "# in mind that this will have an effect on your quantification results!).\n",
1446 | "\n",
1447 | "### YOUR CODE HERE!"
1448 | ]
1449 | },
1450 | {
1451 | "cell_type": "markdown",
1452 | "metadata": {},
1453 | "source": [
1454 | "## Extracting Quantitative Measurements "
1455 | ]
1456 | },
1457 | {
1458 | "cell_type": "markdown",
1459 | "metadata": {},
1460 | "source": [
1461 | "#### Background\n",
1462 | "\n",
1463 | "The ultimate goal of image segmentation is of course the extraction of quantitative measurements, in this case on a single-cell level. Measures of interest can be based on intensity (in different channels) or on the size and shape of the cells.\n",
1464 | "\n",
1465 | "To exemplify how different properties of cells can be measured, we will extract the following:\n",
1466 | "\n",
1467 | "- Cell ID (so all other measurements can be traced back to the cell that was measured)\n",
1468 | "- Mean intensity of each cell\n",
1469 | "- Mean intensity at the membrane of each cell\n",
1470 | "- The cell area, i.e. the number of pixels that make up the cell\n",
1471 | "- The cell outline length, i.e. the number of pixels that make up the cell edge\n",
1472 | "\n",
1473 | "*Note: It makes sense to use smoothed/filtered/background-subtracted images for segmentation. When it comes to measurements, however, it's best to get back to the raw data!*"
1474 | ]
1475 | },
1476 | {
1477 | "cell_type": "markdown",
1478 | "metadata": {},
1479 | "source": [
1480 | "#### Exercise\n",
1481 | "\n",
1482 | "Extract the measurements listed above for each cell and collect them in a dictionary.\n",
1483 | "\n",
1484 | "Note: The ideal data structure for data like this is the `DataFrame` offered by the module `Pandas`. However, for the sake of simplicity, we will here stick with a dictionary of lists.\n",
1485 | "\n",
1486 | "Follow the instructions in the comments below."
1487 | ]
1488 | },
1489 | {
1490 | "cell_type": "code",
1491 | "execution_count": null,
1492 | "metadata": {},
1493 | "outputs": [],
1494 | "source": [
1495 | "# (i) Create a dictionary that contains a key-value pairing for each measurement\n",
1496 | "\n",
1497 | "# The keys should be strings describing the type of measurement (e.g. 'intensity_mean') and \n",
1498 | "# the values should be empty lists. These empty lists will be filled with the results of the\n",
1499 | "# measurements.\n",
1500 | "\n",
1501 | "### YOUR CODE HERE!"
1502 | ]
1503 | },
1504 | {
1505 | "cell_type": "code",
1506 | "execution_count": null,
1507 | "metadata": {},
1508 | "outputs": [],
1509 | "source": [
1510 | "# (ii) Record the measurements for each cell\n",
1511 | "\n",
1512 | "# Iterate over the segmented cells ('np.unique').\n",
1513 | "# Inside the loop, create a mask for the current cell and use it to extract the measurements listed above. \n",
1514 | "# Add them to the appropriate list in the dictionary using the 'append' method.\n",
1515 | "# Hint: Remember that you can get out all the values within a masked area by indexing the image \n",
1516 | "# with the mask. For example, 'np.mean(image[cell_mask])' will return the mean of all the \n",
1517 | "# intensity values of 'image' that are masked by 'cell_mask'!\n",
1518 | "\n",
1519 | "### YOUR CODE HERE!"
1520 | ]
1521 | },
1522 | {
1523 | "cell_type": "code",
1524 | "execution_count": null,
1525 | "metadata": {},
1526 | "outputs": [],
1527 | "source": [
1528 | "# (iii) Print the results and check that they make sense\n",
1529 | "\n",
1530 | "### YOUR CODE HERE!"
1531 | ]
1532 | },
1533 | {
1534 | "cell_type": "markdown",
1535 | "metadata": {},
1536 | "source": [
1537 | "## Simple Analysis & Visualisation "
1538 | ]
1539 | },
1540 | {
1541 | "cell_type": "markdown",
1542 | "metadata": {},
1543 | "source": [
1544 | "#### Background\n",
1545 | "\n",
1546 | "By extracting quantitative measurements from an image we cross over from 'image analysis' to 'data analysis'. \n",
1547 | "\n",
1548 | "This section briefly explains how to do basic data analysis and plotting, including boxplots, scatterplots and linear fits. It also showcases how to map data back onto the image, creating an \"image-based heatmap\"."
1549 | ]
1550 | },
1551 | {
1552 | "cell_type": "markdown",
1553 | "metadata": {},
1554 | "source": [
1555 | "#### Exercise\n",
1556 | "\n",
1557 | "Analyze and plot the extracted data in a variety of ways.\n",
1558 | "\n",
1559 | "Follow the instructions in the comments below."
1560 | ]
1561 | },
1562 | {
1563 | "cell_type": "code",
1564 | "execution_count": null,
1565 | "metadata": {},
1566 | "outputs": [],
1567 | "source": [
1568 | "# (i) Familiarize yourself with the data structure of the results dict and summarize the results\n",
1569 | "\n",
1570 | "# Recall that dictionaries are unordered; a dataset of interest is accessed through its key.\n",
1571 | "# In our case, the datasets inside the dict are lists of values, ordered in the same order\n",
1572 | "# as the cell IDs. \n",
1573 | "\n",
1574 | "# For each dataset in the results dict, print its name (the key) along with its mean, standard \n",
1575 | "# deviation, maximum, minimum, and median. The appropriate numpy methods (e.g. 'np.median') work\n",
1576 | "# with lists just as well as with arrays.\n",
1577 | "\n",
1578 | "### YOUR CODE HERE!"
1579 | ]
1580 | },
1581 | {
1582 | "cell_type": "code",
1583 | "execution_count": null,
1584 | "metadata": {},
1585 | "outputs": [],
1586 | "source": [
1587 | "# (ii) Create a box plot showing the mean cell and mean membrane intensities for both channels. \n",
1588 | "\n",
1589 | "# Use the function 'plt.boxplot'. Use the 'label' keyword of 'plt.boxplot' to label the x axis with \n",
1590 | "# the corresponding key names. Feel free to play around with the various options of the boxplot \n",
1591 | "# function to make your plot look nicer. Remember that you can first call 'plt.figure' to adjust \n",
1592 | "# settings such as the size of the plot.\n",
1593 | "\n",
1594 | "### YOUR CODE HERE!"
1595 | ]
1596 | },
1597 | {
1598 | "cell_type": "code",
1599 | "execution_count": null,
1600 | "metadata": {},
1601 | "outputs": [],
1602 | "source": [
1603 | "# (iii) Create a scatter plot of cell outline length over cell area\n",
1604 | "\n",
1605 | "# Use the function 'plt.scatter' for this. Be sure to properly label the \n",
1606 | "# plot using 'plt.xlabel' and 'plt.ylabel'.\n",
1607 | "### YOUR CODE HERE!\n",
1608 | "\n",
1609 | "# BONUS: Do you understand why you are seeing the pattern this produces? Can you\n",
1610 | "# generate a 'null model' curve that assumes all cells to be circular? What is\n",
1611 | "# the result? Do you notice something odd about it? What could be the reason for\n",
1612 | "# this and how could it be fixed?\n",
1613 | "### YOUR CODE HERE!"
1614 | ]
1615 | },
1616 | {
1617 | "cell_type": "code",
1618 | "execution_count": null,
1619 | "metadata": {},
1620 | "outputs": [],
1621 | "source": [
1622 | "# (iv) Perform a linear fit of membrane intensity over cell area\n",
1623 | "\n",
1624 | "# Use the function 'linregress' from the module 'scipy.stats'. Be sure to read the docs to\n",
1625 | "# understand the output of this function. Print the output.\n",
1626 | "\n",
1627 | "### YOUR CODE HERE!"
1628 | ]
1629 | },
1630 | {
1631 | "cell_type": "code",
1632 | "execution_count": null,
1633 | "metadata": {},
1634 | "outputs": [],
1635 | "source": [
1636 | "# (v) Think about the result\n",
1637 | "\n",
1638 | "# Note that the fit seems to return a highly significant p-value but a very low correlation \n",
1639 | "# coefficient (r-value). Based on prior knowledge, we would not expect a linear correlation of \n",
1640 | "# this sort to be present in our data. \n",
1641 | "#\n",
1642 | "# This should prompt several questions:\n",
1643 | "# 1) What does this p-value actually mean? Check the docs of 'linregress'!\n",
1644 | "# 2) Could there be artifacts in our segmentation that bias this analysis?\n",
1645 | "#\n",
1646 | "# In general, it's always good to be very careful when doing any kind of data analysis. Make sure you \n",
1647 | "# understand the functions you are using and always check for possible errors or sources of bias!"
1648 | ]
1649 | },
1650 | {
1651 | "cell_type": "code",
1652 | "execution_count": null,
1653 | "metadata": {},
1654 | "outputs": [],
1655 | "source": [
1656 | "# (vi) Overlay the linear fit onto a scatter plot\n",
1657 | "\n",
1658 | "# Recall that a linear function is defined by `y = slope * x + intercept`.\n",
1659 | "\n",
1660 | "# To define the line you'd like to plot, you need two values of x (the starting point and\n",
1661 | "# and the end point of the line). What values of x make sense? Can you get them automatically?\n",
1662 | "### YOUR CODE HERE!\n",
1663 | "\n",
1664 | "# When you have the x-values for the starting point and end point, get the corresponding y \n",
1665 | "# values from the fit through the equation above.\n",
1666 | "### YOUR CODE HERE!\n",
1667 | "\n",
1668 | "# Plot the line with 'plt.plot'. Adjust the line's properties so it is well visible.\n",
1669 | "# Note: Remember that you have to create the scatterplot before plotting the line so that\n",
1670 | "# the line will be placed on top of the scatterplot.\n",
1671 | "### YOUR CODE HERE!\n",
1672 | "\n",
1673 | "# Use 'plt.legend' to add information about the line to the plot.\n",
1674 | "### YOUR CODE HERE!\n",
1675 | "\n",
1676 | "# Label the plot and finally show it with 'plt.show'.\n",
1677 | "### YOUR CODE HERE!"
1678 | ]
1679 | },
1680 | {
1681 | "cell_type": "code",
1682 | "execution_count": null,
1683 | "metadata": {},
1684 | "outputs": [],
1685 | "source": [
1686 | "# (vii) Map the cell area back onto the image as a 'heatmap'\n",
1687 | "\n",
1688 | "# Scale the cell area data to 8bit so that it can be used as pixel intensity values.\n",
1689 | "# Hint: if the largest cell area should correspond to the value 255 in uint8, then \n",
1690 | "# the other cell areas correspond to 'cell_area * 255 / largest_cell_area'.\n",
1691 | "# Hint: To perform an operation on all cell area values at once, convert the list \n",
1692 | "# of cell areas to a numpy array.\n",
1693 | "### YOUR CODE HERE!\n",
1694 | "\n",
1695 | "# Initialize a new image array; all values should be zeros, the shape should be identical \n",
1696 | "# to the images we worked with before and the dtype should be uint8.\n",
1697 | "### YOUR CODE HERE!\n",
1698 | "\n",
1699 | "# Iterate over the segmented cells. In addition to the cell IDs, the for-loop should\n",
1700 | "# also include a simple counter (starting from 0) with which the area measurement can be \n",
1701 | "# accessed by indexing.\n",
1702 | "### YOUR CODE HERE!\n",
1703 | " \n",
1704 | " # Mask the current cell and assign the cell's (re-scaled) area value to the cell's pixels.\n",
1705 | " ### YOUR CODE HERE!\n",
1706 | "\n",
1707 | "# Visualize the result as a colored semi-transparent overlay over the raw/smoothed original input image.\n",
1708 | "# BONUS: See if you can exclude outliers to make the color mapping more informative!\n",
1709 | "### YOUR CODE HERE!"
1710 | ]
1711 | },
1712 | {
1713 | "cell_type": "markdown",
1714 | "metadata": {},
1715 | "source": [
1716 | "## Writing Output to Files "
1717 | ]
1718 | },
1719 | {
1720 | "cell_type": "markdown",
1721 | "metadata": {},
1722 | "source": [
1723 | "#### Background\n",
1724 | "\n",
1725 | "The final step of the pipeline shows how to write various outputs of the pipeline to files.\n",
1726 | "\n",
1727 | "Data can be saved to files in a human-readable format such as text files (e.g. to import into Excel), in a format readable for other programs such as tif-images (e.g. to view in Fiji) or in a language-specific file that makes it easy to reload the data into python in the future (e.g. for further analysis)."
1728 | ]
1729 | },
1730 | {
1731 | "cell_type": "markdown",
1732 | "metadata": {},
1733 | "source": [
1734 | "#### Exercise \n",
1735 | "\n",
1736 | "Write the generated data into a variety of different output files.\n",
1737 | "\n",
1738 | "Follow the instructions in the comments below."
1739 | ]
1740 | },
1741 | {
1742 | "cell_type": "code",
1743 | "execution_count": null,
1744 | "metadata": {},
1745 | "outputs": [],
1746 | "source": [
1747 | "# (i) Write one or more of the images you produced to a tif file\n",
1748 | "\n",
1749 | "# Use the function 'imsave' from the 'skimage.io' module. Make sure that the array you are \n",
1750 | "# writing is of integer type. If necessary, you can use the method 'astype' for conversions, \n",
1751 | "# e.g. 'some_array.astype(np.uint8)' or 'some_array.astype(np.uint16)'. Careful when \n",
1752 | "# converting a segmentation to uint8; if there are more than 255 cells, the 8bit format\n",
1753 | "# doesn't have sufficient bit-depth to represent all cell IDs!\n",
1754 | "#\n",
1755 | "# You can also try adding the segmentation to the original image, creating an image with\n",
1756 | "# two channels, one of them being the segmentation. \n",
1757 | "#\n",
1758 | "# After writing the file, load it into Fiji and check that everything worked as intended.\n",
1759 | "\n",
1760 | "### YOUR CODE HERE!"
1761 | ]
1762 | },
1763 | {
1764 | "cell_type": "code",
1765 | "execution_count": null,
1766 | "metadata": {},
1767 | "outputs": [],
1768 | "source": [
1769 | "# (ii) Write a figure to a png or pdf\n",
1770 | "\n",
1771 | "# Recreate the scatter plot from above (with or without the regression line), then save the figure\n",
1772 | "# as a png using 'plt.savefig'. Alternatively, you can also save it to a pdf, which will create a\n",
1773 | "# vector graphic that can be imported into programs like Adobe Illustrator.\n",
1774 | "\n",
1775 | "### YOUR CODE HERE!"
1776 | ]
1777 | },
1778 | {
1779 | "cell_type": "code",
1780 | "execution_count": null,
1781 | "metadata": {},
1782 | "outputs": [],
1783 | "source": [
1784 | "# (iii) Save the segmentation as a numpy file\n",
1785 | "\n",
1786 | "# Numpy files allow fast storage and reloading of numpy arrays. Use the function 'np.save'\n",
1787 | "# to save the array and reload it using 'np.load'.\n",
1788 | "\n",
1789 | "### YOUR CODE HERE!"
1790 | ]
1791 | },
1792 | {
1793 | "cell_type": "code",
1794 | "execution_count": null,
1795 | "metadata": {},
1796 | "outputs": [],
1797 | "source": [
1798 | "# (iv) Save the result dictionary as a pickle file\n",
1799 | "\n",
1800 | "# Pickling is a way of generating generic files from almost any python object, which can easily\n",
1801 | "# be reloaded into python at a later point in time.\n",
1802 | "# You will need to open an empty file object using 'open' in write-bytes mode ('wb'). It's best to \n",
1803 | "# do so using the 'with'-statement (context manager) to make sure that the file object will be \n",
1804 | "# closed automatically when you are done with it.\n",
1805 | "# Use the function 'pickle.dump' from the 'pickle' module to write the results to the file.\n",
1806 | "# Hint: Refer to the python documention for input and output to understand how file objects are\n",
1807 | "# handled in python in general.\n",
1808 | "\n",
1809 | "### YOUR CODE HERE!\n",
1810 | "\n",
1811 | "## Note: Pickled files can be re-loaded again as follows:\n",
1812 | "#with open('my_filename.pkl', 'rb') as infile:\n",
1813 | "# reloaded = pickle.load(infile)"
1814 | ]
1815 | },
1816 | {
1817 | "cell_type": "code",
1818 | "execution_count": null,
1819 | "metadata": {},
1820 | "outputs": [],
1821 | "source": [
1822 | "# (v) Write a tab-separated text file of the results dict\n",
1823 | "\n",
1824 | "# The most generic way of saving numeric results is a simple text file. It can be imported into \n",
1825 | "# pretty much any other program.\n",
1826 | "\n",
1827 | "# To write normal text files, open an empty file object in write mode ('w') using the 'with'-statement.\n",
1828 | "### YOUR CODE HERE!\n",
1829 | "\n",
1830 | " # Use the 'file_object.write(string)' method to write strings to the file, one line at a time,\n",
1831 | " # First, write the header of the data (the result dict keys), separated by tabs ('\\t'). \n",
1832 | " # It makes sense to first generate a complete string with all the headers and then write this \n",
1833 | " # string to the file as one line. Note that you will need to explicitly write 'newline' characters \n",
1834 | " # ('\\n') at the end of the line to switch to the next line.\n",
1835 | " # Hint: the string method 'join' is very useful here!\n",
1836 | " ### YOUR CODE HERE!\n",
1837 | "\n",
1838 | " # After writing the headers, iterate over all the cells and write the result data to the file line\n",
1839 | " # by line, by creating strings similar to the header string.\n",
1840 | " ### YOUR CODE HERE!\n",
1841 | "\n",
1842 | "# After writing the data, have a look at the output file in a text editor or in a spreadsheet\n",
1843 | "# program like Excel."
1844 | ]
1845 | },
1846 | {
1847 | "cell_type": "markdown",
1848 | "metadata": {},
1849 | "source": [
1850 | "## Batch Processing "
1851 | ]
1852 | },
1853 | {
1854 | "cell_type": "markdown",
1855 | "metadata": {},
1856 | "source": [
1857 | "#### Background\n",
1858 | "\n",
1859 | "In practice, we never work with just a single image, so we would like to make it possible to run our analysis pipeline for multiple images and then collect and analyze all the results. This final section of the tutorial shows how to do just that."
1860 | ]
1861 | },
1862 | {
1863 | "cell_type": "markdown",
1864 | "metadata": {},
1865 | "source": [
1866 | "#### Exercise\n",
1867 | "\n",
1868 | "To run a pipeline multiple times, it needs to be packaged into a function or - even better - as a separate module. Jupyter notebook is not well suited for this, so if you're working in a notebook, first extract your code to a `.py` file (see instructions below). If you are not working in a notebook, create a copy of your pipeline; we will modify this copy into a function that can then be called repeatedly for different images.\n",
1869 | "\n",
1870 | "To export a jupyter notebook as a `.py` file, use `File > Download as > Python (.py)`, then save the file. Open the resulting python script in a text editor or in an IDE like PyCharm. \n",
1871 | "\n",
1872 | "\n",
1873 | "#### Let's clean the script a bit:\n",
1874 | "\n",
1875 | "- Remove the line `%matplotlib [inline|notebook|qt]`. It is not valid python code outside of a Jupyter notebook.\n",
1876 | "\n",
1877 | "\n",
1878 | "- Go through the script and comment out everything related to plotting; when running a pipeline for dozens or hundreds of images, we usually do not want to generate tons of plots. Similarly, it can make sense to remove some print statments if you have many of them.\n",
1879 | "\n",
1880 | "\n",
1881 | "- Remove the sections `Manual Thresholding` and `Connected Components Labeling`; they are not used in the final segmentation.\n",
1882 | "\n",
1883 | "\n",
1884 | "- Remove the sections `Simple Analysis and Visualization` and `Writing Output to Files`; we will collect the output for each image when running the pipeline in a loop. That way, everything can be analyzed at once at the end. \n",
1885 | " - Note that, even though we skip it here, it is often very useful to store every input file's corresponding outputs in new files. When doing so, the output files should use the name of the input file modified with an additional suffix. For example, the results extracted when analyzing `img_1.tif` might best be stored as `img_1_results.pkl`.\n",
1886 | " - You can implement this approach for saving the segmentations and/or the result dicts as a *bonus* exercise!\n",
1887 | "\n",
1888 | "\n",
1889 | "- Feel free to delete some of the background information to make the script more concise.\n",
1890 | "\n",
1891 | "\n",
1892 | "#### Converting the pipeline to a function:\n",
1893 | "\n",
1894 | "Convert the entire pipeline into a function that accepts a directory and a filename as input, runs everything, and returns the final segmentation and the results dictionary. To do this, you must:\n",
1895 | "\n",
1896 | "- Add the function definition statement at the beginning of the script (after the imports)\n",
1897 | "- Replace the 'hard-coded' directory path and filename by variables that are accepted by the function\n",
1898 | "- Indent all the code\n",
1899 | "- Add a return statement at the end\n",
1900 | "\n",
1901 | "\n",
1902 | "#### Importing the function and running it for multiple input files:\n",
1903 | "\n",
1904 | "To actually run the pipeline function for multiple input files, we need to do the following:\n",
1905 | "\n",
1906 | "- Import the pipeline function from the `.py` file\n",
1907 | "- Iterate over all the filenames in a directory\n",
1908 | "- For each filename, call the pipeline function\n",
1909 | "- Collect the returned results\n",
1910 | "\n",
1911 | "Once you have converted your pipeline into a function as described above, you can import and run it according to the instructions below."
1912 | ]
1913 | },
1914 | {
1915 | "cell_type": "code",
1916 | "execution_count": null,
1917 | "metadata": {},
1918 | "outputs": [],
1919 | "source": [
1920 | "# (i) Test if your pipeline function actually works\n",
1921 | "\n",
1922 | "# Import your function using the normal python syntax for imports, like this:\n",
1923 | "# from your_module import your_function\n",
1924 | "# Run the function and visualize the resulting segmentation. Make sure everything works as intended.\n",
1925 | "\n",
1926 | "### YOUR CODE HERE!"
1927 | ]
1928 | },
1929 | {
1930 | "cell_type": "code",
1931 | "execution_count": null,
1932 | "metadata": {},
1933 | "outputs": [],
1934 | "source": [
1935 | "# (ii) Get all relevant filenames from the input directory\n",
1936 | "\n",
1937 | "# Use the function 'listdir' from the module 'os' to get a list of all the files\n",
1938 | "# in a directory. Find a way to filter out only the relevant input files, namely\n",
1939 | "# \"example_cells_1.tif\" and \"example_cells_2.tif\". Of course, one would usually\n",
1940 | "# do this for many more images, otherwise it's not worth the effort.\n",
1941 | "# Hint: Loop over the filenames and use if statements to decide which ones to \n",
1942 | "# keep and which ones to throw away.\n",
1943 | "\n",
1944 | "### YOUR CODE HERE!"
1945 | ]
1946 | },
1947 | {
1948 | "cell_type": "code",
1949 | "execution_count": null,
1950 | "metadata": {},
1951 | "outputs": [],
1952 | "source": [
1953 | "# (iii) Iterate over the input filenames and run the pipeline function\n",
1954 | "\n",
1955 | "# Be sure to collect the output of the pipeline function in a way that allows\n",
1956 | "# you to trace it back to the file it came from. You could for example use a\n",
1957 | "# dictionary with the filenames as keys.\n",
1958 | "\n",
1959 | "### YOUR CODE HERE!"
1960 | ]
1961 | },
1962 | {
1963 | "cell_type": "code",
1964 | "execution_count": null,
1965 | "metadata": {},
1966 | "outputs": [],
1967 | "source": [
1968 | "# (iv) Recreate one of the scatterplots from above but this time with all the cells\n",
1969 | "\n",
1970 | "# You can color-code the dots to indicate which file they came from. Don't forget to\n",
1971 | "# add a corresponding legend.\n",
1972 | "\n",
1973 | "### YOUR CODE HERE!"
1974 | ]
1975 | },
1976 | {
1977 | "cell_type": "markdown",
1978 | "metadata": {},
1979 | "source": [
1980 | "## *Congratulations! You have completed the tutorial!*\n",
1981 | "\n",
1982 | "**We hope you enjoyed the ride and learned a lot!**"
1983 | ]
1984 | },
1985 | {
1986 | "cell_type": "markdown",
1987 | "metadata": {},
1988 | "source": [
1989 | "### Concluding Remarks\n",
1990 | "\n",
1991 | "It's important to remember that the phrase ***\"Use it or loose it!\"*** fully applies for the skills taught in this tutorial.\n",
1992 | "\n",
1993 | "If you now just go back to the lab and don't touch python or image analysis for the next half year, most of the things you have learned here will be lost.\n",
1994 | "\n",
1995 | "So, what can you do?\n",
1996 | "\n",
1997 | "\n",
1998 | "- If possible, start applying what you have learned to your own work right away\n",
1999 | "\n",
2000 | "\n",
2001 | "- Even if your current work doesn't absolutely *need* coding / image analysis (which to be honest is hard to believe! ;p), you can still use it at least to make some nice plots!\n",
2002 | "\n",
2003 | "\n",
2004 | "- Another very good approach is to find yourself an interesting little side project you can play around with\n",
2005 | "\n",
2006 | "\n",
2007 | "- Of course, there is still much more to learn and the internet happens to be full of excellent tutorials!\n",
2008 | " - As a starting point, have a look at [Bio-IT's curated list of tutorials](https://bio-it.embl.de/coding-club/curated-tutorials/)"
2009 | ]
2010 | },
2011 | {
2012 | "cell_type": "markdown",
2013 | "metadata": {},
2014 | "source": [
2015 | "***We wish you the best of luck for all your coding endeavors!***"
2016 | ]
2017 | }
2018 | ],
2019 | "metadata": {
2020 | "kernelspec": {
2021 | "display_name": "Python 3 (ipykernel)",
2022 | "language": "python",
2023 | "name": "python3"
2024 | },
2025 | "language_info": {
2026 | "codemirror_mode": {
2027 | "name": "ipython",
2028 | "version": 3
2029 | },
2030 | "file_extension": ".py",
2031 | "mimetype": "text/x-python",
2032 | "name": "python",
2033 | "nbconvert_exporter": "python",
2034 | "pygments_lexer": "ipython3",
2035 | "version": "3.7.3"
2036 | }
2037 | },
2038 | "nbformat": 4,
2039 | "nbformat_minor": 2
2040 | }
2041 |
--------------------------------------------------------------------------------
/image_analysis_tutorial_solutions.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Image Analysis with Python - Tutorial Pipeline Solutions\n",
8 | "\n",
9 | "*originally created in 2016*
\n",
10 | "*updated and converted to a Jupyter notebook in 2017*
\n",
11 | "*updated and converted to python 3 in 2018*
\n",
12 | "*by Jonas Hartmann (Gilmour group, EMBL Heidelberg)*
"
13 | ]
14 | },
15 | {
16 | "cell_type": "markdown",
17 | "metadata": {},
18 | "source": [
19 | "## Table of Contents\n",
20 | "\n",
21 | "1. [About this Tutorial](#about)\n",
22 | "2. [Importing Modules & Packages](#import)\n",
23 | "3. [Loading & Handling Image Data](#load)\n",
24 | "4. [Preprocessing](#prepro)\n",
25 | "5. [Manual Thresholding & Threshold Detection](#thresh)\n",
26 | "6. [Adaptive Thresholding](#adaptive)\n",
27 | "7. [Improving Masks with Binary Morphology](#morpho)\n",
28 | "8. [Connected Components Labeling](#label)\n",
29 | "9. [Cell Segmentation by Seeding & Expansion](#seg)\n",
30 | "10. [Postprocessing: Removing Cells at the Image Border](#postpro)\n",
31 | "11. [Identifying Cell Edges](#edges)\n",
32 | "12. [Extracting Quantitative Measurements](#measure)\n",
33 | "13. [Simple Analysis & Visualization](#analysis)\n",
34 | "14. [Writing Output to Files](#write)\n",
35 | "15. [Batch Processing](#batch)"
36 | ]
37 | },
38 | {
39 | "cell_type": "markdown",
40 | "metadata": {},
41 | "source": [
42 | "## About this Tutorial - Notes on the Solutions \n",
43 | "\n",
44 | "These are the solutions to the tutorial pipeline (see `image_analysis_tutorial.py`). *For the best learning experience, it is highly recommended that you first try to implement a solution yourself!* Only come here if you are totally stuck or if you have a working solution and would like to double-check it.\n",
45 | "\n",
46 | "Note that there are multiple ways of implementing any particular step in the pipeline, so if your solution is different from the solution here, it is not necessarily wrong. However, *some solutions are better than others* because they are...\n",
47 | "\n",
48 | "- **...more readable:**\n",
49 | " - When reading the code, it is obvious and clear what the code is doing\n",
50 | " - The code is clearly commented to help others (and your future self) understand it\n",
51 | " \n",
52 | " \n",
53 | "- **...more general:**\n",
54 | " - The code still works if there are minor changes to the data (e.g. size of the image being processed)\n",
55 | " - The code can easily be transformed into a solution for a similar problem\n",
56 | " \n",
57 | " \n",
58 | "- **...more computationally efficient:**\n",
59 | " - No unnecessary copies of large datasets are made (memory efficiency)\n",
60 | " - Faster algorithms are used, e.g. array operations instead of loops (CPU efficiency)\n",
61 | " - No unnecessary loading and writing of data (io efficiency)\n",
62 | "\n",
63 | "\n",
64 | "It is up to you to decide if your solution is better, equally good, or not as good as the solution presented here. Either way, we hope you can learn something by looking at our solutions. **:)**"
65 | ]
66 | },
67 | {
68 | "cell_type": "markdown",
69 | "metadata": {},
70 | "source": [
71 | "## Importing Modules & Packages "
72 | ]
73 | },
74 | {
75 | "cell_type": "code",
76 | "execution_count": null,
77 | "metadata": {},
78 | "outputs": [],
79 | "source": [
80 | "# The numerical array package numpy as np\n",
81 | "import numpy as np"
82 | ]
83 | },
84 | {
85 | "cell_type": "markdown",
86 | "metadata": {},
87 | "source": [
88 | "#### Exercise Solution \n",
89 | "\n",
90 | "Using the import command as above, follow the instructions in the comments below to import two additional modules that we will be using frequently in this pipeline."
91 | ]
92 | },
93 | {
94 | "cell_type": "code",
95 | "execution_count": null,
96 | "metadata": {},
97 | "outputs": [],
98 | "source": [
99 | "# The plotting module matplotlib.pyplot as plt\n",
100 | "import matplotlib.pyplot as plt\n",
101 | "\n",
102 | "# The image processing package scipy.ndimage as ndi\n",
103 | "import scipy.ndimage as ndi"
104 | ]
105 | },
106 | {
107 | "cell_type": "markdown",
108 | "metadata": {},
109 | "source": [
110 | "#### Side Note for Jupyter Notebook Users\n",
111 | "\n",
112 | "You can configure how the figures made by matplotlib are displayed."
113 | ]
114 | },
115 | {
116 | "cell_type": "code",
117 | "execution_count": null,
118 | "metadata": {},
119 | "outputs": [],
120 | "source": [
121 | "# Set matplotlib backend\n",
122 | "%matplotlib inline\n",
123 | "#%matplotlib notebook \n",
124 | "#%matplotlib qt "
125 | ]
126 | },
127 | {
128 | "cell_type": "markdown",
129 | "metadata": {},
130 | "source": [
131 | "## Loading & Handling Image Data "
132 | ]
133 | },
134 | {
135 | "cell_type": "markdown",
136 | "metadata": {},
137 | "source": [
138 | "#### Exercise Solution\n",
139 | "\n",
140 | "We will now proceed to load one of the example images and verify that we get what we expect."
141 | ]
142 | },
143 | {
144 | "cell_type": "code",
145 | "execution_count": null,
146 | "metadata": {},
147 | "outputs": [],
148 | "source": [
149 | "# (i) Specify the directory path and file name\n",
150 | "\n",
151 | "# Create a string variable with the name of the file you'd like to load (here: 'example_cells_1.tif').\n",
152 | "# Suggested name for the variable: filename\n",
153 | "# Note: Paths and filenames can contain slashes, empty spaces and other special symbols, which can cause \n",
154 | "# trouble for programming languages under certain circumstances. To circumvent such trouble, add \n",
155 | "# the letter r before your string definition to create a so-called 'raw string', which is not\n",
156 | "# affected by these problems (e.g. `my_raw_string = r\"some string with funny symbols: \\\\\\!/~***!\"`).\n",
157 | "filename = r'example_cells_1.tif'\n",
158 | "\n",
159 | "# If the file is not in the current working directory, you must also have a way of specifying the path\n",
160 | "# to the directory where the file is stored. Most likely, your example images are stored in a directory\n",
161 | "# called 'example_data' in the same folder as this notebook. Note that you can use either the full path\n",
162 | "# - something like r\"/home/jack/data/python_image_analysis/example_data\"\n",
163 | "# or the relative path, starting from the current working directory\n",
164 | "# - here that would just be r\"example_data\"\n",
165 | "# Create a string variable with the path to the directory that contains the file you'd like to load.\n",
166 | "# Suggested name for the variable: dirpath\n",
167 | "dirpath = r'example_data' # Relative path\n",
168 | "#dirpath = r'/home/jack/data/python_image_analysis/example_data/example_cells_1.tif' # Absolute path "
169 | ]
170 | },
171 | {
172 | "cell_type": "code",
173 | "execution_count": null,
174 | "metadata": {},
175 | "outputs": [],
176 | "source": [
177 | "# (ii) Combine the directory path and file name into one variable, the file path\n",
178 | "\n",
179 | "# Import the function 'join' from the module 'os.path'\n",
180 | "# This function automatically takes care of the slashes that need to be added when combining two paths.\n",
181 | "from os.path import join\n",
182 | "\n",
183 | "# Use the 'join' function to combine the directory path with the file name and create a new variable.\n",
184 | "# Print the result to see that everything is correct (this is always a good idea!)\n",
185 | "# Suggested name for the variable: filepath\n",
186 | "filepath = join(dirpath, filename)\n",
187 | "print(filepath)"
188 | ]
189 | },
190 | {
191 | "cell_type": "code",
192 | "execution_count": null,
193 | "metadata": {},
194 | "outputs": [],
195 | "source": [
196 | "# (iii) Load the image\n",
197 | "\n",
198 | "# Import the function 'imread' from the module 'skimage.io'.\n",
199 | "from skimage.io import imread\n",
200 | "\n",
201 | "# Load 'example_cells_1.tif' and store it in a variable.\n",
202 | "# Suggested name for the variable: img\n",
203 | "img = imread(filepath)"
204 | ]
205 | },
206 | {
207 | "cell_type": "code",
208 | "execution_count": null,
209 | "metadata": {},
210 | "outputs": [],
211 | "source": [
212 | "# (iv) Check that everything is in order\n",
213 | "\n",
214 | "# Check that 'img' is a variable of type 'ndarray' - use Python's built-in function 'type'.\n",
215 | "print(\"Loaded array is of type:\", type(img))\n",
216 | "\n",
217 | "# Print the shape of the array using the numpy-function 'shape'. \n",
218 | "# Make sure you understand the output!\n",
219 | "print(\"Loaded array has shape:\", img.shape)\n",
220 | "\n",
221 | "# Check the datatype of the individual numbers in the array. You can use the array attribute 'dtype' to do so.\n",
222 | "# Make sure you understand the output!\n",
223 | "print(\"Loaded values are of type:\", img.dtype)\n",
224 | "\n",
225 | "# SOLUTION NOTE: The dtype should be 'uint8', because these are unsigned 8-bit integer images.\n",
226 | "# This means that the intensity values range from 0 to 255 in steps of 1."
227 | ]
228 | },
229 | {
230 | "cell_type": "code",
231 | "execution_count": null,
232 | "metadata": {},
233 | "outputs": [],
234 | "source": [
235 | "# (v) Look at the image to confirm that everything worked as intended\n",
236 | "\n",
237 | "# To plot the array as an image, use pyplot's functions 'plt.imshow' followed by 'plt.show'. \n",
238 | "# Check the documentation for 'plt.imshow' and note the parameters that can be specified, such as colormap (cmap)\n",
239 | "# and interpolation. Since you are working with scientific data, interpolation is unwelcome, so you should set it \n",
240 | "# to \"none\". The most common cmap for grayscale images is naturally \"gray\".\n",
241 | "# You may also want to adjust the size of the figure. You can do this by preparing the figure canvas with\n",
242 | "# the function 'plt.figure' before calling 'plt.imshow'. The canvas size is adjusted using the keyword argument\n",
243 | "# 'figsize' when calling 'plt.figure'.\n",
244 | "plt.figure(figsize=(7,7))\n",
245 | "plt.imshow(img, interpolation='none', cmap='gray')\n",
246 | "plt.show()"
247 | ]
248 | },
249 | {
250 | "cell_type": "markdown",
251 | "metadata": {},
252 | "source": [
253 | "## Preprocessing "
254 | ]
255 | },
256 | {
257 | "cell_type": "markdown",
258 | "metadata": {},
259 | "source": [
260 | "#### Exercise Solution\n",
261 | "\n",
262 | "Perform Gaussian smoothing and visualize the result."
263 | ]
264 | },
265 | {
266 | "cell_type": "code",
267 | "execution_count": null,
268 | "metadata": {},
269 | "outputs": [],
270 | "source": [
271 | "# (i) Create a variable for the smoothing factor sigma, which should be an integer value\n",
272 | "\n",
273 | "sigma = 3\n",
274 | "\n",
275 | "# After implementing the Gaussian smoothing function below, you can modify this variable \n",
276 | "# to find the ideal value of sigma."
277 | ]
278 | },
279 | {
280 | "cell_type": "code",
281 | "execution_count": null,
282 | "metadata": {},
283 | "outputs": [],
284 | "source": [
285 | "# (ii) Perform the smoothing on the image\n",
286 | "\n",
287 | "# To do so, use the Gaussian filter function 'ndi.gaussian_filter' from the \n",
288 | "# image processing module 'scipy.ndimage', which was imported at the start of the tutorial. \n",
289 | "# Check out the documentation of scipy to see how to use this function. \n",
290 | "# Allocate the output to a new variable.\n",
291 | "img_smooth = ndi.gaussian_filter(img, sigma)"
292 | ]
293 | },
294 | {
295 | "cell_type": "code",
296 | "execution_count": null,
297 | "metadata": {},
298 | "outputs": [],
299 | "source": [
300 | "# (iii) Visualize the result using 'plt.imshow'\n",
301 | "\n",
302 | "# Compare with the original image visualized above. \n",
303 | "# Does the output make sense? Is this what you expected? \n",
304 | "# Can you optimize sigma such that the image looks smooth without blurring the membranes too much?\n",
305 | "plt.figure(figsize=(7,7))\n",
306 | "plt.imshow(img_smooth, interpolation='none', cmap='gray')\n",
307 | "plt.show()\n",
308 | "\n",
309 | "# To have a closer look at a specific region of the image, crop that region out and show it in a \n",
310 | "# separate plot. Remember that you can crop arrays by \"indexing\" or \"slicing\" them similar to lists.\n",
311 | "# Use such \"zoomed-in\" views throughout this tutorial to take a closer look at your intermediate \n",
312 | "# results when necessary.\n",
313 | "plt.figure(figsize=(6,6))\n",
314 | "plt.imshow(img_smooth[400:600, 200:400], interpolation='none', cmap='gray')\n",
315 | "plt.show()"
316 | ]
317 | },
318 | {
319 | "cell_type": "code",
320 | "execution_count": null,
321 | "metadata": {},
322 | "outputs": [],
323 | "source": [
324 | "# (iv) BONUS: Show the raw and smoothed images side by side using 'plt.subplots'\n",
325 | "\n",
326 | "fig, ax = plt.subplots(1, 2, figsize=(10,7))\n",
327 | "ax[0].imshow(img, interpolation='none', cmap='gray')\n",
328 | "ax[1].imshow(img_smooth, interpolation='none', cmap='gray')\n",
329 | "ax[0].set_title('Raw Image')\n",
330 | "ax[1].set_title('Smoothed Image')\n",
331 | "plt.show()"
332 | ]
333 | },
334 | {
335 | "cell_type": "markdown",
336 | "metadata": {},
337 | "source": [
338 | "## Manual Thresholding & Threshold Detection "
339 | ]
340 | },
341 | {
342 | "cell_type": "markdown",
343 | "metadata": {},
344 | "source": [
345 | "#### Exercise Solution\n",
346 | "\n",
347 | "Try out manual thresholding and automated threshold detection."
348 | ]
349 | },
350 | {
351 | "cell_type": "code",
352 | "execution_count": null,
353 | "metadata": {},
354 | "outputs": [],
355 | "source": [
356 | "# (i) Create a variable for a manually set threshold, which should be an integer\n",
357 | "\n",
358 | "# This can be changed later to find a suitable value.\n",
359 | "thresh = 70"
360 | ]
361 | },
362 | {
363 | "cell_type": "code",
364 | "execution_count": null,
365 | "metadata": {},
366 | "outputs": [],
367 | "source": [
368 | "# (ii) Perform thresholding on the smoothed image\n",
369 | "\n",
370 | "# Remember that you can use relational (Boolean) expressions such as 'smaller' (<), 'equal' (==)\n",
371 | "# or 'greater or equal' (>=) with numpy arrays - and you can directly assign the result to a new\n",
372 | "# variable.\n",
373 | "mem = img_smooth > thresh\n",
374 | "\n",
375 | "# Check the dtype of your thresholded image\n",
376 | "# You should see that the dtype is 'bool', which stands for 'Boolean' and means the array\n",
377 | "# is now simply filled with 'True' and 'False', where 'True' is the foreground (the regions\n",
378 | "# above the threshold) and 'False' is the background.\n",
379 | "print(mem.dtype)"
380 | ]
381 | },
382 | {
383 | "cell_type": "code",
384 | "execution_count": null,
385 | "metadata": {},
386 | "outputs": [],
387 | "source": [
388 | "# (iii) Visualize the result\n",
389 | "\n",
390 | "fig, ax = plt.subplots(1, 2, figsize=(10,7))\n",
391 | "ax[0].imshow(img_smooth, interpolation='none', cmap='gray')\n",
392 | "ax[1].imshow(mem, interpolation='none', cmap='gray')\n",
393 | "ax[0].set_title('Smoothed Image')\n",
394 | "ax[1].set_title('Thresholded Membranes')\n",
395 | "plt.show()"
396 | ]
397 | },
398 | {
399 | "cell_type": "code",
400 | "execution_count": null,
401 | "metadata": {},
402 | "outputs": [],
403 | "source": [
404 | "# (iv) Try out different thresholds to find the best one\n",
405 | "\n",
406 | "# If you are using jupyter notebook, you can adapt the code below to\n",
407 | "# interactively change the threshold and look for the best one. These\n",
408 | "# kinds of interactive functions are called 'widgets' and are very \n",
409 | "# useful in exploratory data analysis to create greatly simplified\n",
410 | "# 'User Interfaces' (UIs) on the fly.\n",
411 | "# As a bonus exercise, try to understand or look up how the widget works\n",
412 | "# and play around with it a bit!\n",
413 | "\n",
414 | "# Prepare widget\n",
415 | "from ipywidgets import interact\n",
416 | "@interact(thresh=(10,250,10))\n",
417 | "def select_threshold(thresh=100):\n",
418 | " \n",
419 | " # Thresholding\n",
420 | " mem = img_smooth > thresh\n",
421 | " \n",
422 | " # Visualization\n",
423 | " plt.figure(figsize=(7,7))\n",
424 | " plt.imshow(mem, interpolation='none', cmap='gray')\n",
425 | " plt.show()"
426 | ]
427 | },
428 | {
429 | "cell_type": "code",
430 | "execution_count": null,
431 | "metadata": {},
432 | "outputs": [],
433 | "source": [
434 | "# (v) Perfom automated threshold detection with Otsu's method\n",
435 | "\n",
436 | "# The scikit-image module 'skimage.filters.thresholding' provides\n",
437 | "# several threshold detection algorithms. The most popular one \n",
438 | "# among them is Otsu's method. Using what you've learned so far,\n",
439 | "# import the 'threshold_otsu' function, use it to automatically \n",
440 | "# determine a threshold for the smoothed image, apply the threshold,\n",
441 | "# and visualize the result.\n",
442 | "\n",
443 | "# Import\n",
444 | "from skimage.filters.thresholding import threshold_otsu\n",
445 | "\n",
446 | "# Calculate and apply threshold\n",
447 | "thresh = threshold_otsu(img_smooth)\n",
448 | "mem = img_smooth > thresh\n",
449 | " \n",
450 | "# Visualization\n",
451 | "plt.figure(figsize=(7,7))\n",
452 | "plt.imshow(mem, interpolation='none', cmap='gray')\n",
453 | "plt.show()"
454 | ]
455 | },
456 | {
457 | "cell_type": "code",
458 | "execution_count": null,
459 | "metadata": {},
460 | "outputs": [],
461 | "source": [
462 | "# (vi) BONUS: Did you notice the 'try_all_threshold' function?\n",
463 | "\n",
464 | "# That's convenient! Use it to automatically test the threshold detection\n",
465 | "# functions in 'skimage.filters.thresholding'. Don't forget to adjust the\n",
466 | "# 'figsize' parameter so the resulting images are clearly visible.\n",
467 | "from skimage.filters.thresholding import try_all_threshold\n",
468 | "fig = try_all_threshold(img_smooth, figsize=(10,10), verbose=False)"
469 | ]
470 | },
471 | {
472 | "cell_type": "markdown",
473 | "metadata": {},
474 | "source": [
475 | "## Adaptive Thresholding "
476 | ]
477 | },
478 | {
479 | "cell_type": "markdown",
480 | "metadata": {},
481 | "source": [
482 | "#### Exercise Solution\n",
483 | "\n",
484 | "Implement the two steps of adaptive background subtraction:\n",
485 | "\n",
486 | "1. Use a strong \"mean filter\" with a circular `SE` to create the background image.\n",
487 | "\n",
488 | "2. Use the background image for thresholding."
489 | ]
490 | },
491 | {
492 | "cell_type": "code",
493 | "execution_count": null,
494 | "metadata": {},
495 | "outputs": [],
496 | "source": [
497 | "# Step 1\n",
498 | "# ------\n",
499 | "\n",
500 | "# The expression below creates circular structuring elements. \n",
501 | "# It is an elegant but complicated piece of code and at the moment it is not \n",
502 | "# necessary for you to understand it in detail. Use it to create structuring \n",
503 | "# elements of different sizes (by changing 'i') and find a way to visualize \n",
504 | "# the result (remember that the SE is just a small 'image').\n",
505 | "#\n",
506 | "# Try to answer the following questions: \n",
507 | "#\n",
508 | "# - Is the resulting SE really circular? \n",
509 | "# >>> It is close enough for large i, not so much for small i.\n",
510 | "#\n",
511 | "# - Can certain values of 'i' cause problems? If so, why? \n",
512 | "# >>> Even values create a slight asymmetry! It is best to use only odd values!\n",
513 | "#\n",
514 | "# - What value of 'i' should be used for the SE?\n",
515 | "# >>> My first guess was i=30, which is about 3x the membrane diameter. \n",
516 | "# I tried out some other values but ultimately stuck with this.\n",
517 | "\n",
518 | "# Create SE\n",
519 | "i = 31\n",
520 | "SE = (np.mgrid[:i,:i][0] - np.floor(i/2))**2 + (np.mgrid[:i,:i][1] - np.floor(i/2))**2 <= np.floor(i/2)**2\n",
521 | "\n",
522 | "# Visualize the result\n",
523 | "plt.imshow(SE, interpolation='none')\n",
524 | "plt.show()"
525 | ]
526 | },
527 | {
528 | "cell_type": "code",
529 | "execution_count": null,
530 | "metadata": {},
531 | "outputs": [],
532 | "source": [
533 | "# (ii) Create the background\n",
534 | "\n",
535 | "# Run a mean filter over the image using the disc SE and assign the output to a new variable.\n",
536 | "# Use the function 'skimage.filters.rank.mean'.\n",
537 | "from skimage.filters import rank \n",
538 | "bg = rank.mean(img_smooth, footprint=SE)"
539 | ]
540 | },
541 | {
542 | "cell_type": "code",
543 | "execution_count": null,
544 | "metadata": {},
545 | "outputs": [],
546 | "source": [
547 | "# (iii) Visualize the resulting background image. Does what you get make sense?\n",
548 | "\n",
549 | "plt.figure(figsize=(7,7))\n",
550 | "plt.imshow(bg, interpolation='none', cmap='gray')\n",
551 | "plt.show()"
552 | ]
553 | },
554 | {
555 | "cell_type": "code",
556 | "execution_count": null,
557 | "metadata": {},
558 | "outputs": [],
559 | "source": [
560 | "# Step 2\n",
561 | "# ------\n",
562 | "\n",
563 | "# (iv) Threshold the Gaussian-smoothed original image against the background image created in step 1 \n",
564 | "# using a relational expression\n",
565 | "\n",
566 | "mem = img_smooth > bg"
567 | ]
568 | },
569 | {
570 | "cell_type": "code",
571 | "execution_count": null,
572 | "metadata": {},
573 | "outputs": [],
574 | "source": [
575 | "# (v) Visualize and understand the output. \n",
576 | "\n",
577 | "plt.figure(figsize=(7,7))\n",
578 | "plt.imshow(mem, interpolation='none', cmap='gray')\n",
579 | "plt.show()\n",
580 | "\n",
581 | "# Are you happy with this result as a membrane segmentation? \n",
582 | "# ->> It's pretty good, except for the speckles that litter the inside of the cells!"
583 | ]
584 | },
585 | {
586 | "cell_type": "markdown",
587 | "metadata": {},
588 | "source": [
589 | "## Improving Masks with Binary Morphology "
590 | ]
591 | },
592 | {
593 | "cell_type": "markdown",
594 | "metadata": {},
595 | "source": [
596 | "#### Exercise Solution\n",
597 | "\n",
598 | "Improve the membrane segmentation from above with morphological operations."
599 | ]
600 | },
601 | {
602 | "cell_type": "code",
603 | "execution_count": null,
604 | "metadata": {},
605 | "outputs": [],
606 | "source": [
607 | "# (i) Get rid of speckles using binary hole filling\n",
608 | "\n",
609 | "# Use the function 'ndi.binary_fill_holes' for this. Be sure to check the docs to\n",
610 | "# understand exactly what it does. For this to work as intended, you will have to \n",
611 | "# invert the mask, which you can do using the function `np.logical_not` or the\n",
612 | "# corresponding operator '~'. Again, be sure to understand why this has to be done\n",
613 | "# and don't forget to revert the result back.\n",
614 | "\n",
615 | "#mem_holefilled = np.logical_not(ndi.binary_fill_holes(np.logical_not(mem))) # Long form\n",
616 | "mem_holefilled = ~ndi.binary_fill_holes(~mem) # Short form"
617 | ]
618 | },
619 | {
620 | "cell_type": "code",
621 | "execution_count": null,
622 | "metadata": {},
623 | "outputs": [],
624 | "source": [
625 | "# (ii) Try out other morphological operations to further improve the membrane mask\n",
626 | "\n",
627 | "# The various operations are available in the ndimage module, for example 'ndi.binary_closing'.\n",
628 | "# Play around and see how the different functions affect the mask. Can you optimize the mask, \n",
629 | "# for example by closing gaps?\n",
630 | "# Note that the default SE for these functions is a square. Feel free to create another disc-\n",
631 | "# shaped SE and see how that changes the outcome.\n",
632 | "# BONUS: If you pay close attention, you will notice that some of these operations introduce \n",
633 | "# artifacts at the image boundaries. Can you come up with a way of solving this?\n",
634 | "\n",
635 | "# New circular SE of appropriate size (determined by trial and error)\n",
636 | "i = 15\n",
637 | "SE = (np.mgrid[:i,:i][0] - np.floor(i/2))**2 + (np.mgrid[:i,:i][1] - np.floor(i/2))**2 <= np.floor(i/2)**2\n",
638 | "\n",
639 | "# One solution to the boundary artefact issue is padding with the reflection.\n",
640 | "# 'Padding' refers to the extension of the image at the boundaries, in this case using\n",
641 | "# a 'reflection' of the pixel values next to the boundary. If morphological operations \n",
642 | "# are done on the padded image, the boundary artefacts will occur in the padded region\n",
643 | "# outside the original image, which can simply be cropped out again at the end.\n",
644 | "pad_size = i+1\n",
645 | "mem_padded = np.pad(mem_holefilled, pad_size, mode='reflect')\n",
646 | "\n",
647 | "# Binary closing works well to round off the membranes and close gaps\n",
648 | "mem_final = ndi.binary_closing(mem_padded, structure=SE)\n",
649 | "\n",
650 | "# This slicing operation crops the padded image back to the original size\n",
651 | "mem_final = mem_final[pad_size:-pad_size, pad_size:-pad_size]"
652 | ]
653 | },
654 | {
655 | "cell_type": "code",
656 | "execution_count": null,
657 | "metadata": {},
658 | "outputs": [],
659 | "source": [
660 | "# (iii) Visualize the final result\n",
661 | "\n",
662 | "fig, ax = plt.subplots(1, 2, figsize=(10,7))\n",
663 | "ax[0].imshow(img_smooth, interpolation='none', cmap='gray')\n",
664 | "ax[1].imshow(mem_final, interpolation='none', cmap='gray')\n",
665 | "ax[0].set_title('Smoothed Image')\n",
666 | "ax[1].set_title('Final Membrane Mask')\n",
667 | "plt.show()"
668 | ]
669 | },
670 | {
671 | "cell_type": "markdown",
672 | "metadata": {},
673 | "source": [
674 | "## Connected Components Labeling "
675 | ]
676 | },
677 | {
678 | "cell_type": "markdown",
679 | "metadata": {},
680 | "source": [
681 | "#### Exercise Solution\n",
682 | "\n",
683 | "Use your membrane segmentation for connected components labeling."
684 | ]
685 | },
686 | {
687 | "cell_type": "code",
688 | "execution_count": null,
689 | "metadata": {},
690 | "outputs": [],
691 | "source": [
692 | "# (i) Label connected components\n",
693 | "\n",
694 | "# Use the function 'ndi.label' from the 'ndimage' module. \n",
695 | "# Note that this function labels foreground pixels (1s, not 0s), so you may need \n",
696 | "# to invert your membrane mask just as for hole filling above.\n",
697 | "# Also, note that 'ndi.label' returns another result in addition to the labeled \n",
698 | "# image. Read up on this in the function's documentation and make sure you don't\n",
699 | "# mix up the two outputs!\n",
700 | "\n",
701 | "cell_labels, _ = ndi.label(~mem_final)\n",
702 | "\n",
703 | "# Solution note: For functions with multiple outputs (here the labeled image on\n",
704 | "# the one hand and the number of detected objects on the other), it is a python\n",
705 | "# convention to unpack those outputs that will not be used in the remainder of \n",
706 | "# code into the variable '_' (underscore). This makes it clear to those reading\n",
707 | "# the code that the function returns multiple things but some of them are not\n",
708 | "# important in this case."
709 | ]
710 | },
711 | {
712 | "cell_type": "code",
713 | "execution_count": null,
714 | "metadata": {},
715 | "outputs": [],
716 | "source": [
717 | "# (ii) Visualize the output\n",
718 | "\n",
719 | "# Here, it is no longer ideal to use a 'gray' colormap, since we want to visualize that each\n",
720 | "# cell has a unique ID. Play around with different colormaps (check the docs to see what\n",
721 | "# types of colormaps are available) and choose one that you are happy with.\n",
722 | "\n",
723 | "plt.figure(figsize=(7,7))\n",
724 | "plt.imshow(cell_labels, interpolation='none', cmap='inferno')\n",
725 | "plt.show()"
726 | ]
727 | },
728 | {
729 | "cell_type": "markdown",
730 | "metadata": {},
731 | "source": [
732 | "## Cell Segmentation by Seeding & Expansion "
733 | ]
734 | },
735 | {
736 | "cell_type": "markdown",
737 | "metadata": {},
738 | "source": [
739 | "#### Exercise Solution\n",
740 | "\n",
741 | "Find seeds using the distance transform approach."
742 | ]
743 | },
744 | {
745 | "cell_type": "code",
746 | "execution_count": null,
747 | "metadata": {},
748 | "outputs": [],
749 | "source": [
750 | "# (i) Run a distance transform on the membrane mask\n",
751 | "\n",
752 | "# Use the function 'ndi.distance_transform_edt'.\n",
753 | "# You may need to invert your membrane mask so the distances are computed on\n",
754 | "# the cells, not on the membranes.\n",
755 | "dist_trans = ndi.distance_transform_edt(~mem_final)"
756 | ]
757 | },
758 | {
759 | "cell_type": "code",
760 | "execution_count": null,
761 | "metadata": {},
762 | "outputs": [],
763 | "source": [
764 | "# (ii) Visualize the output and understand what you are seeing.\n",
765 | "\n",
766 | "plt.figure(figsize=(7,7))\n",
767 | "plt.imshow(dist_trans, interpolation='none', cmap='viridis')\n",
768 | "plt.show()"
769 | ]
770 | },
771 | {
772 | "cell_type": "code",
773 | "execution_count": null,
774 | "metadata": {},
775 | "outputs": [],
776 | "source": [
777 | "# (iii) Smoothen the distance transform\n",
778 | "\n",
779 | "# Use 'ndi.gaussian_filter' to do so.\n",
780 | "# You will have to optimize your choice of 'sigma' based on the outcome below.\n",
781 | "\n",
782 | "# Applying the filter\n",
783 | "dist_trans_smooth = ndi.gaussian_filter(dist_trans, sigma=5)\n",
784 | "\n",
785 | "# Visualizing\n",
786 | "plt.figure(figsize=(7,7))\n",
787 | "plt.imshow(dist_trans_smooth, interpolation='none', cmap='viridis')\n",
788 | "plt.show()"
789 | ]
790 | },
791 | {
792 | "cell_type": "code",
793 | "execution_count": null,
794 | "metadata": {},
795 | "outputs": [],
796 | "source": [
797 | "# (iv) Retrieve the local maxima (the 'peaks') from the distance transform\n",
798 | "\n",
799 | "# Use the function 'peak_local_max' from the module 'skimage.feature'. By default, this function will return the\n",
800 | "# indices (aka coordinates) of the pixels where the local maxima are. However, we instead need a boolean mask of \n",
801 | "# the same shape as the original image, where all the local maximum pixels are labeled as `1` and everything else \n",
802 | "# as `0`. The documentation for `peak_local_max` shows how to convert the coordinates into a Boolean mask using\n",
803 | "# numpy (see the last of the examples in the docs). This is a bit of a technical detail, though, so feel free to\n",
804 | "# copy the conversion from the solutions.\n",
805 | "\n",
806 | "# Retrieve peak coordinates\n",
807 | "from skimage.feature import peak_local_max\n",
808 | "seed_coords = peak_local_max(dist_trans_smooth, min_distance=10)\n",
809 | "\n",
810 | "# Convert coords to mask as per skimage documentation\n",
811 | "seeds = np.zeros_like(dist_trans_smooth, dtype=bool)\n",
812 | "seeds[tuple(seed_coords.T)] = True"
813 | ]
814 | },
815 | {
816 | "cell_type": "code",
817 | "execution_count": null,
818 | "metadata": {},
819 | "outputs": [],
820 | "source": [
821 | "# (v) Visualize the output as an overlay on the raw (or smoothed) image\n",
822 | "\n",
823 | "# If you just look at the local maxima image, it will simply look like a bunch of distributed dots.\n",
824 | "# To get an idea if the seeds are well-placed, you will need to overlay these dots onto the original image.\n",
825 | "\n",
826 | "# To do this, it is important to first understand a key point about how the 'pyplot' module works: \n",
827 | "# every plotting command is slapped on top of the previous plotting commands, until everything is ultimately \n",
828 | "# shown when 'plt.show' is called. Hence, you can first plot the raw (or smoothed) input image and then\n",
829 | "# plot the seeds on top of it before showing both with 'plt.show'.\n",
830 | "\n",
831 | "# As you can see if you try this, you will not get the desired result because the zero values in seed array\n",
832 | "# are painted in black over the image you want in the background. To solve this problem, you need to mask \n",
833 | "# these zero values before plotting the seeds. You can do this by creating an appropriately masked array\n",
834 | "# using the function 'np.ma.array' with the keyword argument 'mask'. \n",
835 | "# Check the docs or Stack Overflow to figure out how to do this.\n",
836 | "\n",
837 | "# BONUS: As an additional improvement to the visualization, use 'ndi.maximum_filter' to dilate the \n",
838 | "# seeds a little bit, making them bigger and thus better visible.\n",
839 | "\n",
840 | "# Dilate seeds\n",
841 | "seeds_dil = ndi.maximum_filter(seeds, size=10)\n",
842 | "\n",
843 | "# Create plot\n",
844 | "plt.figure(figsize=(7,7))\n",
845 | "plt.imshow(img_smooth, interpolation='none', cmap='gray')\n",
846 | "plt.imshow(np.ma.array(seeds_dil, mask=seeds_dil==0), interpolation='none', cmap='autumn')\n",
847 | "plt.show()"
848 | ]
849 | },
850 | {
851 | "cell_type": "code",
852 | "execution_count": null,
853 | "metadata": {},
854 | "outputs": [],
855 | "source": [
856 | "# (vii) Label the seeds\n",
857 | "\n",
858 | "# Use connected component labeling to give each cell seed a unique ID number.\n",
859 | "seeds_labeled = ndi.label(seeds)[0]\n",
860 | "\n",
861 | "# Visualize the final result (the labeled seeds) as an overlay on the raw (or smoothed) image\n",
862 | "seeds_labeled_dil = ndi.maximum_filter(seeds_labeled, size=10) # Expand a bit for visualization\n",
863 | "plt.figure(figsize=(10,10))\n",
864 | "plt.imshow(img_smooth, interpolation='none', cmap='gray')\n",
865 | "plt.imshow(np.ma.array(seeds_labeled_dil, mask=seeds_labeled_dil==0), interpolation='none', cmap='prism')\n",
866 | "plt.show()"
867 | ]
868 | },
869 | {
870 | "cell_type": "markdown",
871 | "metadata": {},
872 | "source": [
873 | "### Expansion by Watershed"
874 | ]
875 | },
876 | {
877 | "cell_type": "markdown",
878 | "metadata": {},
879 | "source": [
880 | "#### Exercise Solution\n",
881 | "\n",
882 | "Expand your seeds by means of a watershed expansion."
883 | ]
884 | },
885 | {
886 | "cell_type": "code",
887 | "execution_count": null,
888 | "metadata": {},
889 | "outputs": [],
890 | "source": [
891 | "# (i) Perform watershed\n",
892 | "\n",
893 | "# Use the function 'watershed' from the module 'skimage.segmentation'.\n",
894 | "# Use the labeled cell seeds and the smoothed membrane image as input.\n",
895 | "from skimage.segmentation import watershed\n",
896 | "ws = watershed(img_smooth, seeds_labeled)"
897 | ]
898 | },
899 | {
900 | "cell_type": "code",
901 | "execution_count": null,
902 | "metadata": {},
903 | "outputs": [],
904 | "source": [
905 | "# (ii) Show the result as transparent overlay over the smoothed input image\n",
906 | "\n",
907 | "# Like the masked overlay of the seeds, this can be achieved by making two calls to 'imshow',\n",
908 | "# one for the background image and one for the segmentation. Instead of masking away background,\n",
909 | "# this time you simply make the segmentation image semi-transparent by adjusting the keyword \n",
910 | "# argument 'alpha' of the 'imshow' function, which specifies opacity.\n",
911 | "# Be sure to choose an appropriate colormap that allows you to distinguish the segmented cells\n",
912 | "# even if cells with a very similar ID are next to each other (I would recommend 'prism').\n",
913 | "plt.figure(figsize=(10,10))\n",
914 | "plt.imshow(img_smooth, interpolation='none', cmap='gray')\n",
915 | "plt.imshow(ws, interpolation='none', cmap='prism', alpha=0.4)\n",
916 | "plt.show()"
917 | ]
918 | },
919 | {
920 | "cell_type": "markdown",
921 | "metadata": {},
922 | "source": [
923 | "#### *A Note on Segmentation Quality*\n",
924 | "\n",
925 | "This concludes the segmentation of the cells in the example image. Depending on the quality you achieved in each step along the way, the final segmentation may be of greater or lesser quality (in terms of over-/under-segmentation errors).\n",
926 | "\n",
927 | "It should be noted that the segmentation will likely *never* be perfect, as there is usually a trade-off between over- and undersegmentation.\n",
928 | "\n",
929 | "This raises an important question: ***When should I stop trying to optimize my segmentation?***\n",
930 | "\n",
931 | "There is no absolute answer to this question but the best answer is probably this: ***When you can use it to address your biological questions!***\n",
932 | "\n",
933 | "*Importantly, this implies that you should already have relatively clear questions in mind when you are working on the segmentation!*"
934 | ]
935 | },
936 | {
937 | "cell_type": "markdown",
938 | "metadata": {},
939 | "source": [
940 | "## Postprocessing: Removing Cells at the Image Border "
941 | ]
942 | },
943 | {
944 | "cell_type": "markdown",
945 | "metadata": {},
946 | "source": [
947 | "#### Exercise Solution \n",
948 | "\n",
949 | "Iterate through all the cells in your segmentation and remove those touching the image border."
950 | ]
951 | },
952 | {
953 | "cell_type": "code",
954 | "execution_count": null,
955 | "metadata": {},
956 | "outputs": [],
957 | "source": [
958 | "# (i) Create an image border mask\n",
959 | "\n",
960 | "border_mask = np.zeros(ws.shape, dtype=bool)\n",
961 | "border_mask = ndi.binary_dilation(border_mask, border_value=1)"
962 | ]
963 | },
964 | {
965 | "cell_type": "code",
966 | "execution_count": null,
967 | "metadata": {},
968 | "outputs": [],
969 | "source": [
970 | "# (ii) 'Delete' the cells at the border\n",
971 | "\n",
972 | "# When modifying a segmentation (in this case by deleting some cells), it makes sense\n",
973 | "# to work on a copy of the array, not on the original. This avoids unexpected behaviors,\n",
974 | "# especially within jupyter notebooks. Use the function 'np.copy' to copy an array.\n",
975 | "clean_ws = np.copy(ws)\n",
976 | "\n",
977 | "# Iterate over the IDs of all the cells in the segmentation. Use a for-loop and the \n",
978 | "# function 'np.unique' (remember that each cell in our segmentation is labeled with a \n",
979 | "# different integer value).\n",
980 | "for cell_ID in np.unique(ws):\n",
981 | "\n",
982 | " # Create a mask that contains only the 'current' cell of the iteration\n",
983 | " # Hint: Remember that the comparison of an array with some number (array==number)\n",
984 | " # returns a Boolean mask of the pixels in 'array' whose value is 'number'.\n",
985 | " cell_mask = ws==cell_ID \n",
986 | " \n",
987 | " # Using the cell mask and the border mask from above, test if the cell has pixels touching \n",
988 | " # the image border or not.\n",
989 | " # Hint: 'np.logical_and'\n",
990 | " cell_border_overlap = np.logical_and(cell_mask, border_mask) # Overlap of cell mask and boundary mask\n",
991 | " total_overlap_pixels = np.sum(cell_border_overlap) # Sum overlapping pixels\n",
992 | "\n",
993 | " # If a cell touches the image boundary, delete it by setting its pixels in the segmentation to 0.\n",
994 | " if total_overlap_pixels > 0: \n",
995 | " clean_ws[cell_mask] = 0"
996 | ]
997 | },
998 | {
999 | "cell_type": "code",
1000 | "execution_count": null,
1001 | "metadata": {},
1002 | "outputs": [],
1003 | "source": [
1004 | "# OPTIONAL: re-label the remaining cells to keep the numbering consistent from 1 to N (with 0 as background).\n",
1005 | "\n",
1006 | "for new_ID, cell_ID in enumerate(np.unique(clean_ws)[1:]): # The [1:] excludes 0 from the list (background)!\n",
1007 | " clean_ws[clean_ws==cell_ID] = new_ID+1 # The same here for the +1"
1008 | ]
1009 | },
1010 | {
1011 | "cell_type": "code",
1012 | "execution_count": null,
1013 | "metadata": {},
1014 | "outputs": [],
1015 | "source": [
1016 | "# (iii) Visualize the result\n",
1017 | "\n",
1018 | "# Show the result as transparent overlay over the raw or smoothed image. \n",
1019 | "# Here you have to combine alpha (to make cells transparent) and 'np.ma.array'\n",
1020 | "# (to hide empty space where the border cells were deleted).\n",
1021 | "\n",
1022 | "plt.figure(figsize=(7,7))\n",
1023 | "plt.imshow(img_smooth, interpolation='none', cmap='gray')\n",
1024 | "plt.imshow(np.ma.array(clean_ws, mask=clean_ws==0), interpolation='none', cmap='prism', alpha=0.4)\n",
1025 | "plt.show()"
1026 | ]
1027 | },
1028 | {
1029 | "cell_type": "markdown",
1030 | "metadata": {},
1031 | "source": [
1032 | "## Identifying Cell Edges "
1033 | ]
1034 | },
1035 | {
1036 | "cell_type": "markdown",
1037 | "metadata": {},
1038 | "source": [
1039 | "#### Exercise Solution\n",
1040 | "\n",
1041 | "Create a labeled mask of cell edges."
1042 | ]
1043 | },
1044 | {
1045 | "cell_type": "code",
1046 | "execution_count": null,
1047 | "metadata": {},
1048 | "outputs": [],
1049 | "source": [
1050 | "# (i) Create an array of the same size and data type as the segmentation but filled with only zeros\n",
1051 | "\n",
1052 | "edges = np.zeros_like(clean_ws)"
1053 | ]
1054 | },
1055 | {
1056 | "cell_type": "code",
1057 | "execution_count": null,
1058 | "metadata": {},
1059 | "outputs": [],
1060 | "source": [
1061 | "# (ii) Iterate over the cell IDs\n",
1062 | "for cell_ID in np.unique(clean_ws)[1:]:\n",
1063 | "\n",
1064 | " # (iii) Erode the cell's mask by 1 pixel\n",
1065 | " # Hint: 'ndi.binary_erode'\n",
1066 | " cell_mask = clean_ws==cell_ID\n",
1067 | " eroded_cell_mask = ndi.binary_erosion(cell_mask, iterations=1) # Increase iterations to make boundary wider!\n",
1068 | " \n",
1069 | " # (iv) Create the cell edge mask\n",
1070 | " # Hint: 'np.logical_xor'\n",
1071 | " edge_mask = np.logical_xor(cell_mask, eroded_cell_mask)\n",
1072 | " \n",
1073 | " # (v) Add the cell edge mask to the empty array generated above, labeling it with the cell's ID\n",
1074 | " edges[edge_mask] = cell_ID"
1075 | ]
1076 | },
1077 | {
1078 | "cell_type": "code",
1079 | "execution_count": null,
1080 | "metadata": {},
1081 | "outputs": [],
1082 | "source": [
1083 | "# (vi) Visualize the result\n",
1084 | "\n",
1085 | "# Note: Because the lines are so thin (1pxl wide), they may not be displayed correctly in small figures.\n",
1086 | "# You can 'zoom in' by showing a sub-region of the image which is then rendered bigger. You can\n",
1087 | "# also go back to the edge identification code and make the edges multiple pixels wide (but keep \n",
1088 | "# in mind that this will have an effect on your quantification results!).\n",
1089 | "\n",
1090 | "plt.figure(figsize=(7,7))\n",
1091 | "plt.imshow(np.zeros_like(edges)[300:500, 300:500], cmap='gray', vmin=0, vmax=1) # Simple black background\n",
1092 | "plt.imshow(np.ma.array(edges, mask=edges==0)[300:500, 300:500], interpolation='none', cmap='prism')\n",
1093 | "plt.show()"
1094 | ]
1095 | },
1096 | {
1097 | "cell_type": "markdown",
1098 | "metadata": {},
1099 | "source": [
1100 | "## Extracting Quantitative Measurements "
1101 | ]
1102 | },
1103 | {
1104 | "cell_type": "markdown",
1105 | "metadata": {},
1106 | "source": [
1107 | "#### Exercise Solution\n",
1108 | "\n",
1109 | "Extract the measurements listed above for each cell and collect them in a dictionary.\n",
1110 | "\n",
1111 | "Note: The ideal data structure for data like this is the `DataFrame` offered by the module `Pandas`. However, for the sake of simplicty, we will here stick with a dictionary of lists."
1112 | ]
1113 | },
1114 | {
1115 | "cell_type": "code",
1116 | "execution_count": null,
1117 | "metadata": {},
1118 | "outputs": [],
1119 | "source": [
1120 | "# (i) Create a dictionary that contains a key-value pairing for each measurement\n",
1121 | "\n",
1122 | "# The keys should be strings describing the type of measurement (e.g. 'intensity_mean') and \n",
1123 | "# the values should be empty lists. These empty lists will be filled with the results of the\n",
1124 | "# measurements\n",
1125 | "\n",
1126 | "results = {\"cell_id\" : [],\n",
1127 | " \"int_mean\" : [],\n",
1128 | " \"int_mem_mean\" : [],\n",
1129 | " \"cell_area\" : [],\n",
1130 | " \"cell_edge\" : []}\n",
1131 | "\n",
1132 | "# Solution note: the spacing between the strings and colons doesn't matter for the code's\n",
1133 | "# execution. It is used solely to make the code more readable!"
1134 | ]
1135 | },
1136 | {
1137 | "cell_type": "code",
1138 | "execution_count": null,
1139 | "metadata": {},
1140 | "outputs": [],
1141 | "source": [
1142 | "# (ii) Record the measurements for each cell\n",
1143 | "\n",
1144 | "# Iterate over the segmented cells ('np.unique').\n",
1145 | "# Inside the loop, create a mask for the current cell and use it to extract the measurements listed above. \n",
1146 | "# Add them to the appropriate list in the dictionary using the 'append' method.\n",
1147 | "# Hint: Remember that you can get out all the values within a masked area by indexing the image \n",
1148 | "# with the mask. For example, 'np.mean(image[cell_mask])' will return the mean of all the \n",
1149 | "# intensity values of 'image' that are masked by 'cell_mask'!\n",
1150 | "\n",
1151 | "# Iterate over cell IDs\n",
1152 | "for cell_id in np.unique(clean_ws)[1:]:\n",
1153 | "\n",
1154 | " # Mask the current cell and cell edge\n",
1155 | " cell_mask = clean_ws==cell_id\n",
1156 | " edge_mask = edges==cell_id\n",
1157 | " \n",
1158 | " # Get the measurements\n",
1159 | " results[\"cell_id\"].append(cell_id)\n",
1160 | " results[\"int_mean\"].append(np.mean(img[cell_mask]))\n",
1161 | " results[\"int_mem_mean\"].append(np.mean(img[edge_mask]))\n",
1162 | " results[\"cell_area\"].append(np.sum(cell_mask))\n",
1163 | " results[\"cell_edge\"].append(np.sum(edge_mask))"
1164 | ]
1165 | },
1166 | {
1167 | "cell_type": "code",
1168 | "execution_count": null,
1169 | "metadata": {},
1170 | "outputs": [],
1171 | "source": [
1172 | "# (iii) Print the results and check that they make sense\n",
1173 | "\n",
1174 | "for key in results.keys(): \n",
1175 | " print(key + \":\", results[key][:5], '\\n')"
1176 | ]
1177 | },
1178 | {
1179 | "cell_type": "markdown",
1180 | "metadata": {},
1181 | "source": [
1182 | "## Simple Analysis & Visualisation "
1183 | ]
1184 | },
1185 | {
1186 | "cell_type": "markdown",
1187 | "metadata": {},
1188 | "source": [
1189 | "#### Exercise Solution\n",
1190 | "\n",
1191 | "Analyze and plot the extracted data in a variety of ways."
1192 | ]
1193 | },
1194 | {
1195 | "cell_type": "code",
1196 | "execution_count": null,
1197 | "metadata": {},
1198 | "outputs": [],
1199 | "source": [
1200 | "# (i) Familiarize yourself with the data structure of the results dict and summarize the results\n",
1201 | "\n",
1202 | "# Recall that dictionaries are unordered; a dataset of interest is accessed through its key.\n",
1203 | "# In our case, the datasets inside the dict are lists of values, ordered in the same order\n",
1204 | "# as the cell IDs. \n",
1205 | "\n",
1206 | "# For each dataset in the results dict, print its name (the key) along with its mean, standard \n",
1207 | "# deviation, maximum, minimum, and median. The appropriate numpy methods (e.g. 'np.median') work\n",
1208 | "# with lists just as well as with arrays.\n",
1209 | "\n",
1210 | "# Custom function for nice printing of summary statistics.\n",
1211 | "# Note the use of format strings for nice number padding.\n",
1212 | "def print_summary(data):\n",
1213 | " print( \" Mean: {:7.2f}\".format(np.mean(data)) )\n",
1214 | " print( \" Stdev: {:7.2f}\".format(np.std(data)) )\n",
1215 | " print( \" Max: {:7.2f}\".format(np.max(data)) )\n",
1216 | " print( \" Min: {:7.2f}\".format(np.min(data)) )\n",
1217 | " print( \" Median: {:7.2f}\".format(np.median(data)) )\n",
1218 | "\n",
1219 | "# Calling the custom function for each dataset\n",
1220 | "for key in results.keys():\n",
1221 | " print( '\\n'+key )\n",
1222 | " print_summary(results[key])\n",
1223 | " \n",
1224 | "# There are also pre-made functions to get summary statistics,\n",
1225 | "# for example 'scipy.stats.describe'.\n",
1226 | "from scipy.stats import describe\n",
1227 | "stat_summary = describe(results['int_mean'])\n",
1228 | "\n",
1229 | "print( '\\nscipy.stats.describe of int_mean' )\n",
1230 | "for key in stat_summary._asdict().keys():\n",
1231 | " print( ' ', key+': ', stat_summary._asdict()[key] )"
1232 | ]
1233 | },
1234 | {
1235 | "cell_type": "code",
1236 | "execution_count": null,
1237 | "metadata": {},
1238 | "outputs": [],
1239 | "source": [
1240 | "# (ii) Create a box plot showing the mean cell and mean membrane intensities for both channels. \n",
1241 | "\n",
1242 | "# Use the function 'plt.boxplot'. Use the 'label' keyword of 'plt.boxplot' to label the x axis with \n",
1243 | "# the corresponding key names. Feel free to play around with the various options of the boxplot \n",
1244 | "# function to make your plot look nicer. Remember that you can first call 'plt.figure' to adjust \n",
1245 | "# settings such as the size of the plot.\n",
1246 | "\n",
1247 | "plt.figure(figsize=(3,6))\n",
1248 | "plt.boxplot([results['int_mean'], results['int_mem_mean']], \n",
1249 | " labels=['int_mean', 'int_mem_mean'],\n",
1250 | " widths=0.6, notch=True)\n",
1251 | "plt.show()"
1252 | ]
1253 | },
1254 | {
1255 | "cell_type": "code",
1256 | "execution_count": null,
1257 | "metadata": {},
1258 | "outputs": [],
1259 | "source": [
1260 | "# (iii) Create a scatter plot of cell outline length over cell area\n",
1261 | "\n",
1262 | "# Use the function 'plt.scatter' for this. Be sure to properly label the \n",
1263 | "# plot using 'plt.xlabel' and 'plt.ylabel'.\n",
1264 | "\n",
1265 | "plt.figure(figsize=(8,5))\n",
1266 | "plt.scatter(results[\"cell_area\"], results[\"cell_edge\"],\n",
1267 | " edgecolor='k', s=30, alpha=0.5)\n",
1268 | "plt.xlabel(\"cell area [pxl^2]\")\n",
1269 | "plt.ylabel(\"cell edge length [pxl]\")\n",
1270 | "\n",
1271 | "# BONUS: Do you understand why you are seeing the pattern this produces? \n",
1272 | "# ->> The curve reflects how circumference scales with area!\n",
1273 | "\n",
1274 | "# Can you generate a 'null model' curve that assumes all cells to be circular? \n",
1275 | "cell_area_range = np.linspace(min(results[\"cell_area\"]), max(results[\"cell_area\"]), num=100)\n",
1276 | "circle_circumference = 2 * np.pi * np.sqrt( cell_area_range / np.pi )\n",
1277 | "plt.plot(cell_area_range, circle_circumference, color='r', alpha=0.8)\n",
1278 | "plt.legend(['circles', 'data'], loc=2, fontsize=10)\n",
1279 | "plt.show()\n",
1280 | "\n",
1281 | "# What is the result? Do you notice something odd about it? What could be the reason for\n",
1282 | "# this and how could it be fixed?\n",
1283 | "# ->> In general, the cells don't deviate all that much from the circular case.\n",
1284 | "# ->> Strangely, some cells have a smaller outline than the circumference of a circle\n",
1285 | "# of equivalent area. This is mathematically impossible.\n",
1286 | "# ->> A possible reason could be that the measures are taken in pixels, which leads\n",
1287 | "# to a so-called discretization error. It could be fixed by \"meshing\" the cell\n",
1288 | "# outline and interpolating a more accurate measurement of circumference."
1289 | ]
1290 | },
1291 | {
1292 | "cell_type": "code",
1293 | "execution_count": null,
1294 | "metadata": {},
1295 | "outputs": [],
1296 | "source": [
1297 | "# (iv) Perform a linear fit of membrane intensity over cell area\n",
1298 | "\n",
1299 | "# Use the function 'linregress' from the module 'scipy.stats'. Be sure to read the docs to\n",
1300 | "# understand the output of this function. Print the output.\n",
1301 | "\n",
1302 | "# Compute linear fit\n",
1303 | "from scipy.stats import linregress\n",
1304 | "linfit = linregress(results[\"cell_area\"], results[\"int_mem_mean\"])\n",
1305 | "\n",
1306 | "# Print all the results\n",
1307 | "linprops = ['slope','interc','rval','pval','stderr']\n",
1308 | "for index,prop in enumerate(linprops):\n",
1309 | " print( prop, '\\t', '{:4.2e}'.format(linfit[index]) )"
1310 | ]
1311 | },
1312 | {
1313 | "cell_type": "code",
1314 | "execution_count": null,
1315 | "metadata": {},
1316 | "outputs": [],
1317 | "source": [
1318 | "# (v) Think about the result\n",
1319 | "\n",
1320 | "# Note that the fit seems to return a highly significant p-value but a very low correlation \n",
1321 | "# coefficient (r-value). Based on prior knowledge, we would not expect a linear correlation of \n",
1322 | "# this sort to be present in our data. \n",
1323 | "#\n",
1324 | "# This should prompt several questions:\n",
1325 | "# 1) What does this p-value actually mean? Check the docs of 'linregress'!\n",
1326 | "# ->> This p-value only means that, given a linear fit through this data, the slope of the\n",
1327 | "# fit is very unlikely to be zero. However, it does not make a statement on whether or\n",
1328 | "# not it makes sense to use a linear fit in the first place. Looking at the scatterplot\n",
1329 | "# below or at the correlation coefficient r, it is clear that a linear fit on this data\n",
1330 | "# is not meaningful.\n",
1331 | "# ->> Note also: With single-cell approaches, we quickly get to a large number of data points. \n",
1332 | "# This makes hypothesis testing in general less useful, as p-values tend to become very\n",
1333 | "# small even if the null hypothesis holds. It makes sense to instead report effect sizes.\n",
1334 | "# This is a tricky topic but well worth reading up on.\n",
1335 | "#\n",
1336 | "# 2) Could there be artifacts in our segmentation that bias this analysis?\n",
1337 | "# ->> Oversegmentation is an important source of bias here. If a cell is oversegmented,\n",
1338 | "# it will be considered as two or three cells. These will naturally have a lower\n",
1339 | "# cell area and will naturally have a lower membrane intensity because some of their\n",
1340 | "# edges are actually not on membranes. In other words, they will fall into the bottom\n",
1341 | "# left of the plot, distorting the data.\n",
1342 | "#\n",
1343 | "# In general, it's always good to be very careful when doing any kind of data analysis. Make sure you \n",
1344 | "# understand the functions you are using and always check for possible errors or sources of bias!"
1345 | ]
1346 | },
1347 | {
1348 | "cell_type": "code",
1349 | "execution_count": null,
1350 | "metadata": {},
1351 | "outputs": [],
1352 | "source": [
1353 | "# (vi) Overlay the linear fit onto a scatter plot\n",
1354 | "\n",
1355 | "# Recall that a linear function is defined by `y = slope * x + intercept`.\n",
1356 | "\n",
1357 | "# To define the line you'd like to plot, you need two values of x (the starting point and\n",
1358 | "# and the end point of the line). What values of x make sense? Can you get them automatically?\n",
1359 | "# ->> The max and min values in the data are a good choice.\n",
1360 | "x_vals = [min(results[\"cell_area\"]), max(results[\"cell_area\"])]\n",
1361 | "\n",
1362 | "# When you have the x-values for the starting point and end point, get the corresponding y \n",
1363 | "# values from the fit through the equation above.\n",
1364 | "y_vals = [linfit[0] * x_vals[0] + linfit[1], linfit[0] * x_vals[1] + linfit[1]]\n",
1365 | "\n",
1366 | "# Plot the line with 'plt.plot'. Adjust the line's properties so it is well visible.\n",
1367 | "# Note: Remember that you have to create the scatterplot before plotting the line so that\n",
1368 | "# the line will be placed on top of the scatterplot.\n",
1369 | "plt.figure(figsize=(8,5))\n",
1370 | "plt.scatter(results[\"cell_area\"], results[\"int_mem_mean\"], \n",
1371 | " edgecolor='k', s=30, alpha=0.5)\n",
1372 | "plt.plot(x_vals, y_vals, color='red', lw=2, alpha=0.8)\n",
1373 | "\n",
1374 | "# Use 'plt.legend' to add information about the line to the plot.\n",
1375 | "plt.legend([\"linear fit, Rsq={:4.2e}\".format(linfit[2]**2.0)], frameon=False, loc=4)\n",
1376 | "\n",
1377 | "# Label the plot and finally show it with 'plt.show'.\n",
1378 | "plt.xlabel(\"cell area [pxl]\")\n",
1379 | "plt.ylabel(\"Mean membrane intensity [a.u.]\")\n",
1380 | "plt.title(\"Scatterplot with linear fit\")\n",
1381 | "plt.show()"
1382 | ]
1383 | },
1384 | {
1385 | "cell_type": "code",
1386 | "execution_count": null,
1387 | "metadata": {},
1388 | "outputs": [],
1389 | "source": [
1390 | "# (vii) Map the cell area back onto the image as a 'heatmap'\n",
1391 | "\n",
1392 | "# Scale the cell area data to 8bit so that it can be used as pixel intensity values.\n",
1393 | "# Hint: if the largest cell area should correspond to the value 255 in uint8, then \n",
1394 | "# the other cell areas correspond to 'cell_area * 255 / largest_cell_area'.\n",
1395 | "# Hint: To perform an operation on all cell area values at once, convert the list \n",
1396 | "# of cell areas to a numpy array.\n",
1397 | "areas_8bit = np.array(results[\"cell_area\"]) / max(results[\"cell_area\"]) * 255\n",
1398 | "\n",
1399 | "# Initialize a new image array; all values should be zeros, the shape should be identical \n",
1400 | "# to the images we worked with before and the dtype should be uint8.\n",
1401 | "area_map = np.zeros_like(clean_ws, dtype=np.uint8) \n",
1402 | "\n",
1403 | "# Iterate over the segmented cells. In addition to the cell IDs, the for-loop should\n",
1404 | "# also include a simple counter (starting from 0) with which the area measurement can be \n",
1405 | "# accessed by indexing.\n",
1406 | "for index, cell_id in enumerate(results[\"cell_id\"]):\n",
1407 | " \n",
1408 | " # Mask the current cell and assign the cell's (re-scaled) area value to the cell's pixels.\n",
1409 | " area_map[clean_ws==cell_id] = areas_8bit[index]\n",
1410 | "\n",
1411 | "# Visualize the result as a colored semi-transparent overlay over the raw/smoothed original input image.\n",
1412 | "# BONUS: See if you can exclude outliers to make the color mapping more informative!\n",
1413 | "\n",
1414 | "# Mask of outliers (the largest and smallest 5% of all cells)\n",
1415 | "outlier_mask = np.logical_or(area_map > np.percentile(areas_8bit, 95),\n",
1416 | " area_map < np.percentile(areas_8bit, 5))\n",
1417 | "\n",
1418 | "# Mask of all regions to leave blank (outliers + image boundary cells)\n",
1419 | "full_mask = np.logical_or(area_map==0, outlier_mask)\n",
1420 | "\n",
1421 | "# Create the plot\n",
1422 | "plt.figure(figsize=(10,10))\n",
1423 | "plt.imshow(img_smooth, interpolation='none', cmap='gray')\n",
1424 | "plt.imshow(np.ma.array(area_map, mask=full_mask),\n",
1425 | " interpolation='none', cmap='viridis', alpha=0.6)\n",
1426 | "plt.show()"
1427 | ]
1428 | },
1429 | {
1430 | "cell_type": "markdown",
1431 | "metadata": {},
1432 | "source": [
1433 | "## Writing Output to Files "
1434 | ]
1435 | },
1436 | {
1437 | "cell_type": "markdown",
1438 | "metadata": {},
1439 | "source": [
1440 | "#### Exercise Solution\n",
1441 | "\n",
1442 | "Write the generated data into a variety of different output files."
1443 | ]
1444 | },
1445 | {
1446 | "cell_type": "code",
1447 | "execution_count": null,
1448 | "metadata": {},
1449 | "outputs": [],
1450 | "source": [
1451 | "# (i) Write one or more of the images you produced to a tif file\n",
1452 | "\n",
1453 | "# Use the function 'imsave' from the 'skimage.io' module. Make sure that the array you are \n",
1454 | "# writing is of integer type. If necessary, you can use the method 'astype' for conversions, \n",
1455 | "# e.g. 'some_array.astype(np.uint8)' or 'some_array.astype(np.uint16)'. Careful when \n",
1456 | "# converting a segmentation to uint8; if there are more than 255 cells, the 8bit format\n",
1457 | "# doesn't have sufficient bit-depth to represent all cell IDs!\n",
1458 | "#\n",
1459 | "# You can also try adding the segmentation to the original image, creating an image with\n",
1460 | "# two channels, one of them being the segmentation. \n",
1461 | "#\n",
1462 | "# After writing the file, load it into Fiji and check that everything worked as intended.\n",
1463 | "\n",
1464 | "from skimage.io import imsave\n",
1465 | "imsave(\"example_cells_1_edges.tif\", edges.astype(np.uint16))"
1466 | ]
1467 | },
1468 | {
1469 | "cell_type": "code",
1470 | "execution_count": null,
1471 | "metadata": {},
1472 | "outputs": [],
1473 | "source": [
1474 | "# (ii) Write a figure to a png or pdf\n",
1475 | "\n",
1476 | "# Recreate the scatter plot from above (with or without the regression line), then save the figure\n",
1477 | "# as a png using 'plt.savefig'. Alternatively, you can also save it to a pdf, which will create a\n",
1478 | "# vector graphic that can be imported into programs like Adobe Illustrator.\n",
1479 | "\n",
1480 | "# Create plot (but don't show)\n",
1481 | "plt.scatter(results[\"cell_area\"], results[\"int_mem_mean\"], \n",
1482 | " edgecolor='k', s=30, alpha=0.5)\n",
1483 | "plt.plot(x_vals, y_vals, color='red', lw=2, alpha=0.8)\n",
1484 | "plt.legend([\"linear fit, Rsq={:4.2e}\".format(linfit[2]**2.0)], frameon=False, loc=4)\n",
1485 | "plt.xlabel(\"cell area [pxl]\")\n",
1486 | "plt.ylabel(\"Mean membrane intensity [a.u.]\")\n",
1487 | "plt.title(\"Scatterplot with linear fit\")\n",
1488 | "\n",
1489 | "# Save as png and pdf\n",
1490 | "plt.savefig('example_cells_1_scatterFit.png')\n",
1491 | "plt.savefig('example_cells_1_scatterFit.pdf')\n",
1492 | "plt.clf() # Clear the figure buffer"
1493 | ]
1494 | },
1495 | {
1496 | "cell_type": "code",
1497 | "execution_count": null,
1498 | "metadata": {},
1499 | "outputs": [],
1500 | "source": [
1501 | "# (iii) Save the segmentation as a numpy file\n",
1502 | "\n",
1503 | "# Numpy files allow fast storage and reloading of numpy arrays. Use the function 'np.save'\n",
1504 | "# to save the array and reload it using 'np.load'.\n",
1505 | "\n",
1506 | "np.save(\"example_cells_1_seg\", clean_ws) # Save\n",
1507 | "seg = np.load(\"example_cells_1_seg.npy\") # Load\n",
1508 | "print(clean_ws.shape, seg.shape)"
1509 | ]
1510 | },
1511 | {
1512 | "cell_type": "code",
1513 | "execution_count": null,
1514 | "metadata": {},
1515 | "outputs": [],
1516 | "source": [
1517 | "# (iv) Save the result dictionary as a pickle file\n",
1518 | "\n",
1519 | "# Pickling is a way of generating generic files from almost any python object, which can easily\n",
1520 | "# be reloaded into python at a later point in time.\n",
1521 | "# You will need to open an empty file object using 'open' in write mode ('w'). It's best to do \n",
1522 | "# so using the 'with'-statement (context manager) to make sure that the file object will be closed\n",
1523 | "# automatically when you are done with it.\n",
1524 | "# Use the function 'pickle.dump' from the 'pickle' module to write the results to the file.\n",
1525 | "# Hint: Refer to the python documention for input and output to understand how file objects are\n",
1526 | "# handled in python in general.\n",
1527 | "\n",
1528 | "import pickle\n",
1529 | "with open('example_cells_1_results.pkl','wb') as outfile:\n",
1530 | " pickle.dump(results, outfile)\n",
1531 | "\n",
1532 | "# Note: Pickled files can be re-loaded again as follows:\n",
1533 | "with open('example_cells_1_results.pkl', 'rb') as infile:\n",
1534 | " results_reloaded = pickle.load(infile)\n",
1535 | " print(results_reloaded.keys())"
1536 | ]
1537 | },
1538 | {
1539 | "cell_type": "code",
1540 | "execution_count": null,
1541 | "metadata": {},
1542 | "outputs": [],
1543 | "source": [
1544 | "# (v) Write a tab-separated text file of the results dict\n",
1545 | "\n",
1546 | "# The most generic way of saving numeric results is a simple text file. It can be imported into \n",
1547 | "# pretty much any other program.\n",
1548 | "\n",
1549 | "# To write normal text files, open an empty file object in write mode ('w') using the 'with'-statement.\n",
1550 | "with open('example_cells_1_results.txt','w') as outfile:\n",
1551 | "\n",
1552 | " # Use the 'file_object.write(string)' method to write strings to the file, one line at a time,\n",
1553 | " # First, write the header of the data (the result dict keys), separated by tabs ('\\t'). \n",
1554 | " # It makes sense to first generate a complete string with all the headers and then write this \n",
1555 | " # string to the file as one line. Note that you will need to explicitly write 'newline' characters \n",
1556 | " # ('\\n') at the end of the line to switch to the next line.\n",
1557 | " # Hint: the string method 'join' is very useful here!\n",
1558 | " header_string = '\\t'.join(results.keys()) + '\\n'\n",
1559 | " outfile.write(header_string)\n",
1560 | "\n",
1561 | " # After writing the headers, iterate over all the cells and write the result data to the file line\n",
1562 | " # by line, by creating strings similar to the header string.\n",
1563 | " for index in range(len(results['cell_id'])):\n",
1564 | " data_string = '\\t'.join([str(results[key][index]) for key in results.keys()]) + '\\n'\n",
1565 | " outfile.write(data_string)\n",
1566 | "\n",
1567 | "# After writing the data, have a look at the output file in a text editor or in a spreadsheet\n",
1568 | "# program like Excel."
1569 | ]
1570 | },
1571 | {
1572 | "cell_type": "markdown",
1573 | "metadata": {},
1574 | "source": [
1575 | "## Batch Processing "
1576 | ]
1577 | },
1578 | {
1579 | "cell_type": "markdown",
1580 | "metadata": {},
1581 | "source": [
1582 | "#### Exercise Solution\n",
1583 | "\n",
1584 | "Convert the pipeline to a single function in a python script.\n",
1585 | "\n",
1586 | "Import that function and write the code necessary to iterate over all input files in a folder, apply the function to each file, and collect the results."
1587 | ]
1588 | },
1589 | {
1590 | "cell_type": "markdown",
1591 | "metadata": {},
1592 | "source": [
1593 | "#### Solution Note\n",
1594 | "\n",
1595 | "The converted version of the pipeline can be found in `batch_processing_solution.py`. \n",
1596 | "\n",
1597 | "Note that the most of the exercise comments have been removed so it doesn't look too cluttered. However, this level of clean-up is probably a bit extreme; it is generally recommended to retain at least basic comments on the purpose of each code block. \n",
1598 | "\n",
1599 | "Also note that a doc string was added to the function definition (the string designated with three `\"\"\"` at the start and end, directly under the function definition). This is a very useful reference, since it is automatically recognized as a help message by Jupyter notebook (and other IDEs), so you can easily double-check what an imported function does, for example by typing `run_pipeline?` in a code cell."
1600 | ]
1601 | },
1602 | {
1603 | "cell_type": "code",
1604 | "execution_count": null,
1605 | "metadata": {},
1606 | "outputs": [],
1607 | "source": [
1608 | "# (i) Test if your pipeline function actually works\n",
1609 | "\n",
1610 | "# Import your function using the normal python syntax for imports, like this:\n",
1611 | "# from your_module import your_function\n",
1612 | "# Run the function and visualize the resulting segmentation. Make sure everything works as intended.\n",
1613 | "\n",
1614 | "from batch_processing_solution import run_pipeline\n",
1615 | "pip_seg, pip_results = run_pipeline(r\"example_data\", r'example_cells_1.tif')\n",
1616 | "\n",
1617 | "plt.figure(figsize=(7,7))\n",
1618 | "plt.imshow(np.zeros_like(pip_seg), interpolation='none', cmap='gray', vmax=1) # Simple black background\n",
1619 | "plt.imshow(np.ma.array(pip_seg, mask=pip_seg==0), interpolation='none', cmap='prism')\n",
1620 | "plt.show()"
1621 | ]
1622 | },
1623 | {
1624 | "cell_type": "code",
1625 | "execution_count": null,
1626 | "metadata": {},
1627 | "outputs": [],
1628 | "source": [
1629 | "# (ii) Get all relevant filenames from the input directory\n",
1630 | "\n",
1631 | "# Use the function 'listdir' from the module 'os' to get a list of all the files\n",
1632 | "# in a directory. Find a way to filter out only the relevant input files, namely\n",
1633 | "# \"example_cells_1.tif\" and \"example_cells_2.tif\". Of course, one would usually\n",
1634 | "# do this for many more images, otherwise it's not worth the effort.\n",
1635 | "# Hint: Loop over the filenames and use if statements to decide which ones to \n",
1636 | "# keep and which ones to throw away.\n",
1637 | "\n",
1638 | "# Get all files\n",
1639 | "dirpath = r\"example_data\"\n",
1640 | "from os import listdir\n",
1641 | "filelist = listdir(dirpath)\n",
1642 | "\n",
1643 | "# Filter for target files: simple option\n",
1644 | "# Note that this will use ALL files with a .tif ending, which in some circumstances\n",
1645 | "# may include files that are not supposed to be used!\n",
1646 | "target_files = []\n",
1647 | "for fname in filelist:\n",
1648 | " if fname.endswith('.tif'):\n",
1649 | " target_files.append(fname)\n",
1650 | "print(target_files)\n",
1651 | "\n",
1652 | "# Filter for target files: advanced option using regex and a list comprehension\n",
1653 | "import re\n",
1654 | "target_pattern = re.compile(\"^example_cells_\\d+\\.tif$\")\n",
1655 | "target_files = [fname for fname in filelist if target_pattern.match(fname)]\n",
1656 | "print(target_files)"
1657 | ]
1658 | },
1659 | {
1660 | "cell_type": "code",
1661 | "execution_count": null,
1662 | "metadata": {},
1663 | "outputs": [],
1664 | "source": [
1665 | "# (iii) Iterate over the relevant input filenames and run the pipeline function\n",
1666 | "\n",
1667 | "# Be sure to collect the output of the pipeline function in a way that allows\n",
1668 | "# you to trace it back to the file it came from. You could for example use a\n",
1669 | "# dictionary with the filenames as keys.\n",
1670 | "\n",
1671 | "# Initialize empty dictionaries\n",
1672 | "all_seg = {}\n",
1673 | "all_results = {}\n",
1674 | "\n",
1675 | "# Iterate over files and run pipeline for each\n",
1676 | "for fname in target_files:\n",
1677 | " pip_seg, pip_results = run_pipeline(dirpath, fname)\n",
1678 | " all_seg[fname] = pip_seg\n",
1679 | " all_results[fname] = pip_results"
1680 | ]
1681 | },
1682 | {
1683 | "cell_type": "code",
1684 | "execution_count": null,
1685 | "metadata": {},
1686 | "outputs": [],
1687 | "source": [
1688 | "# (iv) Recreate one of the scatterplots from above, but this time with all the cells\n",
1689 | "\n",
1690 | "# You can color-code the dots to indicate which file they came from. Don't forget to\n",
1691 | "# add a corresponding legend.\n",
1692 | "\n",
1693 | "plt.figure(figsize=(8,5))\n",
1694 | "\n",
1695 | "colors = ['blue','red']\n",
1696 | "for key, color in zip(sorted(all_results.keys()), colors):\n",
1697 | " plt.scatter(all_results[key][\"cell_area\"], all_results[key][\"cell_edge\"],\n",
1698 | " edgecolor='k', c=color, s=30, alpha=0.5, label=key)\n",
1699 | " \n",
1700 | "plt.legend()\n",
1701 | "plt.xlabel(\"cell area [pxl^2]\")\n",
1702 | "plt.ylabel(\"cell edge length [pxl]\")\n",
1703 | "plt.show()"
1704 | ]
1705 | },
1706 | {
1707 | "cell_type": "markdown",
1708 | "metadata": {},
1709 | "source": [
1710 | "## *This is the end of the tutorial!*\n",
1711 | "\n",
1712 | "**We hope you enjoyed the ride and learned a lot!**"
1713 | ]
1714 | },
1715 | {
1716 | "cell_type": "markdown",
1717 | "metadata": {},
1718 | "source": [
1719 | "### Concluding Remarks\n",
1720 | "\n",
1721 | "It's important to remember that the phrase ***\"Use it or loose it!\"*** fully applies for the skills taught in this tutorial.\n",
1722 | "\n",
1723 | "If you now just go back to the lab and don't touch python or image analysis for the next half year, most of the things you have learned here will be lost.\n",
1724 | "\n",
1725 | "So, what can you do?\n",
1726 | "\n",
1727 | "\n",
1728 | "- If possible, start applying what you have learned to your own work right away\n",
1729 | "\n",
1730 | "\n",
1731 | "- Even if your current work doesn't absolutely *need* coding / image analysis (which to be honest is hard to believe! ;p), you can still use it at least to make some nice plots!\n",
1732 | "\n",
1733 | "\n",
1734 | "- Another very good approach is to find yourself an interesting little side project you can play around with\n",
1735 | "\n",
1736 | "\n",
1737 | "- Of course, there is still much more to learn and the internet happens to be full of excellent tutorials!\n",
1738 | " - As a starting point, have a look at [Bio-IT's curated list of tutorials](https://bio-it.embl.de/coding-club/curated-tutorials/)"
1739 | ]
1740 | },
1741 | {
1742 | "cell_type": "markdown",
1743 | "metadata": {},
1744 | "source": [
1745 | "***We wish you the best of luck for all your coding endeavors!***"
1746 | ]
1747 | }
1748 | ],
1749 | "metadata": {
1750 | "kernelspec": {
1751 | "display_name": "Python [conda env:py39_tutorial_test]",
1752 | "language": "python",
1753 | "name": "conda-env-py39_tutorial_test-py"
1754 | },
1755 | "language_info": {
1756 | "codemirror_mode": {
1757 | "name": "ipython",
1758 | "version": 3
1759 | },
1760 | "file_extension": ".py",
1761 | "mimetype": "text/x-python",
1762 | "name": "python",
1763 | "nbconvert_exporter": "python",
1764 | "pygments_lexer": "ipython3",
1765 | "version": "3.9.16"
1766 | }
1767 | },
1768 | "nbformat": 4,
1769 | "nbformat_minor": 2
1770 | }
1771 |
--------------------------------------------------------------------------------
/ipynb_images/adaptive_bg_1D.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WhoIsJack/python-bioimage-analysis-tutorial/1d2473994c0151d8b83f0385f007425ad4c7a055/ipynb_images/adaptive_bg_1D.png
--------------------------------------------------------------------------------
/ipynb_images/distance_transform.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WhoIsJack/python-bioimage-analysis-tutorial/1d2473994c0151d8b83f0385f007425ad4c7a055/ipynb_images/distance_transform.png
--------------------------------------------------------------------------------
/ipynb_images/fig_gen.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | """
3 | Created on Mon May 01 00:30:59 2017
4 |
5 | @author: Jonas Hartmann @ Gilmour group @ EMBL Heidelberg
6 |
7 | @descript: Quick & dirty script to generate illustrations for the python image
8 | analysis course's tutorial pipeline.
9 | """
10 |
11 | # IMPORTS
12 |
13 | import numpy as np
14 | import scipy.ndimage as ndi
15 | import matplotlib.pyplot as plt
16 |
17 |
18 | # GAUSSIAN KERNEL GRID
19 |
20 | # Create the Gaussian kernel
21 | a = np.zeros((11,11),dtype=np.uint8)
22 | a[5,5] = 255
23 | a = ndi.gaussian_filter(a,2)
24 |
25 | # Generate figure
26 | pig,ax = plt.subplots()
27 | ax.matshow(a,cmap='Blues',vmax=12)
28 |
29 | # Add the labels
30 | for (i, j), z in np.ndenumerate(a*10):
31 | ax.text(j, i, z, ha='center', va='center')
32 |
33 | # Cosmetics, saving and showing
34 | plt.axis('off')
35 | #plt.savefig('gaussian_kernel_grid.png')
36 | #plt.show()
37 |
38 | # Clean
39 | plt.clf()
40 | plt.close()
41 |
42 |
43 | # 1D ADAPTIVE THRESHOLDING
44 |
45 | # Create 1D 'membrane' data
46 | a = np.zeros(100) + 10
47 | a[35] = 1000
48 | a[65] = 1000
49 | a = ndi.gaussian_filter(a,2)
50 |
51 | # Create adaptive background
52 | b = ndi.uniform_filter(a,size=20)
53 |
54 | # Plot stuff
55 | plt.plot(a)
56 | plt.plot(b,c='r')
57 | plt.ylim([-10,270])
58 |
59 | # Label, save and show
60 | plt.legend(['Raw 1D Membrane Signal','Adaptive Background'])
61 | plt.xlabel('space [pixels]')
62 | plt.ylabel('intensity [a.u.]')
63 | #plt.savefig('adaptive_bg_1D.png')
64 | #plt.show()
65 |
66 | # Clean
67 | plt.clf()
68 | plt.close()
69 |
70 |
71 | # UNIFORM KERNEL WITH STRUCTURING ELEMENT
72 |
73 | # Create data
74 | i = 11
75 | a = (np.mgrid[:i+2,:i+2][0] - np.floor(i/2) - 1)**2 + (np.mgrid[:i+2,:i+2][1] - np.floor(i/2) - 1)**2 <= np.floor(i/2)**2
76 |
77 | # Generate figure
78 | pig,ax = plt.subplots()
79 | ax.matshow(a,cmap='Blues',vmax=2)
80 |
81 | # Add the labels
82 | for (i, j), z in np.ndenumerate(a*1):
83 | ax.text(j, i, z, ha='center', va='center')
84 |
85 | # Cosmetics, saving and showing
86 | plt.axis('off')
87 | #plt.savefig('uniform_filter_SE.png')
88 | #plt.show()
89 |
90 | # Clean
91 | plt.clf()
92 | plt.close()
93 |
94 |
95 | # DISTANCE TRANSFORM
96 |
97 | # Create data
98 | a = np.zeros((16,28),dtype=np.uint8)
99 | a[6:10,6:10] = 255
100 | a[6:10,18:22] = 255
101 | a = ndi.gaussian_filter(a,3)
102 | a = a >= 9
103 |
104 | # Distance transform
105 | b = ndi.distance_transform_edt(a)
106 |
107 | # Generate figure
108 | pig,ax = plt.subplots(1,2)
109 | ax[0].matshow(a,cmap='Blues',vmax=2)
110 | ax[1].matshow(b,cmap='Blues')
111 |
112 | # Add the labels
113 | for (i, j), z in np.ndenumerate(a.astype(np.uint8)):
114 | ax[0].text(j, i, z, ha='center', va='center')
115 | for (i, j), z in np.ndenumerate(b.astype(np.uint8)):
116 | ax[1].text(j, i, z, ha='center', va='center')
117 |
118 | # Cosmetics
119 | ax[0].axis('off')
120 | ax[1].axis('off')
121 | ax[0].set_title('Boolean Mask')
122 | ax[1].set_title('Distance Transform')
123 | plt.tight_layout()
124 |
125 | ## Saving
126 | #manager = plt.get_current_fig_manager()
127 | #manager.window.showMaximized()
128 | #plt.savefig('distance_transform.png')
129 |
130 | # Showing
131 | #plt.show()
132 |
133 | # Clean
134 | plt.clf()
135 | plt.close()
136 |
137 |
138 | # WATERSHED 1D ILLUSTRATION
139 |
140 | # Create 1D 'membrane' data
141 | a = np.zeros(150,dtype=np.uint8)
142 | a[[25,50,90,125]] = [150,180,80,120]
143 | a = ndi.gaussian_filter(a,2)
144 | a = (a.astype(np.float) / float(a.max()) * 200.0).astype(np.uint8) + 10
145 |
146 | # Create seed data
147 | b = (np.array([10,38,70,110,140]),np.array([10,10,10,10,10]))
148 |
149 | # Three watershed steps
150 | w1 = np.ones_like(a) + 70
151 | w2 = np.ones_like(a) + 140
152 | w3 = np.ones_like(a) + 240
153 |
154 | # Plotting function
155 | def plot_stuff(ax,l1=None,l2=None):
156 |
157 | # Plot intensity
158 | ax.plot(a,lw=2,label=l1,color='#6AADD5')
159 | ax.fill_between(np.arange(a.shape[0]),a,color='#F7FBFF')
160 |
161 | # Plot seeds
162 | ax.scatter(b[0],b[1],label=l2,color='#C71B11',zorder=3,s=30,marker='D')
163 |
164 | # Cosmetics
165 | ax.set_ylim([0,255])
166 | ax.set_xlim([0,149])
167 |
168 | # Done
169 | return ax
170 |
171 | # Make the figure
172 | pig,ax = plt.subplots(2,2,sharex=True,sharey=True)
173 |
174 | # Add plot before watershed
175 | ax[0,0] = plot_stuff(ax[0,0],l1='membrane signal',l2='seeds')
176 | ax[0,0].legend()
177 | ax[0,0].set_title("watershed level 0",fontsize=14)
178 |
179 | # Add plot with watershed at 70
180 | ax[0,1].fill_between(np.arange(w1.shape[0]),w1,color='#0B559F',
181 | label='watershed')
182 | ax[0,1] = plot_stuff(ax[0,1])
183 | ax[0,1].legend()
184 | ax[0,1].set_title("watershed level 70",fontsize=14)
185 |
186 | # Add plot with watershed at 140
187 | ax[1,0].fill_between(np.arange(w2.shape[0]),w2,color='#0B559F')
188 | ax[1,0] = plot_stuff(ax[1,0])
189 | ax[1,0].vlines(90,0,255,lw=2,color='#C71B11',zorder=3)
190 | ax[1,0].set_title("watershed level 140",fontsize=14)
191 |
192 | # Add plot with watershed at 240
193 | ax[1,1].fill_between(np.arange(w3.shape[0]),w3,color='#0B559F')
194 | ax[1,1] = plot_stuff(ax[1,1])
195 | ax[1,1].vlines([25,50,90,125],0,255,lw=2,color='#C71B11',zorder=3,
196 | label='final cell boundaries')
197 | ax[1,1].legend()
198 | ax[1,1].set_title("watershed level 240",fontsize=14)
199 |
200 | # General labels
201 | pig.text(0.5, 0.02, 'space [pixels]', ha='center', fontsize=14)
202 | pig.text(0.04, 0.5, 'intensity [a.u.]', va='center', rotation='vertical',
203 | fontsize=14)
204 |
205 | ## Saving
206 | #manager = plt.get_current_fig_manager()
207 | #manager.window.showMaximized()
208 | #plt.savefig('watershed_illustration.png')
209 |
210 | # Tighten layout and show
211 | plt.tight_layout()
212 | #plt.show()
213 |
214 | # Clean
215 | plt.clf()
216 | plt.close()
217 |
218 |
219 |
220 |
--------------------------------------------------------------------------------
/ipynb_images/gaussian_kernel_grid.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WhoIsJack/python-bioimage-analysis-tutorial/1d2473994c0151d8b83f0385f007425ad4c7a055/ipynb_images/gaussian_kernel_grid.png
--------------------------------------------------------------------------------
/ipynb_images/uniform_filter_SE.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WhoIsJack/python-bioimage-analysis-tutorial/1d2473994c0151d8b83f0385f007425ad4c7a055/ipynb_images/uniform_filter_SE.png
--------------------------------------------------------------------------------
/ipynb_images/watershed_illustration.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WhoIsJack/python-bioimage-analysis-tutorial/1d2473994c0151d8b83f0385f007425ad4c7a055/ipynb_images/watershed_illustration.png
--------------------------------------------------------------------------------