├── README.md
├── main_tutorial
    ├── .ipynb_checkpoints
    │   └── pipeline-checkpoint.ipynb
    ├── README.md
    ├── example_cells_1.tif
    ├── example_cells_2.tif
    ├── tutorial_pipeline.py
    ├── tutorial_pipeline_batch.py
    ├── tutorial_pipeline_batch_partial_solutions.py
    ├── tutorial_pipeline_batch_solutions.py
    ├── tutorial_pipeline_partial_solutions.py
    └── tutorial_pipeline_solutions.py
├── optional_advanced_content
    ├── Multiprocessing
    │   ├── batch_multiprocessing.py
    │   ├── example_cells_1.tif
    │   ├── example_cells_2.tif
    │   └── example_multiprocessing.py
    ├── Vectorization
    │   ├── example_cells_1.tif
    │   ├── example_cells_1_segmented.npy
    │   └── example_vectorization.py
    ├── cluster_computation
    │   ├── README.md
    │   ├── batch_cluster.py
    │   ├── example_cells_1.tif
    │   ├── example_cells_2.tif
    │   └── example_cluster.py
    └── data_analysis
    │   ├── example_cells_1.tif
    │   ├── example_cells_1_green.npy
    │   ├── example_cells_1_segmented.npy
    │   ├── example_data_analysis.py
    │   └── example_data_analysis_solutions.py
└── pre_tutorial
    ├── .ipynb_checkpoints
        ├── Short tutorial on functions-checkpoint.ipynb
        ├── arrays-and-numpy-checkpoint.ipynb
        ├── pre-tutorial-checkpoint.ipynb
        ├── pre_tutorial-checkpoint.ipynb
        └── tutorial-on-functions-checkpoint.ipynb
    ├── README.md
    ├── arrays-and-numpy.ipynb
    ├── ext_nuc_AP2_beta_subunit.tif
    ├── figA.jpeg
    ├── figC.png
    ├── figD.png
    ├── figE.png
    ├── figF.png
    ├── module_example.py
    ├── nuclei.png
    ├── nuclei.txt
    ├── pre-tutorial.ipynb
    ├── randimg.txt
    ├── results.txt
    └── tutorial-on-functions.ipynb


/README.md:
--------------------------------------------------------------------------------
  1 | Python Workshop - Image Processing
  2 | ===================================
  3 | 
  4 | **Please note that a new and improved version of the materials for this course is available [here](https://github.com/WhoIsJack/python-workshop-image-processing)!**
  5 | 
  6 | 
  7 | ## Course Aims and Overview
  8 | 
  9 | This course teaches the basics of bio-image processing, segmentation and analysis in python. It is based on tutorials that integrate explanations and exercises, enabling participants to build their own image analysis pipeline step by step.
 10 | 
 11 | The `main_tutorial` uses single-cell segmentation of a confocal fluorescence microscopy image to illustrate key concepts from preprocessing to segmentation to data analysis. It includes a tutorial on how to apply such a pipeline to multiple images at once (batch processing).
 12 | 
 13 | The main tutorial is complemented by the `pre-tutorial`, which reviews some basic python concepts using as an example a rat fibroblast image, and by the `optional_advanced_content`, which features further examples and tutorials on the topics of vectorization, multiprocessing, cluster computation and advanced data analysis.
 14 | 
 15 | This course is aimed at people with basic to intermediate knowledge of python and basic knowledge of microscopy. For people with basic knowledge of image processing, the tutorials can be followed without attending the lectures.
 16 | 
 17 | 
 18 | ## Instructions on following this course
 19 | 
 20 | - If you have only very basic knowledge of python or if you are feeling a little rusty, you should start with the `pre-tutorial`, which includes three tutorials: one on numpy arrays, one on python functions and one on the basics of interacting with image data in Python. If you are more experienced, you may want to skim or skip the pre-tutorial.
 21 | Note: The pre-tutorial is organized as an iPython notebook.
 22 | 
 23 | - In the `main_tutorial`, it is recommended to follow the `tutorial_pipeline` first. By following the exercises, you should be able to implement your own segmentation pipeline. If you run into trouble, you can use the provided solutions as inspiration - however, it is *highly* recommended to spend a lot of time figuring things out yourself, as this is an important part of any programming exercise. If you are having a lot of trouble, you may want to use the `partial_solutions`, which give you some help yet still demand that you think about it yourself. After completing the segmentation pipeline, you can follow the `tutorial_pipeline_batch` to learn how to run your program for several images and collect all the results.
 24 | Note: The main tutorial is organized simply as comments in an empty script. It is up to you to fill in the appropriate code.
 25 | 
 26 | - Finally, the `advanced_content` contains an introductory example to three important techniques for making making your scripts faster and operating on large datasets, namely *vectorization*, *multiprocessing* and *cluster processing*. The `data_analysis` tutorial (currently in *BETA*!) is an introduction to piping segmentation results into more advanced statistical data analysis, including *feature extraction*, *PCA*, *clustering* and *graph-based analysis*.
 27 | 
 28 | 
 29 | ## Concepts discussed in course lectures
 30 | 
 31 | 1. **Basic Python (KS)**
 32 | 	* Importing packages and modules
 33 | 	* Reading files
 34 | 	* Data and variable types
 35 | 	* Importing data
 36 | 	* Storing data in variables
 37 | 	* Defining and using functions
 38 | 	* Arrays, indexing, slicing
 39 | 	* Control flow
 40 | 	* Plotting images
 41 | 	* Debugging by printing
 42 | 	* Output formatting and writing files
 43 | 	* Using the documentation
 44 | 
 45 | 
 46 | 2. **Basics of BioImage Processing (KM)**
 47 | 	* Images as numbers
 48 | 		* Bit/colour depth
 49 | 		* Colour maps and look up tables 
 50 | 	* Definition of Bio-image Analysis
 51 | 		* Image Analysis definition for signal processing science 
 52 | 		* Image Analysis definition for biology
 53 | 		* Algorithms and Workflows
 54 | 		* Typical workflows in biology
 55 | 	* Convolution and Filtering
 56 | 		* Why do we do filtering?
 57 | 		* Convolution in 1D, 2D and 3D 
 58 | 	* Pre-segmentation filtering
 59 | 		* De-noising
 60 | 		* Smoothing
 61 | 		* Unsharp mask
 62 | 	* Post-segmentation filtering
 63 | 		* Tuning segmented structures
 64 | 		* Mathematical morphology, erosion, dilation
 65 | 			* Distance map 
 66 | 			* Watershed
 67 | 
 68 | 3. **Introduction to the Tutorial Pipeline (JH)**
 69 | 	* Automated Single-Cell Segmentation
 70 | 		* Why? (Advantages of single-cell approaches)
 71 | 		* How? (Standard segmentation pipeline build)
 72 | 			* Preprocessing (smoothing, background subtraction)
 73 | 			* Presegmentation (thresholding, seed detection)
 74 | 			* Segmentation (seed expansion; e.g. watershed)
 75 | 			* Postprocessing (removing artefacts, refining segmentation)
 76 | 			* Quantification and analysis
 77 | 		* What? (for the main tutorial: 2D spinning disc confocal fluorescence microscopy images of Zebrafish embryonic cells)
 78 | 		* Who? (YOU!)
 79 | 
 80 | 3. **Advanced material**
 81 | 	* CellProfiler to automate image analysis workflows and python plugin module **(VH)**
 82 | 	* Code Optimisation (vectorisation, multiprocessing, cluster processing) & advanced data analysis **(JH)**
 83 | 
 84 | 		
 85 | ## Instructors
 86 | 
 87 | - Karin Sasaki
 88 |     - EMBL Centre for Biological Modelling
 89 |     - Organiser of course, practical materials preparation, tutor, TA
 90 | - Jonas Hartmann
 91 |     - Gilmour Lab, CBB, EMBL
 92 |     - Pipeline developer, practical materials preparation, tutor, TA
 93 | - Kora Miura
 94 |     - EMBL Centre for Molecular and Cellular Imaging
 95 |     - Tutor
 96 | - Volker Hilsenstein
 97 |     - Scientific officer at the ALMF
 98 |     - Tutor, TA (image processing)
 99 | - Toby Hodges
100 |     - Bio-IT, EMBL
101 |     - TA (python)
102 | - Aliaksandr Halavatyi
103 |     - Postdoc at the Pepperkik group
104 |     - TA (programming/image processing)
105 | - Imre Gaspar
106 |     - Staff scientists at the Ephrussi group
107 |     - TA (programming/image processing)
108 | 
109 | 
110 | ## Feedback 
111 | 
112 | We welcome any feedback on this course! 
113 | 
114 | Feel free to contact us at *karin.sasaki@embl.de* or *jonas.hartmann@embl.de*.
115 | 


--------------------------------------------------------------------------------
/main_tutorial/.ipynb_checkpoints/pipeline-checkpoint.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "markdown",
 5 |    "metadata": {},
 6 |    "source": [
 7 |     "## SECTION 0 - SET UP"
 8 |    ]
 9 |   },
10 |   {
11 |    "cell_type": "markdown",
12 |    "metadata": {},
13 |    "source": [
14 |     "1. Make sure that all your python and image files are in the same directory.\n",
15 |     "\n",
16 |     "2. Remember that you can develop this pipeline using a) a simple text editor and running it on the terminal, b) using Spyder or c) a Jupyter notebook. For all of them, you need to navigate to the directory where all your files are saved.\n",
17 |     "    \n",
18 |     "    * On the terminal, type cd dir_path, replacing dir_path for the path of the directory\n",
19 |     "    * On Spyder and Jupyter nootebook it can be done interactively.\n"
20 |    ]
21 |   },
22 |   {
23 |    "cell_type": "code",
24 |    "execution_count": null,
25 |    "metadata": {
26 |     "collapsed": true
27 |    },
28 |    "outputs": [],
29 |    "source": []
30 |   }
31 |  ],
32 |  "metadata": {
33 |   "kernelspec": {
34 |    "display_name": "Python 2",
35 |    "language": "python",
36 |    "name": "python2"
37 |   },
38 |   "language_info": {
39 |    "codemirror_mode": {
40 |     "name": "ipython",
41 |     "version": 2
42 |    },
43 |    "file_extension": ".py",
44 |    "mimetype": "text/x-python",
45 |    "name": "python",
46 |    "nbconvert_exporter": "python",
47 |    "pygments_lexer": "ipython2",
48 |    "version": "2.7.11"
49 |   }
50 |  },
51 |  "nbformat": 4,
52 |  "nbformat_minor": 0
53 | }
54 | 


--------------------------------------------------------------------------------
/main_tutorial/README.md:
--------------------------------------------------------------------------------
 1 | ## README on Main Tutorial
 2 | 
 3 | ### DESCRIPTION
 4 | This is a tutorial to exemplify fundamental concepts of automated image processing and segmentation, using python.
 5 | 
 6 | This course assumes a basic knowledge of the Python Programming Language. For those at EMBL, this means that you have participated in *a* beginners course for programming, preferably a Python course.
 7 | 
 8 | 
 9 | ### TASK
10 | Segmentation of 2D spinning-disc confocal fluorescence microscopy images of a membrane marker in Zebrafish early embryonic cells.
11 | 
12 | 
13 | ### REQUIREMENTS
14 | - Python 2.7 (we recommend the Anaconda distribution, which includes most of the required modules)
15 | - Modules: NumPy, SciPy, scikit-image, tifffile
16 | - A text/code editor 
17 | 
18 | 
19 | ### HOW TO FOLLOW THIS TUTORIAL
20 | - Files you should have:
21 | 	- `tutorial_pipeline.py`; the tutorial Python script
22 | 	- `tutorial_pipeline_batch.py`; the tutorial introducing batch processing
23 | 	- The corresponding solutions and if desired partial solutions
24 | 	- `example_cells_1.tif` and `example_cells_2.tif`; these images are dual-color spinning-disc confocal confocal micrographs (40x) of two membrane-localised proteins during zebrafish early embryonic  development (~10hpf). They have 2 colors and are 930 by 780 pixels.
25 | 
26 | - The tutorial is self explanatory, indicating the steps to be taken next, what you should aim at achieving after carrying out those steps and the commands that could be used (there is not a one correct answer; these are suggestions).
27 | 
28 | - With this exercise we want to encourage you to become a more independent programmer, so if there is a command you don't quite know how to use, make sure you read the documentation. To do so, most of the time it's enough to google for "python" and the name of the module or function you want to use.
29 | 
30 | - If you are following this tutorial in class, if you have any questions, raise your hand and someone will come to help you. Otherwise, feel free to send your query to one of these two email addresses:
31 | *jonas.hartmann@embl.de*
32 | *karin.sasaki@embl.de*
33 | 
34 | ### IMAGE PROCESSING CONCEPTS DISCUSSED IN THIS TUTORIAL
35 | - Loading and visualising images
36 | - Images are arrays of numbers; they can be indexed, sliced, etc...
37 | - Images contain 3 types of information: Intensity, Shape, Size (a good segmentation pipeline uses them all)
38 | - Preprocessing: smoothing, background substraction
39 | - Segmentation: adaptive thresholding, distance transformation, detection of maxima, watershed
40 | - Filtering: Discarding undesired objects, e.g. cells at the image boundaries
41 | - Analysis: Extracting measurements from segmentation
42 | - Saving output (segmentation, data, graphs)
43 | - Automation: How to apply a pipeline to all files in a directory (batch processing)
44 | 
45 | ### PROGRAMMING CONCEPTS DISCUSSED/TRAINED IN THIS TUTORIAL
46 | - Python scripts, functions
47 | - Common variable types: numpy array, dictionaries
48 | - Control flow
49 | - Modules, packages, importing modules and packages and using them
50 | - Importing data
51 | - Using the documentation
52 | - Arrays and manipulation (dimensions, indexing, slicing, arithmetic)
53 | - Visualising images
54 | - Debugging by printing relevant information and plotting images at appropriate stages
55 | - Exporting data and writing files
56 | - Good practice
57 | 
58 | 


--------------------------------------------------------------------------------
/main_tutorial/example_cells_1.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/main_tutorial/example_cells_1.tif


--------------------------------------------------------------------------------
/main_tutorial/example_cells_2.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/main_tutorial/example_cells_2.tif


--------------------------------------------------------------------------------
/main_tutorial/tutorial_pipeline_batch.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Tue Dec 22 00:12:38 2015
  4 | 
  5 | @author:    Created by Jonas Hartmann @ Gilmour Group @ EMBL Heidelberg
  6 |             Edited by Karin Sasaki @ CBM @ EMBL Heidelberg
  7 |             
  8 | @descript:  This is the batch version of 'tutorial_pipeline.py', which is an  
  9 |             example pipeline for the segmentation of 2D confocal fluorescence 
 10 |             microscopy images of a membrane marker in confluent epithel-like 
 11 |             cells. This batch version serves to illustrate how such a pipeline
 12 |             can be run automatically on multiple images that are saved in the
 13 |             current directory.
 14 |             
 15 |             The pipeline is optimized to run with the provided example images,
 16 |             which are dual-color spinning-disc confocal micrographs (40x) of
 17 |             two membrane-localized proteins during zebrafish early embryonic
 18 |             development (~10hpf).
 19 | 
 20 | @requires:  Python 2.7
 21 |             NumPy 1.9, SciPy 0.15
 22 |             scikit-image 0.11.2, tifffile 0.3.1
 23 | """
 24 | 
 25 | #%%
 26 | #------------------------------------------------------------------------------
 27 | #  SECTION 0 - SET UP
 28 | 
 29 | # 1. (Re)check that the segmentation pipeline works!
 30 | # You should already have the final pipeline that segments and quantifies one of the example images. Make sure it is working by running it in one go from start to finish. Check that you do not get errors and that output is what you expect.
 31 | 
 32 | # 2. Check that you have the right data!
 33 | # We provide two images ('example_cells_1.tif' and 'example_cells_2.tif') to test the batch version of the pipeline. Make sure you have them both ready in the working directory.
 34 | 
 35 | # 3. Deal with Python 2.7 legacy
 36 | from __future__ import division
 37 | 
 38 | # 4. EXERCISE: Import modules required by the pipeline
 39 |     # Array manipulation package numpy as np
 40 |     # Plotting package matplotlib.pyplot as plt
 41 |     # Image processing package scipy.ndimage as ndi
 42 | 
 43 | 
 44 | #%%
 45 | #------------------------------------------------------------------------------
 46 | # SECTION 1 - PACKAGE PIPELINE INTO A FUNCTION
 47 | 
 48 | # The goal of this script is to repeatedly run the segmentation algorithm you
 49 | # programmed in tutorial_pipeline.py. The easiest way of packaging code to run
 50 | # it multiple times is to make it into a function.
 51 | 
 52 | # EXERCISE
 53 | # Define a function that...
 54 | #     ...takes one argument as input: a filename as a string
 55 | #     ...returns two outputs: the final segmentation and the quantified data
 56 | #     ...reports that it is finished with the current file just before returning the result.
 57 | 
 58 | # To do this, you need to copy the pipeline you developed, up to section 8, and paste it inside the function. Since the pipeline should run without any supervision by the user, remove (or comment out) any instances where an image would be shown. You can also exclude section 4. Make sure everything is set up such that the function can be called and the entire pipeline will run with the filename that is passed to the function.
 59 | 
 60 | # Recall that to define a new function the syntax is 
 61 | # def function_name(input arguments): 
 62 | #   """function documentation string"""
 63 | #   function procedure
 64 | #   return [expression]
 65 | 
 66 | 
 67 | #%%   
 68 | #------------------------------------------------------------------------------
 69 | # SECTION 2 - EXECUTION SCRIPT
 70 | 
 71 | # Now that the pipeline function is defined, we can run it for each image file in a directory and collect the results as they are returned.
 72 | 
 73 | 
 74 | #------------------------
 75 | # Part A - Get the current working directory
 76 | #------------------------
 77 | 
 78 | # Define a variable 'input_dir' with the path to the current working directory, where the images should be saved. In principle, you can also specify any other path where you store your images.
 79 |     
 80 |     # (i) Import the function 'getcwd' from the module 'os'
 81 | 
 82 |     # (ii) Get the name of the current working directory with 'getcwd'
 83 | 
 84 | 
 85 | #------------------------
 86 | # Part B - Generate a list of image filenames
 87 | #------------------------
 88 | 
 89 | # (i) Make a list variable containing the names of all the files in the directory, using the function 'listdir' from the module 'os'. (Suggested variable name: 'filelist')
 90 | 
 91 | # (ii) From the above list, collect the filenames of only those files that are tifs and allocate them to a new list variable named 'tiflist'. Here, it is useful to use a for-loop to loop over all the names in 'filelist' and to use an if-statement combined with slicing (indexing) to check if the current string ends with the characters '.tif'.
 92 | 
 93 | # (iii) Double-check that you have the right files in tiflist. You can either print the number of files in the list, or print all the names in the list.
 94 | 
 95 | 
 96 | #------------------------
 97 | # Part C - Loop over the tiflist, run the pipeline for each filename and collect the results
 98 | #------------------------
 99 | 
100 | 
101 | # (i) Initialise two empty lists, 'all_results' and 'all_segmentations', where you will collect the quantifications and the segmented images, respectively, for each file.
102 | 
103 | # (ii) Write a for-loop that goes through every file in the tiflist. Within this for loop, you should:
104 |          
105 |          # Run the pipeline function and allocate the output to new variables; remember that this pipeline returns two arguments, so you need two output variables.
106 | 
107 |          # Add the output to the variables 'all_results' and 'all_segmentations', respectively. You can use the '.append' method to add them to the lists.
108 | 
109 | # (iii) Check your understanding:
110 |     # Try to think about the complete data structure of 'all_results' and 'all_segmentations'. What type of variable are they? What type of variable to they contain? What data is contained within these variables? You can try printing things to fully understand the data structure.
111 | 
112 | # (iv) [OPTIONAL] Exception handling
113 |     # It would be a good idea to make sure that not everything fails (i.e. the program stops and you loose all data) if there is an error in just one file. To avoid this, you can include a "try-except block" in your loop. To learn about handling exceptions (errors) in this way, visit http://www.tutorialspoint.com/python/python_exceptions.htm and https://wiki.python.org/moin/HandlingExceptions. Also, remember to include a warning message when the pipeline fails and to print the name of the file that caused the error, making a diagnosis possible. To do this properly, you should use the function 'warn' from the module 'warnings'. Finally, you may want to count how many times the pipeline runs correctly and print that number at the end, informing the user how many out of the total number of images were successfully segmented.
114 | 
115 | 
116 | #------------------------
117 | # Part D - Print a short summary  
118 | #------------------------
119 | 
120 | # Find out how many cells in total were detected, from all the images: 
121 | 
122 |     # (i) Initialise a counter 'num_cells' to 0
123 |     
124 |     # (ii) Use a for loop that goes through 'all_results';
125 |     
126 |             # For each entry, identify how many cells were segmented in the image (e.g. by getting the length of the "cell_id" entry in the result dict). Add this length to the counter.
127 |     
128 |     # (iii) Print a statement that reports the final count of cells detected, for all images segmented.
129 | 
130 | 
131 | #------------------------
132 | # Part E - Quick visualisation of results
133 | #------------------------
134 | 
135 | # (i) Plot a scatter plot for all data and save the image:
136 | 
137 |     # Loop through all_results and scatter plot 'cell_size' vs 'the red_membrane_mean'. Remember to use a for-loop and the function 'enumerate'.
138 |     
139 |     # Save the image to a png file using 'plt.savefig'. 
140 | 
141 | 
142 | # (ii) [OPTIONAL] You may want to give cells from different images different colors:
143 |     
144 |     # Use the module 'cm' (for colormaps) from 'plt' and choose a colormap, e.g. 'jet'. 
145 |     
146 |     # Create the colormap with the number of colors required for the different images (in this example just 2). You can use 'range' or 'np.linspace' to ensure that you will always have the correct number of colors required, irrespective of the number of images you run the pipeline on. This colormap needs to be defined before making the plots.
147 |     
148 |     # When generating the scatter plot, use the parameter 'color' to use a different color from your colormap for each image you iterate through. Using 'enumerate' for the iterations makes this easier. For more info on 'color' see the docs of 'plt.scatter'.
149 |     
150 | 
151 | #------------------------
152 | # Part F - Save all the segmentations as a "3D" tif
153 | #------------------------
154 | 
155 | # (i) Convert 'all_segmentations' to a 3D numpy array (instead of a list of 2D arrays)
156 | 
157 | # (ii) Save the result to a tif file using the 'imsave' function from the 'tifffile' module
158 | 
159 | # (iii) Have a look at the file in Fiji/ImageJ. The quality of segmentation across multiple images (that you did not use to optimize the pipeline) tells you how robust your pipeline is.
160 | 
161 | 
162 | #------------------------
163 | # Part G - Save the quantification data as a txt file 
164 | #------------------------
165 | 
166 | # Saving your data as tab- or comma-separated text files allows you to import it into other programs (excel, Prism, R, ...) for further analysis and visualization.
167 | 
168 | # (i) Open an empty file object using "with open" (as explained at the end of the pipeline tutorial). Specify the file format to '.txt' and the mode to write ('w').
169 | 
170 | # (ii) The headers of the data are the key names of the dict containing the result for each input image (i.e. 'cell_id', 'green_mean', etc.). Write them on the first line of the file, separated by tabs ('\t'). You need the methods 'string.join' and 'file.write'.
171 | 
172 | # (iii) Loop through each filename in 'tiflist' (using a for-loop and enumerate of 'tiflist'). For each filename...
173 | 
174 |     # ...write the filename itself. 
175 |          
176 |     # ...extract the corresponding results from 'all_results'.
177 |     
178 |     # ...iterate over all the cells (using a for-loop and 'enumerate' of 'resultsDict["cell_id"]') and...
179 |     
180 |         # ...write the data of the cell, separated by '\t'. 
181 |     
182 |         
183 | 
184 | #%%   
185 | #------------------------------------------------------------------------------
186 | # SECTION 4 -  RATIOMETRIC NORMALIZATION TO CONTROL CHANNEL
187 | 
188 | # To correct for technical variability it is often useful to have an internal control, e.g. some fluorophore that we expect to be the same between all analyzed conditions, and use it to normalize other measurements.
189 | 
190 | # For example, we can assume that our green channel is just a generic membrane marker, whereas the red channel is a labelled protein of interest. Thus, using the red/green ratio instead of the raw values from the red channel may yield a clearer result when comparing intensity measurements of the red protein of interest between different images.
191 | 
192 | #------------------------
193 | # Part A - Create the ratio
194 | #------------------------
195 | 
196 | # (i) Calculate the ratio of red membrane mean intensity to green membrane mean intensity for each cell in each image. Add the results to the 'result_dict' of each image with a new key, for example 'red_green_mem_ratio'.
197 | 
198 | 
199 | #------------------------
200 | # Part B - Make a scatter plot, this time with the ratio
201 | #------------------------
202 | 
203 | # (i) Recreate the scatterplot from Section 3, part E, but plotting the ratio over cell size rather than the red membrane mean intensity.
204 | 
205 | # (ii) Compare the two plots. Does the outcome match your expectations? Can you explain the newly 'generated' outliers?
206 | 
207 | # Note: Depending on the type of data and the question, normalizing with internal controls can be crucial to arrive at the correct conclusion. However, as you can see from the outliers here, a ratio is not always the ideal approach to normalization. When doing data analysis, you may want to spend some time thinking about how best to normalize your data. Testing different outcomes using 'synthetic' data (created using random number generators) can also help to confirm that your normalization (or your analysis in general) does not bias your results.
208 | 
209 | 
210 | #------------------------------------------------------------------------------
211 | #------------------------------------------------------------------------------
212 | # THIS IS THE END OF THE TUTORIAL.
213 | #------------------------------------------------------------------------------
214 | #------------------------------------------------------------------------------
215 | 


--------------------------------------------------------------------------------
/main_tutorial/tutorial_pipeline_batch_partial_solutions.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Tue Dec 22 00:12:38 2015
  4 | 
  5 | @author:    Created by Jonas Hartmann @ Gilmour Group @ EMBL Heidelberg
  6 |             Edited by Karin Sasaki @ CBM @ EMBL Heidelberg
  7 |             
  8 | @descript:  This is the batch version of 'tutorial_pipeline.py', which is an  
  9 |             example pipeline for the segmentation of 2D confocal fluorescence 
 10 |             microscopy images of a membrane marker in confluent epithel-like 
 11 |             cells. This batch version serves to illustrate how such a pipeline
 12 |             can be run automatically on multiple images that are saved in the
 13 |             current directory.
 14 |             
 15 |             The pipeline is optimized to run with the provided example images,
 16 |             which are dual-color spinning-disc confocal micrographs (40x) of
 17 |             two membrane-localized proteins during zebrafish early embryonic
 18 |             development (~10hpf).
 19 | 
 20 | @requires:  Python 2.7
 21 |             NumPy 1.9, SciPy 0.15
 22 |             scikit-image 0.11.2, tifffile 0.3.1
 23 | """
 24 | 
 25 | 
 26 | #%%
 27 | #------------------------------------------------------------------------------
 28 | #  SECTION 0 - SET UP
 29 | 
 30 | # 1. (Re)check that the segmentation pipeline works!
 31 | # You should already have the final pipeline that segments and quantifies one of the example images. Make sure it is working by running it in one go from start to finish. Check that you do not get errors and that output is what you expect.
 32 | 
 33 | # 2. Check that you have the right data!
 34 | # We provide two images ('example_cells_1.tif' and 'example_cells_2.tif') to test the batch version of the pipeline. Make sure you have them both ready in the working directory.
 35 | 
 36 | # 3. Deal with Python 2.7 legacy
 37 | from __future__ import division
 38 | 
 39 | # 4. EXERCISE: Import modules required by the pipeline
 40 | import --- as np                 # Array manipulation package
 41 | import --- as plt    # Plotting package
 42 | import --- as ndi        # Image processing package
 43 | 
 44 | 
 45 | #%%
 46 | #------------------------------------------------------------------------------
 47 | # SECTION 1 - PACKAGE PIPELINE INTO A FUNCTION
 48 | 
 49 | # The goal of this script is to repeatedly run the segmentation algorithm you
 50 | # programmed in tutorial_pipeline.py. The easiest way of packaging code to run
 51 | # it multiple times is to make it into a function.
 52 | 
 53 | # EXERCISE
 54 | # Define a function that...
 55 | #     ...takes one argument as input: a filename as a string
 56 | #     ...returns two outputs: the final segmentation and the quantified data
 57 | #     ...reports that it is finished with the current file just before returning the result.
 58 | 
 59 | # To do this, you need to copy the pipeline you developed, up to section 8, and paste it inside the function. Since the pipeline should run without any supervision by the user, remove (or comment out) any instances where an image would be shown. You can also exclude section 4. Make sure everything is set up such that the function can be called and the entire pipeline will run with the filename that is passed to the function.
 60 | 
 61 | # Recall that to define a new function the syntax is 
 62 | # def function_name(input arguments): 
 63 | #   """function documentation string"""
 64 | #   function procedure
 65 | #   return [expression]
 66 | 
 67 | --- pipeline(filename):
 68 |     
 69 |     # Report that the pipelineis being executed
 70 |     print "  Starting pipeline for", filename
 71 |     
 72 |     # Import tif file
 73 |     import skimage.io as io               # Image file manipulation module
 74 |     img = io.imread(filename)             # Importing multi-color tif file
 75 |     
 76 |     # Slicing: We only work on one channel for segmentation
 77 |     green = img[0,:,:]
 78 |     
 79 |     
 80 |     #------------------------------------------------------------------------------
 81 |     # PREPROCESSING AND SIMPLE CELL SEGMENTATION:
 82 |     # (I) SMOOTHING AND (II) ADAPTIVE THRESHOLDING
 83 |     
 84 |     # -------
 85 |     # Part I
 86 |     # -------
 87 |     
 88 |     # Gaussian smoothing
 89 |     sigma = 3                                                # Smoothing factor for Gaussian
 90 |     green_smooth = ndi.filters.gaussian_filter(green,sigma)  # Perform smoothing
 91 |     
 92 |     
 93 |     # -------
 94 |     # Part II
 95 |     # -------
 96 |     
 97 |     # Create an adaptive background
 98 |     struct = ((np.mgrid[:31,:31][0] - 15)**2 + (np.mgrid[:31,:31][1] - 15)**2) <= 15**2  # Create a disk-shaped structural element
 99 |     from skimage.filters import rank            # Import module containing mean filter function
100 |     bg = rank.mean(green_smooth, selem=struct)  # Run a mean filter over the image using the disc
101 |     
102 |     # Threshold using created background
103 |     green_mem = green_smooth >= bg
104 |     
105 |     # Clean by morphological hole filling
106 |     green_mem = ndi.binary_fill_holes(np.logical_not(green_mem))
107 |     
108 |     
109 |     #------------------------------------------------------------------------------
110 |     # IMPROVED CELL SEGMENTATION BY SEEDING AND EXPANSION: 
111 |     # (I) SEEDING BY DISTANCE TRANSFORM
112 |     # (II) EXPANSION BY WATERSHED
113 |     
114 |     # -------
115 |     # Part I
116 |     # -------
117 |     
118 |     # Distance transform on thresholded membranes
119 |     # Advantage of distance transform for seeding: It is quite robust to local 
120 |     # "holes" in the membranes.
121 |     green_dt= ndi.distance_transform_edt(green_mem)
122 |     
123 |     # Dilating (maximum filter) of distance transform improves results
124 |     green_dt = ndi.filters.maximum_filter(green_dt,size=10) 
125 |     
126 |     # Retrieve and label the local maxima
127 |     from skimage.feature import peak_local_max
128 |     green_max = peak_local_max(green_dt,indices=False,min_distance=10)  # Local maximum detection
129 |     green_max = ndi.label(green_max)[0]                                 # Labeling
130 |     
131 |     
132 |     # -------
133 |     # Part II
134 |     # -------
135 |     
136 |     # Get the watershed function and run it
137 |     from skimage.morphology import watershed
138 |     green_ws = watershed(green_smooth,green_max)
139 |     
140 |     
141 |     #------------------------------------------------------------------------------
142 |     # IDENTIFICATION OF CELL EDGES
143 |     
144 |     # Define the edge detection function
145 |     def edge_finder(footprint_values):
146 |         if (footprint_values == footprint_values[0]).all():
147 |             return 0
148 |         else:
149 |             return 1
150 |         
151 |     # Iterate the edge finder over the segmentation
152 |     green_edges = ndi.filters.generic_filter(green_ws,edge_finder,size=3)
153 | 
154 |     
155 |     #------------------------------------------------------------------------------
156 |     # POSTPROCESSING: REMOVING CELLS AT THE IMAGE BORDER    
157 |     
158 |     # Create a mask for the image boundary pixels
159 |     boundary_mask = np.ones_like(green_ws)   # Initialize with all ones
160 |     boundary_mask[1:-1,1:-1] = 0             # Set middle square to 0
161 |     
162 |     # Iterate over all cells in the segmentation
163 |     current_label = 1
164 |     for cell_id in np.unique(green_ws):
165 |         
166 |         # If the current cell touches the boundary, remove it
167 |         if np.sum((green_ws==cell_id)*boundary_mask) != 0:
168 |             green_ws[green_ws==cell_id] = 0
169 |             
170 |         # This is to keep the labeling continuous, which is cleaner
171 |         else:
172 |             green_ws[green_ws==cell_id] = current_label
173 |             current_label += 1
174 |     
175 | 
176 |     #------------------------------------------------------------------------------
177 |     # MEASUREMENTS: SINGLE-CELL AND MEMBRANE READOUTS    
178 |     
179 |     # Initialize a dict for results of choice
180 |     results = {"cell_id":[], "green_mean":[], "red_mean":[],"green_membrane_mean":[], 
181 |                "red_membrane_mean":[],"cell_size":[],"cell_outline":[]}
182 |     
183 |     # Iterate over segmented cells
184 |     for cell_id in np.unique(green_ws)[1:]:
185 |         
186 |         # Mask the pixels of the current cell
187 |         cell_mask = green_ws==cell_id  
188 |         edge_mask = np.logical_and(cell_mask,green_edges)
189 |         
190 |         # Get the current cell's values
191 |         # Note that the original raw data is used for quantification!
192 |         results["cell_id"].append(cell_id)
193 |         results["green_mean"].append(np.mean(img[0,:,:][cell_mask]))
194 |         results["red_mean"].append(np.mean(img[1,:,:][cell_mask]))    
195 |         results["green_membrane_mean"].append(np.mean(img[0,:,:][edge_mask]))    
196 |         results["red_membrane_mean"].append(np.mean(img[1,:,:][edge_mask]))    
197 |         results["cell_size"].append(np.sum(cell_mask))
198 |         results["cell_outline"].append(np.sum(edge_mask))
199 | 
200 | 
201 |     #------------------------------------------------------------------------------
202 |     # REPORT PROGRESS AND RETURN RESULTS
203 | 
204 |     print "  Completed pipeline for", filename    
205 | 
206 |     return green_ws, results 
207 | 
208 | 
209 | #%%   
210 | #------------------------------------------------------------------------------
211 | # SECTION 2 - EXECUTION SCRIPT
212 | 
213 | # Now that the pipeline function is defined, we can run it for each image file in a directory and collect the results as they are returned.
214 | 
215 | 
216 | #------------------------
217 | # Part A - Get the current working directory
218 | #------------------------
219 | 
220 | # Define a variable 'input_dir' with the path to the current working directory, where the images should be saved. In principle, you can also specify any other path where you store your images.
221 |     
222 |     # (i) Import the function 'getcwd' from the module 'os'
223 | --- os --- getcwd 
224 | 
225 |     # (ii) Get the name of the current working directory with 'getcwd'
226 | input_dir = ----   
227 | 
228 | 
229 | 
230 | #------------------------
231 | # Part B - Generate a list of image filenames
232 | #------------------------
233 | 
234 | # (i) Make a list variable containing the names of all the files in the directory, using the function 'listdir' from the module 'os'. (Suggested variable name: 'filelist')
235 | --- os --- listdir
236 | filelist = listdir(----)
237 | 
238 | 
239 | # (ii) From the above list, collect the filenames of only those files that are tifs and allocate them to a new list variable named 'tiflist'. Here, it is useful to use a for-loop to loop over all the names in 'filelist' and to use an if-statement combined with slicing (indexing) to check if the current string ends with the characters '.tif'.
240 | tiflist = []
241 | --- filename --- filelist:
242 |     --- filename[-4:]=='.tif':
243 |         tiflist.---(filename)
244 | 
245 | # (iii) Double-check that you have the right files in tiflist. You can either print the number of files in the list, or print all the names in the list.
246 | print "Found", ---(tiflist), "tif files in target directory"
247 | 
248 | 
249 | #------------------------
250 | # Part C - Loop over the tiflist, run the pipeline for each filename and collect the results
251 | #------------------------
252 | 
253 | # (i) Initialise two empty lists, 'all_results' and 'all_segmentations', where you will collect the quantifications and the segmented images, respectively, for each file.
254 | all_results = []
255 | all_segmentations = []
256 | 
257 | # (ii) Write a for-loop that goes through every file in the tiflist. Within this for loop, you should:
258 | print "Running batch processing"
259 | --- filename --- tiflist: # For every file in tiflist...
260 |          
261 |          # Run the pipeline function and allocate the output to new variables; remember that this pipeline returns two arguments, so you need two output variables.
262 |     seg,results = ---(---)  
263 | 
264 |          # Add the output to the variables 'all_results' and 'all_segmentations', respectively. You can use the '.append' method to add them to the lists.
265 |     all_results.---(results)       
266 |     all_segmentations.---(seg)
267 | 
268 | # (iii) Check your understanding:
269 |     # Try to think about the complete data structure of 'all_results' and 'all_segmentations'. What type of variable are they? What type of variable to they contain? What data is contained within these variables? You can try printing things to fully understand the data structure.
270 | 
271 | # (iv) [OPTIONAL] Exception handling
272 |     # It would be a good idea to make sure that not everything fails (i.e. the program stops and you loose all data) if there is an error in just one file. To avoid this, you can include a "try-except block" in your loop. To learn about handling exceptions (errors) in this way, visit http://www.tutorialspoint.com/python/python_exceptions.htm and https://wiki.python.org/moin/HandlingExceptions. Also, remember to include a warning message when the pipeline fails and to print the name of the file that caused the error, making a diagnosis possible. To do this properly, you should use the function 'warn' from the module 'warnings'. Finally, you may want to count how many times the pipeline runs correctly and print that number at the end, informing the user how many out of the total number of images were successfully segmented.
273 | # see below
274 | 
275 | 
276 | # you can use the for loop below instead of the one above
277 | #--- filename --- tiflist: # For every file in tiflist...
278 | #    
279 | #    # Exception handling so the program can move on if one image fails for some reason.
280 | #    try---  
281 | #    
282 | #        # Run the pipeline
283 | #        seg,results = ---(---)  
284 | #        
285 | #        # Add the results to our collection lists
286 | #        all_results.---(results)       
287 | #        all_segmentations.---(seg)
288 | #        
289 | #        # Update the success counter
290 | #        success_counter += 1
291 | #        
292 | #    # What to do if something goes wrong.
293 | #    except Exception:
294 | #        
295 | #        # Warn the user, then carry on with the next file
296 | #        from warnings import warn
297 | #        warn("There was an exception in " --- filename + "!!!")
298 | 
299 |     
300 | 
301 | #------------------------
302 | # Part D - Print a short summary  
303 | #------------------------
304 | 
305 | # Find out how many cells in total were detected, from all the images: 
306 | 
307 |     # (i) Initialise a counter 'num_cells' to 0
308 | num_cells = 0 
309 |     
310 |     # (ii) Use a for loop that goes through 'all_results';
311 | for --- in ---: # For each image...
312 |     
313 |             # For each entry, identify how many cells were segmented in the image (e.g. by getting the length of the "cell_id" entry in the result dict). Add this length to the counter.
314 |     num_cells = num_cells + len(---["cell_id"]) # ...add the number of cells detected.
315 |         
316 |     # (iii) Print a statement that reports the final count of cells detected, for all images segmented.
317 | print "Detected", ---, "cells in total"
318 | 
319 | 
320 | #------------------------
321 | # Part E - Quick visualisation of results
322 | #------------------------
323 | 
324 | # (i) Plot a scatter plot for all data and save the image:
325 | 
326 |     # Loop through all_results and scatter plot 'cell_size' vs 'the red_membrane_mean'. Remember to use a for-loop and the function 'enumerate'.
327 | for image_id,resultDict in ---(all_results):
328 |     # ...add the datapoints to the plot.
329 |     plt.---(resultDict["cell_size"],resultDict[---])
330 | 
331 | # Label axes
332 | plt.x---("cell size")
333 | plt.---label("red membrane mean")
334 |             
335 |     # Save the image to a png file using 'plt.savefig'. 
336 | plt.---('BATCH_all_cells_scatter.png', bbox_inches='tight')
337 | plt.show()
338 | 
339 | 
340 | # (ii) [OPTIONAL] You may want to give cells from different images different colors:
341 |     
342 |     # Use the module 'cm' (for colormaps) from 'plt' and choose a colormap, e.g. 'jet'. 
343 |     
344 |     # Create the colormap with the number of colors required for the different images (in this example just 2). You can use 'range' or 'np.linspace' to ensure that you will always have the correct number of colors required, irrespective of the number of images you run the pipeline on. This colormap needs to be defined before making the plots.
345 |     
346 |     # When generating the scatter plot, use the parameter 'color' to use a different color from your colormap for each image you iterate through. Using 'enumerate' for the iterations makes this easier. For more info on 'color' see the docs of 'plt.scatter'.
347 | 
348 | # Note: Use either the version below or the one above
349 | # Prepare colormap to color cells from each image differently
350 | colors = plt.---.jet(np.linspace(0,---,len(all_results)))
351 | 
352 | # For each analyzed image...
353 | for image_id,resultDict in ---(all_results):
354 |     
355 |     # ...add the datapoints to the plot.
356 |     plt.---(resultDict["cell_size"],resultDict[---],color=---[image_id])
357 | 
358 | # Label axes
359 | plt.x---("cell size")
360 | plt.---label("red membrane mean")
361 | 
362 | # Show or save result
363 | plt.---('BATCH_all_cells_scatter.png', bbox_inches='tight')
364 | plt.show()
365 | 
366 | 
367 | #------------------------
368 | # Part F - Save all the segmentations as a "3D" tif
369 | #------------------------
370 | 
371 | # (i) Convert 'all_segmentations' to a 3D numpy array (instead of a list of 2D arrays)
372 | all_segmentations = np.array(---)
373 | 
374 | # (ii) Save the result to a tif file using the 'imsave' function from the 'tifffile' module
375 | from tifffile import imsave
376 | imsave("BATCH_segmentations.tif",---,bigtiff=True)
377 | 
378 | # (iii) Have a look at the file in Fiji/ImageJ. The quality of segmentation across multiple images (that you did not use to optimize the pipeline) tells you how robust your pipeline is.
379 | 
380 | 
381 | #------------------------
382 | # Part G - Save the quantification data as a txt file 
383 | #------------------------
384 | 
385 | # Saving your data as tab- or comma-separated text files allows you to import it into other programs (excel, Prism, R, ...) for further analysis and visualization.
386 | 
387 | # (i) Open an empty file object using "with open" (as explained at the end of the pipeline tutorial). Specify the file format to '.txt' and the mode to write ('w').
388 | --- ---("BATCH_results.txt","w") --- txt_out:         
389 | 
390 | # (ii) The headers of the data are the key names of the dict containing the result for each input image (i.e. 'cell_id', 'green_mean', etc.). Write them on the first line of the file, separated by tabs ('\t'). You need the methods 'string.join' and 'file.write'.
391 |     txt_out.---(''.---(key+'\t' for key in results.keys()) + '\n')
392 |     
393 | # (iii) Loop through each filename in 'tiflist' (using a for-loop and enumerate of 'tiflist'). For each filename...
394 |     for image_id,filename in ---(tiflist):                       
395 |         
396 |     # ...write the filename itself. 
397 |         txt_out.---(--- + "\n")
398 |         
399 |     # ...extract the corresponding results from 'all_results'.
400 |         resultDict = all_results[---]
401 |         
402 |     # ...iterate over all the cells (using a for-loop and 'enumerate' of 'resultsDict["cell_id"]') and...
403 |         for index,value in ---(resultDict["cell_id"]):           
404 |             
405 |         # ...write the data of the cell, separated by '\t'. 
406 |             txt_out.---(''.---(str(resultDict[key][index])+'\t' for key in resultDict.keys()) + '\n')   
407 | 
408 | 
409 | 
410 | #%%   
411 | #------------------------------------------------------------------------------
412 | # SECTION 4 -  RATIOMETRIC NORMALIZATION TO CONTROL CHANNEL
413 | 
414 | # To correct for technical variability it is often useful to have an internal control, e.g. some fluorophore that we expect to be the same between all analyzed conditions, and use it to normalize other measurements.
415 | 
416 | # For example, we can assume that our green channel is just a generic membrane marker, whereas the red channel is a labelled protein of interest. Thus, using the red/green ratio instead of the raw values from the red channel may yield a clearer result when comparing intensity measurements of the red protein of interest between different images.
417 | 
418 | #------------------------
419 | # Part A - Create the ratio
420 | #------------------------
421 | 
422 | # (i) Calculate the ratio of red membrane mean intensity to green membrane mean intensity for each cell in each image. Add the results to the 'result_dict' of each image with a new key, for example 'red_green_mem_ratio'.
423 | 
424 | # For each image...
425 | for image_id,resultDict in ---(all_results):
426 |     
427 |     # Calculate red/green ratio and save it under a new key in result_dict. Done for each cell using list comprehension.
428 |     all_results[image_id]["red_green_mem_ratio"] = [resultDict["---"][i] --- resultDict["---"][i] --- i --- range(len(resultDict["cell_id"]))]
429 | 
430 | 
431 | #------------------------
432 | # Part B - Make a scatter plot, this time with the ratio
433 | #------------------------
434 | # (i) Recreate the scatterplot from Section 3, part E, but plotting the ratio over cell size rather than the red membrane mean intensity.
435 | # Prepare colormap to color the cells of each image differently
436 | colors = plt.cm.jet(np.linspace(0,1,len(all_results)))
437 | 
438 | # For each image...
439 | for ---,--- in ---(all_results):
440 |     
441 |     # ...add the data points to the scatter.
442 |     plt.---(resultDict["---"],resultDict["---"],color=colors[image_id])
443 |     
444 | # Label axes
445 | plt.xlabel("cell size")
446 | plt.ylabel("red membrane mean")
447 | 
448 | # Show or save result
449 | plt.---('BATCH_all_cells_ratio_scatter.png', bbox_inches='tight')  
450 | plt.---()
451 | 
452 | 
453 | # (ii) Compare the two plots. Does the outcome match your expectations? Can you explain the newly 'generated' outliers?
454 | 
455 | # Note: Depending on the type of data and the question, normalizing with internal controls can be crucial to arrive at the correct conclusion. However, as you can see from the outliers here, a ratio is not always the ideal approach to normalization. When doing data analysis, you may want to spend some time thinking about how best to normalize your data. Testing different outcomes using 'synthetic' data (created using random number generators) can also help to confirm that your normalization (or your analysis in general) does not bias your results.
456 | 
457 | 
458 | #------------------------------------------------------------------------------
459 | #------------------------------------------------------------------------------
460 | # THIS IS THE END OF THE TUTORIAL.
461 | #------------------------------------------------------------------------------
462 | #------------------------------------------------------------------------------
463 | 


--------------------------------------------------------------------------------
/main_tutorial/tutorial_pipeline_batch_solutions.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Tue Dec 22 00:12:38 2015
  4 | 
  5 | @author:    Created by Jonas Hartmann @ Gilmour Group @ EMBL Heidelberg
  6 |             Edited by Karin Sasaki @ CBM @ EMBL Heidelberg
  7 |             
  8 | @descript:  This is the batch version of 'tutorial_pipeline.py', which is an  
  9 |             example pipeline for the segmentation of 2D confocal fluorescence 
 10 |             microscopy images of a membrane marker in confluent epithel-like 
 11 |             cells. This batch version serves to illustrate how such a pipeline
 12 |             can be run automatically on multiple images that are saved in the
 13 |             current directory.
 14 |             
 15 |             The pipeline is optimized to run with the provided example images,
 16 |             which are dual-color spinning-disc confocal micrographs (40x) of
 17 |             two membrane-localized proteins during zebrafish early embryonic
 18 |             development (~10hpf).
 19 | 
 20 | @requires:  Python 2.7
 21 |             NumPy 1.9, SciPy 0.15
 22 |             scikit-image 0.11.2, tifffile 0.3.1
 23 | """
 24 | 
 25 | 
 26 | #%%
 27 | #------------------------------------------------------------------------------
 28 | #  SECTION 0 - SET UP
 29 | 
 30 | # 1. (Re)check that the segmentation pipeline works!
 31 | # You should already have the final pipeline that segments and quantifies one of the example images. Make sure it is working by running it in one go from start to finish. Check that you do not get errors and that output is what you expect.
 32 | 
 33 | # 2. Check that you have the right data!
 34 | # We provide two images ('example_cells_1.tif' and 'example_cells_2.tif') to test the batch version of the pipeline. Make sure you have them both ready in the working directory.
 35 | 
 36 | # 3. Deal with Python 2.7 legacy
 37 | from __future__ import division
 38 | 
 39 | # 4. EXERCISE: Import modules required by the pipeline
 40 | import numpy as np                 # Array manipulation package
 41 | import matplotlib.pyplot as plt    # Plotting package
 42 | import scipy.ndimage as ndi        # Image processing package
 43 | 
 44 | 
 45 | #%%
 46 | #------------------------------------------------------------------------------
 47 | # SECTION 1 - PACKAGE PIPELINE INTO A FUNCTION
 48 | 
 49 | # The goal of this script is to repeatedly run the segmentation algorithm you
 50 | # programmed in tutorial_pipeline.py. The easiest way of packaging code to run
 51 | # it multiple times is to make it into a function.
 52 | 
 53 | # EXERCISE
 54 | # Define a function that...
 55 | #     ...takes one argument as input: a filename as a string
 56 | #     ...returns two outputs: the final segmentation and the quantified data
 57 | #     ...reports that it is finished with the current file just before returning the result.
 58 | 
 59 | # To do this, you need to copy the pipeline you developed, up to section 8, and paste it inside the function. Since the pipeline should run without any supervision by the user, remove (or comment out) any instances where an image would be shown. You can also exclude section 4. Make sure everything is set up such that the function can be called and the entire pipeline will run with the filename that is passed to the function.
 60 | 
 61 | # Recall that to define a new function the syntax is 
 62 | # def function_name(input arguments): 
 63 | #   """function documentation string"""
 64 | #   function procedure
 65 | #   return [expression]
 66 | 
 67 | def pipeline(filename):
 68 |     
 69 |     # Report that the pipeline is being executed
 70 |     print "  Starting pipeline for", filename
 71 |     
 72 |     # Import tif file
 73 |     import skimage.io as io               # Image file manipulation module
 74 |     img = io.imread(filename)             # Importing multi-color tif file
 75 |     
 76 |     # Slicing: We only work on one channel for segmentation
 77 |     green = img[0,:,:]
 78 |     
 79 |     
 80 |     #------------------------------------------------------------------------------
 81 |     # PREPROCESSING AND SIMPLE CELL SEGMENTATION:
 82 |     # (I) SMOOTHING AND (II) ADAPTIVE THRESHOLDING
 83 |     
 84 |     # -------
 85 |     # Part I
 86 |     # -------
 87 |     
 88 |     # Gaussian smoothing
 89 |     sigma = 3                                                # Smoothing factor for Gaussian
 90 |     green_smooth = ndi.filters.gaussian_filter(green,sigma)  # Perform smoothing
 91 |     
 92 |     
 93 |     # -------
 94 |     # Part II
 95 |     # -------
 96 |     
 97 |     # Create an adaptive background
 98 |     struct = ((np.mgrid[:31,:31][0] - 15)**2 + (np.mgrid[:31,:31][1] - 15)**2) <= 15**2  # Create a disk-shaped structural element
 99 |     from skimage.filters import rank            # Import module containing mean filter function
100 |     bg = rank.mean(green_smooth, selem=struct)  # Run a mean filter over the image using the disc
101 |     
102 |     # Threshold using created background
103 |     green_mem = green_smooth >= bg
104 |     
105 |     # Clean by morphological hole filling
106 |     green_mem = ndi.binary_fill_holes(np.logical_not(green_mem))
107 |     
108 |     
109 |     #------------------------------------------------------------------------------
110 |     # IMPROVED CELL SEGMENTATION BY SEEDING AND EXPANSION: 
111 |     # (I) SEEDING BY DISTANCE TRANSFORM
112 |     # (II) EXPANSION BY WATERSHED
113 |     
114 |     # -------
115 |     # Part I
116 |     # -------
117 |     
118 |     # Distance transform on thresholded membranes
119 |     # Advantage of distance transform for seeding: It is quite robust to local 
120 |     # "holes" in the membranes.
121 |     green_dt= ndi.distance_transform_edt(green_mem)
122 |     
123 |     # Dilating (maximum filter) of distance transform improves results
124 |     green_dt = ndi.filters.maximum_filter(green_dt,size=10) 
125 |     
126 |     # Retrieve and label the local maxima
127 |     from skimage.feature import peak_local_max
128 |     green_max = peak_local_max(green_dt,indices=False,min_distance=10)  # Local maximum detection
129 |     green_max = ndi.label(green_max)[0]                                 # Labeling
130 |     
131 |     
132 |     # -------
133 |     # Part II
134 |     # -------
135 |     
136 |     # Get the watershed function and run it
137 |     from skimage.morphology import watershed
138 |     green_ws = watershed(green_smooth,green_max)
139 |     
140 |     
141 |     #------------------------------------------------------------------------------
142 |     # IDENTIFICATION OF CELL EDGES
143 |     
144 |     # Define the edge detection function
145 |     def edge_finder(footprint_values):
146 |         if (footprint_values == footprint_values[0]).all():
147 |             return 0
148 |         else:
149 |             return 1
150 |         
151 |     # Iterate the edge finder over the segmentation
152 |     green_edges = ndi.filters.generic_filter(green_ws,edge_finder,size=3)
153 | 
154 |     
155 |     #------------------------------------------------------------------------------
156 |     # POSTPROCESSING: REMOVING CELLS AT THE IMAGE BORDER    
157 |     
158 |     # Create a mask for the image boundary pixels
159 |     boundary_mask = np.ones_like(green_ws)   # Initialize with all ones
160 |     boundary_mask[1:-1,1:-1] = 0             # Set middle square to 0
161 |     
162 |     # Iterate over all cells in the segmentation
163 |     current_label = 1
164 |     for cell_id in np.unique(green_ws):
165 |         
166 |         # If the current cell touches the boundary, remove it
167 |         if np.sum((green_ws==cell_id)*boundary_mask) != 0:
168 |             green_ws[green_ws==cell_id] = 0
169 |             
170 |         # This is to keep the labeling continuous, which is cleaner
171 |         else:
172 |             green_ws[green_ws==cell_id] = current_label
173 |             current_label += 1
174 |     
175 | 
176 |     #------------------------------------------------------------------------------
177 |     # MEASUREMENTS: SINGLE-CELL AND MEMBRANE READOUTS    
178 |     
179 |     # Initialize a dict for results of choice
180 |     results = {"cell_id":[], "green_mean":[], "red_mean":[],"green_membrane_mean":[], 
181 |                "red_membrane_mean":[],"cell_size":[],"cell_outline":[]}
182 |     
183 |     # Iterate over segmented cells
184 |     for cell_id in np.unique(green_ws)[1:]:
185 |         
186 |         # Mask the pixels of the current cell
187 |         cell_mask = green_ws==cell_id  
188 |         edge_mask = np.logical_and(cell_mask,green_edges)
189 |         
190 |         # Get the current cell's values
191 |         # Note that the original raw data is used for quantification!
192 |         results["cell_id"].append(cell_id)
193 |         results["green_mean"].append(np.mean(img[0,:,:][cell_mask]))
194 |         results["red_mean"].append(np.mean(img[1,:,:][cell_mask]))    
195 |         results["green_membrane_mean"].append(np.mean(img[0,:,:][edge_mask]))    
196 |         results["red_membrane_mean"].append(np.mean(img[1,:,:][edge_mask]))    
197 |         results["cell_size"].append(np.sum(cell_mask))
198 |         results["cell_outline"].append(np.sum(edge_mask))
199 | 
200 | 
201 |     #------------------------------------------------------------------------------
202 |     # REPORT PROGRESS AND RETURN RESULTS
203 | 
204 |     print "  Completed pipeline for", filename    
205 | 
206 |     return green_ws, results 
207 | 
208 | 
209 | #%%   
210 | #------------------------------------------------------------------------------
211 | # SECTION 2 - EXECUTION SCRIPT
212 | 
213 | # Now that the pipeline function is defined, we can run it for each image file in a directory and collect the results as they are returned.
214 | 
215 | 
216 | #------------------------
217 | # Part A - Get the current working directory
218 | #------------------------
219 | 
220 | # Define a variable 'input_dir' with the path to the current working directory, where the images should be saved. In principle, you can also specify any other path where you store your images.
221 |     
222 | # Import 'getcwd' and retrieve working directory
223 | from os import getcwd 
224 | input_dir = getcwd()   
225 | 
226 | 
227 | #------------------------
228 | # Part B - Generate a list of image filenames
229 | #------------------------
230 | 
231 | # Make a list of files in the directory
232 | from os import listdir
233 | filelist = listdir(input_dir)
234 | 
235 | # Collect the file names only for files that are tifs 
236 | 
237 | # Note: This is an elegant solution using "list comprehension".
238 | tiflist = [filename for filename in filelist if filename[-4:]=='.tif'] 
239 | 
240 | # This is the more classical solution. It does exactly the same:
241 | tiflist = []
242 | for filename in filelist:
243 |     if filename[-4:]=='.tif':
244 |         tiflist.append(filename)
245 | 
246 | # Check that you have the right files in tiflist. 
247 | print "Found", len(tiflist), "tif files in target directory"
248 | 
249 | 
250 | #------------------------
251 | # Part C - Loop over the tiflist, run the pipeline for each filename and collect the results
252 | #------------------------
253 | 
254 | # Initialise 'all_results' and 'all_segmentations' to store output.
255 | all_results = []
256 | all_segmentations = []
257 | 
258 | # Initialise a counter to count how many times the pipeline is run succesfully
259 | success_counter = 0
260 | 
261 | # Run the actual batch processing
262 | print "Running batch processing"
263 | for filename in tiflist: # For every file in tiflist...
264 |     
265 |     # Exception handling so the program can move on if one image fails for some reason.
266 |     try:  
267 |     
268 |         # Run the pipeline
269 |         seg,results = pipeline(filename)  
270 |         
271 |         # Add the results to our collection lists
272 |         all_results.append(results)       
273 |         all_segmentations.append(seg)
274 |         
275 |         # Update the success counter
276 |         success_counter += 1
277 |         
278 |     # What to do if something goes wrong.
279 |     except Exception:
280 |         
281 |         # Warn the user, then carry on with the next file
282 |         from warnings import warn
283 |         warn("There was an exception in " + filename + "!!!")
284 | 
285 | 
286 | #------------------------
287 | # Part D - Print a short summary  
288 | #------------------------
289 | 
290 | # How many images were successfully analyzed?
291 | print "Successfully analyzed", success_counter, "of", len(tiflist), "images"
292 | 
293 | # How many cells were segmented in total.
294 | num_cells = 0 
295 | for resultDict in all_results: # For each image...
296 |     num_cells = num_cells + len(resultDict["cell_id"]) # ...add the number of cells detected.
297 | 
298 | # Print a statement that reports the final count of cells detected, for all images segmented.    
299 | print "Detected", num_cells, "cells in total"
300 | 
301 | 
302 | #------------------------
303 | # Part E - Quick visualisation of results
304 | #------------------------
305 | 
306 | # Scatter plot of red membrane mean intensity over cell size. 
307 | 
308 | # Prepare colormap to color cells from each image differently
309 | colors = plt.cm.jet(np.linspace(0,1,len(all_results)))
310 | 
311 | # For each analyzed image...
312 | for image_id,resultDict in enumerate(all_results):
313 |     
314 |     # ...add the datapoints to the plot.
315 |     plt.scatter(resultDict["cell_size"],resultDict["red_membrane_mean"],color=colors[image_id])
316 | 
317 | # Label axes
318 | plt.xlabel("cell size")
319 | plt.ylabel("red membrane mean")
320 | 
321 | # Show or save result
322 | plt.savefig('BATCH_all_cells_scatter.png', bbox_inches='tight')
323 | plt.show()
324 | 
325 | 
326 | #------------------------
327 | # Part F - Save all the segmentations as a "3D" tif
328 | #------------------------
329 | 
330 | # Convert 'all_segmentations' to a 3D numpy array
331 | all_segmentations = np.array(all_segmentations)
332 | 
333 | # Save the result to a tif file using the 'imsave' function from the 'tifffile' module
334 | from tifffile import imsave
335 | imsave("BATCH_segmentations.tif",all_segmentations,bigtiff=True)
336 | 
337 | 
338 | #------------------------
339 | # Part G - Save the quantification data as a txt file 
340 | #------------------------
341 | 
342 | # Open an empty file using the context manager ('with')
343 | with open("BATCH_results.txt","w") as txt_out:         
344 | 
345 |     # Write the headers (note the use of a "list comprehension"-style in-line for-loop)
346 |     txt_out.write(''.join(key+'\t' for key in results.keys()) + '\n')
347 |     
348 |     # For each analyzed image...
349 |     for image_id,filename in enumerate(tiflist):                       
350 |         
351 |         # ...write the filename
352 |         txt_out.write(filename + "\n")
353 |         
354 |         # ...extract the corresponding results                   
355 |         resultDict = all_results[image_id]
356 |         
357 |         # ...iterate over cells...
358 |         for index,value in enumerate(resultDict["cell_id"]):           
359 |             
360 |             # ...and write cell data (note the use of a "list comprehension"-style in-line for-loop)
361 |             txt_out.write(''.join(str(resultDict[key][index])+'\t' for key in resultDict.keys()) + '\n')   
362 | 
363 | 
364 | #%%   
365 | #------------------------------------------------------------------------------
366 | # SECTION 4 -  RATIOMETRIC NORMALIZATION TO CONTROL CHANNEL
367 | 
368 | # To correct for technical variability it is often useful to have an internal control, e.g. some fluorophore that we expect to be the same between all analyzed conditions, and use it to normalize other measurements.
369 | 
370 | # For example, we can assume that our green channel is just a generic membrane marker, whereas the red channel is a labelled protein of interest. Thus, using the red/green ratio instead of the raw values from the red channel may yield a clearer result when comparing intensity measurements of the red protein of interest between different images.
371 | 
372 | #------------------------
373 | # Part A - Create the ratio
374 | #------------------------
375 | 
376 | # For each image...
377 | for image_id,resultDict in enumerate(all_results):
378 |     
379 |     # Calculate red/green ratio and save it under a new key in result_dict. Done for each cell using list comprehension.
380 |     all_results[image_id]["red_green_mem_ratio"] = [resultDict["red_membrane_mean"][i] / resultDict["green_membrane_mean"][i] for i in range(len(resultDict["cell_id"]))]
381 | 
382 | 
383 | #------------------------
384 | # Part B - Make a scatter plot, this time with the ratio
385 | #------------------------
386 | 
387 | # Scatterplot of red/green ratio over cell size.
388 | 
389 | # Prepare colormap to color the cells of each image differently
390 | colors = plt.cm.jet(np.linspace(0,1,len(all_results)))
391 | 
392 | # For each image...
393 | for image_id,resultDict in enumerate(all_results):
394 |     
395 |     # ...add the data points to the scatter.
396 |     plt.scatter(resultDict["cell_size"],resultDict["red_green_mem_ratio"],color=colors[image_id])
397 |     
398 | # Label axes
399 | plt.xlabel("cell size")
400 | plt.ylabel("red membrane mean")
401 | 
402 | # Show or save result
403 | plt.savefig('BATCH_all_cells_ratio_scatter.png', bbox_inches='tight')  
404 | plt.show()
405 | 
406 | 
407 | #------------------------------------------------------------------------------
408 | #------------------------------------------------------------------------------
409 | # THIS IS THE END OF THE TUTORIAL.
410 | #------------------------------------------------------------------------------
411 | #------------------------------------------------------------------------------
412 | 


--------------------------------------------------------------------------------
/main_tutorial/tutorial_pipeline_solutions.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Tue Dec 22 00:12:38 2015
  4 | 
  5 | @author:    Jonas Hartmann @ Gilmour Group @ EMBL Heidelberg
  6 |             Edited by Karin Sasaki @ CBM @ EMBL Heidelberg
  7 |             
  8 | @descript:  This is an example pipeline for the segmentation of 2D confocal
  9 |             fluorescence microscopy images of a membrane marker in confluent
 10 |             epithel-like cells. It exemplifies many fundamental concepts of 
 11 |             automated image processing and segmentation. 
 12 |             
 13 |             The pipeline is optimized to run with the provided example images,
 14 |             which are dual-color spinning-disc confocal micrographs (40x) of
 15 |             two membrane-localized proteins during zebrafish early embryonic
 16 |             development (~10hpf).
 17 |             
 18 |             'tutorial_pipeline_batch.py' shows how the same pipeline could be
 19 |             adapted to run automatically on multiple images in a directory.
 20 |             
 21 | @requires:  Python 2.7
 22 |             NumPy 1.9, SciPy 0.15
 23 |             scikit-image 0.11.2, tifffile 0.3.1
 24 | """
 25 | 
 26 | #%%
 27 | #------------------------------------------------------------------------------
 28 | # SECTION 0 - SET UP
 29 | 
 30 | # 1. Remember that you can develop this pipeline using 
 31 | #    a) a simple text editor and running it on the terminal
 32 | #    b) the Spyder IDE or 
 33 | #    c) a Jupyter notebook.
 34 | 
 35 | # 2. Make sure that all your python and image files are in the same directory, then make that directory your working directory.
 36 | #    - On the terminal, type "cd dir_path", replacing dir_path for the path of the directory
 37 | #    - In Spyder and Jupyter nootebook it can be done interactively.
 38 | 
 39 | # 3. Python is continuously under development to get better and better. In some rare cases, these new improvements need to be specifically imported to be active. One such case is the division operation in Python 2.7, which has some undesired behavior for the division of integer numbers. We can easily fix this by importing the new and improved division function from Python 3. It makes sense to do this at the start of all Python 2.7 scripts.
 40 | from __future__ import division
 41 | 
 42 | # 4. This script consists of explanations and exercises that guide you to complete the pipeline. It is designed to give you a guided experience of what "real programming" is like. This is one of the reasons why the pre-tutorial is provided as a Jupyter Notebook, but this main tutorial is not; we, and our colleagues, mostly develop programs using a text editor and the terminal.  In that same spirit, if you already have access to the solutions, we recommend that you try to solve the tutorial alone, without looking at them. 
 43 | 
 44 | # 5. If you are not feeling comfortable with the exercises, there is a partially-solved version that you can also follow. 
 45 | 
 46 | 
 47 | #%%
 48 | #------------------------------------------------------------------------------
 49 | # SECTION 1 - IMPORT MODULES AND PACKAGES
 50 | 
 51 | from __future__ import division    # Python 2.7 legacy
 52 | import numpy as np                 # Array manipulation package
 53 | import matplotlib.pyplot as plt    # Plotting package
 54 | import scipy.ndimage as ndi        # Image processing package
 55 | 
 56 | 
 57 | #%%
 58 | #------------------------------------------------------------------------------
 59 | # SECTION 2 - IMPORT AND PREPARE DATA
 60 | 
 61 | # Image processing essentially means carrying out mathematical operations on images. For this purpose, it is useful to represent image data in orderly data structures called "arrays" for which many mathematical operations are well defined. Arrays are grids with rows and columns that are filled with numbers; in the case of image data, those numbers correspond to the pixel values of the image. Arrays can have any number of dimensions (or "axes"). For example, a 2D array could represent the x and y axis of a normal image, a 3D array could contain a z-stack (xyz), a 4D array could also have multiple channels for each image (xyzc) and a 5D array could have time on top of that (xyzct).
 62 | 
 63 | # EXERCISE
 64 | # We will now proceed to import the image data, verifying we get what we expect and specifying the data we will work with. Before you start, it makes sense to have a quick look at the data in Fiji/imagej so you know what you are working with.
 65 | 
 66 | # Specify filename
 67 | filename = "example_cells_1.tif"
 68 | 
 69 | # Import tif files
 70 | import skimage.io as io               # Image file manipulation module
 71 | img = io.imread(filename)             # Importing multi-color tif file
 72 | 
 73 | # Check that everything is in order
 74 | print type(img)     # Check that img is a variable of type ndarray
 75 | print img.dtype     # Check data type is 8uint
 76 | print "Loaded array has shape", img.shape  # Printing array shape; 2 colors, 930 by 780 pixels
 77 | 
 78 | # Show image
 79 | plt.imshow(img[0,:,:],interpolation='none',cmap='gray')   # Showing one of the channels (notice "interpolation='none'"!)
 80 | plt.show()
 81 | 
 82 | # Slicing: We only work on one channel for segmentation
 83 | green = img[0,:,:]
 84 | 
 85 | 
 86 | #%%
 87 | #------------------------------------------------------------------------------
 88 | # SECTION 3 - PREPROCESSING AND SIMPLE CELL SEGMENTATION:
 89 | #            (I) SMOOTHING AND (II) ADAPTIVE THRESHOLDING
 90 | 
 91 | # -------
 92 | # Part I
 93 | # -------
 94 | 
 95 | # Gaussian smoothing
 96 | sigma = 3                                                # Smoothing factor for Gaussian
 97 | green_smooth = ndi.filters.gaussian_filter(green,sigma)  # Perform smoothing
 98 | 
 99 | # visualise 
100 | plt.imshow(green_smooth,interpolation='none',cmap='gray')
101 | plt.show()
102 | 
103 | 
104 | # -------
105 | # Part II
106 | # -------
107 | 
108 | # Create an adaptive background
109 | struct = ((np.mgrid[:31,:31][0] - 15)**2 + (np.mgrid[:31,:31][1] - 15)**2) <= 15**2  # Create a disk-shaped structural element
110 | from skimage.filters import rank            # Import module containing mean filter function
111 | bg = rank.mean(green_smooth, selem=struct)  # Run a mean filter over the image using the disc
112 | 
113 | # Threshold using created background
114 | green_mem = green_smooth >= bg
115 | 
116 | # Clean by morphological hole filling
117 | green_mem = ndi.binary_fill_holes(np.logical_not(green_mem))
118 | 
119 | # Show the result
120 | plt.imshow(green_mem,interpolation='none',cmap='gray')
121 | plt.show()
122 | 
123 | 
124 | 
125 | #%%
126 | #------------------------------------------------------------------------------
127 | # SECTION 4 - CONNECTED COMPONENTS LABELING (OR: "WE COULD BE DONE NOW")
128 | 
129 | # If the data is clean and we just want a very quick cell or membrane segmentation, we could be done now. All we would still need to do is to label the individual cells - in other words, to give each separate "connected component" an individual number.
130 | 
131 | # Labeling connected components
132 | green_components = ndi.label(green_mem)[0] 
133 | plt.imshow(green_components,interpolation='none', cmap='gray')    
134 | plt.show() 
135 | 
136 | # The result you get here should look not to bad but will likely still have some problems. For example, some cells will be connected because there were small gaps between them in the membrane. Also, the membranes themselves are not partitioned to the individual cells, so we cannot make measurements of membrane intensities for each cell. These problems can be resolved by means of a "seeding-expansion" strategy, which we will implement below.
137 | 
138 | 
139 | 
140 | #%%
141 | #------------------------------------------------------------------------------
142 | # SECTION 5 - IMPROVED CELL SEGMENTATION BY SEEDING AND EXPANSION: 
143 | #             (I) SEEDING BY DISTANCE TRANSFORM
144 | #             (II) EXPANSION BY WATERSHED
145 | #
146 | # Part I - Seeding refers to the identification of 'seeds', a few pixels that can assigned to each particular cell with great certainty. If available, a channel showing the cell nuclei is often used for seeding. However, using the membrane segmentation we have developed above, we can also generate relatively reliable seeds without the need to image nuclei.
147 | # Part II - The generated seeds are expanded into regions of the image where the cell assignment is less clear-cut than in the seed region itself. The goal is to expand each seed exactly up to the borders of the corresponding cell, resulting in a full segmentation. The watershed technique is the most common algorithm for expansion.
148 | 
149 | # -------
150 | # Part I
151 | # -------
152 | 
153 | # Distance transform on thresholded membranes
154 | # Advantage of distance transform for seeding: It is quite robust to local 
155 | # "holes" in the membranes.
156 | green_dt= ndi.distance_transform_edt(green_mem)
157 | plt.imshow(green_dt,interpolation='none')
158 | plt.show()
159 | 
160 | # Dilating (maximum filter) of distance transform improves results
161 | green_dt = ndi.filters.maximum_filter(green_dt,size=10) 
162 | plt.imshow(green_dt,interpolation='none')
163 | plt.show()
164 | 
165 | # Retrieve and label the local maxima
166 | from skimage.feature import peak_local_max
167 | green_max = peak_local_max(green_dt,indices=False,min_distance=10)  # Local maximum detection
168 | green_max = ndi.label(green_max)[0]                                 # Labeling
169 | 
170 | # Show maxima as masked overlay
171 | plt.imshow(green_smooth,cmap='gray',interpolation='none')
172 | plt.imshow(np.ma.array(green_max,mask=green_max==0),interpolation='none') 
173 | plt.show()
174 | 
175 | 
176 | # -------
177 | # Part II
178 | # -------
179 | 
180 | # Watershedding is a relatively simple but powerful algorithm for expanding seeds. The image intensity is considered as a topographical map (with high  intensities being "mountains" and low intensities "valleys") and water is poured into the valleys, starting from each of the seeds. The water first labels the lowest intensity pixels around the seeds, then continues to fill up. The cell boundaries (the 'mountains') are where the "waterfronts" between different seeds ultimately touch and stop expanding.
181 | 
182 | # Get the watershed function and run it
183 | from skimage.morphology import watershed
184 | green_ws = watershed(green_smooth,green_max)
185 | 
186 | # Show result as transparent overlay
187 | # Note: For a better visualization, see "FINDING CELL EDGES" below!
188 | plt.imshow(green_smooth,cmap='gray',interpolation='none')
189 | plt.imshow(green_ws,interpolation='none',alpha=0.7) 
190 | plt.show()
191 | 
192 | # OBSERVATION
193 | # Note that the previously connected cells are now mostly separated and the membranes are partitioned to their respective cells. Depending on the quality of the seeding, however, there may now be some cases of oversegmentation (a single cell split into multiple segmentation objects). This is a typical example of the trade-off between specificity and sensitivity one always has to face in computational classification tasks. As an advanced task, you can try to think of ways to fuse the wrongly oversegmented cells back together.   
194 | 
195 | 
196 | #%%
197 | #------------------------------------------------------------------------------
198 | # SECTION 6 - IDENTIFICATION OF CELL EDGES
199 | 
200 | # Now that we have a full cell segmentation, we can retrieve the cell edges, that is the pixels bordering neighboring cells. This is useful for many purposes; in our case, for example, edge intensities are a good measure of membrane intensity, which may be a desired readout. The length of the edge (relative to cell size) is also an informative feature about the cell shape. Finally, showing colored edges is a nice way of visualizing cell segmentations.
201 | 
202 | # There are many ways of identifying edge pixels in a fully labeled segmentation. It can be done using erosion or dilation, for example, or it can be done in an extremely fast and fully vectorized way (for this, see "Vectorization" in the optional advanced content). Here, we use a slow but intuitive method that also serves to showcase the 'generic_filter' function in ndimage.
203 | 
204 | # 'ndi.filters.generic_filter' is a powerful way of quickly iterating any function over numpy arrays (including functions that use a structuring element). 'generic_filter' iterates a structure element over all the values in an array and passes the corresponding values to a user-defined function. The result returned by this function is then allocated to the pixel in the image that corresponds to the origin of the se. Check the documentation to find out more about the arguments for 'generic_filter'.
205 | 
206 | # Define the edge detection function
207 | def edge_finder(footprint_values):
208 |     if (footprint_values == footprint_values[0]).all():
209 |         return 0
210 |     else:
211 |         return 1
212 |     
213 | # Iterate the edge finder over the segmentation
214 | green_edges = ndi.filters.generic_filter(green_ws,edge_finder,size=3)
215 | 
216 | # Label the detected edges based on the underlying cells
217 | green_edges_labeled = green_edges * green_ws
218 | 
219 | # Show them as masked overlay
220 | plt.imshow(green_smooth,cmap='gray',interpolation='none')
221 | plt.imshow(np.ma.array(green_edges_labeled,mask=green_edges_labeled==0),interpolation='none') 
222 | plt.show()
223 | 
224 | 
225 | 
226 | 
227 | #%%
228 | #------------------------------------------------------------------------------
229 | # SECTION 7 - POSTPROCESSING: REMOVING CELLS AT THE IMAGE BORDER
230 | 
231 | # Segmentation is never perfect and it often makes sense to remove artefacts afterwards. For example, one could filter out objects that are too small, have a very strange shape, or very strange intensity values. Note that this is equivalent to the removal of outliers in data analysis and should only be done for good reason and with caution.
232 | 
233 | # As an example of postprocessing, we will now filter out a particular group of problematic cells: those that are cut off at the image border.
234 | 
235 | 
236 | # Create a mask for the image boundary pixels
237 | boundary_mask = np.ones_like(green_ws)   # Initialize with all ones
238 | boundary_mask[1:-1,1:-1] = 0             # Set middle square to 0
239 | 
240 | # Iterate over all cells in the segmentation
241 | current_label = 1
242 | for cell_id in np.unique(green_ws):
243 |     
244 |     # If the current cell touches the boundary, remove it
245 |     if np.sum((green_ws==cell_id)*boundary_mask) != 0:
246 |         green_ws[green_ws==cell_id] = 0
247 |         
248 |     # This is to keep the labeling continuous, which is cleaner
249 |     else:
250 |         green_ws[green_ws==cell_id] = current_label
251 |         current_label += 1
252 | 
253 | # Show result as transparent overlay
254 | plt.imshow(green_smooth,cmap='gray',interpolation='none')
255 | plt.imshow(np.ma.array(green_ws,mask=green_ws==0),interpolation='none',alpha=0.7) 
256 | plt.show()
257 | 
258 | 
259 | #%%
260 | #------------------------------------------------------------------------------
261 | # SECTION 8 - MEASUREMENTS: SINGLE-CELL AND MEMBRANE READOUTS
262 | 
263 | # Now that the cells and membranes in the image are segmented, we can quantify various readouts for every cell individually. Readouts can be based on the intensity in different channels in the original image or on the size and shape of the cells themselves.
264 | 
265 | # To exemplify how different properties of cells can be measured, we will quantify the following:
266 |     # Cell ID (so all other measurements can be traced back to the cell that was measured)
267 |     # Mean intensity of each cell, for each channel
268 |     # Mean intensity at the membrane of each cell, for each channel
269 |     # The cell size, in terms of the number of pixels that make up the cell
270 |     # The cell outline length, in terms of the number of pixels that make up the cell boundary
271 | 
272 | # We will use a dictionary to collect all the information in an orderly fashion.
273 | 
274 | 
275 | # Initialize a dict for results of choice
276 | results = {"cell_id":[], "green_mean":[], "red_mean":[],"green_membrane_mean":[], 
277 |            "red_membrane_mean":[],"cell_size":[],"cell_outline":[]}
278 | 
279 | # Iterate over segmented cells
280 | for cell_id in np.unique(green_ws)[1:]:
281 |     
282 |     # Mask the pixels of the current cell
283 |     cell_mask = green_ws==cell_id  
284 |     edge_mask = np.logical_and(cell_mask,green_edges)
285 |     
286 |     # Get the current cell's values
287 |     # Note that the original raw data is used for quantification!
288 |     results["cell_id"].append(cell_id)
289 |     results["green_mean"].append(np.mean(img[0,:,:][cell_mask]))
290 |     results["red_mean"].append(np.mean(img[1,:,:][cell_mask]))    
291 |     results["green_membrane_mean"].append(np.mean(img[0,:,:][edge_mask]))    
292 |     results["red_membrane_mean"].append(np.mean(img[1,:,:][edge_mask]))    
293 |     results["cell_size"].append(np.sum(cell_mask))
294 |     results["cell_outline"].append(np.sum(edge_mask))
295 | 
296 | 
297 | #%%    
298 | #------------------------------------------------------------------------------
299 | # SECTION 9 - SIMPLE ANALYSIS AND VISUALIZATION
300 | 
301 | # Now that you have collected the readouts to a dictionary you can analyse them in any way you wish. This section shows how to do basic plotting and analysis of the results, including mapping the data back onto the image (as a 'heatmap') and producing boxplots, scatterplots and a linear fit. A more in-depth example of how to couple image analysis into advanced data analysis can be found in 'data_analysis' in the 'optional_advanced_material' directory.
302 | 
303 | 
304 | # (i) Print out the results you want to see
305 | for key in results.keys():
306 |     print "\n" + key
307 |     print results[key]
308 | 
309 | 
310 | # (ii) Make box plots of the cell and membrane intensities, for both channels.
311 | plt.boxplot([results[key] for key in results.keys()][2:-1],labels=results.keys()[2:-1])
312 | plt.show()
313 | 
314 | 
315 | # (iii) 
316 | # Make a scatter plot to show if there is a dependancy of the membrane intensity (either channel) on the cell size (for example) and add the linear fit to scatter plot, to see the correlation.
317 | 
318 |     # Import the module stats from scipy       
319 | from scipy import stats                  
320 | 
321 |     # Linear fit of cell size vs membrane intensity
322 | linfit = stats.linregress(results["cell_size"],results["red_membrane_mean"]) 
323 |                 
324 |     # Make scatter plot. 
325 | plt.scatter(results["cell_size"],results["red_membrane_mean"])
326 | plt.xlabel("cell size")
327 | plt.ylabel("red_membrane_mean")
328 | 
329 |     # Define the equation of the line that fits the data, using an anonymous function
330 | fit = lambda x: linfit[0] * x + linfit[1]
331 | 
332 |     # Get the fitted values (for graph limits)
333 | ax = plt.gca()        
334 | x_lims = ax.get_xlim()
335 | fit_vals = map(fit,x_lims)
336 | 
337 |     # Plot the line 
338 | plt.gca().set_autoscale_on(False)         # Prevent the figure from rescaling when line is added, by using plt.gca().set_autoscale_on(False)
339 | plt.plot(x_lims,fit_vals,'r-',lw=2)
340 | plt.show()
341 | 
342 |     
343 | 
344 | # (iv) Print out results from stats analysis
345 | linnames = ["slope","intercept","r-value","p-value","stderr"]            # Names of esults of stats.linregress
346 | print "\nLinear fit of cell size to red membrane intensity"              # Header
347 | for index,value in enumerate(linfit):                                    # For each value...
348 |     print "  " + linnames[index] + "\t\t" + str(value)                   # ...print the result
349 | print "  r-squared\t\t" + str(linfit[2]**2)                              # Also print R-squared
350 | 
351 | 
352 | # (v) Map the cell size and cell membrane back onto the image.
353 | sizes_8bit = results["cell_size"] / max(results["cell_size"]) * 255  # Map to 8bit
354 | size_map = np.zeros_like(green_ws,dtype=np.uint8)                    # Initialize image
355 | for index,cell_id in enumerate(np.unique(green_ws)[1:]):             # Iterate over cells
356 |     size_map[green_ws==cell_id] = sizes_8bit[index]                  # Assign corresponding cell size to cell pixels
357 | 
358 | 
359 | plt.imshow(green_smooth,cmap='gray',interpolation='none')            # Set grayscale background image
360 | plt.imshow(np.ma.array(size_map,mask=size_map==0),interpolation='none',alpha=0.7)  # Colored overlay
361 | plt.show()
362 | 
363 | 
364 | # (vi)
365 | # Note that this seems to return a highly significant p-value but a very low
366 | # correlation coefficient (r-value). We also would not expect this correlation
367 | # to be present in our data. This should prompt several considerations:
368 | #   1) What does this p-value actually mean? See help(stats.linregress)
369 | #   2) Since we have not filtered properly for artefacts (e.g. "cells" of very
370 | #      small size), they might bias this particular fit.
371 | #   3) We're now working with a lot of datapoints. This can skew statistical
372 | #      analyses! To some extent, we can correct for this by multiple testing
373 | #      correction and by comparison with randomized datasets. Additionaly, a 
374 | #      closer look at Bayesian statistics is highly recommended for people 
375 | #      working with large datasets.
376 | 
377 | 
378 | #%%
379 | #------------------------------------------------------------------------------
380 | # SECTION 10 - MEASUREMENTS: WRITING OUTPUT
381 | 
382 | # There are several ways of presenting the output of a program. Data can be saved to files in a human-readable format (e.g. text files (e.g. to import into Excel), images, etc), or written to language-specific files for future use (i.e. instead of having to run the whole program again). Here you will learn some of this possibilities.
383 | 
384 | 
385 | # (i)
386 | # Write an image to a tif (could be opened e.g. in Fiji)
387 | 
388 | # Get file handling function
389 | from tifffile import imsave                                            
390 | 
391 | # Save array to tif
392 | imsave(filename+"_labeledEdges.tif",green_edges_labeled,bigtiff=True)  
393 | 
394 | 
395 | 
396 | # (ii)
397 | # Write a figure to a png or pdf
398 | 
399 | # Recreate scatter plot from above
400 | plt.scatter(results["cell_size"],results["red_membrane_mean"])  
401 | plt.xlabel("cell size")                                    
402 | plt.ylabel("red membrane mean")
403 | 
404 | # Save to png (rasterized)
405 | plt.savefig(filename+'_scatter.png', bbox_inches='tight')  
406 | 
407 | # Save to pdf (vectorized)
408 | plt.savefig(filename+'_scatter.pdf', bbox_inches='tight')  
409 | 
410 | 
411 | 
412 | # (iii)
413 | # Write a python file that can be reloaded in other Python programs
414 | import json
415 | with open('k_resultsDict.json', 'w') as fp:
416 |     json.dump(results, fp)
417 |     
418 | # This could be loaded again in this way:
419 | #with open(filename+'_resultsDict.json', 'r') as fp:
420 | #    results = json.load(fp)
421 | 
422 | 
423 | # (iv)
424 | # Write a text file of the numerical data gathered  (could be opened e.g. in Excel)
425 | with open(filename+"_output.txt","w") as txt_out:                                                # Open an empty file object (with context manager)
426 |     txt_out.write(''.join(key+'\t' for key in results.keys()) + '\n')                            # Write the headers
427 |     for index,value in enumerate(results["cell_id"]):                                            # Iterate over cells
428 |         txt_out.write(''.join(str(results[key][index])+'\t' for key in results.keys()) + '\n')   # Write cell data
429 |             
430 | 
431 | #------------------------------------------------------------------------------
432 | #------------------------------------------------------------------------------
433 | # THIS IS THE END OF THE TUTORIAL.
434 | #------------------------------------------------------------------------------
435 | #------------------------------------------------------------------------------
436 | 
437 | 
438 | 


--------------------------------------------------------------------------------
/optional_advanced_content/Multiprocessing/batch_multiprocessing.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Tue Dec 22 00:12:38 2015
  4 | 
  5 | @author:    Jonas Hartmann @ Gilmour Group @ EMBL Heidelberg
  6 | 
  7 | @descript:  A cleaned version of the batch segmentation pipeline, ready for 
  8 |             multiprocessed execution. For further information, please see
  9 |             example_multiprocessing.py.
 10 |             
 11 | @requires:  Python 2.7
 12 |             NumPy 1.9, SciPy 0.15, scikit-image 0.11.3
 13 | """
 14 | 
 15 | 
 16 | # IMPORT STUFF
 17 | from __future__ import division    # Python 2.7 legacy
 18 | import numpy as np                 # Array manipulation package
 19 | import scipy.ndimage as ndi        # Image processing package
 20 | 
 21 | 
 22 | #------------------------------------------------------------------------------
 23 | 
 24 | # PIPELINE FUNCTION
 25 | 
 26 | def pipeline(filename):
 27 | 
 28 |     
 29 |     #------------------------------------------------------------------------------
 30 |     
 31 |     # IMPORT AND SLICE DATA    
 32 |     
 33 |     # Check if input filename exists, else return an error
 34 |     from os.path import isfile
 35 |     if not isfile(filename):
 36 |         from warnings import warn
 37 |         warn("Could not find file" + filename + '.tif')
 38 |         return "ERROR"
 39 |     
 40 |     # Import tif files
 41 |     import skimage.io as io               # Image file manipulation module
 42 |     img = io.imread(filename)             # Importing multi-color tif file
 43 |     img = np.array(img)                   # Converting MultiImage object to numpy array
 44 |     
 45 |     # Slicing: We only work on one channel for segmentation
 46 |     green = img[0,:,:]
 47 |     
 48 |     
 49 |     #------------------------------------------------------------------------------
 50 |     
 51 |     # PREPROCESSING: SMOOTHING AND ADAPTIVE THRESHOLDING
 52 |     # It's standard to smoothen images to reduce technical noise - this improves
 53 |     # all subsequent image processing steps. Adaptive thresholding allows the
 54 |     # masking of foreground objects even if the background intensity varies across
 55 |     # the image.
 56 |     
 57 |     # Gaussian smoothing
 58 |     sigma = 3                                                # Smoothing factor for Gaussian
 59 |     green_smooth = ndi.filters.gaussian_filter(green,sigma)  # Perform smoothing
 60 |     
 61 |     # Create an adaptive background
 62 |     #struct = ndi.iterate_structure(ndi.generate_binary_structure(2,1),24)               # Create a diamond-shaped structural element
 63 |     struct = ((np.mgrid[:31,:31][0] - 15)**2 + (np.mgrid[:31,:31][1] - 15)**2) <= 15**2  # Create a disk-shaped structural element
 64 |     bg = ndi.filters.generic_filter(green_smooth,np.mean,footprint=struct)               # Run a mean filter over the image using the disc
 65 |     
 66 |     # Threshold using created background
 67 |     green_thresh = green_smooth >= bg
 68 |     
 69 |     # Clean by morphological hole filling
 70 |     green_thresh = ndi.binary_fill_holes(np.logical_not(green_thresh))
 71 |     
 72 |     # Show the result
 73 | #    plt.imshow(green_thresh,interpolation='none',cmap='gray')
 74 | #    plt.show()
 75 |     
 76 |     
 77 |     #------------------------------------------------------------------------------
 78 |     
 79 |     # (SIDE NOTE: WE COULD BE DONE NOW)
 80 |     # If the data is very clean and/or we just want a quick look, we could simply
 81 |     # label all connected pixels now and consider the result our segmentation.
 82 |     
 83 |     # Labeling connected components
 84 | #    green_label = ndi.label(green_thresh)[0]
 85 | #    plt.imshow(green_label,interpolation='none')
 86 | #    plt.show() 
 87 |     
 88 |     # However, to also partition the membranes to the cells, to generally improve  
 89 |     # the segmentatation (e.g. split cells that end up connected here) and to 
 90 |     # handle more complicated morphologies or to deal with lower quality data, 
 91 |     # this approach is not sufficient.
 92 |     
 93 |     
 94 |     #------------------------------------------------------------------------------
 95 |     
 96 |     # SEGMENTATION: SEEDING BY DISTANCE TRANSFORM
 97 |     # More advanced segmentation is usually a combination of seeding and expansion.
 98 |     # In seeding, we want to find a few pixels for each cell that we can assign to
 99 |     # said cell with great certainty. These 'seeds' are then expanded to partition
100 |     # regions of the image where cell affiliation is less clear-cut.
101 |     
102 |     # Distance transform on thresholded membranes
103 |     # Advantage of distance transform for seeding: It is quite robust to local 
104 |     # "holes" in the membranes.
105 |     green_dt= ndi.distance_transform_edt(green_thresh)
106 | #    plt.imshow(green_dt,interpolation='none')
107 | #    plt.show()
108 |     
109 |     # Dilating (maximum filter) of distance transform improves results
110 |     green_dt = ndi.filters.maximum_filter(green_dt,size=10) 
111 | #    plt.imshow(green_dt,interpolation='none')
112 | #    plt.show()
113 |     
114 |     # Retrieve and label the local maxima
115 |     from skimage.feature import peak_local_max
116 |     green_max = peak_local_max(green_dt,indices=False,min_distance=10)  # Local maximum detection
117 |     green_max = ndi.label(green_max)[0]                                 # Labeling
118 |     
119 |     # Show maxima as masked overlay
120 | #    plt.imshow(green_smooth,cmap='gray',interpolation='none')
121 | #    plt.imshow(np.ma.array(green_max,mask=green_max==0),interpolation='none') 
122 | #    plt.show()
123 |     
124 |     
125 |     #------------------------------------------------------------------------------
126 |     
127 |     # SEGMENTATION: EXPANSION BY WATERSHED
128 |     # Watershedding is a relatively simple but powerful algorithm for expanding
129 |     # seeds. The image intensity is considered as a topographical map (with high
130 |     # intensities being "mountains" and low intensities "valleys") and water is
131 |     # poured into the valleys from each of the seeds. The water first labels the
132 |     # lowest intensity pixels around the seeds, then continues to fill up. The cell
133 |     # boundaries are where the waterfronts between different seeds touch.
134 |     
135 |     # Get the watershed function and run it
136 |     from skimage.morphology import watershed
137 |     green_ws = watershed(green_smooth,green_max)
138 |     
139 |     # Show result as transparent overlay
140 |     # Note: For a better visualization, see "FINDING CELL EDGES" below!
141 | #    plt.imshow(green_smooth,cmap='gray',interpolation='none')
142 | #    plt.imshow(green_ws,interpolation='none',alpha=0.7) 
143 | #    plt.show()
144 |     
145 |     # Notice that the previously connected cells are now mostly separated and the
146 |     # membranes are partitioned to their respective cells. 
147 |     # ...however, we now see a few cases of oversegmentation!
148 |     # This is a typical example of the trade-offs one has to face in any 
149 |     # computational classification task. 
150 |     
151 |     
152 |     #------------------------------------------------------------------------------
153 |     
154 |     # POSTPROCESSING: REMOVING CELLS AT THE IMAGE BORDER
155 |     # Since segmentation is never perfect, it often makes sense to remove artefacts
156 |     # after the segmentation. For example, one could filter out cells that are too
157 |     # big, have a strange shape, or strange intensity values. Similarly, supervised 
158 |     # machine learning can be used to identify cells of interest based on a 
159 |     # combination of various features. Another example of cells that should be 
160 |     # removed are those at the image boundary.
161 |     
162 |     # Create a mask for the image boundary pixels
163 |     boundary_mask = np.ones_like(green_ws)   # Initialize with all ones
164 |     boundary_mask[1:-1,1:-1] = 0             # Set middle square to 0
165 |     
166 |     # Iterate over all cells in the segmentation
167 |     current_label = 1
168 |     for cell_id in np.unique(green_ws):
169 |         
170 |         # If the current cell touches the boundary, remove it
171 |         if np.sum((green_ws==cell_id)*boundary_mask) != 0:
172 |             green_ws[green_ws==cell_id] = 0
173 |             
174 |         # This is to keep the labeling continuous, which is cleaner
175 |         else:
176 |             green_ws[green_ws==cell_id] = current_label
177 |             current_label += 1
178 |     
179 |     # Show result as transparent overlay
180 | #    plt.imshow(green_smooth,cmap='gray',interpolation='none')
181 | #    plt.imshow(np.ma.array(green_ws,mask=green_ws==0),interpolation='none',alpha=0.7) 
182 | #    plt.show()
183 |     
184 |     
185 |     #------------------------------------------------------------------------------
186 |     
187 |     # MEASUREMENTS: FINDING CELL EDGES
188 |     # Finding cell edges is very useful for many purposes. In our example, edge
189 |     # intensities are a measure of membrane intensities, which may be a desired
190 |     # readout. The length of the edge (relative to cell size) is also a quite
191 |     # informative feature about the cell shape. Finally, showing colored edges is
192 |     # a nice way of visualizing segmentations.
193 |     
194 |     # How this works: The generic_filter function (see further below) iterates a 
195 |     # structure element (in this case a 3x3 square) over an image and passes all
196 |     # the values within that element to some arbitrary function (in this case 
197 |     # edge_finder). The edge_finder function checks if all these pixels are the 
198 |     # same; if they are, the current pixel is not at an edge (return 0), otherwise 
199 |     # it is (return 1). generic_filter takes the returned values and organizes them
200 |     # into an image again by setting the central pixel of each 3x3 square to the
201 |     # respective return value from edge_finder.
202 |     
203 |     # Define the edge detection function
204 |     def edge_finder(footprint_values):
205 |         if (footprint_values == footprint_values[0]).all():
206 |             return 0
207 |         else:
208 |             return 1
209 |         
210 |     # Iterate the edge finder over the segmentation
211 |     green_edges = ndi.filters.generic_filter(green_ws,edge_finder,size=3)
212 |     
213 |     # Label the detected edges based on the underlying cells
214 | #    green_edges_labeled = green_edges * green_ws
215 |     
216 |     # Show them as masked overlay
217 | #    plt.imshow(green_smooth,cmap='gray',interpolation='none')
218 | #    plt.imshow(np.ma.array(green_edges_labeled,mask=green_edges_labeled==0),interpolation='none') 
219 | #    plt.show()
220 |     
221 |     
222 |     #------------------------------------------------------------------------------
223 |     
224 |     # MEASUREMENTS: SINGLE-CELL READOUTS
225 |     # Now that the cells in the image are nicely segmented, we can quantify various
226 |     # readouts for every cell individually. Readouts can be based on the intensity
227 |     # in the original image, on intensities in other channels or on the size and
228 |     # shape of the cells themselves.
229 |     
230 |     # Initialize a dict for results of choice
231 |     results = {"cell_id":[], "green_mean":[], "red_mean":[],"green_mem_mean":[], 
232 |                "red_mem_mean":[],"cell_size":[],"cell_outline":[]}
233 |     
234 |     # Iterate over segmented cells
235 |     for cell_id in np.unique(green_ws)[1:]:
236 |         
237 |         # Mask the pixels of the current cell
238 |         cell_mask = green_ws==cell_id    
239 |         
240 |         # Get the current cell's values
241 |         # Note that the original raw data is used for quantification!
242 |         results["cell_id"].append(cell_id)
243 |         results["green_mean"].append(np.mean(img[0,:,:][cell_mask]))
244 |         results["red_mean"].append(np.mean(img[1,:,:][cell_mask]))    
245 |         results["green_mem_mean"].append(np.mean(img[0,:,:][np.logical_and(cell_mask,green_edges)]))    
246 |         results["red_mem_mean"].append(np.mean(img[1,:,:][np.logical_and(cell_mask,green_edges)]))    
247 |         results["cell_size"].append(np.sum(cell_mask))
248 |         results["cell_outline"].append(np.sum(np.logical_and(cell_mask,green_edges)))
249 |     
250 |     
251 |     #------------------------------------------------------------------------------
252 | 
253 |     # Return result as tuple (important in multiprocessing)
254 |     return (green_ws, results)
255 | 
256 |     
257 | #------------------------------------------------------------------------------
258 | 
259 | 
260 | 
261 |     


--------------------------------------------------------------------------------
/optional_advanced_content/Multiprocessing/example_cells_1.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/optional_advanced_content/Multiprocessing/example_cells_1.tif


--------------------------------------------------------------------------------
/optional_advanced_content/Multiprocessing/example_cells_2.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/optional_advanced_content/Multiprocessing/example_cells_2.tif


--------------------------------------------------------------------------------
/optional_advanced_content/Multiprocessing/example_multiprocessing.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Sat Mar 12 20:59:1 2016
  4 | 
  5 | @author:   Jonas Hartmann @ Gilmour Group @ EMBL Heidelberg
  6 | 
  7 | @descript: Multiprocessing is a simple way of increasing the speed of code if 
  8 |            it is impossible, insufficient or otherwise undesirable to do so by
  9 |            vectorization. Essentially, multiprocessing simply means running 
 10 |            different independent parts of a program (for example an function
 11 |            that is run again and again on different data) at the same time
 12 |            instead of sequentially in a loop. This is an example of using 
 13 |            multiprocessing to run the batch pipeline from the main tutorial on
 14 |            multiple different images at the same time, instead of processing 
 15 |            them one by one.
 16 |            
 17 |            In Python, multiprocessing is handled in the multiprocessing module. 
 18 |            The easiest way of using it is to initialize a pool of "worker" 
 19 |            processes, which are then available to run the functions passed to
 20 |            them (or "mapped onto them"). Although this is relatively easy to 
 21 |            do, multiprocessing has some quirks that need to be payed attention:
 22 | 
 23 |            1) Functions passed to worker processes can take at most one object
 24 |               as input and return at most one object as output. If multiple
 25 |               parameters need to be passed, they must be packaged into a single
 26 |               object first (and then unpacked at the beginning of the function).
 27 |               
 28 |            2) If functions write to files, print output or display graphs, 
 29 |               great care is advised during multiprocessing, as the different 
 30 |               subprocesses may try to do these things at the same time, which 
 31 |               may result in a garbled chaos or even a crash.
 32 |  
 33 |            3) Every worker process will start out by automatically trying to 
 34 |               set up the same "environment" as the main process. This 
 35 |               effectively means that each subprocess tries to execute the main
 36 |               script again at the start, which could obviously have catastrophic 
 37 |               consequences. To prevent this, the main script must be "protected". 
 38 |               This is done through the built-in variable __name__, which has the 
 39 |               value "__main__" if the script is called from the main process and 
 40 |               a different value if it's called by a worker process. This can be
 41 |               exploited to make sure that the main script is not completely 
 42 |               re-run by each subprocess (see the beginning of this script).   
 43 | 
 44 |            The following describes an example of how to run the batch pipeline 
 45 |            using N parallel processes. It requires batch_multiprocessing.py, 
 46 |            which is a "cleaned" version of the batch pipeline that accommodates 
 47 |            the three quirks mentioned above. Note that all code outside of the
 48 |            actual pipeline function has been deleted to avoid a similar problem 
 49 |            to (3) during the import of the function (Python executes all
 50 |            non-protected code blocks in a module when that module is imported!).
 51 | 
 52 |            Execution of the following example for 4 copies of the same image 
 53 |            takes ~73s on my machine (2 available cores). Running the 4 copies 
 54 |            without multiprocessing would take ~144s.
 55 | 
 56 | @requires: Python 2.7
 57 |            NumPy 1.9, SciPy 0.15, matplotlib 1.5.1, scikit-image 0.11.3
 58 | 
 59 | """
 60 | 
 61 | # IMPORT BASIC MODULES
 62 | 
 63 | from __future__ import division    # Python 2.7 legacy
 64 | import numpy as np                 # Array manipulation package
 65 | import matplotlib.pyplot as plt    # Plotting package
 66 | import scipy.ndimage as ndi        # Image processing package
 67 | 
 68 | 
 69 | #------------------------------------------------------------------------------
 70 |    
 71 | # PROTECTION OF THIS SCRIPT FOR MULTIPROCESSING
 72 | # When subprocesses are initialized, they will first try to run this main 
 73 | # script again (this is done to set up the environment/name space properly).
 74 | # Since we do not want the following to be run again and again, we have to 
 75 | # protect it. 
 76 | # The built-in variable __name__ is automatically set to "__main__" in the main
 77 | # process but has other values in the subprocesses, which means those processes
 78 | # will ignore the code block within the following if-statement:
 79 | 
 80 | if __name__ == '__main__':
 81 |     
 82 |     
 83 |     #--------------------------------------------------------------------------
 84 |     
 85 |     # PREPARATION
 86 |     
 87 |     # Begin timing
 88 |     from time import time
 89 |     before = time()    
 90 |     
 91 |     # Generate a list of image filenames (just as in main tutorial)
 92 |     from os import listdir, getcwd
 93 |     filelist = listdir(getcwd())
 94 |     tiflist = [fname for fname in filelist if fname[-4:]=='.tif']
 95 |     
 96 |     # Prepare for multiprocessing
 97 |     N = 4                         # Maximum number of processes used
 98 |     import multiprocessing.pool   # Import multiprocessing class
 99 |     currentPool = multiprocessing.Pool(processes=N)    # Create a pool of worker processes
100 |     from batch_multiprocessing import pipeline         # Import cleaned pipeline function
101 |     
102 |     
103 |     # EXECUTION
104 |     
105 |     # Here, the function pipeline is executed by the current pool of worker 
106 |     # processes for each parameter (filename) in the tiflist and the output is
107 |     # written into the output_list.
108 |     output_list = currentPool.map(pipeline,tiflist)
109 |     
110 |     # This is necessary clean-up to make sure that all worker subprocesses are
111 |     # properly terminated. It's more of a "safety" thing, since things can
112 |     # *really* go wrong in multiprocessing...
113 |     currentPool.close()
114 |     currentPool.join()
115 |     
116 |     # Reorganize the output into the same shape as in the batch tutorial
117 |     all_results = [output[1] for output in output_list if output != "ERROR"]
118 |     all_segmentations = [output[0] for output in output_list if output != "ERROR"]
119 |     
120 |     
121 |     # DOWNSTREAM HANDLING
122 |     
123 |     # End Timing    
124 |     after = time()
125 |     print after - before
126 |     
127 |     # See if it worked by printing the short summary  
128 |     print "\nSuccessfully analyzed", len(all_results), "of", len(tiflist), "images"
129 |     print "Detected", sum([len(resultDict["cell_id"]) for resultDict in all_results]), "cells in total"
130 |     
131 |     # See if it worked by showing the scatterplot
132 |     colors = plt.cm.jet(np.linspace(0,1,len(all_results))) # To give cells from different images different colors
133 |     for image_id,resultDict in enumerate(all_results):
134 |         plt.scatter(resultDict["cell_size"],resultDict["red_mem_mean"],color=colors[image_id])
135 |     plt.xlabel("cell size")
136 |     plt.ylabel("red_mem_mean")
137 |     plt.show()
138 |     
139 |     
140 |     #--------------------------------------------------------------------------
141 | 
142 | 
143 | 
144 | 


--------------------------------------------------------------------------------
/optional_advanced_content/Vectorization/example_cells_1.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/optional_advanced_content/Vectorization/example_cells_1.tif


--------------------------------------------------------------------------------
/optional_advanced_content/Vectorization/example_cells_1_segmented.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/optional_advanced_content/Vectorization/example_cells_1_segmented.npy


--------------------------------------------------------------------------------
/optional_advanced_content/Vectorization/example_vectorization.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Sat Mar 12 16:28:01 2016
  4 | 
  5 | @author:    Jonas Hartmann @ Gilmour Group @ EMBL Heidelberg
  6 | 
  7 | @descript:  An example of how vectorization can speed up data analysis code.
  8 |             Vectorization refers to the removal of iterations in favor of array
  9 |             operations. These generally compute much faster since they are 
 10 |             implicitly parallel; if the same operation is done for all elements
 11 |             of an array independently - i.e. without a loop - multiple such 
 12 |             operations can be done in parallel (or at least in quick succession
 13 |             without any overhead) on the CPU.
 14 |             
 15 |             This example is based on the cell segmentation generated in the 
 16 |             main tutorial script, which is loaded from a npy file. The example
 17 |             shows a vectorized version of the edge pixel filter.
 18 | 
 19 | @moreInfo:  We previously used scipy.ndimage.generic_filter as a means of 
 20 |             iterating over our segmentation and detecting the edges of each 
 21 |             cell. Whilst generic_filter provides a means of fast array 
 22 |             iteration, it would be much faster to use a vectorized approach,
 23 |             i.e. one that does not rely on iteration at all.
 24 |             
 25 |             One way to find cell border pixels without iterating is to generate 
 26 |             all four possible "shifted-by-1" versions of the array. The edge 
 27 |             pixels are those that do not have the same value in one of the 
 28 |             shifted arrays as compared to the original. Since array comparison 
 29 |             does not require iteration, this approach is bound to be much 
 30 |             faster, especially for large arrays.
 31 |             
 32 |             However, there is a trade-off: generating 4 shifted copies of the 
 33 |             image array requires a lot of memory, which can be a problem for 
 34 |             big data. Such trade-offs between memory and speed are a common 
 35 |             concern in code optimization.
 36 |             
 37 |             Note that vectorization is actually quite easy for a lot of common
 38 |             operations in programs; it just takes a bit of thinking and often
 39 |             some knowledge of linear algebra. However, there are also cases 
 40 |             where the solution is not obvious or easily derived (this example
 41 |             is probably one such case - at least it was for me). In those
 42 |             cases, searching the internet for solutions to the problem (or a 
 43 |             similar problem) is usually worth a try.
 44 |             
 45 | @speed:     This version takes ~0.016s to run on my machine, versus ~5.318s for 
 46 |             the iteration-based implementation in the main tutorial.    
 47 | 
 48 | @requires:  Python 2.7
 49 |             NumPy 1.9, scikit-image 0.11.3, matplotlib 1.5.1
 50 | """
 51 | 
 52 | 
 53 | # PREPARATION
 54 | 
 55 | # Module imports
 56 | from __future__ import division    # Python 2.7 legacy
 57 | import numpy as np                 # Array manipulation package
 58 | import matplotlib.pyplot as plt    # Plotting package
 59 | 
 60 | # Data import (segmentation from main tutorial)
 61 | filename = 'example_cells_1' 
 62 | seg = np.load(filename+'_segmented.npy')
 63 | 
 64 | # Begin timing
 65 | from time import time
 66 | before = time()
 67 | 
 68 | 
 69 | ### EXECUTION
 70 | 
 71 | # Padding adds values around the original array (here just 1 line of pixels)
 72 | seg_pad = np.pad(seg,1,mode='reflect')
 73 | 
 74 | # This generates a list of shifted-by-1 arrays by slicing sub-blocks out of the
 75 | # padded original.
 76 | seg_shifts = [seg_pad[:-2,:-2],seg_pad[:-2,2:],seg_pad[2:,:-2],seg_pad[2:,2:]]
 77 | 
 78 | # Now it's just a matter of checking which pixels are different in a shifted
 79 | # array compared to the original
 80 | edges = np.zeros_like(seg)
 81 | for shift in seg_shifts:
 82 |     edges[shift!=seg] = 1
 83 |     
 84 | # Label the detected edges based on the underlying cells (as in the main tutorial)
 85 | edges = edges * seg
 86 | 
 87 | 
 88 | ### DOWNSTREAM HANDLING
 89 | 
 90 | # End timing
 91 | after = time()
 92 | print after - before
 93 | 
 94 | # Show result as masked overlay (as in the main tutorial)
 95 | import skimage.io as io
 96 | img = io.imread(filename+'.tif')
 97 | plt.imshow(img[0,:,:],cmap='gray',interpolation='none')
 98 | plt.imshow(np.ma.array(edges,mask=edges==0),interpolation='none') 
 99 | plt.show()
100 | 
101 | 
102 | 
103 |     


--------------------------------------------------------------------------------
/optional_advanced_content/cluster_computation/README.md:
--------------------------------------------------------------------------------
 1 | # Advanced Content
 2 | 
 3 | ### Code Optimization
 4 | 
 5 | Examples for how to speed up your code, relevant for anything that handles relatively large amounts of data (image analysis, data analysis, modeling, ...). There are scripts exemplifying three strategies:
 6 | 
 7 | - **Vectorization**
 8 | 
 9 | 	- Using the edge finder from the main script as an example, this demonstrates the drastic 		 	   increase in speed that can often be achieved if operations are vectorized.
10 | 	
11 | - **Multiprocessing**
12 | 
13 | 	- This shows how Python's multiprocessing module can be used to simultaneously run the batch 	  		pipeline from the main tutorial on several images.
14 | 	
15 | - **Cluster Processing**
16 | 
17 | 	- An example of how to use a Python script to run another script multiple times with different 	   input data. If run locally, this is very similar to multiprocessing, but with a bit of 	 		  knowledge about high-performance cluster computing (see appropriate courses), this approach can 		be used to handle job submission and result collection on a computer cluster.
18 | 
19 | It should be noted that one of the key aspects of code optimization is finding out *which part* of the code costs the most time and could be optimized for the greatest gain in speed. This is called *profiling* and there are a number of options for how to do it, both in the form of Python modules as well as built into IDEs like Spyder. Profiling is not discussed here, but as a very simple example the `time` module is used to test how long the different versions take to run.
20 | 
21 | 
22 | ### Advanced Data Analysis
23 | 
24 | This tutorial illustrates how single-cell segmentation results can be piped into advanced data analysis. This is intended as a starting point for people to get into advanced data analysis with python. In particular, it shows off three important modules (scikit-learn, scipy.cluster and networkx) and illustrates a number of key concepts and methods (feature extraction, standardization/normalization, PCA and tSNE, clustering, graph representation). As a little bonus at the end, the xkcd-style plotting feature of matplotlib is shown. ;)
25 | 
26 | *Important note: This tutorial is a **BETA** - it may contain bugs and other errors!*


--------------------------------------------------------------------------------
/optional_advanced_content/cluster_computation/batch_cluster.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Tue Dec 22 00:12:38 2015
  4 | 
  5 | @author:    Jonas Hartmann @ Gilmour Group @ EMBL Heidelberg
  6 | 
  7 | @descript:  A version of main tutorial segmentation pipeline optimized for
  8 |             cluster submission. The filename parameter for the pipeline
  9 |             function is retrieved from commandline input. See example_cluster.py
 10 |             for details.
 11 |             
 12 | @requires:  Python 2.7
 13 |             NumPy 1.9, SciPy 0.15, scikit-image 0.11.3
 14 | """
 15 | 
 16 | 
 17 | # IMPORT STUFF
 18 | from __future__ import division    # Python 2.7 legacy
 19 | import numpy as np                 # Array manipulation package
 20 | import scipy.ndimage as ndi        # Image processing package
 21 | 
 22 | 
 23 | #------------------------------------------------------------------------------
 24 | 
 25 | # GET FILENAME FROM COMMANDLINE
 26 | import sys
 27 | filename = sys.argv[1]
 28 | 
 29 |     
 30 | #------------------------------------------------------------------------------
 31 | 
 32 | # IMPORT AND SLICE DATA    
 33 | 
 34 | # Check if input filename exists, else create a file reporting the problem
 35 | # and then terminate.
 36 | from os.path import isfile
 37 | if not isfile(filename):
 38 |     error = "ERROR: Could not find file " + str(filename)
 39 |     import json
 40 |     with open(filename[:-4]+'_out.json', 'w') as fp:
 41 |         json.dump(error, fp)
 42 |     raise NameError("Could not find file " + str(filename))
 43 | 
 44 | # Import tif files
 45 | import skimage.io as io               # Image file manipulation module
 46 | img = io.imread(filename)             # Importing multi-color tif file
 47 | img = np.array(img)                   # Converting MultiImage object to numpy array
 48 | 
 49 | # Slicing: We only work on one channel for segmentation
 50 | green = img[0,:,:]
 51 | 
 52 | 
 53 | #------------------------------------------------------------------------------
 54 | 
 55 | # PREPROCESSING: SMOOTHING AND ADAPTIVE THRESHOLDING
 56 | # It's standard to smoothen images to reduce technical noise - this improves
 57 | # all subsequent image processing steps. Adaptive thresholding allows the
 58 | # masking of foreground objects even if the background intensity varies across
 59 | # the image.
 60 | 
 61 | # Gaussian smoothing
 62 | sigma = 3                                                # Smoothing factor for Gaussian
 63 | green_smooth = ndi.filters.gaussian_filter(green,sigma)  # Perform smoothing
 64 | 
 65 | # Create an adaptive background
 66 | #struct = ndi.iterate_structure(ndi.generate_binary_structure(2,1),24)               # Create a diamond-shaped structural element
 67 | struct = ((np.mgrid[:31,:31][0] - 15)**2 + (np.mgrid[:31,:31][1] - 15)**2) <= 15**2  # Create a disk-shaped structural element
 68 | bg = ndi.filters.generic_filter(green_smooth,np.mean,footprint=struct)               # Run a mean filter over the image using the disc
 69 | 
 70 | # Threshold using created background
 71 | green_thresh = green_smooth >= bg
 72 | 
 73 | # Clean by morphological hole filling
 74 | green_thresh = ndi.binary_fill_holes(np.logical_not(green_thresh))
 75 | 
 76 | # Show the result
 77 | #    plt.imshow(green_thresh,interpolation='none',cmap='gray')
 78 | #    plt.show()
 79 | 
 80 | 
 81 | #------------------------------------------------------------------------------
 82 | 
 83 | # (SIDE NOTE: WE COULD BE DONE NOW)
 84 | # If the data is very clean and/or we just want a quick look, we could simply
 85 | # label all connected pixels now and consider the result our segmentation.
 86 | 
 87 | # Labeling connected components
 88 | #    green_label = ndi.label(green_thresh)[0]
 89 | #    plt.imshow(green_label,interpolation='none')
 90 | #    plt.show() 
 91 | 
 92 | # However, to also partition the membranes to the cells, to generally improve  
 93 | # the segmentatation (e.g. split cells that end up connected here) and to 
 94 | # handle more complicated morphologies or to deal with lower quality data, 
 95 | # this approach is not sufficient.
 96 | 
 97 | 
 98 | #------------------------------------------------------------------------------
 99 | 
100 | # SEGMENTATION: SEEDING BY DISTANCE TRANSFORM
101 | # More advanced segmentation is usually a combination of seeding and expansion.
102 | # In seeding, we want to find a few pixels for each cell that we can assign to
103 | # said cell with great certainty. These 'seeds' are then expanded to partition
104 | # regions of the image where cell affiliation is less clear-cut.
105 | 
106 | # Distance transform on thresholded membranes
107 | # Advantage of distance transform for seeding: It is quite robust to local 
108 | # "holes" in the membranes.
109 | green_dt= ndi.distance_transform_edt(green_thresh)
110 | #    plt.imshow(green_dt,interpolation='none')
111 | #    plt.show()
112 | 
113 | # Dilating (maximum filter) of distance transform improves results
114 | green_dt = ndi.filters.maximum_filter(green_dt,size=10) 
115 | #    plt.imshow(green_dt,interpolation='none')
116 | #    plt.show()
117 | 
118 | # Retrieve and label the local maxima
119 | from skimage.feature import peak_local_max
120 | green_max = peak_local_max(green_dt,indices=False,min_distance=10)  # Local maximum detection
121 | green_max = ndi.label(green_max)[0]                                 # Labeling
122 | 
123 | # Show maxima as masked overlay
124 | #    plt.imshow(green_smooth,cmap='gray',interpolation='none')
125 | #    plt.imshow(np.ma.array(green_max,mask=green_max==0),interpolation='none') 
126 | #    plt.show()
127 | 
128 | 
129 | #------------------------------------------------------------------------------
130 | 
131 | # SEGMENTATION: EXPANSION BY WATERSHED
132 | # Watershedding is a relatively simple but powerful algorithm for expanding
133 | # seeds. The image intensity is considered as a topographical map (with high
134 | # intensities being "mountains" and low intensities "valleys") and water is
135 | # poured into the valleys from each of the seeds. The water first labels the
136 | # lowest intensity pixels around the seeds, then continues to fill up. The cell
137 | # boundaries are where the waterfronts between different seeds touch.
138 | 
139 | # Get the watershed function and run it
140 | from skimage.morphology import watershed
141 | green_ws = watershed(green_smooth,green_max)
142 | 
143 | # Show result as transparent overlay
144 | # Note: For a better visualization, see "FINDING CELL EDGES" below!
145 | #    plt.imshow(green_smooth,cmap='gray',interpolation='none')
146 | #    plt.imshow(green_ws,interpolation='none',alpha=0.7) 
147 | #    plt.show()
148 | 
149 | # Notice that the previously connected cells are now mostly separated and the
150 | # membranes are partitioned to their respective cells. 
151 | # ...however, we now see a few cases of oversegmentation!
152 | # This is a typical example of the trade-offs one has to face in any 
153 | # computational classification task. 
154 | 
155 | 
156 | #------------------------------------------------------------------------------
157 | 
158 | # POSTPROCESSING: REMOVING CELLS AT THE IMAGE BORDER
159 | # Since segmentation is never perfect, it often makes sense to remove artefacts
160 | # after the segmentation. For example, one could filter out cells that are too
161 | # big, have a strange shape, or strange intensity values. Similarly, supervised 
162 | # machine learning can be used to identify cells of interest based on a 
163 | # combination of various features. Another example of cells that should be 
164 | # removed are those at the image boundary.
165 | 
166 | # Create a mask for the image boundary pixels
167 | boundary_mask = np.ones_like(green_ws)   # Initialize with all ones
168 | boundary_mask[1:-1,1:-1] = 0             # Set middle square to 0
169 | 
170 | # Iterate over all cells in the segmentation
171 | current_label = 1
172 | for cell_id in np.unique(green_ws):
173 |     
174 |     # If the current cell touches the boundary, remove it
175 |     if np.sum((green_ws==cell_id)*boundary_mask) != 0:
176 |         green_ws[green_ws==cell_id] = 0
177 |         
178 |     # This is to keep the labeling continuous, which is cleaner
179 |     else:
180 |         green_ws[green_ws==cell_id] = current_label
181 |         current_label += 1
182 | 
183 | # Show result as transparent overlay
184 | #    plt.imshow(green_smooth,cmap='gray',interpolation='none')
185 | #    plt.imshow(np.ma.array(green_ws,mask=green_ws==0),interpolation='none',alpha=0.7) 
186 | #    plt.show()
187 | 
188 | 
189 | #------------------------------------------------------------------------------
190 | 
191 | # MEASUREMENTS: FINDING CELL EDGES
192 | # Finding cell edges is very useful for many purposes. In our example, edge
193 | # intensities are a measure of membrane intensities, which may be a desired
194 | # readout. The length of the edge (relative to cell size) is also a quite
195 | # informative feature about the cell shape. Finally, showing colored edges is
196 | # a nice way of visualizing segmentations.
197 | 
198 | # How this works: The generic_filter function (see further below) iterates a 
199 | # structure element (in this case a 3x3 square) over an image and passes all
200 | # the values within that element to some arbitrary function (in this case 
201 | # edge_finder). The edge_finder function checks if all these pixels are the 
202 | # same; if they are, the current pixel is not at an edge (return 0), otherwise 
203 | # it is (return 1). generic_filter takes the returned values and organizes them
204 | # into an image again by setting the central pixel of each 3x3 square to the
205 | # respective return value from edge_finder.
206 | 
207 | # Define the edge detection function
208 | def edge_finder(footprint_values):
209 |     if (footprint_values == footprint_values[0]).all():
210 |         return 0
211 |     else:
212 |         return 1
213 |     
214 | # Iterate the edge finder over the segmentation
215 | green_edges = ndi.filters.generic_filter(green_ws,edge_finder,size=3)
216 | 
217 | # Label the detected edges based on the underlying cells
218 | #    green_edges_labeled = green_edges * green_ws
219 | 
220 | # Show them as masked overlay
221 | #    plt.imshow(green_smooth,cmap='gray',interpolation='none')
222 | #    plt.imshow(np.ma.array(green_edges_labeled,mask=green_edges_labeled==0),interpolation='none') 
223 | #    plt.show()
224 | 
225 | 
226 | #------------------------------------------------------------------------------
227 | 
228 | # MEASUREMENTS: SINGLE-CELL READOUTS
229 | # Now that the cells in the image are nicely segmented, we can quantify various
230 | # readouts for every cell individually. Readouts can be based on the intensity
231 | # in the original image, on intensities in other channels or on the size and
232 | # shape of the cells themselves.
233 | 
234 | # Initialize a dict for results of choice
235 | results = {"cell_id":[], "green_mean":[], "red_mean":[],"green_mem_mean":[], 
236 |            "red_mem_mean":[],"cell_size":[],"cell_outline":[]}
237 | 
238 | # Iterate over segmented cells
239 | for cell_id in np.unique(green_ws)[1:]:
240 |     
241 |     # Mask the pixels of the current cell
242 |     cell_mask = green_ws==cell_id    
243 |     
244 |     # Get the current cell's values
245 |     # Note that the original raw data is used for quantification!
246 |     results["cell_id"].append(cell_id)
247 |     results["green_mean"].append(np.mean(img[0,:,:][cell_mask]))
248 |     results["red_mean"].append(np.mean(img[1,:,:][cell_mask]))    
249 |     results["green_mem_mean"].append(np.mean(img[0,:,:][np.logical_and(cell_mask,green_edges)]))    
250 |     results["red_mem_mean"].append(np.mean(img[1,:,:][np.logical_and(cell_mask,green_edges)]))    
251 |     results["cell_size"].append(np.sum(cell_mask))
252 |     results["cell_outline"].append(np.sum(np.logical_and(cell_mask,green_edges)))
253 | 
254 | 
255 | #------------------------------------------------------------------------------
256 | 
257 | # SAVE RESULTS
258 | 
259 | import json
260 | with open(filename[:-4]+'_out.json', 'w') as fp:
261 |     json.dump(results, fp)
262 | 
263 |     
264 | #------------------------------------------------------------------------------
265 | 
266 | 
267 | 
268 |     


--------------------------------------------------------------------------------
/optional_advanced_content/cluster_computation/example_cells_1.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/optional_advanced_content/cluster_computation/example_cells_1.tif


--------------------------------------------------------------------------------
/optional_advanced_content/cluster_computation/example_cells_2.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/optional_advanced_content/cluster_computation/example_cells_2.tif


--------------------------------------------------------------------------------
/optional_advanced_content/cluster_computation/example_cluster.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Tue Dec 22 00:12:38 2015
  4 | 
  5 | @author:    Jonas Hartmann @ Gilmour Group @ EMBL Heidelberg
  6 | 
  7 | @descript:  Sometimes a single computer just doesn't cut it; a computer cluster 
  8 |             is required. At EMBL, we have access to a high-performance
  9 |             computation (HPC) cluster with over 4000 CPUs, see:
 10 |                 
 11 |                 https://intranet.embl.de/it_services/services/computing/hpc_cluster/index.html
 12 | 
 13 |             The HPC cluster is used by submitting jobs from a (linux-based) 
 14 |             server, using a queuing system called "LSF". IT offers courses on
 15 |             how this is done, and since many people are using the cluster, it 
 16 |             is good to know what you are doing before trying it yourself, to 
 17 |             avoid causing problems for others.
 18 |             
 19 |             For those who already know about LSF (or plan to learn about it),
 20 |             this is an example of how cluster computation could be handled with
 21 |             Python, using the batch processing pipeline established in the main
 22 |             tutorial. However, instead of submitting to the cluster, this 
 23 |             script creates Python processes on the local machine, making it 
 24 |             more or less equivalent to multi-processing.
 25 |             
 26 |             In principle, cluster handling requires two things: job submission
 27 |             and result collection. Here, the analysis pipeline is submitted 
 28 |             with each image as a job and the resulting segmentations are
 29 |             collected when those jobs finish. Doing this on the clsuter would
 30 |             be slightly more complicated than doing it locally, but those who
 31 |             know about HPC/LSF should be able to figure it out.
 32 |             
 33 |             NOTE 1: This uses a "cleaned" version of the batch pipeline, which
 34 |             takes the input filename from a commandline argument and saves its
 35 |             output into a file.
 36 | 
 37 |             NOTE 2: This code is every so slightly dependent on the operating
 38 |             system used and on the paths of Python and the input files on the
 39 |             system. The current version is written for Windows and lines tagged 
 40 |             as #OS! have to be adjusted for linux (and in some cases also for
 41 |             Windows machines if the paths are different).
 42 | 
 43 | @requires:  Python 2.7
 44 |             NumPy 1.9, SciPy 0.15, scikit-image 0.11.3, matpliotlib 1.5.1
 45 | """
 46 | 
 47 | # IMPORT BASIC MODULES
 48 | 
 49 | from __future__ import division    # Python 2.7 legacy
 50 | import numpy as np                 # Array manipulation package
 51 | import matplotlib.pyplot as plt    # Plotting package
 52 | import json                        # Writing and reading python objects
 53 | 
 54 | 
 55 | # PREPARATION
 56 | 
 57 | # Generate a list of image filenames (just as in the batch tutorial)
 58 | from os import listdir, getcwd
 59 | filelist = listdir(getcwd())
 60 | tiflist = [fname for fname in filelist if fname[-4:]=='.tif']
 61 | 
 62 | 
 63 | # SUBMISSION
 64 | 
 65 | # For each filename, use the commandline to execute the batch pipeline script
 66 | # with that filename as an input.
 67 | from os import system    # Function to run commandline commands
 68 | print "Submitting jobs..."
 69 | for fname in tiflist:
 70 |     
 71 |     system('python batch_cluster.py "'+fname+'"') #OS!
 72 |     
 73 |     # For cluster submission, it would look something like this. However, note
 74 |     # that this is pseudo-code and would have to be adjusted at least slightly!
 75 |     #system("bsub -o out.txt -e error.txt 'python batch_cluster.py "+fname+"'")
 76 | 
 77 | 
 78 | # RESULT COLLECTION
 79 | 
 80 | all_results = [] # Initialize result list
 81 | all_done = []    # This is used to check which images have already been processed
 82 | errors = 0       # This is used to count errors
 83 | 
 84 | # A while-loop to keep looking until all the output files have been retrieved.
 85 | # Note that unexpected errors within the pipeline may cause this to become an 
 86 | # infinite loop; it would be better to implement this in a clean fashion that
 87 | # handles exceptions in the pipelines properly (or at least stops automatically
 88 | # after a certain amount of waiting time).
 89 | while len(all_done) != len(tiflist):
 90 |     
 91 |     # Wait for 30 sec
 92 |     print "Waiting for results..."
 93 |     from time import sleep
 94 |     sleep(30)
 95 |     
 96 |     # Check for output files
 97 |     filelist = listdir(getcwd())
 98 |     outlist = [fname for fname in filelist if '_out.json' in fname]
 99 |     all_done = all_done + outlist
100 |     
101 |     # For each available output file...
102 |     for fname in outlist:
103 |         
104 |         # Load the output data
105 |         with open(fname, 'r') as fp:
106 |             results = json.load(fp)
107 |             
108 |             # Make sure errors are caught...
109 |             if type(results) == str:
110 |                 errors += 1
111 |                 
112 |             # If all is well, add the result to all other results
113 |             else:
114 |                 all_results.append(results)
115 | 
116 |         # Then just delete the file; we don't need it anymore
117 |         system('del "'+fname+'"') #OS!
118 |     
119 |     # Report on progress, then wait for a minute, then try again
120 |     print "Retrieved", len(all_done), "result files of", len(tiflist), "with a total of", errors, "errors!"
121 | 
122 | 
123 | # DOWNSTREAM PROCESSING
124 |   
125 | # See if it worked by printing the short summary  
126 | print "\nSuccessfully analyzed", len(all_results), "of", len(tiflist), "images"
127 | print "Detected", sum([len(resultDict["cell_id"]) for resultDict in all_results]), "cells in total"
128 | 
129 | # See if it worked by showing the scatterplot
130 | colors = plt.cm.jet(np.linspace(0,1,len(all_results))) # To give cells from different images different colors
131 | for image_id,resultDict in enumerate(all_results):
132 |     plt.scatter(resultDict["cell_size"],resultDict["red_mem_mean"],color=colors[image_id])
133 | plt.xlabel("cell size")
134 | plt.ylabel("red_mem_mean")
135 | plt.show()   
136 |     
137 | 


--------------------------------------------------------------------------------
/optional_advanced_content/data_analysis/example_cells_1.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/optional_advanced_content/data_analysis/example_cells_1.tif


--------------------------------------------------------------------------------
/optional_advanced_content/data_analysis/example_cells_1_green.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/optional_advanced_content/data_analysis/example_cells_1_green.npy


--------------------------------------------------------------------------------
/optional_advanced_content/data_analysis/example_cells_1_segmented.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/optional_advanced_content/data_analysis/example_cells_1_segmented.npy


--------------------------------------------------------------------------------
/optional_advanced_content/data_analysis/example_data_analysis.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Sat Mar 12 16:28:01 2016
  4 | 
  5 | @author:    Jonas Hartmann @ Gilmour Group @ EMBL Heidelberg
  6 | 
  7 | @descript:  A crude introduction on how to pipeline single-cell segmentation 
  8 |             data into downstream analyses such as clustering. 
  9 |             
 10 |             For people with limited experience in data analysis, this script is
 11 |             intended as an inspiration and incentive to think about possible
 12 |             advanced analyses downstream of segmentation. Solving the exercises
 13 |             without help may be difficult, so it may be a good idea to have a 
 14 |             look at the solutions to get some idea of how the problems should
 15 |             be approached. However, once the principles are understood, it is
 16 |             an important part of the learning experience to build one's own 
 17 |             implementation.
 18 |             
 19 |             More experienced people can use this script as a starting point to 
 20 |             exploring the data analysis packages provided for Python. It also 
 21 |             illustrates that Python readily allows the construction of complete
 22 |             and consistent analysis pipelines, from image preprocessing to  
 23 |             feature extraction to clustering (and back). The exercises will be
 24 |             doable, and are intended as an incentive to think about the concepts,
 25 |             also in regards to ones own data.
 26 |             
 27 |             There are a number of machine learning, clustering and other data
 28 |             analysis packages for Python. As a starting point, I recommend you
 29 |             look into the following:
 30 |             
 31 |             - scikit-learn  (scikit-learn.org/stable/)
 32 |             - scipy.cluster (docs.scipy.org/doc/scipy/reference/cluster.html)
 33 |             - networkx      (networkx.github.io/)
 34 |             
 35 |             For people interested in Bayesian methods (not covered here), I
 36 |             recommend the PyMC package (pymc-devs.github.io/pymc/).
 37 | 
 38 | @WARNING:   This exercise and the associated solutions are a BETA! They have
 39 |             been implemented in a limited amount of time and have not been
 40 |             tested extensively. Furthermore, the example data used is rather
 41 |             uniform in regards to many conventional features and thus is not
 42 |             ideal to illustrate clustering. Nevertheless, the principles and
 43 |             packages introduced here should serve as a good inspiration or
 44 |             starting point for further study.
 45 | 
 46 | @requires:  Python 2.7
 47 |             NumPy 1.9, scikit-image 0.11.3, matplotlib 1.5.1
 48 |             SciPy 0.16.0, scikit-learn 0.15.2, networkx 1.9.1
 49 | """
 50 | 
 51 | 
 52 | #------------------------------------------------------------------------------
 53 | 
 54 | ### PREPARATION
 55 | 
 56 | ### Module imports
 57 | from __future__ import division    # Python 2.7 legacy
 58 | import numpy as np                 # Array manipulation package
 59 | import matplotlib.pyplot as plt    # Plotting package
 60 | import scipy.ndimage as ndi        # Multidimensional image operations
 61 | 
 62 | 
 63 | ### Importing image and segmentation data from main tutorial
 64 | 
 65 | # Note: Loading from .npy is faster, so do the following once:
 66 | #filename = 'example_cells_1'
 67 | #import skimage.io as IO
 68 | #img = IO.imread(filename+'.tif')[0,:,:]
 69 | #img = ndi.filters.gaussian_filter(img,3) # Smoothing 
 70 | #np.save(filename+'_green',img)
 71 | 
 72 | # Loading from npy
 73 | filename = 'example_cells_1'
 74 | img = np.load(filename+'_green.npy')
 75 | seg = np.load(filename+'_segmented.npy')
 76 | 
 77 | # Some frequently used variables
 78 | labels = np.unique(seg)[1:]  # Labels of cells in segmentation
 79 | N = len(labels)              # Number of cells in segmentation
 80 | 
 81 | 
 82 | #------------------------------------------------------------------------------
 83 | 
 84 | ### FEATURE EXTRACTION
 85 | # As discussed in the main tutorial, we can measure various quantities for each
 86 | # cell once the cells have been segmented. Any such quantity can be used as a
 87 | # feature to classify or cluster cells. Besides explicitly measured quantities,
 88 | # there are algorithms/packages that measure a whole bunch of features at once.
 89 | 
 90 | # All the extracted features together are called the 'feature space'. Each
 91 | # sample can be considered a point in a space that has as many dimensions as
 92 | # there are features. The feature space should be arranged as an array that
 93 | # has the shape (n_samples,n_features).
 94 | 
 95 | # EXERCISE 1: 
 96 | # Come up with at least 4 different features and measure them for each cell in  
 97 | # the segmentation of the main tutorial.
 98 | 
 99 | # Hint: For many measures of shape and spatial distribution, it is useful to 
100 | #       first calculate the centroid of the segmented object and then think of
101 | #       features relative to it.
102 | 
103 | # Hint: It can be advantageous to use measures that are largely independent of 
104 | #       or normalized for cell size, so this factor does not end up dominating 
105 | #       the features. Cells size itself can be a useful feature, though.
106 | 
107 | # Hint: Don't forget that we detected the membranes of each cell in the main
108 | #       script. Importing this data may be useful for the calculation of
109 | #       various features.
110 | 
111 | # Hint: Make sure you visualize your data!
112 | #       It can be very useful to have a look at what a feature looks like when 
113 | #       mapped to the actual image. This may already show interesting patterns, 
114 | #       or should at least confirm that the extracted value is consistent with 
115 | #       its rationale. For example, one could show a feature as a color-coded
116 | #       semi-transparent overlay over the actual image.
117 | #       Furthermore, box and scatter plots are great options for investigating 
118 | #       how the values of a feature are distributed and how features relate to
119 | #       each other..
120 | 
121 | 
122 | 
123 | 
124 | 
125 | 
126 | 
127 | # EXERCISE 2:
128 | # Find and use a feature extraction algorithm that returns a large feature set
129 | # for each cell. The features could for example be related to shape or texture.
130 | 
131 | 
132 | 
133 | 
134 | 
135 | 
136 | 
137 | 
138 | # Note: You can save and later reload the features spaces you've generated with
139 | # np.save and np.load, so you don't need to run the feature extraction each 
140 | # time you run the script.
141 | 
142 | 
143 | 
144 | #------------------------------------------------------------------------------
145 | 
146 | ### NORMALIZATION AND STANDARDIZATION
147 | # Many classification and clustering algorithms need features to be normalized
148 | # and/or standardized, otherwise the absolute size of the feature could affect 
149 | # the result (for example, you could get a different result if you use cell 
150 | # size in um or in pixels, because the absolute numbers are different).
151 | 
152 | # Normalization in this context generally means scaling your features to each
153 | # range from 0 to 1. Standardization means centering the features around zero
154 | # and scaling them to "unit variance" by dividing by their standard deviation.
155 | # A more elaborate version of this which often provides a good starting point
156 | # is a "whitening transform", which is implemented in the scipy.cluster module.
157 | 
158 | # It's worthwhile to read up on normalization/standardization so you avoid 
159 | # introducing errors/biases. For example, normalization of data with outliers
160 | # will compress the 'real' data into a very small range. Thus, outliers should
161 | # be removed before normalization/standardization.
162 | 
163 | # EXERCISE 3:
164 | # Find a way to remove outliers from your feature space.
165 | 
166 | 
167 | 
168 | 
169 | 
170 | 
171 | 
172 | 
173 | 
174 | # EXERCISE 4: 
175 | # Standardize, normalize and/or whiten your feature space as you deem fit,
176 | # either by transforming the data yourself or using a module function.
177 | 
178 | 
179 | 
180 | 
181 | 
182 | 
183 | 
184 | 
185 | # Note: Don't forget to visualize your data again and compare to the raw data!
186 | 
187 | 
188 | 
189 | #------------------------------------------------------------------------------
190 | 
191 | ### PRINCIPAL COMPONENT ANALYSIS (PCA)
192 | # The principal components of a feature space are the axes of greatest variance
193 | # of the data. By transforming our data to this "relevance-corrected" coordinate
194 | # system, we can achieve two things:
195 | # 1) Usually, most of the variance in a dataset falls onto just a few principal
196 | #    components, so we can ignore the other ones as irrelevant, thus reducing
197 | #    the number of features whilst maintaining all information. This is very
198 | #    useful to facilitate subsequent analyses.
199 | # 2) Just PCA on its own can yield nice results. For example, different cell
200 | #    populations that are not clearly separated by any single feature may 
201 | #    appear separated along a principal component. Furthermore, principal 
202 | #    components may correlate with other features of the data, which can be an
203 | #    interesting result on its own.
204 | 
205 | # EXERCISE 5:
206 | # Perform a PCA on your feature space and investigate the results.
207 | 
208 | # Hint: You may want to use the PCA implementation of scikit-learn
209 | #       Algorithms in sklearn are provided as "estimator objects". The general
210 | #       workflow for using them is to first instantiate the estimator object,
211 | #       passing general parameters, then to fit the estimator to your data and
212 | #       finally to extract various results from the fitted estimator.
213 | 
214 | 
215 | 
216 | 
217 | 
218 | 
219 | 
220 | 
221 | 
222 | 
223 | 
224 | #------------------------------------------------------------------------------
225 | 
226 | ### K-MEANS CLUSTERING
227 | # If you expect that you can split your population into distinct groups, an 
228 | # easy way of doing so in an unsupervised fashion is k-means clustering. 
229 | # K-means partitions samples into clusters based on their proximity to the 
230 | # cluster's mean.
231 | 
232 | # EXERCISE 6:
233 | # Perform k-means clustering on your data. To do so, you have to assume the
234 | # number of clusters. To begin with, just try it with 5 clusters. Try doing the
235 | # PCA for raw, normalized and PCA-transformed data to see the difference. Don't 
236 | # forget to visualize your result.
237 | 
238 | 
239 | 
240 | 
241 | 
242 | 
243 | 
244 | 
245 | 
246 | 
247 | 
248 | # ADDITIONAL EXERCISE:
249 | # Can you think of and implement a simple way of objectively choosing the 
250 | # number of clusters for k-means?
251 | 
252 | 
253 | 
254 | 
255 | 
256 | 
257 | 
258 | 
259 | 
260 | 
261 | 
262 | #------------------------------------------------------------------------------
263 | 
264 | ### tSNE ANALYSIS
265 | # Although PCA is great to reduce and visualize high-dimensional data, it only
266 | # works well on linear relationships and global trends. Therefore, alternative 
267 | # algorithms optimized for non-linear, local relationships have also been
268 | # created.
269 | 
270 | # These algorithms tend to be quite complicated and going into them is beyond 
271 | # the scope of this course. This example is intended as a taste of what is out
272 | # there and to show people who already know about these methods that they are
273 | # implemented in Python. Note that it can be risky to use these algorithms if 
274 | # you do not know what you are doing, so it may make sense to read up and/or to 
275 | # consult with an expert before you do this kind of analysis.
276 | 
277 | # This is not an exercise, just an example for you to study.
278 | # You can find the code in the solutions file.
279 | 
280 | 
281 | 
282 | #------------------------------------------------------------------------------
283 | 
284 | ### GRAPH-BASED ANALYSIS
285 | # Graphs are a universal way of mathematically describing relationships, be 
286 | # they based on similarity, interaction, or virtually anything else. Despite 
287 | # their power, graph-based analyses have so far not been used extensively on 
288 | # biological imaging data, but as microscopes and analysis algorithms improve,
289 | # they become increasingly feasible and will likely become very important in
290 | # the future.
291 | 
292 | # The networkx module provides various functions for importing and generating
293 | # graphs, for operating and analyzing graphs and for exporting and visualizing
294 | # graphs. The following example shows how a simple graph based on our feature  
295 | # space could be built and visualized. In doing so, it introduces the networkx 
296 | # Graph object, which is the core of the networkx module.
297 | 
298 | # This is not an exercise, just an example for you to study.
299 | # You can find the code in the solutions file.
300 | 
301 | 
302 | 
303 | #------------------------------------------------------------------------------
304 | 
305 | ### BONUS: XCKD-STYLE PLOTS
306 | 
307 | # CONGRATULATIONS! You made it to the very end of this debaucherous tutorial,
308 | # so you now get to see what is probably the most fantastic functionality in
309 | # matplotlib: plotting in the style of the xkcd webcomic!
310 | 
311 | # This is not an exercise, just an example for you to study.
312 | # You can find the code in the solutions file.
313 | 
314 | 
315 | 


--------------------------------------------------------------------------------
/pre_tutorial/.ipynb_checkpoints/Short tutorial on functions-checkpoint.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# This is a short tutorial on defining and using functions/modules\n",
  8 |     "\n",
  9 |     "References:\n",
 10 |     "\n",
 11 |     "http://www.tutorialspoint.com/python/ \n",
 12 |     "\n",
 13 |     "https://github.com/tobyhodges/ITPP\n",
 14 |     "\n",
 15 |     "https://github.com/cmci/HTManalysisCourse/blob/master/CentreCourseProtocol.md#workflow-python-primer\n",
 16 |     "\n",
 17 |     "http://cmci.embl.de/documents/ijcourses"
 18 |    ]
 19 |   },
 20 |   {
 21 |    "cell_type": "markdown",
 22 |    "metadata": {},
 23 |    "source": [
 24 |     "## Defining a Function\n",
 25 |     "\n",
 26 |     "You can define functions to provide the required functionality. Here are simple rules to define a function in Python.\n",
 27 |     "\n",
 28 |     "Function blocks begin with the keyword def followed by the function name and parentheses ( ( ) ).\n",
 29 |     "\n",
 30 |     "Any input parameters or arguments should be placed within these parentheses. You can also define parameters inside these parentheses.\n",
 31 |     "\n",
 32 |     "The first statement of a function can be an optional statement - the documentation string of the function or docstring.\n",
 33 |     "\n",
 34 |     "The code block within every function starts with a colon (:) and is indented.\n",
 35 |     "\n",
 36 |     "The statement return [expression] exits a function, optionally passing back an expression to the caller. A return statement with no arguments is the same as return None."
 37 |    ]
 38 |   },
 39 |   {
 40 |    "cell_type": "markdown",
 41 |    "metadata": {},
 42 |    "source": [
 43 |     "## Syntax\n",
 44 |     "\n",
 45 |     "```def functionname( parameters ):\n",
 46 |     "   \n",
 47 |     "    \"function_docstring\"\n",
 48 |     "    function_suite\n",
 49 |     "    return [expression]```\n",
 50 |     "   \n",
 51 |     "By default, parameters have a positional behavior and you need to inform them in the same order that they were defined."
 52 |    ]
 53 |   },
 54 |   {
 55 |    "cell_type": "markdown",
 56 |    "metadata": {},
 57 |    "source": [
 58 |     "### Example\n",
 59 |     "\n",
 60 |     "The following function takes a string as input parameter and prints it on standard screen."
 61 |    ]
 62 |   },
 63 |   {
 64 |    "cell_type": "code",
 65 |    "execution_count": 2,
 66 |    "metadata": {
 67 |     "collapsed": true
 68 |    },
 69 |    "outputs": [],
 70 |    "source": [
 71 |     "def printme( str ):\n",
 72 |     "   \"This prints a passed string into this function\"\n",
 73 |     "   print str\n",
 74 |     "   return"
 75 |    ]
 76 |   },
 77 |   {
 78 |    "cell_type": "markdown",
 79 |    "metadata": {},
 80 |    "source": [
 81 |     "### Example\n",
 82 |     "\n",
 83 |     "The following function takes two numbers and returns and prints their sum."
 84 |    ]
 85 |   },
 86 |   {
 87 |    "cell_type": "code",
 88 |    "execution_count": 3,
 89 |    "metadata": {
 90 |     "collapsed": true
 91 |    },
 92 |    "outputs": [],
 93 |    "source": [
 94 |     "def addme( a, b ):\n",
 95 |     "   \"This adds passed arguments.\"\n",
 96 |     "   print a+b\n",
 97 |     "   return a+b"
 98 |    ]
 99 |   },
100 |   {
101 |    "cell_type": "markdown",
102 |    "metadata": {},
103 |    "source": [
104 |     "## Calling a Function\n",
105 |     "\n",
106 |     "Defining a function only gives it a name, specifies the parameters that are to be included in the function and structures the blocks of code.\n",
107 |     "\n",
108 |     "Once the basic structure of a function is finalized, you can execute it by calling it from another function or directly from the Python prompt. Following is the example to call printme() function −  "
109 |    ]
110 |   },
111 |   {
112 |    "cell_type": "code",
113 |    "execution_count": 7,
114 |    "metadata": {
115 |     "collapsed": false
116 |    },
117 |    "outputs": [
118 |     {
119 |      "name": "stdout",
120 |      "output_type": "stream",
121 |      "text": [
122 |       "I'm first call to user defined function!\n",
123 |       "Again second call to the same function\n",
124 |       "3\n",
125 |       "3\n"
126 |      ]
127 |     }
128 |    ],
129 |    "source": [
130 |     "printme(\"I'm first call to user defined function!\")\n",
131 |     "printme(\"Again second call to the same function\") \n",
132 |     "print addme(1,2)"
133 |    ]
134 |   },
135 |   {
136 |    "cell_type": "markdown",
137 |    "metadata": {},
138 |    "source": [
139 |     "## Function Arguments\n",
140 |     "\n",
141 |     "You can call a function by using the following types of formal arguments:\n",
142 |     "\n",
143 |     "### Required arguments\n",
144 |     "\n",
145 |     "Required arguments are the arguments passed to a function in correct positional order. Here, the number of arguments in the function call should match exactly with the function definition.\n",
146 |     "\n",
147 |     "To call the function printme(), you definitely need to pass one argument, otherwise it gives a syntax error.\n",
148 |     "\n",
149 |     "```Traceback (most recent call last):\n",
150 |     "File \"test.py\", line 11, in <module>\n",
151 |     "    printme();\n",
152 |     "TypeError: printme() takes exactly 1 argument (0 given)```\n",
153 |     "\n",
154 |     "Similarly, for addme() you need to pass two arguments."
155 |    ]
156 |   },
157 |   {
158 |    "cell_type": "markdown",
159 |    "metadata": {},
160 |    "source": [
161 |     "### Keyword arguments\n",
162 |     "\n",
163 |     "Keyword arguments are related to the function calls. When you use keyword arguments in a function call, the caller identifies the arguments by the parameter name.\n",
164 |     "\n",
165 |     "This allows you to skip arguments or place them out of order because the Python interpreter is able to use the keywords provided to match the values with parameters. \n",
166 |     "\n",
167 |     "For example, the following code:"
168 |    ]
169 |   },
170 |   {
171 |    "cell_type": "code",
172 |    "execution_count": 8,
173 |    "metadata": {
174 |     "collapsed": false
175 |    },
176 |    "outputs": [
177 |     {
178 |      "name": "stdout",
179 |      "output_type": "stream",
180 |      "text": [
181 |       "My string\n"
182 |      ]
183 |     }
184 |    ],
185 |    "source": [
186 |     "printme( str = \"My string\")"
187 |    ]
188 |   },
189 |   {
190 |    "cell_type": "markdown",
191 |    "metadata": {},
192 |    "source": [
193 |     "The following example gives more clear picture. Note that the order of parameters does not matter."
194 |    ]
195 |   },
196 |   {
197 |    "cell_type": "code",
198 |    "execution_count": 9,
199 |    "metadata": {
200 |     "collapsed": false
201 |    },
202 |    "outputs": [
203 |     {
204 |      "name": "stdout",
205 |      "output_type": "stream",
206 |      "text": [
207 |       "Name:  miki\n",
208 |       "Age:  50\n"
209 |      ]
210 |     }
211 |    ],
212 |    "source": [
213 |     "# Function definition is here\n",
214 |     "def printinfo( name, age ):\n",
215 |     "   \"This prints a passed info into this function\"\n",
216 |     "   print \"Name: \", name\n",
217 |     "   print \"Age: \", age\n",
218 |     "   return\n",
219 |     "\n",
220 |     "# Now you can call printinfo function\n",
221 |     "printinfo( age=50, name=\"miki\" )"
222 |    ]
223 |   },
224 |   {
225 |    "cell_type": "markdown",
226 |    "metadata": {},
227 |    "source": [
228 |     "### Default arguments\n",
229 |     "\n",
230 |     "A default argument is an argument that assumes a default value if a value is not provided in the function call for that argument. The following example gives an idea on default arguments, it prints default age if it is not passed −"
231 |    ]
232 |   },
233 |   {
234 |    "cell_type": "code",
235 |    "execution_count": 10,
236 |    "metadata": {
237 |     "collapsed": false
238 |    },
239 |    "outputs": [
240 |     {
241 |      "name": "stdout",
242 |      "output_type": "stream",
243 |      "text": [
244 |       "Name:  miki\n",
245 |       "Age  50\n",
246 |       "Name:  miki\n",
247 |       "Age  35\n"
248 |      ]
249 |     }
250 |    ],
251 |    "source": [
252 |     "# Function definition is here\n",
253 |     "def printinfo( name, age = 35 ):\n",
254 |     "   \"This prints a passed info into this function\"\n",
255 |     "   print \"Name: \", name\n",
256 |     "   print \"Age \", age\n",
257 |     "   return;\n",
258 |     "\n",
259 |     "# Now you can call printinfo function\n",
260 |     "printinfo( age=50, name=\"miki\" )\n",
261 |     "printinfo( name=\"miki\" )"
262 |    ]
263 |   },
264 |   {
265 |    "cell_type": "markdown",
266 |    "metadata": {},
267 |    "source": [
268 |     "### Variable-length arguments\n",
269 |     "\n",
270 |     "You may need to process a function for more arguments than you specified while defining the function. These arguments are called variable-length arguments and are not named in the function definition, unlike required and default arguments.\n",
271 |     "\n",
272 |     "Syntax for a function with non-keyword variable arguments is this −\n",
273 |     "\n",
274 |     "```def functionname([formal_args,] *var_args_tuple ):\n",
275 |     "   \"function_docstring\"\n",
276 |     "   function_suite\n",
277 |     "   return [expression]```\n",
278 |     "\n",
279 |     "An asterisk (*) is placed before the variable name that holds the values of all nonkeyword variable arguments. This tuple remains empty if no additional arguments are specified during the function call. Following is a simple example −"
280 |    ]
281 |   },
282 |   {
283 |    "cell_type": "code",
284 |    "execution_count": 11,
285 |    "metadata": {
286 |     "collapsed": false
287 |    },
288 |    "outputs": [
289 |     {
290 |      "name": "stdout",
291 |      "output_type": "stream",
292 |      "text": [
293 |       "Output is: \n",
294 |       "10\n",
295 |       "Output is: \n",
296 |       "70\n",
297 |       "60\n",
298 |       "50\n"
299 |      ]
300 |     }
301 |    ],
302 |    "source": [
303 |     "# Function definition is here\n",
304 |     "def printinfo( arg1, *vartuple ):\n",
305 |     "   \"This prints a variable passed arguments\"\n",
306 |     "   print \"Output is: \"\n",
307 |     "   print arg1\n",
308 |     "   for var in vartuple:\n",
309 |     "      print var\n",
310 |     "   return;\n",
311 |     "\n",
312 |     "# Now you can call printinfo function\n",
313 |     "printinfo( 10 )\n",
314 |     "printinfo( 70, 60, 50 )"
315 |    ]
316 |   },
317 |   {
318 |    "cell_type": "markdown",
319 |    "metadata": {},
320 |    "source": [
321 |     "### The Anonymous Functions\n",
322 |     "\n",
323 |     "These functions are called anonymous because they are not declared in the standard manner by using the def keyword. You can use the lambda keyword to create small anonymous functions.\n",
324 |     "\n",
325 |     "Lambda forms can take any number of arguments but return just one value in the form of an expression. They cannot contain commands or multiple expressions.\n",
326 |     "\n",
327 |     "An anonymous function cannot be a direct call to print because lambda requires an expression\n",
328 |     "\n",
329 |     "Lambda functions have their own local namespace and cannot access variables other than those in their parameter list and those in the global namespace.\n",
330 |     "\n",
331 |     "Although it appears that lambda's are a one-line version of a function, they are not equivalent to inline statements in C or C++, whose purpose is by passing function stack allocation during invocation for performance reasons."
332 |    ]
333 |   },
334 |   {
335 |    "cell_type": "markdown",
336 |    "metadata": {},
337 |    "source": [
338 |     "#### Syntax\n",
339 |     "\n",
340 |     "The syntax of lambda functions contains only a single statement, which is as follows:\n",
341 |     "\n",
342 |     "lambda [arg1 [,arg2,.....argn]]:expression\n",
343 |     "\n",
344 |     "Following is the example to show how lambda form of function works:"
345 |    ]
346 |   },
347 |   {
348 |    "cell_type": "code",
349 |    "execution_count": 12,
350 |    "metadata": {
351 |     "collapsed": false
352 |    },
353 |    "outputs": [
354 |     {
355 |      "name": "stdout",
356 |      "output_type": "stream",
357 |      "text": [
358 |       "Value of total :  30\n",
359 |       "Value of total :  40\n"
360 |      ]
361 |     }
362 |    ],
363 |    "source": [
364 |     "# Function definition is here\n",
365 |     "sum = lambda arg1, arg2: arg1 + arg2;\n",
366 |     "\n",
367 |     "# Now you can call sum as a function\n",
368 |     "print \"Value of total : \", sum( 10, 20 )\n",
369 |     "print \"Value of total : \", sum( 20, 20 )"
370 |    ]
371 |   },
372 |   {
373 |    "cell_type": "markdown",
374 |    "metadata": {},
375 |    "source": [
376 |     "### The return Statement\n",
377 |     "\n",
378 |     "The statement return [expression] exits a function, optionally passing back an expression to the caller. A return statement with no arguments is the same as return None.\n",
379 |     "\n",
380 |     "All the above examples are not returning any value. You can return a value from a function as follows:"
381 |    ]
382 |   },
383 |   {
384 |    "cell_type": "code",
385 |    "execution_count": 13,
386 |    "metadata": {
387 |     "collapsed": false
388 |    },
389 |    "outputs": [
390 |     {
391 |      "name": "stdout",
392 |      "output_type": "stream",
393 |      "text": [
394 |       "-10\n"
395 |      ]
396 |     }
397 |    ],
398 |    "source": [
399 |     "# Function definition is here\n",
400 |     "def substractme( arg1, arg2 ):\n",
401 |     "   # Substracts the second parameter from the first and returns the result.\"\n",
402 |     "   total = arg1 - arg2\n",
403 |     "   return total;\n",
404 |     "\n",
405 |     "# Now you can call sum function\n",
406 |     "total = substractme( 10, 20 );\n",
407 |     "print total "
408 |    ]
409 |   },
410 |   {
411 |    "cell_type": "code",
412 |    "execution_count": 14,
413 |    "metadata": {
414 |     "collapsed": false
415 |    },
416 |    "outputs": [
417 |     {
418 |      "name": "stdout",
419 |      "output_type": "stream",
420 |      "text": [
421 |       "3\n",
422 |       "-1\n",
423 |       "2\n",
424 |       "3\n",
425 |       "2\n",
426 |       "-1\n"
427 |      ]
428 |     }
429 |    ],
430 |    "source": [
431 |     "# The retuned arguments are also order-specific. So if you have a function:\n",
432 |     "def arithmetic( a, b ):\n",
433 |     "    sumab = a+b\n",
434 |     "    substractab = a-b\n",
435 |     "    multiplyab = a*b\n",
436 |     "    return sumab, substractab, multiplyab\n",
437 |     "    \n",
438 |     "# This\n",
439 |     "c, d, e = arithmetic(1,2)\n",
440 |     "print c\n",
441 |     "print d\n",
442 |     "print e\n",
443 |     "\n",
444 |     "# does not alocate the same values to the varialbes c, d, e as this\n",
445 |     "c, e, d = arithmetic(1,2)\n",
446 |     "print c\n",
447 |     "print d\n",
448 |     "print e"
449 |    ]
450 |   },
451 |   {
452 |    "cell_type": "markdown",
453 |    "metadata": {},
454 |    "source": [
455 |     "### Scope of Variables\n",
456 |     "\n",
457 |     "All variables in a program may not be accessible at all locations in that program. This depends on where you have declared a variable.\n",
458 |     "\n",
459 |     "The scope of a variable determines the portion of the program where you can access a particular identifier. There are two basic scopes of variables in Python:\n",
460 |     "\n",
461 |     "Global variables\n",
462 |     "\n",
463 |     "Local variables\n",
464 |     "\n",
465 |     "#### Global vs. Local variables\n",
466 |     "\n",
467 |     "Variables that are defined inside a function body have a local scope, and those defined outside have a global scope.\n",
468 |     "\n",
469 |     "This means that local variables can be accessed only inside the function in which they are declared, whereas global variables can be accessed throughout the program body by all functions. When you call a function, the variables declared inside it are brought into scope. Following is a simple example −"
470 |    ]
471 |   },
472 |   {
473 |    "cell_type": "code",
474 |    "execution_count": 15,
475 |    "metadata": {
476 |     "collapsed": true
477 |    },
478 |    "outputs": [],
479 |    "source": [
480 |     "total = 0; # This is global variable."
481 |    ]
482 |   },
483 |   {
484 |    "cell_type": "code",
485 |    "execution_count": 17,
486 |    "metadata": {
487 |     "collapsed": false
488 |    },
489 |    "outputs": [
490 |     {
491 |      "name": "stdout",
492 |      "output_type": "stream",
493 |      "text": [
494 |       "Inside the function (local) total:  -10\n",
495 |       "Outside the function (global) total :  0\n"
496 |      ]
497 |     }
498 |    ],
499 |    "source": [
500 |     "# Function definition is here\n",
501 |     "def substractme( arg1, arg2 ):\n",
502 |     "   # Substratcs the second parameter from the first and return them.\"\n",
503 |     "   total = arg1 - arg2; # Here total is a local variable.\n",
504 |     "   print \"Inside the function (local) total: \", total \n",
505 |     "   return total;\n",
506 |     "\n",
507 |     "# Now you can call sum function\n",
508 |     "substractme( 10, 20 );\n",
509 |     "print \"Outside the function (global) total : \", total "
510 |    ]
511 |   },
512 |   {
513 |    "cell_type": "markdown",
514 |    "metadata": {},
515 |    "source": [
516 |     "### Python Modules\n",
517 |     "\n",
518 |     "A module allows you to logically organise your Python code. Grouping related code into a module makes the code easier to understand and use. A module is a Python object with arbitrarily named attributes that you can bind and reference.\n",
519 |     "\n",
520 |     "Simply, a module is a file consisting of Python code. A module can define functions, classes and variables. A module can also include runnable code.\n",
521 |     "\n",
522 |     "For example, in a new file, copy and past the following, making sure you undertand the code. Name the file module_example.py:\n",
523 |     "\n",
524 |     "```def print_func( par ):\n",
525 |     "       print \"Hello : \", par\n",
526 |     "       return```\n",
527 |     "\n",
528 |     "In this example, the module we are intersted in ix modelu_example.py."
529 |    ]
530 |   },
531 |   {
532 |    "cell_type": "markdown",
533 |    "metadata": {},
534 |    "source": [
535 |     "### The import Statement\n",
536 |     "\n",
537 |     "You can use any Python source file as a module by executing an import statement in some other Python source file. The import has the following syntax:\n",
538 |     "\n",
539 |     "```import module1[, module2[,... moduleN]```\n",
540 |     "\n",
541 |     "When the interpreter encounters an import statement, it imports the module if the module is present in the search path. A search path is a list of directories that the interpreter searches before importing a module. For example, to import the module module_example.py, you need to put the following command at the top of the script −"
542 |    ]
543 |   },
544 |   {
545 |    "cell_type": "code",
546 |    "execution_count": 18,
547 |    "metadata": {
548 |     "collapsed": false
549 |    },
550 |    "outputs": [
551 |     {
552 |      "name": "stdout",
553 |      "output_type": "stream",
554 |      "text": [
555 |       "Hello :  Zara\n"
556 |      ]
557 |     }
558 |    ],
559 |    "source": [
560 |     "# Import module support\n",
561 |     "import module_example as modex\n",
562 |     "\n",
563 |     "# Now you can call defined functions of the module, as follows\n",
564 |     "modex.print_func(\"Zara\")"
565 |    ]
566 |   },
567 |   {
568 |    "cell_type": "markdown",
569 |    "metadata": {},
570 |    "source": [
571 |     "A module is loaded only once, regardless of the number of times it is imported. This prevents the module execution from happening over and over again if multiple imports occur."
572 |    ]
573 |   },
574 |   {
575 |    "cell_type": "markdown",
576 |    "metadata": {},
577 |    "source": [
578 |     "### The from...import Statement and The from...import * Statement:\n",
579 |     "\n",
580 |     "Python's from statement lets you import specific attributes from a module into the current namespace. The from...import has the following syntax −\n",
581 |     "\n",
582 |     "#from modname import name1[, name2[, ... nameN]]\n",
583 |     "\n",
584 |     "For example, to import the function fibonacci from the module fib, use the following statement:"
585 |    ]
586 |   },
587 |   {
588 |    "cell_type": "code",
589 |    "execution_count": 20,
590 |    "metadata": {
591 |     "collapsed": true
592 |    },
593 |    "outputs": [],
594 |    "source": [
595 |     "from scipy import sum as s"
596 |    ]
597 |   },
598 |   {
599 |    "cell_type": "markdown",
600 |    "metadata": {},
601 |    "source": [
602 |     "This statement does not import the entire module/package scipy into the current namespace; it just introduces the item sum from the module scipy into the global symbol table of the importing module, and renames it to s.\n",
603 |     "\n",
604 |     "It is also possible to import all names from a module into the current namespace by using the following import statement:"
605 |    ]
606 |   },
607 |   {
608 |    "cell_type": "code",
609 |    "execution_count": 21,
610 |    "metadata": {
611 |     "collapsed": false,
612 |     "scrolled": true
613 |    },
614 |    "outputs": [
615 |     {
616 |      "ename": "ImportError",
617 |      "evalue": "No module named modname",
618 |      "output_type": "error",
619 |      "traceback": [
620 |       "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
621 |       "\u001b[0;31mImportError\u001b[0m                               Traceback (most recent call last)",
622 |       "\u001b[0;32m<ipython-input-21-0edd57cefca1>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mfrom\u001b[0m \u001b[0mmodname\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
623 |       "\u001b[0;31mImportError\u001b[0m: No module named modname"
624 |      ]
625 |     }
626 |    ],
627 |    "source": [
628 |     "from modname import *"
629 |    ]
630 |   },
631 |   {
632 |    "cell_type": "code",
633 |    "execution_count": null,
634 |    "metadata": {
635 |     "collapsed": true
636 |    },
637 |    "outputs": [],
638 |    "source": []
639 |   }
640 |  ],
641 |  "metadata": {
642 |   "kernelspec": {
643 |    "display_name": "Python 2",
644 |    "language": "python",
645 |    "name": "python2"
646 |   },
647 |   "language_info": {
648 |    "codemirror_mode": {
649 |     "name": "ipython",
650 |     "version": 2
651 |    },
652 |    "file_extension": ".py",
653 |    "mimetype": "text/x-python",
654 |    "name": "python",
655 |    "nbconvert_exporter": "python",
656 |    "pygments_lexer": "ipython2",
657 |    "version": "2.7.11"
658 |   }
659 |  },
660 |  "nbformat": 4,
661 |  "nbformat_minor": 0
662 | }
663 | 


--------------------------------------------------------------------------------
/pre_tutorial/.ipynb_checkpoints/pre_tutorial-checkpoint.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "markdown",
 5 |    "metadata": {},
 6 |    "source": [
 7 |     "# Reminder of importing modules"
 8 |    ]
 9 |   },
10 |   {
11 |    "cell_type": "markdown",
12 |    "metadata": {},
13 |    "source": [
14 |     "Modules in Python are simply Python files with the .py extension, which  implement a set of functions. The purpose of writting modules is to group  related code together, to make it easier to understand and use, sort of as a  'black box'.\n",
15 |     "\n",
16 |     "Packages are name spaces that contain multiple packages and modules themselves. You can think of them as 'directories'.\n",
17 |     "\n",
18 |     "To be able to use the functions in a particular package or module, you need  to 'import' that module or package. To do so, you need to use the import command. For example, to import the package numpy, which enables the manipulation of arrays, you would do:"
19 |    ]
20 |   },
21 |   {
22 |    "cell_type": "code",
23 |    "execution_count": null,
24 |    "metadata": {
25 |     "collapsed": true
26 |    },
27 |    "outputs": [],
28 |    "source": [
29 |     "import numpy as np                 # Array manipulation package\n",
30 |     "import matplotlib.pyplot as plt    # Plotting package\n",
31 |     "import skimage.io as io            # Image file manipulation module"
32 |    ]
33 |   }
34 |  ],
35 |  "metadata": {
36 |   "kernelspec": {
37 |    "display_name": "Python 2",
38 |    "language": "python",
39 |    "name": "python2"
40 |   },
41 |   "language_info": {
42 |    "codemirror_mode": {
43 |     "name": "ipython",
44 |     "version": 2
45 |    },
46 |    "file_extension": ".py",
47 |    "mimetype": "text/x-python",
48 |    "name": "python",
49 |    "nbconvert_exporter": "python",
50 |    "pygments_lexer": "ipython2",
51 |    "version": "2.7.11"
52 |   }
53 |  },
54 |  "nbformat": 4,
55 |  "nbformat_minor": 0
56 | }
57 | 


--------------------------------------------------------------------------------
/pre_tutorial/.ipynb_checkpoints/tutorial-on-functions-checkpoint.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Python functions and modules"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "## 1. Defining a Function\n",
 15 |     "\n",
 16 |     "You can define functions to provide the required functionality. Here are simple rules to define a function in Python and the syntax:\n",
 17 |     "\n",
 18 |     "- Function blocks begin with the keyword `def` followed by the function name, parentheses ( ( ) ) and a colon (:).\n",
 19 |     "- Any input parameters or arguments should be placed within the parentheses. You can also define parameters inside these parentheses.\n",
 20 |     "- Next is the first statement of the function, which can be an optional statement - the documentation string of the function or docstring.\n",
 21 |     "- The code block is next. It needs to be indented.\n",
 22 |     "- The statement `return [expression]` exits a function and, optionally, passes back an expression to the caller. A return statement with no arguments is the same as `return None`.\n",
 23 |     "\n",
 24 |     "```\n",
 25 |     "def functionname( parameters ):\n",
 26 |     "    \"function_docstring\"\n",
 27 |     "    function_suite\n",
 28 |     "    return [expression]\n",
 29 |     "```"
 30 |    ]
 31 |   },
 32 |   {
 33 |    "cell_type": "markdown",
 34 |    "metadata": {},
 35 |    "source": [
 36 |     "#### Example\n",
 37 |     "\n",
 38 |     "The following function takes a string as input parameter and prints it on standard screen."
 39 |    ]
 40 |   },
 41 |   {
 42 |    "cell_type": "code",
 43 |    "execution_count": 10,
 44 |    "metadata": {
 45 |     "collapsed": true
 46 |    },
 47 |    "outputs": [],
 48 |    "source": [
 49 |     "def printme( str ):\n",
 50 |     "   \"This prints a passed string into this function\"\n",
 51 |     "   print str\n",
 52 |     "   return"
 53 |    ]
 54 |   },
 55 |   {
 56 |    "cell_type": "markdown",
 57 |    "metadata": {},
 58 |    "source": [
 59 |     "#### Example \n",
 60 |     "\n",
 61 |     "The following function takes two numbers and returns and prints their sum."
 62 |    ]
 63 |   },
 64 |   {
 65 |    "cell_type": "code",
 66 |    "execution_count": 11,
 67 |    "metadata": {
 68 |     "collapsed": true
 69 |    },
 70 |    "outputs": [],
 71 |    "source": [
 72 |     "def addme( a, b ):\n",
 73 |     "   \"This adds passed arguments.\"\n",
 74 |     "   return a+b"
 75 |    ]
 76 |   },
 77 |   {
 78 |    "cell_type": "markdown",
 79 |    "metadata": {},
 80 |    "source": [
 81 |     "## 2. Calling a Function\n",
 82 |     "\n",
 83 |     "Defining a function only gives it a name, specifies the parameters that are to be included in the function and structures the blocks of code.\n",
 84 |     "\n",
 85 |     "Once the basic structure of a function is finalized, you can execute it by calling it directly from the Python prompt. Note that by default, parameters have a positional behavior and, if there is more than one, you need to input them in the same order that they were defined. "
 86 |    ]
 87 |   },
 88 |   {
 89 |    "cell_type": "markdown",
 90 |    "metadata": {},
 91 |    "source": [
 92 |     "#### Example"
 93 |    ]
 94 |   },
 95 |   {
 96 |    "cell_type": "code",
 97 |    "execution_count": 12,
 98 |    "metadata": {
 99 |     "collapsed": false
100 |    },
101 |    "outputs": [
102 |     {
103 |      "name": "stdout",
104 |      "output_type": "stream",
105 |      "text": [
106 |       "I'm first call to user defined function!\n",
107 |       "Again second call to the same function\n",
108 |       "1 plus 2 is 3\n"
109 |      ]
110 |     }
111 |    ],
112 |    "source": [
113 |     "printme(\"I'm first call to user defined function!\")\n",
114 |     "printme(\"Again second call to the same function\") \n",
115 |     "print '1 plus 2 is', addme(1,2)"
116 |    ]
117 |   },
118 |   {
119 |    "cell_type": "markdown",
120 |    "metadata": {},
121 |    "source": [
122 |     "## 3. Function Arguments\n",
123 |     "\n",
124 |     "Functions have different types of arguments:\n",
125 |     "\n",
126 |     "#### Required arguments\n",
127 |     "\n",
128 |     "Required arguments are the arguments passed to a function in correct positional order. Here, the number of arguments in the function call should match exactly with the function definition.\n",
129 |     "\n",
130 |     "To call the function `printme()`, you definitely need to pass one argument, otherwise it gives a syntax error.\n",
131 |     "\n",
132 |     "Similarly, for `addme()` you need to pass two arguments."
133 |    ]
134 |   },
135 |   {
136 |    "cell_type": "markdown",
137 |    "metadata": {},
138 |    "source": [
139 |     "#### Keyword arguments\n",
140 |     "\n",
141 |     "Keyword arguments are related to the function calls. When you use keyword arguments in a function call, the caller identifies the arguments by the parameter name.\n",
142 |     "\n",
143 |     "This allows you to skip arguments or place them out of order because the Python interpreter is able to use the keywords provided to match the values with parameters. \n",
144 |     "\n",
145 |     "For example, the following code:"
146 |    ]
147 |   },
148 |   {
149 |    "cell_type": "code",
150 |    "execution_count": 13,
151 |    "metadata": {
152 |     "collapsed": false
153 |    },
154 |    "outputs": [
155 |     {
156 |      "name": "stdout",
157 |      "output_type": "stream",
158 |      "text": [
159 |       "My string\n"
160 |      ]
161 |     }
162 |    ],
163 |    "source": [
164 |     "printme( str = \"My string\")"
165 |    ]
166 |   },
167 |   {
168 |    "cell_type": "markdown",
169 |    "metadata": {},
170 |    "source": [
171 |     "The following example gives a more clear picture. Note that when keyword arguments are used, the order of parameters does not matter."
172 |    ]
173 |   },
174 |   {
175 |    "cell_type": "code",
176 |    "execution_count": 14,
177 |    "metadata": {
178 |     "collapsed": false
179 |    },
180 |    "outputs": [
181 |     {
182 |      "name": "stdout",
183 |      "output_type": "stream",
184 |      "text": [
185 |       "Name:  miki\n",
186 |       "Age:  50\n"
187 |      ]
188 |     }
189 |    ],
190 |    "source": [
191 |     "# Function definition is here\n",
192 |     "def printinfo( name, age ):\n",
193 |     "   \"This prints a passed info into this function\"\n",
194 |     "   print \"Name: \", name\n",
195 |     "   print \"Age: \", age\n",
196 |     "   return\n",
197 |     "\n",
198 |     "# Now you can call printinfo function\n",
199 |     "printinfo( age=50, name=\"miki\" )"
200 |    ]
201 |   },
202 |   {
203 |    "cell_type": "markdown",
204 |    "metadata": {},
205 |    "source": [
206 |     "#### Default arguments\n",
207 |     "\n",
208 |     "A default argument is an argument that assumes a default value if a value is not provided in the function call for that argument. The following example gives an idea on default arguments, it prints default age if it is not passed −"
209 |    ]
210 |   },
211 |   {
212 |    "cell_type": "code",
213 |    "execution_count": 15,
214 |    "metadata": {
215 |     "collapsed": false
216 |    },
217 |    "outputs": [
218 |     {
219 |      "name": "stdout",
220 |      "output_type": "stream",
221 |      "text": [
222 |       "Name:  miki\n",
223 |       "Age  50\n",
224 |       "Name:  miki\n",
225 |       "Age  35\n"
226 |      ]
227 |     }
228 |    ],
229 |    "source": [
230 |     "# Function definition is here\n",
231 |     "def printinfo( name, age = 35 ):\n",
232 |     "   \"This prints a passed info into this function\"\n",
233 |     "   print \"Name: \", name\n",
234 |     "   print \"Age \", age\n",
235 |     "   return;\n",
236 |     "\n",
237 |     "# Now you can call printinfo function\n",
238 |     "printinfo( age=50, name=\"miki\" )\n",
239 |     "printinfo( name=\"miki\" )"
240 |    ]
241 |   },
242 |   {
243 |    "cell_type": "markdown",
244 |    "metadata": {},
245 |    "source": [
246 |     "#### Variable-length arguments\n",
247 |     "\n",
248 |     "You may need to process a function for more arguments than you specified while defining the function. These arguments are called variable-length arguments and are not named in the function definition, unlike required and default arguments.\n",
249 |     "\n",
250 |     "Syntax for a function with non-keyword variable arguments is this −\n",
251 |     "\n",
252 |     "```\n",
253 |     "def functionname([formal_args,] *var_args_tuple ):\n",
254 |     "   \"function_docstring\"\n",
255 |     "   function_suite\n",
256 |     "   return [expression]\n",
257 |     "```\n",
258 |     "\n",
259 |     "An asterisk (*) is placed before the variable name that holds the values of all nonkeyword variable arguments. This tuple remains empty if no additional arguments are specified during the function call. "
260 |    ]
261 |   },
262 |   {
263 |    "cell_type": "code",
264 |    "execution_count": 16,
265 |    "metadata": {
266 |     "collapsed": false
267 |    },
268 |    "outputs": [
269 |     {
270 |      "name": "stdout",
271 |      "output_type": "stream",
272 |      "text": [
273 |       "Output is: \n",
274 |       "10\n",
275 |       "Output is: \n",
276 |       "70\n",
277 |       "60\n",
278 |       "50\n"
279 |      ]
280 |     }
281 |    ],
282 |    "source": [
283 |     "# Function definition is here\n",
284 |     "def printinfo( arg1, *vartuple ):\n",
285 |     "   \"This prints a variable passed arguments\"\n",
286 |     "   print \"Output is: \"\n",
287 |     "   print arg1\n",
288 |     "   for var in vartuple:\n",
289 |     "      print var\n",
290 |     "   return;\n",
291 |     "\n",
292 |     "# Now you can call printinfo function\n",
293 |     "printinfo( 10 )\n",
294 |     "printinfo( 70, 60, 50 )"
295 |    ]
296 |   },
297 |   {
298 |    "cell_type": "markdown",
299 |    "metadata": {},
300 |    "source": [
301 |     "## 4. Anonymous Functions\n",
302 |     "\n",
303 |     "These functions are called anonymous because they are not declared in the standard manner by using the `def` keyword. You can use the `lambda` keyword to create small anonymous functions.\n",
304 |     "\n",
305 |     "Lambda forms can take any number of arguments but return just one value in the form of an expression. They cannot contain commands or multiple expressions."
306 |    ]
307 |   },
308 |   {
309 |    "cell_type": "markdown",
310 |    "metadata": {},
311 |    "source": [
312 |     "#### Syntax\n",
313 |     "\n",
314 |     "The syntax of `lambda` functions contains only a single statement, which is as follows:\n",
315 |     "\n",
316 |     "`lambda [arg1 [,arg2,.....argn]]:expression`"
317 |    ]
318 |   },
319 |   {
320 |    "cell_type": "code",
321 |    "execution_count": 12,
322 |    "metadata": {
323 |     "collapsed": false
324 |    },
325 |    "outputs": [
326 |     {
327 |      "name": "stdout",
328 |      "output_type": "stream",
329 |      "text": [
330 |       "Value of total :  30\n",
331 |       "Value of total :  40\n"
332 |      ]
333 |     }
334 |    ],
335 |    "source": [
336 |     "# Function definition is here - this function has two arguments and it adds them up\n",
337 |     "sum = lambda arg1, arg2: arg1 + arg2;\n",
338 |     "\n",
339 |     "# Now you can call sum as a function\n",
340 |     "print \"Value of total : \", sum( 10, 20 )\n",
341 |     "print \"Value of total : \", sum( 20, 20 )"
342 |    ]
343 |   },
344 |   {
345 |    "cell_type": "markdown",
346 |    "metadata": {},
347 |    "source": [
348 |     "## 5. The return Statement\n",
349 |     "\n",
350 |     "We briefly used the return statement `return [expression]` in the above functions, but let's try to explain it more explicitly: It exits a function, optionally, passing back an expression to the caller. A return statement with no arguments is the same as `return None`."
351 |    ]
352 |   },
353 |   {
354 |    "cell_type": "code",
355 |    "execution_count": 17,
356 |    "metadata": {
357 |     "collapsed": false
358 |    },
359 |    "outputs": [
360 |     {
361 |      "name": "stdout",
362 |      "output_type": "stream",
363 |      "text": [
364 |       "-10\n"
365 |      ]
366 |     }
367 |    ],
368 |    "source": [
369 |     "# This function returns an expression\n",
370 |     "def substractme( arg1, arg2 ):\n",
371 |     "   # Substracts the second parameter from the first and returns the result.\"\n",
372 |     "   total = arg1 - arg2\n",
373 |     "   return total;\n",
374 |     "\n",
375 |     "# Now you can call sum function\n",
376 |     "total = substractme( 10, 20 );\n",
377 |     "print total "
378 |    ]
379 |   },
380 |   {
381 |    "cell_type": "markdown",
382 |    "metadata": {},
383 |    "source": [
384 |     "Warning: The retuned arguments are also order-specific! See below:"
385 |    ]
386 |   },
387 |   {
388 |    "cell_type": "code",
389 |    "execution_count": 18,
390 |    "metadata": {
391 |     "collapsed": false
392 |    },
393 |    "outputs": [
394 |     {
395 |      "name": "stdout",
396 |      "output_type": "stream",
397 |      "text": [
398 |       "3\n",
399 |       "-1\n",
400 |       "2\n",
401 |       "\n",
402 |       "\n",
403 |       "3\n",
404 |       "2\n",
405 |       "-1\n"
406 |      ]
407 |     }
408 |    ],
409 |    "source": [
410 |     "# if you have a function:\n",
411 |     "def arithmetic( a, b ):\n",
412 |     "    sumab = a+b\n",
413 |     "    substractab = a-b\n",
414 |     "    multiplyab = a*b\n",
415 |     "    return sumab, substractab, multiplyab\n",
416 |     "    \n",
417 |     "# This\n",
418 |     "c, d, e = arithmetic(1,2)\n",
419 |     "print c\n",
420 |     "print d\n",
421 |     "print e\n",
422 |     "print '\\n'\n",
423 |     "\n",
424 |     "# does not alocate the same values to the varialbes c, d, e as this\n",
425 |     "c, e, d = arithmetic(1,2)\n",
426 |     "print c\n",
427 |     "print d\n",
428 |     "print e"
429 |    ]
430 |   },
431 |   {
432 |    "cell_type": "markdown",
433 |    "metadata": {},
434 |    "source": [
435 |     "## 5. Scope of Variables\n",
436 |     "\n",
437 |     "All variables in a program may not be accessible at all locations in that program. This depends on where you have declared a variable. The scope of a variable determines the portion of the program where you can access a particular identifier.\n",
438 |     "\n",
439 |     "There are two basic scopes of variables in Python:\n",
440 |     "\n",
441 |     "Global variables\n",
442 |     "\n",
443 |     "Local variables\n",
444 |     "\n",
445 |     "\n",
446 |     "#### Global vs. Local variables\n",
447 |     "\n",
448 |     "Variables that are defined inside a function body have a local scope, and those defined outside have a global scope.\n",
449 |     "\n",
450 |     "This means that local variables can be accessed only inside the function in which they are declared, whereas global variables can be accessed throughout the program body by all functions. When you call a function, the variables declared inside it are brought into scope. Following is a simple example −"
451 |    ]
452 |   },
453 |   {
454 |    "cell_type": "code",
455 |    "execution_count": 15,
456 |    "metadata": {
457 |     "collapsed": true
458 |    },
459 |    "outputs": [],
460 |    "source": [
461 |     "total = 0; # This is global variable."
462 |    ]
463 |   },
464 |   {
465 |    "cell_type": "code",
466 |    "execution_count": 17,
467 |    "metadata": {
468 |     "collapsed": false
469 |    },
470 |    "outputs": [
471 |     {
472 |      "name": "stdout",
473 |      "output_type": "stream",
474 |      "text": [
475 |       "Inside the function (local) total:  -10\n",
476 |       "Outside the function (global) total :  0\n"
477 |      ]
478 |     }
479 |    ],
480 |    "source": [
481 |     "# Function definition is here\n",
482 |     "def substractme( arg1, arg2 ):\n",
483 |     "   # Substracts the second parameter from the first and return them.\"\n",
484 |     "   total = arg1 - arg2; # Here total is a local variable.\n",
485 |     "   print \"Inside the function (local) total: \", total \n",
486 |     "   return total;\n",
487 |     "\n",
488 |     "# Now you can call sum function\n",
489 |     "substractme( 10, 20 );\n",
490 |     "print \"Outside the function (global) total : \", total "
491 |    ]
492 |   },
493 |   {
494 |    "cell_type": "markdown",
495 |    "metadata": {},
496 |    "source": [
497 |     "## 6. Python Modules\n",
498 |     "\n",
499 |     "Simply, a module is a file consisting of Python code. A module can define functions, classes and variables. A module can also include runnable code.\n",
500 |     "\n",
501 |     "A module allows you to logically organise your Python code. Grouping related code into a module makes the code easier to understand and use. \n",
502 |     "\n",
503 |     "For example, in a new file, copy and paste the following, making sure you undertand the code. Name the file module_example.py:\n",
504 |     "\n",
505 |     "```\n",
506 |     "def print_func( par ):\n",
507 |     "       print \"Hello : \", par\n",
508 |     "       return\n",
509 |     "```"
510 |    ]
511 |   },
512 |   {
513 |    "cell_type": "markdown",
514 |    "metadata": {},
515 |    "source": [
516 |     "#### The import Statement\n",
517 |     "\n",
518 |     "You can use any Python source file as a module by executing an import statement in some other Python source file. The import has the following syntax:\n",
519 |     "\n",
520 |     "`import module1[, module2[,... moduleN]`\n",
521 |     "\n",
522 |     "When the interpreter encounters an import statement, it imports the module if the module is present in the search path or the current directory.\n",
523 |     "\n",
524 |     "For example, to import the module module_example.py, you need to use the following command:"
525 |    ]
526 |   },
527 |   {
528 |    "cell_type": "code",
529 |    "execution_count": 18,
530 |    "metadata": {
531 |     "collapsed": false
532 |    },
533 |    "outputs": [
534 |     {
535 |      "name": "stdout",
536 |      "output_type": "stream",
537 |      "text": [
538 |       "Hello :  Zara\n"
539 |      ]
540 |     }
541 |    ],
542 |    "source": [
543 |     "# Import module support\n",
544 |     "import module_example as modex\n",
545 |     "\n",
546 |     "# Now you can call defined functions of the module, as follows\n",
547 |     "modex.print_func(\"Zara\")"
548 |    ]
549 |   },
550 |   {
551 |    "cell_type": "markdown",
552 |    "metadata": {},
553 |    "source": [
554 |     "A module is loaded only once, regardless of the number of times it is imported. This prevents the module execution from happening over and over again if multiple imports occur."
555 |    ]
556 |   },
557 |   {
558 |    "cell_type": "markdown",
559 |    "metadata": {},
560 |    "source": [
561 |     "#### The `from...import` statement and the `from...import *` statement:\n",
562 |     "\n",
563 |     "Python's `from import` statement lets you import specific attributes from a module into the current namespace. It has the following syntaxL\n",
564 |     "\n",
565 |     "`from modname import name1[, name2[, ... nameN]]`\n",
566 |     "\n",
567 |     "For example, to import the function `sum` from the module `scipy`, use the following statement:"
568 |    ]
569 |   },
570 |   {
571 |    "cell_type": "code",
572 |    "execution_count": 19,
573 |    "metadata": {
574 |     "collapsed": true
575 |    },
576 |    "outputs": [],
577 |    "source": [
578 |     "from scipy import sum as s"
579 |    ]
580 |   },
581 |   {
582 |    "cell_type": "markdown",
583 |    "metadata": {},
584 |    "source": [
585 |     "This statement does not import the entire module/package scipy into the current namespace; it just introduces the item `sum` from the module `scipy`. Note that in this example it is renamed to `s`.\n",
586 |     "\n",
587 |     "It is also possible to import all names from a module into the current namespace by using the following import statement:"
588 |    ]
589 |   },
590 |   {
591 |    "cell_type": "code",
592 |    "execution_count": 21,
593 |    "metadata": {
594 |     "collapsed": false,
595 |     "scrolled": true
596 |    },
597 |    "outputs": [],
598 |    "source": [
599 |     "from scipy import *"
600 |    ]
601 |   },
602 |   {
603 |    "cell_type": "markdown",
604 |    "metadata": {},
605 |    "source": [
606 |     "#### References:\n",
607 |     "\n",
608 |     "http://www.tutorialspoint.com/python/ , \n",
609 |     "https://github.com/tobyhodges/ITPP , \n",
610 |     "https://github.com/cmci/HTManalysisCourse/blob/master/CentreCourseProtocol.md#workflow-python-primer , \n",
611 |     "http://cmci.embl.de/documents/ijcourses"
612 |    ]
613 |   },
614 |   {
615 |    "cell_type": "code",
616 |    "execution_count": null,
617 |    "metadata": {
618 |     "collapsed": true
619 |    },
620 |    "outputs": [],
621 |    "source": []
622 |   }
623 |  ],
624 |  "metadata": {
625 |   "kernelspec": {
626 |    "display_name": "Python 2",
627 |    "language": "python",
628 |    "name": "python2"
629 |   },
630 |   "language_info": {
631 |    "codemirror_mode": {
632 |     "name": "ipython",
633 |     "version": 2
634 |    },
635 |    "file_extension": ".py",
636 |    "mimetype": "text/x-python",
637 |    "name": "python",
638 |    "nbconvert_exporter": "python",
639 |    "pygments_lexer": "ipython2",
640 |    "version": "2.7.11"
641 |   }
642 |  },
643 |  "nbformat": 4,
644 |  "nbformat_minor": 0
645 | }
646 | 


--------------------------------------------------------------------------------
/pre_tutorial/README.md:
--------------------------------------------------------------------------------
 1 | ## README Pre-Tutorial on Image Processing and Analysis with Python
 2 | 
 3 | 
 4 | ### DESCRIPTION
 5 | Here you can find a tutorial that introduces basic, but necessary, concepts of digital images and their manipulation with Python. It also has a short tutorial on the Python library NumPy and another on Python functions.
 6 | 
 7 | This course assumes a basic knowledge of the Python Programming Language.
 8 | 
 9 | 
10 | ### REQUIREMENTS
11 | - Python 2.7 (we recommend the Anaconda distribution, which includes most of the required modules)
12 | - Modules: NumPy, SciPy, scikit-image, tifffile
13 | - A text/code editor 
14 | 
15 | 
16 | ### PROGRAMMING CONCEPTS AND CONTENT DISCUSSED IN THIS TUTORIAL
17 | - The Python NumPy library to manipulate images
18 | - Making functions in Python
19 | - Manipulating images with Python
20 | 
21 | 
22 | ### IMAGE PROCESSING CONCEPTS AND CONTENT DISCUSSED IN THIS TUTORIAL
23 | - The numerical and array nature of digital images
24 | - Bit-depth, variable types and data types of images in Python
25 | - Grayness resolution, RGB format and look up tables
26 | - Image arithmetic and unexpected errors due to data type
27 | 
28 | 
29 | 
30 | ### HOW TO FOLLOW THIS TUTORIAL
31 | - Files you should have: ipython notebooks titled `pre-tutorial.ipynb`, `arrays-and-numpy.ipynb` and `tutorial-on-functions.ipyn`.
32 | - To start the ipython notebooks, start a new terminal window and type “jupyter notebook”, then press enter. When a new browser window opens, navigate to the folder where you have saved the tutorials. Click on the tutorial that you want to follow.  
33 | - Using Jupyter notebook: When you want to type code into a cell, simply click on it to activate it. When you want to run the code, press shift+enter. You can learn how to use Jupyter notebooks from, e.g., https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/
34 | - If you are following this tutorial in class, if you have any questions, raise your hand and someone will come to help you. Otherwise, feel free to send your query to one of these two email addresses:    
35 | jonas.hartmann@embl.de    
36 | karin.sasaki@embl.de
37 | 
38 | 


--------------------------------------------------------------------------------
/pre_tutorial/ext_nuc_AP2_beta_subunit.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/pre_tutorial/ext_nuc_AP2_beta_subunit.tif


--------------------------------------------------------------------------------
/pre_tutorial/figA.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/pre_tutorial/figA.jpeg


--------------------------------------------------------------------------------
/pre_tutorial/figC.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/pre_tutorial/figC.png


--------------------------------------------------------------------------------
/pre_tutorial/figD.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/pre_tutorial/figD.png


--------------------------------------------------------------------------------
/pre_tutorial/figE.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/pre_tutorial/figE.png


--------------------------------------------------------------------------------
/pre_tutorial/figF.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/pre_tutorial/figF.png


--------------------------------------------------------------------------------
/pre_tutorial/module_example.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | 
3 | def print_func( par ):
4 |    print "Hello : ", par
5 |    return


--------------------------------------------------------------------------------
/pre_tutorial/nuclei.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/karinsasaki/python-workshop-image-processing/a6de0424bb184e55a769801bbedec1c8ce02dd5a/pre_tutorial/nuclei.png


--------------------------------------------------------------------------------
/pre_tutorial/randimg.txt:
--------------------------------------------------------------------------------
 1 | 8.900000000000000000e+01 9.700000000000000000e+01 6.600000000000000000e+01 1.420000000000000000e+02 2.100000000000000000e+02 4.300000000000000000e+01 2.270000000000000000e+02 2.340000000000000000e+02 9.100000000000000000e+01 1.470000000000000000e+02
 2 | 1.390000000000000000e+02 2.320000000000000000e+02 2.240000000000000000e+02 9.100000000000000000e+01 1.250000000000000000e+02 2.480000000000000000e+02 1.940000000000000000e+02 1.910000000000000000e+02 1.660000000000000000e+02 1.000000000000000000e+00
 3 | 0.000000000000000000e+00 1.290000000000000000e+02 1.130000000000000000e+02 2.510000000000000000e+02 2.500000000000000000e+02 1.000000000000000000e+02 2.310000000000000000e+02 1.320000000000000000e+02 1.990000000000000000e+02 8.000000000000000000e+00
 4 | 2.200000000000000000e+01 8.700000000000000000e+01 1.750000000000000000e+02 7.000000000000000000e+01 2.450000000000000000e+02 1.520000000000000000e+02 3.200000000000000000e+01 1.780000000000000000e+02 9.600000000000000000e+01 2.700000000000000000e+01
 5 | 7.000000000000000000e+00 7.300000000000000000e+01 5.800000000000000000e+01 9.600000000000000000e+01 2.460000000000000000e+02 1.210000000000000000e+02 2.020000000000000000e+02 2.130000000000000000e+02 2.250000000000000000e+02 1.120000000000000000e+02
 6 | 2.470000000000000000e+02 2.320000000000000000e+02 2.070000000000000000e+02 2.300000000000000000e+01 1.450000000000000000e+02 1.140000000000000000e+02 1.260000000000000000e+02 1.210000000000000000e+02 2.440000000000000000e+02 4.900000000000000000e+01
 7 | 2.010000000000000000e+02 2.600000000000000000e+01 1.930000000000000000e+02 1.850000000000000000e+02 2.030000000000000000e+02 1.130000000000000000e+02 1.830000000000000000e+02 1.020000000000000000e+02 7.400000000000000000e+01 1.130000000000000000e+02
 8 | 2.020000000000000000e+02 3.600000000000000000e+01 1.180000000000000000e+02 2.520000000000000000e+02 1.840000000000000000e+02 6.800000000000000000e+01 2.410000000000000000e+02 1.000000000000000000e+02 2.700000000000000000e+01 1.680000000000000000e+02
 9 | 1.960000000000000000e+02 4.800000000000000000e+01 1.920000000000000000e+02 2.100000000000000000e+01 6.200000000000000000e+01 1.680000000000000000e+02 1.050000000000000000e+02 2.300000000000000000e+01 1.570000000000000000e+02 2.150000000000000000e+02
10 | 8.600000000000000000e+01 7.900000000000000000e+01 1.320000000000000000e+02 2.210000000000000000e+02 1.100000000000000000e+01 1.800000000000000000e+01 1.610000000000000000e+02 2.100000000000000000e+01 1.750000000000000000e+02 8.600000000000000000e+01
11 | 


--------------------------------------------------------------------------------
/pre_tutorial/results.txt:
--------------------------------------------------------------------------------
1 | {"protein": ["AP2", "p150glued"], "intensity": [8.3106078793474918, 9.8909087411511241], "number": [154370, 140631, 95877]}


--------------------------------------------------------------------------------
/pre_tutorial/tutorial-on-functions.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Python functions and modules"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "markdown",
 12 |    "metadata": {},
 13 |    "source": [
 14 |     "## 1. Defining a Function\n",
 15 |     "\n",
 16 |     "You can define functions to provide the required functionality. Here are simple rules to define a function in Python and the syntax:\n",
 17 |     "\n",
 18 |     "- Function blocks begin with the keyword `def` followed by the function name, parentheses ( ( ) ) and a colon (:).\n",
 19 |     "- Any input parameters or arguments should be placed within the parentheses. You can also define parameters inside these parentheses.\n",
 20 |     "- Next is the first statement of the function, which can be an optional statement - the documentation string of the function or docstring.\n",
 21 |     "- The code block is next. It needs to be indented.\n",
 22 |     "- The statement `return [expression]` exits a function and, optionally, passes back an expression to the caller. A return statement with no arguments is the same as `return None`.\n",
 23 |     "\n",
 24 |     "```\n",
 25 |     "def functionname( parameters ):\n",
 26 |     "    \"function_docstring\"\n",
 27 |     "    function_suite\n",
 28 |     "    return [expression]\n",
 29 |     "```"
 30 |    ]
 31 |   },
 32 |   {
 33 |    "cell_type": "markdown",
 34 |    "metadata": {},
 35 |    "source": [
 36 |     "#### Example\n",
 37 |     "\n",
 38 |     "The following function takes a string as input parameter and prints it on standard screen."
 39 |    ]
 40 |   },
 41 |   {
 42 |    "cell_type": "code",
 43 |    "execution_count": 10,
 44 |    "metadata": {
 45 |     "collapsed": true
 46 |    },
 47 |    "outputs": [],
 48 |    "source": [
 49 |     "def printme( str ):\n",
 50 |     "   \"This prints a passed string into this function\"\n",
 51 |     "   print str\n",
 52 |     "   return"
 53 |    ]
 54 |   },
 55 |   {
 56 |    "cell_type": "markdown",
 57 |    "metadata": {},
 58 |    "source": [
 59 |     "#### Example \n",
 60 |     "\n",
 61 |     "The following function takes two numbers and returns and prints their sum."
 62 |    ]
 63 |   },
 64 |   {
 65 |    "cell_type": "code",
 66 |    "execution_count": 11,
 67 |    "metadata": {
 68 |     "collapsed": true
 69 |    },
 70 |    "outputs": [],
 71 |    "source": [
 72 |     "def addme( a, b ):\n",
 73 |     "   \"This adds passed arguments.\"\n",
 74 |     "   return a+b"
 75 |    ]
 76 |   },
 77 |   {
 78 |    "cell_type": "markdown",
 79 |    "metadata": {},
 80 |    "source": [
 81 |     "## 2. Calling a Function\n",
 82 |     "\n",
 83 |     "Defining a function only gives it a name, specifies the parameters that are to be included in the function and structures the blocks of code.\n",
 84 |     "\n",
 85 |     "Once the basic structure of a function is finalized, you can execute it by calling it directly from the Python prompt. Note that by default, parameters have a positional behavior and, if there is more than one, you need to input them in the same order that they were defined. "
 86 |    ]
 87 |   },
 88 |   {
 89 |    "cell_type": "markdown",
 90 |    "metadata": {},
 91 |    "source": [
 92 |     "#### Example"
 93 |    ]
 94 |   },
 95 |   {
 96 |    "cell_type": "code",
 97 |    "execution_count": 12,
 98 |    "metadata": {
 99 |     "collapsed": false
100 |    },
101 |    "outputs": [
102 |     {
103 |      "name": "stdout",
104 |      "output_type": "stream",
105 |      "text": [
106 |       "I'm first call to user defined function!\n",
107 |       "Again second call to the same function\n",
108 |       "1 plus 2 is 3\n"
109 |      ]
110 |     }
111 |    ],
112 |    "source": [
113 |     "printme(\"I'm first call to user defined function!\")\n",
114 |     "printme(\"Again second call to the same function\") \n",
115 |     "print '1 plus 2 is', addme(1,2)"
116 |    ]
117 |   },
118 |   {
119 |    "cell_type": "markdown",
120 |    "metadata": {},
121 |    "source": [
122 |     "## 3. Function Arguments\n",
123 |     "\n",
124 |     "Functions have different types of arguments:\n",
125 |     "\n",
126 |     "#### Required arguments\n",
127 |     "\n",
128 |     "Required arguments are the arguments passed to a function in correct positional order. Here, the number of arguments in the function call should match exactly with the function definition.\n",
129 |     "\n",
130 |     "To call the function `printme()`, you definitely need to pass one argument, otherwise it gives a syntax error.\n",
131 |     "\n",
132 |     "Similarly, for `addme()` you need to pass two arguments."
133 |    ]
134 |   },
135 |   {
136 |    "cell_type": "markdown",
137 |    "metadata": {},
138 |    "source": [
139 |     "#### Keyword arguments\n",
140 |     "\n",
141 |     "Keyword arguments are related to the function calls. When you use keyword arguments in a function call, the caller identifies the arguments by the parameter name.\n",
142 |     "\n",
143 |     "This allows you to skip arguments or place them out of order because the Python interpreter is able to use the keywords provided to match the values with parameters. \n",
144 |     "\n",
145 |     "For example, the following code:"
146 |    ]
147 |   },
148 |   {
149 |    "cell_type": "code",
150 |    "execution_count": 13,
151 |    "metadata": {
152 |     "collapsed": false
153 |    },
154 |    "outputs": [
155 |     {
156 |      "name": "stdout",
157 |      "output_type": "stream",
158 |      "text": [
159 |       "My string\n"
160 |      ]
161 |     }
162 |    ],
163 |    "source": [
164 |     "printme( str = \"My string\")"
165 |    ]
166 |   },
167 |   {
168 |    "cell_type": "markdown",
169 |    "metadata": {},
170 |    "source": [
171 |     "The following example gives a more clear picture. Note that when keyword arguments are used, the order of parameters does not matter."
172 |    ]
173 |   },
174 |   {
175 |    "cell_type": "code",
176 |    "execution_count": 14,
177 |    "metadata": {
178 |     "collapsed": false
179 |    },
180 |    "outputs": [
181 |     {
182 |      "name": "stdout",
183 |      "output_type": "stream",
184 |      "text": [
185 |       "Name:  miki\n",
186 |       "Age:  50\n"
187 |      ]
188 |     }
189 |    ],
190 |    "source": [
191 |     "# Function definition is here\n",
192 |     "def printinfo( name, age ):\n",
193 |     "   \"This prints a passed info into this function\"\n",
194 |     "   print \"Name: \", name\n",
195 |     "   print \"Age: \", age\n",
196 |     "   return\n",
197 |     "\n",
198 |     "# Now you can call printinfo function\n",
199 |     "printinfo( age=50, name=\"miki\" )"
200 |    ]
201 |   },
202 |   {
203 |    "cell_type": "markdown",
204 |    "metadata": {},
205 |    "source": [
206 |     "#### Default arguments\n",
207 |     "\n",
208 |     "A default argument is an argument that assumes a default value if a value is not provided in the function call for that argument. The following example gives an idea on default arguments, it prints default age if it is not passed −"
209 |    ]
210 |   },
211 |   {
212 |    "cell_type": "code",
213 |    "execution_count": 15,
214 |    "metadata": {
215 |     "collapsed": false
216 |    },
217 |    "outputs": [
218 |     {
219 |      "name": "stdout",
220 |      "output_type": "stream",
221 |      "text": [
222 |       "Name:  miki\n",
223 |       "Age  50\n",
224 |       "Name:  miki\n",
225 |       "Age  35\n"
226 |      ]
227 |     }
228 |    ],
229 |    "source": [
230 |     "# Function definition is here\n",
231 |     "def printinfo( name, age = 35 ):\n",
232 |     "   \"This prints a passed info into this function\"\n",
233 |     "   print \"Name: \", name\n",
234 |     "   print \"Age \", age\n",
235 |     "   return;\n",
236 |     "\n",
237 |     "# Now you can call printinfo function\n",
238 |     "printinfo( age=50, name=\"miki\" )\n",
239 |     "printinfo( name=\"miki\" )"
240 |    ]
241 |   },
242 |   {
243 |    "cell_type": "markdown",
244 |    "metadata": {},
245 |    "source": [
246 |     "#### Variable-length arguments\n",
247 |     "\n",
248 |     "You may need to process a function for more arguments than you specified while defining the function. These arguments are called variable-length arguments and are not named in the function definition, unlike required and default arguments.\n",
249 |     "\n",
250 |     "Syntax for a function with non-keyword variable arguments is this −\n",
251 |     "\n",
252 |     "```\n",
253 |     "def functionname([formal_args,] *var_args_tuple ):\n",
254 |     "   \"function_docstring\"\n",
255 |     "   function_suite\n",
256 |     "   return [expression]\n",
257 |     "```\n",
258 |     "\n",
259 |     "An asterisk (*) is placed before the variable name that holds the values of all nonkeyword variable arguments. This tuple remains empty if no additional arguments are specified during the function call. "
260 |    ]
261 |   },
262 |   {
263 |    "cell_type": "code",
264 |    "execution_count": 16,
265 |    "metadata": {
266 |     "collapsed": false
267 |    },
268 |    "outputs": [
269 |     {
270 |      "name": "stdout",
271 |      "output_type": "stream",
272 |      "text": [
273 |       "Output is: \n",
274 |       "10\n",
275 |       "Output is: \n",
276 |       "70\n",
277 |       "60\n",
278 |       "50\n"
279 |      ]
280 |     }
281 |    ],
282 |    "source": [
283 |     "# Function definition is here\n",
284 |     "def printinfo( arg1, *vartuple ):\n",
285 |     "   \"This prints a variable passed arguments\"\n",
286 |     "   print \"Output is: \"\n",
287 |     "   print arg1\n",
288 |     "   for var in vartuple:\n",
289 |     "      print var\n",
290 |     "   return;\n",
291 |     "\n",
292 |     "# Now you can call printinfo function\n",
293 |     "printinfo( 10 )\n",
294 |     "printinfo( 70, 60, 50 )"
295 |    ]
296 |   },
297 |   {
298 |    "cell_type": "markdown",
299 |    "metadata": {},
300 |    "source": [
301 |     "## 4. Anonymous Functions\n",
302 |     "\n",
303 |     "These functions are called anonymous because they are not declared in the standard manner by using the `def` keyword. You can use the `lambda` keyword to create small anonymous functions.\n",
304 |     "\n",
305 |     "Lambda forms can take any number of arguments but return just one value in the form of an expression. They cannot contain commands or multiple expressions."
306 |    ]
307 |   },
308 |   {
309 |    "cell_type": "markdown",
310 |    "metadata": {},
311 |    "source": [
312 |     "#### Syntax\n",
313 |     "\n",
314 |     "The syntax of `lambda` functions contains only a single statement, which is as follows:\n",
315 |     "\n",
316 |     "`lambda [arg1 [,arg2,.....argn]]:expression`"
317 |    ]
318 |   },
319 |   {
320 |    "cell_type": "code",
321 |    "execution_count": 12,
322 |    "metadata": {
323 |     "collapsed": false
324 |    },
325 |    "outputs": [
326 |     {
327 |      "name": "stdout",
328 |      "output_type": "stream",
329 |      "text": [
330 |       "Value of total :  30\n",
331 |       "Value of total :  40\n"
332 |      ]
333 |     }
334 |    ],
335 |    "source": [
336 |     "# Function definition is here - this function has two arguments and it adds them up\n",
337 |     "sum = lambda arg1, arg2: arg1 + arg2;\n",
338 |     "\n",
339 |     "# Now you can call sum as a function\n",
340 |     "print \"Value of total : \", sum( 10, 20 )\n",
341 |     "print \"Value of total : \", sum( 20, 20 )"
342 |    ]
343 |   },
344 |   {
345 |    "cell_type": "markdown",
346 |    "metadata": {},
347 |    "source": [
348 |     "## 5. The return Statement\n",
349 |     "\n",
350 |     "We briefly used the return statement `return [expression]` in the above functions, but let's try to explain it more explicitly: It exits a function, optionally, passing back an expression to the caller. A return statement with no arguments is the same as `return None`."
351 |    ]
352 |   },
353 |   {
354 |    "cell_type": "code",
355 |    "execution_count": 17,
356 |    "metadata": {
357 |     "collapsed": false
358 |    },
359 |    "outputs": [
360 |     {
361 |      "name": "stdout",
362 |      "output_type": "stream",
363 |      "text": [
364 |       "-10\n"
365 |      ]
366 |     }
367 |    ],
368 |    "source": [
369 |     "# This function returns an expression\n",
370 |     "def substractme( arg1, arg2 ):\n",
371 |     "   # Substracts the second parameter from the first and returns the result.\"\n",
372 |     "   total = arg1 - arg2\n",
373 |     "   return total;\n",
374 |     "\n",
375 |     "# Now you can call sum function\n",
376 |     "total = substractme( 10, 20 );\n",
377 |     "print total "
378 |    ]
379 |   },
380 |   {
381 |    "cell_type": "markdown",
382 |    "metadata": {},
383 |    "source": [
384 |     "Warning: The retuned arguments are also order-specific! See below:"
385 |    ]
386 |   },
387 |   {
388 |    "cell_type": "code",
389 |    "execution_count": 18,
390 |    "metadata": {
391 |     "collapsed": false
392 |    },
393 |    "outputs": [
394 |     {
395 |      "name": "stdout",
396 |      "output_type": "stream",
397 |      "text": [
398 |       "3\n",
399 |       "-1\n",
400 |       "2\n",
401 |       "\n",
402 |       "\n",
403 |       "3\n",
404 |       "2\n",
405 |       "-1\n"
406 |      ]
407 |     }
408 |    ],
409 |    "source": [
410 |     "# if you have a function:\n",
411 |     "def arithmetic( a, b ):\n",
412 |     "    sumab = a+b\n",
413 |     "    substractab = a-b\n",
414 |     "    multiplyab = a*b\n",
415 |     "    return sumab, substractab, multiplyab\n",
416 |     "    \n",
417 |     "# This\n",
418 |     "c, d, e = arithmetic(1,2)\n",
419 |     "print c\n",
420 |     "print d\n",
421 |     "print e\n",
422 |     "print '\\n'\n",
423 |     "\n",
424 |     "# does not alocate the same values to the varialbes c, d, e as this\n",
425 |     "c, e, d = arithmetic(1,2)\n",
426 |     "print c\n",
427 |     "print d\n",
428 |     "print e"
429 |    ]
430 |   },
431 |   {
432 |    "cell_type": "markdown",
433 |    "metadata": {},
434 |    "source": [
435 |     "## 5. Scope of Variables\n",
436 |     "\n",
437 |     "All variables in a program may not be accessible at all locations in that program. This depends on where you have declared a variable. The scope of a variable determines the portion of the program where you can access a particular identifier.\n",
438 |     "\n",
439 |     "There are two basic scopes of variables in Python:\n",
440 |     "\n",
441 |     "Global variables\n",
442 |     "\n",
443 |     "Local variables\n",
444 |     "\n",
445 |     "\n",
446 |     "#### Global vs. Local variables\n",
447 |     "\n",
448 |     "Variables that are defined inside a function body have a local scope, and those defined outside have a global scope.\n",
449 |     "\n",
450 |     "This means that local variables can be accessed only inside the function in which they are declared, whereas global variables can be accessed throughout the program body by all functions. When you call a function, the variables declared inside it are brought into scope. Following is a simple example −"
451 |    ]
452 |   },
453 |   {
454 |    "cell_type": "code",
455 |    "execution_count": 15,
456 |    "metadata": {
457 |     "collapsed": true
458 |    },
459 |    "outputs": [],
460 |    "source": [
461 |     "total = 0; # This is global variable."
462 |    ]
463 |   },
464 |   {
465 |    "cell_type": "code",
466 |    "execution_count": 17,
467 |    "metadata": {
468 |     "collapsed": false
469 |    },
470 |    "outputs": [
471 |     {
472 |      "name": "stdout",
473 |      "output_type": "stream",
474 |      "text": [
475 |       "Inside the function (local) total:  -10\n",
476 |       "Outside the function (global) total :  0\n"
477 |      ]
478 |     }
479 |    ],
480 |    "source": [
481 |     "# Function definition is here\n",
482 |     "def substractme( arg1, arg2 ):\n",
483 |     "   # Substracts the second parameter from the first and return them.\"\n",
484 |     "   total = arg1 - arg2; # Here total is a local variable.\n",
485 |     "   print \"Inside the function (local) total: \", total \n",
486 |     "   return total;\n",
487 |     "\n",
488 |     "# Now you can call sum function\n",
489 |     "substractme( 10, 20 );\n",
490 |     "print \"Outside the function (global) total : \", total "
491 |    ]
492 |   },
493 |   {
494 |    "cell_type": "markdown",
495 |    "metadata": {},
496 |    "source": [
497 |     "## 6. Python Modules\n",
498 |     "\n",
499 |     "Simply, a module is a file consisting of Python code. A module can define functions, classes and variables. A module can also include runnable code.\n",
500 |     "\n",
501 |     "A module allows you to logically organise your Python code. Grouping related code into a module makes the code easier to understand and use. \n",
502 |     "\n",
503 |     "For example, in a new file, copy and paste the following, making sure you undertand the code. Name the file module_example.py:\n",
504 |     "\n",
505 |     "```\n",
506 |     "def print_func( par ):\n",
507 |     "       print \"Hello : \", par\n",
508 |     "       return\n",
509 |     "```"
510 |    ]
511 |   },
512 |   {
513 |    "cell_type": "markdown",
514 |    "metadata": {},
515 |    "source": [
516 |     "#### The import Statement\n",
517 |     "\n",
518 |     "You can use any Python source file as a module by executing an import statement in some other Python source file. The import has the following syntax:\n",
519 |     "\n",
520 |     "`import module1[, module2[,... moduleN]`\n",
521 |     "\n",
522 |     "When the interpreter encounters an import statement, it imports the module if the module is present in the search path or the current directory.\n",
523 |     "\n",
524 |     "For example, to import the module module_example.py, you need to use the following command:"
525 |    ]
526 |   },
527 |   {
528 |    "cell_type": "code",
529 |    "execution_count": 18,
530 |    "metadata": {
531 |     "collapsed": false
532 |    },
533 |    "outputs": [
534 |     {
535 |      "name": "stdout",
536 |      "output_type": "stream",
537 |      "text": [
538 |       "Hello :  Zara\n"
539 |      ]
540 |     }
541 |    ],
542 |    "source": [
543 |     "# Import module support\n",
544 |     "import module_example as modex\n",
545 |     "\n",
546 |     "# Now you can call defined functions of the module, as follows\n",
547 |     "modex.print_func(\"Zara\")"
548 |    ]
549 |   },
550 |   {
551 |    "cell_type": "markdown",
552 |    "metadata": {},
553 |    "source": [
554 |     "A module is loaded only once, regardless of the number of times it is imported. This prevents the module execution from happening over and over again if multiple imports occur."
555 |    ]
556 |   },
557 |   {
558 |    "cell_type": "markdown",
559 |    "metadata": {},
560 |    "source": [
561 |     "#### The `from...import` statement and the `from...import *` statement:\n",
562 |     "\n",
563 |     "Python's `from import` statement lets you import specific attributes from a module into the current namespace. It has the following syntaxL\n",
564 |     "\n",
565 |     "`from modname import name1[, name2[, ... nameN]]`\n",
566 |     "\n",
567 |     "For example, to import the function `sum` from the module `scipy`, use the following statement:"
568 |    ]
569 |   },
570 |   {
571 |    "cell_type": "code",
572 |    "execution_count": 19,
573 |    "metadata": {
574 |     "collapsed": true
575 |    },
576 |    "outputs": [],
577 |    "source": [
578 |     "from scipy import sum as s"
579 |    ]
580 |   },
581 |   {
582 |    "cell_type": "markdown",
583 |    "metadata": {},
584 |    "source": [
585 |     "This statement does not import the entire module/package scipy into the current namespace; it just introduces the item `sum` from the module `scipy`. Note that in this example it is renamed to `s`.\n",
586 |     "\n",
587 |     "It is also possible to import all names from a module into the current namespace by using the following import statement:"
588 |    ]
589 |   },
590 |   {
591 |    "cell_type": "code",
592 |    "execution_count": 21,
593 |    "metadata": {
594 |     "collapsed": false,
595 |     "scrolled": true
596 |    },
597 |    "outputs": [],
598 |    "source": [
599 |     "from scipy import *"
600 |    ]
601 |   },
602 |   {
603 |    "cell_type": "markdown",
604 |    "metadata": {},
605 |    "source": [
606 |     "#### References:\n",
607 |     "\n",
608 |     "http://www.tutorialspoint.com/python/ , \n",
609 |     "https://github.com/tobyhodges/ITPP , \n",
610 |     "https://github.com/cmci/HTManalysisCourse/blob/master/CentreCourseProtocol.md#workflow-python-primer , \n",
611 |     "http://cmci.embl.de/documents/ijcourses"
612 |    ]
613 |   },
614 |   {
615 |    "cell_type": "code",
616 |    "execution_count": null,
617 |    "metadata": {
618 |     "collapsed": true
619 |    },
620 |    "outputs": [],
621 |    "source": []
622 |   }
623 |  ],
624 |  "metadata": {
625 |   "kernelspec": {
626 |    "display_name": "Python 2",
627 |    "language": "python",
628 |    "name": "python2"
629 |   },
630 |   "language_info": {
631 |    "codemirror_mode": {
632 |     "name": "ipython",
633 |     "version": 2
634 |    },
635 |    "file_extension": ".py",
636 |    "mimetype": "text/x-python",
637 |    "name": "python",
638 |    "nbconvert_exporter": "python",
639 |    "pygments_lexer": "ipython2",
640 |    "version": "2.7.11"
641 |   }
642 |  },
643 |  "nbformat": 4,
644 |  "nbformat_minor": 0
645 | }
646 | 


--------------------------------------------------------------------------------