├── LICENSE
├── README.md
├── code
    ├── CnnSupervisedClassification.py
    ├── CnnSupervisedClassification_PyQGIS.py
    ├── CompileClassificationReport.py
    └── TrainCNN.py
├── docs
    ├── _config.yml
    └── readme.md
└── sample_data
    ├── Train
        ├── NASNetMobile_base_50px.h5
        ├── SCLS_StMarg26607.tif
        ├── SCLS_StMarg26626.tif
        ├── SCLS_StMarg26722.tif
        ├── SCLS_StMarg26734.tif
        ├── SCLS_StMarg26758.tif
        ├── SCLS_StMarg26776.tif
        ├── SCLS_StMarg26806.tif
        ├── SCLS_StMarg26818.tif
        ├── SCLS_StMarg26836.tif
        ├── SCLS_StMarg27113.tif
        ├── StMarg_26607.jpg
        ├── StMarg_26626.jpg
        ├── StMarg_26722.jpg
        ├── StMarg_26734.jpg
        ├── StMarg_26758.jpg
        ├── StMarg_26776.jpg
        ├── StMarg_26806.jpg
        ├── StMarg_26818.jpg
        ├── StMarg_26836.jpg
        └── StMarg_27113.jpg
    ├── Validate
        ├── SCLS_StMarg31162.tif
        ├── SCLS_StMarg31180.tif
        ├── SCLS_StMarg31198.tif
        ├── SCLS_StMarg31216.tif
        ├── SCLS_StMarg31234.tif
        ├── SCLS_StMarg31252.tif
        ├── SCLS_StMarg31270.tif
        ├── SCLS_StMarg31288.tif
        ├── SCLS_StMarg31306.tif
        ├── SCLS_StMarg31324.tif
        ├── StMarg_31162.jpg
        ├── StMarg_31180.jpg
        ├── StMarg_31198.jpg
        ├── StMarg_31216.jpg
        ├── StMarg_31234.jpg
        ├── StMarg_31252.jpg
        ├── StMarg_31270.jpg
        ├── StMarg_31288.jpg
        ├── StMarg_31306.jpg
        └── StMarg_31324.jpg
    └── readme.md


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2019 James Dietrich
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # CNN-Supervised Classification
  2 | 
  3 | [![DOI](https://zenodo.org/badge/178877929.svg)](https://zenodo.org/badge/latestdoi/178877929)
  4 | 
  5 | ### Python code for cnn-supervised classification of remotely sensed imagery with deep learning - part of the Deep Riverscapes project
  6 | Supervised classification is a workflow in Remote Sensing (RS) whereby a human user draws training (i.e. labelled) areas, generally with a GIS vector polygon, on a RS image.  The polygons are then used to extract pixel values and, with the labels, fed into a supervised machine learning algorithm for land-cover classification.  The core idea behind *CNN-Supervised* Classification (CSC) is to replace the human user with a pre-trained convolutional neural network (CNN).    Once a CNN is trained, CSC starts by running the trained CNN on an image.  This results in a tiled image classifation.  Then CCC runs a second phase where the CNN-derived tiled classification is reformed into a lable raster and used to train and run a more shallow machine learning algorithm but only on the image pixels of that given image making the result more customised to the specific radiometric properties of the image.   The output is a pixel-level clasification for land-cover.  We have experimented with Random Forests and Multi Layer Perceptrons (MLP) and found that the MLP gives better results.  Development of the CSC workflow was done in the context of fluvial remote sensing and aimed at improving the land-cover clasification of the type of imagery obtained from drone surveys of river corridors.  Our test dataset is compiled from high resolution aerial imagery of 11 rivers.  It has 1 billion labelled pixels for training and another 4 billion labelled pixels for validation.  If we train 11 CNN models, 1 for each river, then validate these CNN models only with the validation images of their repective rivers, we obtain an overall pixel-weighted F1 score of **94%**.  If we train a single CNN with the data from 5 rivers, we find that the resulting CSC workflow can predict classes of the *other* 6 rivers (true out of sample data never seen during CNN training) with an overall pixel-wieghted F1 sore of **90%**. See citation below.
  7 | 
  8 | A short video introduction of CC-Supervised Classification, aimed at a wide non-specialist audience, can be found [here](https://youtu.be/YbY1niHpSHY). Note that the video uses the former name of the method: Self-Supervised Classification. 
  9 | 
 10 |  ## Dependencies
 11 | * Keras (we use TensorFlow-GPU v1.14 as the backend)
 12 | * Scikit-Learn
 13 | * Imbalanced-Learn toolbox 
 14 | * Scikit-Image
 15 | * Pandas
 16 | 
 17 | ## Getting Started
 18 | 
 19 | ### Disclaimer
 20 | This code is currently in the development stage and intended for research purposes.  The coding structure is naive and not optimised for production.  The process is not yet designed to output class rasters for new unclassified images and expects every image to have an accompanying class raster (i.e. a label image) for either training or for validation. 
 21 | 
 22 | ### Basic Installation
 23 | After installing dependencies, the code can be tested with the instructions, data and a NASNet Mobile base model provided in the sample_data folder.
 24 | 
 25 | ### Model and data download
 26 | Once the code functions, users can use the base NASNet Mobile provided and/or download the pre-trained models from the data repository found [here](https://collections.durham.ac.uk/files/r2f1881k904).  The NASNet_Models.zip file contains a base model for NASNet Large which can be trained with the imagery and labels provided in the data repository or to new data.  NASNet_Models.zip also contains a set of pre-trained NASNet Mobile models which can be used to run 'CnnSupervisedClassification.py' with the 1100+ images provided in the repository and used in the work cited below.  Due to file sizes, pre-trained NASNet Large models for all rivers are not provided.  
 27 | 
 28 | ### Data preparation
 29 | It is assumed that the data comes in the format that typically results from an airborne survey such as: root_number.jpg.   We recommend that the data be structured as: RiverName_Number.jpg.  The number must be at least 4 digits (RiverName_0022.jpg), but can be more if nessesary (exampe 5-digit, RiverName_12345.jpg).  The associated classification is expected to have the same filename but with a prefix of 'SCLS_' and a tif format (SCLS_RiverName_0022.tif). The default number of land-cover classes in the code and in the label data found on the repository is 5: water, dry sediment, green vegetation, senescent vegetation and paved roads. Users can alter the number of classes for other studies as needed.  However, all the code and models function by tiling the input imagery in sub-images of 50x50 pixels. 
 30 | 
 31 | ### CNN Training
 32 | Once image data is organised, the script TrainCNN.py can be used to train the NASNet Large or mobile architectures with pretrained weights as downloaded.  User options are at the start.  Elements marked 'Path' or 'Empty' need to be edited. Multiple rivers can be included in the same folder, they will be separated based on the River Names included in the image file names (see above).  On first running, it is recommended to set the ModelTuning variable to True and run the tuning procedure for the CNN.  This will output a figure and the correct number of tuning epochs can be set as the point where the loss and accuracy of the validation data begin to diverge from the loss and accuracy of the training data.  Once this is established, the script must be run again with ModelTuning set to False and the correct value for Tuning. This will save the model with a .h5 extension and it will also save a class key as a small csv file. Once these options are edited in the code no switches are required. e.g. :
 33 | ```
 34 | TrainCNN
 35 | ```
 36 | will work from an Ipython console and:
 37 | ```
 38 | python C:\MyCode\TrainCNN.py
 39 | ```
 40 | will execute the script from a prompt provided the code path is correct.  The easiest option is to use Spyder to edit, save and execute the directly from the editor (Hotkey: F5). Note that in this case you must be sure that dependencies are correctly installed for use by Spyder.  You may need to re-install another version of Spyder in the TensorFlow environment.
 41 | 
 42 | ### CSC execution
 43 | Once a trained CNN model is in place, CSC performance can be evaluated with CnnSupervisedClassification.py.  The images to test must follow the same naming convention and all have an existing set of manual labels as used in the CNN training phase above.   Again variables currently set to 'Path' or 'Empty' must be edited in the code.  The CSC is currently set to use a Multilayer Perceptron (MLP) to perform the phase 2, pixel-level, classification.  In this phase, the CNN classification output for a specific image will be used as training data for that specific image.  The script will execute and output performance metrics for each image.  csv files with a CNN_ prefix give performance metrics for the CNN model with F1 scores and support (# of pixels) for each class.  MLP_ files give the same metrics for the final CSC result after the application of the MLP. A 4-part figure will also be output showing the original image, the existing class labels, the CNN classification and the final CSC classification labelled either MLP. Optionally, a saved class raster can also be saved to disk for each processed image.
 44 | ```
 45 | CnnSupervisedClassification
 46 | ```
 47 | will work from an Ipython console and:
 48 | ```
 49 | python C:\MyCode\CnnSupervisedClassification.py
 50 | ```
 51 | will execute the script from a prompt provided the code path is correct.  The easiest option remains the use Spyder to edit, save and execute the directly from the editor (Hotkey: F5). 
 52 | 
 53 | ![GitHub_StMarg27170](https://user-images.githubusercontent.com/47110600/56954378-8bd66380-6b36-11e9-8396-8ba150c4c4aa.png)
 54 | *Figure 1. Sample 4-part output*
 55 | 
 56 | IMPORTANT: The CNN-Supervised Classification script will use the specified CNN to classify all the images in the PredictPath folder.  Users needing to apply a specific CNN to a specific river dataset should save the imagery from separate rivers in separate folders.
 57 | 
 58 | ### Report Compilation
 59 | The CSC execution will result 3 files per classified image: separate classification score files for for the CNN and MLP stages and an image file showing the input image, the validation data, the CNN classification (used sas training data for the next step) and the MLP (or RF) classification. CompileClassificationReports.py can be edited and executed in a similar way and will output a single csv file whose format is intended for use with Pandas and Seaborn for visualisation.  
 60 | 
 61 | 
 62 | ![GitHub_SSCample](https://user-images.githubusercontent.com/47110600/56954483-c809c400-6b36-11e9-8d1a-fa19647ba524.png)
 63 | *Figure 2. Sample of results as violin plots.  Here we show the outputs for the Ouelle river in Canada*
 64 | 
 65 | ## GIS integration
 66 | The script CnnSupervisedClassificadtion_PyQGIS.py uses PyQGIS code to integrate the CSC process with QGIS.  It is assumed that this will be used with single, presumably large, orthoimages that are geocoded.
 67 | 
 68 | ### GIS installation
 69 | We recommend using the long term release of QGIS (currently 3.4).  Additionnal Python libraries can be installed in the QGIS Python 3 environment with pip. Start the OSGEO4W shell as an administrator and proceed as follows:
 70 | 
 71 | 1. type: py3_env (this will pass commands to the Python 3 environment, used in the console)
 72 | 2. pip install the same packages as above, include version specifications in the pip command.  It is recommended to use the GPU version of tensorflow.
 73 | 3. Downgrade the h5py library to version 2.9.  This is needed to avoid a version clash.
 74 | 4. If using the GPU version of tensorflow, we need to locate CUDA dlls.  Open QGI and open Settings>options>System
 75 | 5. In Environments, append the CUDA locations to the PATH variable.  These should replicate the paths set in windows during CUDA and cudnn installation.  NOTE: this can be a delicate process, any mistake may require a complete re-install of QGIS.
 76 | 
 77 | ### GIS usage
 78 | Add your orthoimage as a raster layer in QGIS.  Open the Python console and there open the CnnSupervisedClassification_PyQGIS script.  Fill the user parameters on lines 75 to 85 of the script and execute.  Geocoded class rasters for both the CNN (prefixed with CLASS_CNN_) and CNN+MLP stage (prefixed with CLASS_CSC_) of CSC will be displayed in QGIS and saved to disk.  If you have specified a validation dataset in the form of a raster (line 79).  Some classification metrics and a confusion matrix will be displayed in the Python console.. 
 79 | 
 80 | 
 81 | ## Authors:
 82 |  - Patrice E. Carbonneau, University of Durham, [e-mail](mailto:patrice.carbonneau@durham.ac.uk)
 83 |  - Toby P. Breckon, University of Durham
 84 |  - James T. Dietrich, University of Northern Iowa
 85 |  - Steven J. Dugdale, University of Nottingham
 86 |  - Mark A. Fonstad, University of Oregon
 87 |  - Hitoshi Miyamoto, Shibaura Institute of Technology
 88 |  - Amy S. Woodget, Loughborough University
 89 |  
 90 | ## Citation
 91 | This work is currently in the process of publication where a full description of parameters will be available.  The current best citation is:
 92 |  
 93 | Carbonneau et al, 2019, Generalised classification of hyperspatial resolution airborne imagery of fluvial scenes with deep convolutional neural networks. Geophysical Research Abstracts, EGU2019-1865, EGU General Assembly 2019.
 94 | 
 95 | This poster is available [here](https://drive.google.com/drive/folders/14nc600DprwxXdzHvIMdLBLH_xVX8pe30?usp=sharing)
 96 |  
 97 |  
 98 |  
 99 | 
100 | 


--------------------------------------------------------------------------------
/code/CnnSupervisedClassification.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # -*- coding: utf-8 -*-
  3 | #------------------------------------------------------------------------------
  4 | __author__ = 'Patrice Carbonneau'
  5 | __contact__ = 'patrice.carbonneau@durham.ac.uk'
  6 | __copyright__ = '(c) Patrice Carbonneau'
  7 | __license__ = 'MIT'
  8 | __date__ = '15 APR 2019'
  9 | __version__ = '1.1'
 10 | __status__ = "initial release"
 11 | __url__ = "https://github.com/geojames/Self-Supervised-Classification"
 12 | 
 13 | 
 14 | """
 15 | Name:           CNNSupervisedClassification.py
 16 | Compatibility:  Python 3.6
 17 | Description:    Performs Self-Supervised Image CLassification with a 
 18 |                 pre-trained Convolutional Neural Network model.
 19 |                 User options are in the first section of code.
 20 | 
 21 | Requires:       keras, numpy, pandas, matplotlib, skimage, sklearn
 22 | 
 23 | Dev Revisions:  JTD - 19/4/19 - Updated file paths, added auto detection of
 24 |                     river names from input images, improved image reading loops
 25 | 
 26 | Licence:        MIT
 27 | Permission is hereby granted, free of charge, to any person obtaining a copy of
 28 | this software and associated documentation files (the "Software"), to deal in 
 29 | the Software without restriction, including without limitation the rights to 
 30 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 
 31 | of the Software, and to permit persons to whom the Software is furnished to do 
 32 | so, subject to the following conditions:
 33 | 
 34 | The above copyright notice and this permission notice shall be included in all 
 35 | copies or substantial portions of the Software.
 36 | 
 37 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 
 38 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 
 39 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 
 40 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 
 41 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
 42 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
 43 | SOFTWARE.
 44 | """
 45 | 
 46 | ###############################################################################
 47 | """ Libraries"""
 48 | from keras import regularizers
 49 | from keras import optimizers
 50 | from keras.models import load_model
 51 | import numpy as np
 52 | import matplotlib.pyplot as plt
 53 | import matplotlib.colors as colors
 54 | import matplotlib.patches as mpatches
 55 | from skimage import io
 56 | import skimage.transform as T
 57 | import pandas as pd
 58 | from keras.models import Sequential
 59 | from keras.layers import Dense, Dropout, BatchNormalization
 60 | from keras.wrappers.scikit_learn import KerasClassifier
 61 | from sklearn.preprocessing import StandardScaler
 62 | from skimage.filters.rank import median, entropy, modal
 63 | import os.path
 64 | from sklearn import metrics
 65 | from skimage.morphology import disk
 66 | from imblearn.combine import SMOTEENN
 67 | import copy
 68 | import sys
 69 | from IPython import get_ipython #this can be removed if not using Spyder
 70 | import glob
 71 | 
 72 | 
 73 | 
 74 | #############################################################
 75 | """User data input. Fill in the info below before running"""
 76 | #############################################################
 77 | 
 78 | ModelName = 'Empty'     #should be the model name from previous run of TrainCNN.py
 79 | TrainPath = 'Empty'  
 80 | PredictPath = 'Empty'    #Location of the images
 81 | ScorePath = 'Empty'      #location of the output files and the model
 82 | Experiment = 'Empty'    #ID to append to output performance files
 83 | 
 84 | '''BASIC PARAMETER CHOICES'''
 85 | UseSmote = False #Turn SMOTE-ENN resampling on and off
 86 | MLP = True #If false, the phase 2 class will be done with a random forest
 87 | TrainingEpochs = 35 #Typically this can be reduced
 88 | Ndims = 3 # Feature Dimensions. 3 if just RGB, 4 will add a co-occurence entropy on 11x11 pixels.  There is NO evidence showing that this actually improves the outcomes. RGB is recommended.
 89 | SubSample = 1 #0-1 percentage of the CNN output to use in the MLP. 1 gives the published results.
 90 | NClasses = 5  #The number of classes in the data. This MUST be the same as the classes used to retrain the model
 91 | SaveClassRaster = False #If true this will save each class image to disk.  Outputs are not geocoded in this script. For GIS integration, see CnnSupervisedClassification_PyQGIS.py
 92 | DisplayHoldout =  False #Display the results figure which is saved to disk.  
 93 | OutDPI = 150 #Recommended 150 for inspection 1200 for papers.  
 94 | 
 95 | '''FILTERING OPTIONS'''
 96 | #These parameters offer extra options to smooth the classification outputs.  By default they are set
 97 | #to be inactive.  They are not used or discussed in the paper and poster associated to this code.
 98 | MinTiles = 0 #The minimum number of contiguous tiles to consider as a significnat element in the image.  
 99 | RecogThresh = 0 #minimum prob of the top-1 predictions to keep
100 | SmallestElement = 0 # Despeckle the classification to the smallest length in pixels of element remaining, just enter linear units (e.g. 3 for 3X3 pixels)
101 | 
102 | 
103 | '''MODEL PARAMETERS''' #These would usually not be edited
104 | DropRate = 0.5
105 | ModelChoice = 2 # 2 for deep model and 3 for very deep model 
106 | LearningRate = 0.001
107 | Chatty = 1 # set the verbosity of the model training.  Use 1 at first, 0 when confident that model is well tuned
108 | MinSample = 250000 #minimum sample size per class before warning
109 | 
110 | 
111 | 
112 | 
113 | 
114 | 
115 | # Path checks- checks for folder ending slash, adds if nessesary
116 | 
117 | if ('/' or "'\'") not in PredictPath[-1]:
118 |     PredictPath = PredictPath + '/'   
119 | 
120 | if ('/' or "'\'") not in ScorePath[-1]:
121 |     ScorePath = ScorePath +'/'
122 |     
123 | # create Score Directory if not present
124 | if os.path.exists(ScorePath) == False:
125 |     os.mkdir(ScorePath)
126 | 
127 | #####################################################################################################################
128 | 
129 | 
130 | ##################################################################
131 | """ HELPER FUNCTIONS SECTION"""
132 | ##################################################################
133 | # Helper function to crop images to have an integer number of tiles. No padding is used.
134 | def CropToTile (Im, size):
135 |     if len(Im.shape) == 2:#handle greyscale
136 |         Im = Im.reshape(Im.shape[0], Im.shape[1],1)
137 | 
138 |     crop_dim0 = size * (Im.shape[0]//size)
139 |     crop_dim1 = size * (Im.shape[1]//size)
140 |     return Im[0:crop_dim0, 0:crop_dim1, :]
141 |     
142 |     
143 | #Helper functions to move images in and out of tensor format
144 | def split_image_to_tiles(im, size):
145 |     
146 |     if len(im.shape) ==2:
147 |         h, w = im.shape
148 |         d = 1
149 |     else:
150 |         h, w, d = im.shape
151 | 
152 |      
153 |     nTiles_height = h//size
154 |     nTiles_width = w//size
155 |     TileTensor = np.zeros((nTiles_height*nTiles_width, size,size,d))
156 |     B=0
157 |     for y in range(0, nTiles_height):
158 |         for x in range(0, nTiles_width):
159 |             x1 = np.int32(x * size)
160 |             y1 = np.int32(y * size)
161 |             x2 = np.int32(x1 + size)
162 |             y2 = np.int32(y1 + size)
163 |             TileTensor[B,:,:,:] = im[y1:y2,x1:x2].reshape(size,size,d)
164 |             B+=1
165 | 
166 |     return TileTensor
167 | 
168 | #Create the label vector
169 | def PrepareTensorData(ImageTile, ClassTile, size):
170 |     #this takes the image tile tensor and the class tile tensor
171 |     #It produces a label vector from the tiles which have 90% of a pure class
172 |     #It then extracts the image tiles that have a classification value in the labels
173 |     LabelVector = np.zeros(ClassTile.shape[0])
174 |     
175 |     for v in range(0,ClassTile.shape[0]):
176 |         Tile = ClassTile[v,:,:,0]
177 |         vals, counts = np.unique(Tile, return_counts = True)
178 |         if (vals[0] == 0) and (counts[0] > 0.1 * size**2):
179 |             LabelVector[v] = 0
180 |         elif counts[np.argmax(counts)] >= 0.9 * size**2:
181 |             LabelVector[v] = vals[np.argmax(counts)] 
182 |     
183 |     LabelVector = LabelVector[LabelVector > 0]
184 |     ClassifiedTiles = np.zeros((np.count_nonzero(LabelVector), size,size,3))
185 |     C = 0
186 |     for t in range(0,np.count_nonzero(LabelVector)):
187 |         if LabelVector[t] > 0:
188 |             ClassifiedTiles[C,:,:,:] = ImageTile[t,:,:,:]
189 |             C += 1
190 |     return LabelVector, ClassifiedTiles
191 | 
192 | 
193 | #############################
194 | def class_prediction_to_image(im, PredictedTiles, size):
195 | 
196 |     if len(im.shape) ==2:
197 |         h, w = im.shape
198 |         d = 1
199 |     else:
200 |         h, w, d = im.shape
201 | 
202 |      
203 |     nTiles_height = h//size
204 |     nTiles_width = w//size
205 |     #TileTensor = np.zeros((nTiles_height*nTiles_width, size,size,d))
206 |     TileImage = np.zeros(im.shape)
207 |     B=0
208 |     for y in range(0, nTiles_height):
209 |         for x in range(0, nTiles_width):
210 |             x1 = np.int32(x * size)
211 |             y1 = np.int32(y * size)
212 |             x2 = np.int32(x1 + size)
213 |             y2 = np.int32(y1 + size)
214 |             #TileTensor[B,:,:,:] = im[y1:y2,x1:x2].reshape(size,size,d)
215 |             TileImage[y1:y2,x1:x2] = np.argmax(PredictedTiles[B,:])
216 |             B+=1
217 | 
218 |     return TileImage
219 | 
220 | # This is a helper function to repeat a filter on 3 colour bands.  Avoids an extra loop in the big loops below
221 | def ColourFilter(Image):
222 |     med = np.zeros(np.shape(Image))
223 |     for b in range (0,3):
224 |         img = Image[:,:,b]
225 |         med[:,:,b] = median(img, disk(5))
226 |     return med
227 |  
228 | 
229 | ##################################################################
230 | #Save classification reports to csv with Pandas
231 | def classification_report_csv(report, filename):
232 |     report_data = []
233 |     lines = report.split('\n')
234 |     for line in lines[2:-5]:
235 |         row = {}
236 |         row_data = line.split(' ') 
237 |         row_data = list(filter(None, row_data))
238 |         row['class'] = row_data[0]
239 |         row['precision'] = float(row_data[1])
240 |         row['recall'] = float(row_data[2])
241 |         row['f1_score'] = float(row_data[3])
242 |         row['support'] = float(row_data[4])
243 |         report_data.append(row)
244 |     dataframe = pd.DataFrame.from_dict(report_data)
245 |     dataframe.to_csv(filename, index = False) 
246 | 
247 | 
248 | ###############################################################################
249 | # Return a class prediction to the 1-Nclasses hierarchical classes
250 | def SimplifyClass(ClassImage, ClassKey):
251 |     Iclasses = np.unique(ClassImage)
252 |     for c in range(0, len(Iclasses)):
253 |         KeyIndex = ClassKey.loc[ClassKey['LocalClass'] == Iclasses[c]]
254 |         Hclass = KeyIndex.iloc[0]['HierarchClass']
255 |         ClassImage[ClassImage == Iclasses[c]] = Hclass
256 |     return ClassImage
257 | 
258 | 
259 | 
260 | ##########################################
261 | #fetches the overall avg F1 score from a classification report
262 | def GetF1(report):
263 |     lines = report.split('\n')
264 |     for line in lines[0:-1]:
265 |         if 'weighted' in line:
266 |             dat = line.split(' ')
267 |     
268 |     return dat[17]
269 | 
270 | ##############################################################################
271 | """Instantiate the MLP pixel-based classifier""" 
272 |    
273 | 
274 | # define deep the model with L2 regularization and dropout
275 | def deep_model_L2D():
276 | 	# create model
277 |     model = Sequential()
278 |     model.add(Dense(256, kernel_regularizer= regularizers.l2(0.001), input_dim=Ndims, kernel_initializer='normal', activation='relu'))
279 |     model.add(Dropout(0.5))
280 |     model.add(Dense(128, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
281 |     model.add(Dropout(0.5))
282 |     model.add(Dense(64, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
283 |     model.add(Dropout(0.5))
284 |     model.add(Dense(32, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
285 |     #model.add(Dropout(0.5))
286 |     model.add(Dense(32, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
287 |     #model.add(Dropout(0.5))
288 |     model.add(Dense(NClasses, kernel_initializer='normal', activation='softmax'))
289 |     
290 |     #Tune an optimiser
291 |     Optim = optimizers.Adam(lr=LearningRate, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=True)
292 |     
293 |     # Compile model
294 |     model.compile(loss='sparse_categorical_crossentropy', optimizer=Optim, metrics = ['accuracy'])
295 |     return model
296 | 
297 | # define the very deep model with L2 regularization and dropout
298 | def very_deep_model_L2D():
299 | 	# create model
300 |     model = Sequential()
301 |     model.add(Dense(512, kernel_regularizer= regularizers.l2(0.001), input_dim=Ndims, kernel_initializer='normal', activation='relu'))
302 |     model.add(Dense(256, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
303 |     model.add(Dropout(DropRate))
304 |     model.add(Dense(128, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
305 |     model.add(Dense(128, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
306 |     model.add(BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', moving_mean_initializer='zeros', moving_variance_initializer='ones', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None))
307 |     model.add(Dense(64, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
308 |     model.add(Dense(32, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
309 |     model.add(Dropout(DropRate))
310 |     model.add(Dense(32, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
311 |     model.add(Dropout(DropRate))
312 |     model.add(Dense(NClasses, kernel_initializer='normal', activation='softmax'))
313 |     
314 |     #Tune an optimiser
315 |     Optim = optimizers.Adam(lr=LearningRate, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=True)
316 |     
317 |     # Compile model
318 |     model.compile(loss='sparse_categorical_crossentropy', optimizer=Optim, metrics = ['accuracy'])
319 |     return model
320 | 
321 | 
322 | 
323 | # Instantiate and fit a MLP model estimator from choices above
324 | if ModelChoice == 1:
325 |     print("Don't waste your time in shallow learning...")
326 |     
327 | elif ModelChoice == 2:
328 |     EstimatorNN = KerasClassifier(build_fn=deep_model_L2D, epochs=TrainingEpochs, batch_size=250000, verbose=Chatty)
329 |     
330 | elif ModelChoice == 3:
331 |     EstimatorNN = KerasClassifier(build_fn=very_deep_model_L2D, epochs=TrainingEpochs, batch_size=250000, verbose=Chatty)
332 |     
333 | else:
334 |     sys.exit("Invalid Model Choice")
335 |     
336 | 
337 | 
338 | 
339 | ###############################################################################
340 | """Load the convnet model"""
341 | #print('Loading re-trained convnet model produced by a run of TrainCNN.py')
342 | print('Loading ' + ModelName + '.h5')
343 | FullModelPath = TrainPath + ModelName + '.h5'
344 | ConvNetmodel = load_model(FullModelPath)
345 | 
346 | ClassKeyPath = TrainPath + ModelName + '.csv'
347 | ClassKey = pd.read_csv(ClassKeyPath)
348 | 
349 | ###############################################################################
350 | """Classify the holdout images with CNN-Supervised Classification"""
351 | size = 50 #Do not edit. The base models supplied all assume a tile size of 50.
352 | 
353 | # Getting River Names from the files
354 | # Glob list fo all jpg images, get unique names form the total list
355 | img = glob.glob(PredictPath+"*.jpg")
356 | TestRiverTuple = []
357 | for im in img:
358 |     TestRiverTuple.append(os.path.basename(im).partition('_')[0])
359 | TestRiverTuple = np.unique(TestRiverTuple)
360 | 
361 | # Get training class images (covers tif and tiff file types)
362 | class_img = glob.glob(PredictPath + "SCLS_*.tif*")
363 | 
364 | 
365 | for f,riv in enumerate(TestRiverTuple):
366 |     for i,im in enumerate(img): 
367 |         print('CNN-supervised classification of ' + os.path.basename(im))
368 |         Im3D = io.imread(im)
369 |         #print(isinstance(Im3D,uint8))
370 |         if len(Im3D) == 2:
371 |             Im3D = Im3D[0]
372 |         Class = io.imread(class_img[i], as_gray=True)
373 |         if (Class.shape[0] != Im3D.shape[0]) or (Class.shape[1] != Im3D.shape[1]):
374 |             print('WARNING: inconsistent image and class mask sizes for ' + im)
375 |             Class = T.resize(Class, (Im3D.shape[0], Im3D.shape[1]), preserve_range = True) #bug handling for vector
376 |         ClassIm = copy.deepcopy(Class)
377 |         #Tile the images to run the convnet
378 |         ImCrop = CropToTile (Im3D, size)
379 |         I_tiles = split_image_to_tiles(ImCrop, size)
380 |         I_tiles = np.int16(I_tiles) / 255
381 |         #Apply the convnet
382 |         print('Detecting CNN-supervised training areas')
383 |         PredictedTiles = ConvNetmodel.predict(I_tiles, batch_size = 32, verbose = Chatty)
384 |         #Convert the convnet one-hot predictions to a new class label image
385 |         PredictedTiles[PredictedTiles < RecogThresh] = 0
386 |         PredictedClass = class_prediction_to_image(Class, PredictedTiles, size)
387 |         PredictedClass = SimplifyClass(PredictedClass, ClassKey)
388 |         #Set classes to 0 if they do not have MinTiles detected by the CNN
389 |         
390 |         #for c in range(0,NClasses+1):
391 |         #    count = np.sum(PredictedClass.reshape(-1,1) == c)
392 |         #    if count <= MinTiles*size*size:
393 |         #        PredictedClass[PredictedClass == c] = 0
394 |         if MinTiles > 0:
395 |             PredictedClass = modal(np.uint8(PredictedClass), np.ones((2*MinTiles*size+1,2*MinTiles*size+1)))
396 |             #PredictedClass = modal(np.uint8(PredictedClass), disk(2*(MinTiles*size*size)+1))
397 | 								
398 |         #Prep the pixel data from the cropped image and new class label image
399 |         
400 |             if Ndims == 4:
401 |                 print('Processing Entropy and Median filter')
402 |                 Entropy = entropy(Im3D[:,:,0], selem = np.ones([11,11]), shift_x = 3,  shift_y = 0)
403 |             else:
404 |                 print('Processing Median filter')
405 |             MedImage = ColourFilter(Im3D) #Median filter on all 3 bands
406 |             r = MedImage[:,:,0]#Split the bands
407 |             g = MedImage[:,:,1]
408 |             b = MedImage[:,:,2]
409 |             #Vectorise the bands, use the classification prdicted by the AI
410 |             m = np.ndarray.flatten(PredictedClass).reshape(-1,1) 
411 |             rv = np.ndarray.flatten(r).reshape(-1,1)
412 |             gv = np.ndarray.flatten(g).reshape(-1,1)
413 |             bv = np.ndarray.flatten(b).reshape(-1,1)
414 |             if Ndims == 4:
415 |                 Entropyv = np.ndarray.flatten(Entropy).reshape(-1,1) 
416 |                 ColumnDat = np.concatenate((rv,gv,bv,Entropyv,m), axis = 1)
417 |             else:
418 |                 ColumnDat = np.concatenate((rv,gv,bv,m), axis = 1)
419 |                 
420 |             del(r,g,b,m)
421 |         #Rescale the data for the fitting work
422 |         SCAL = StandardScaler()
423 |         ScaledValues = SCAL.fit_transform(ColumnDat[:,0:-1])
424 |         ColumnDat[:,0:-1] = ScaledValues
425 |         #Eliminate the zeros in the mask from minimum tiles
426 |         ColumnDat = ColumnDat[ColumnDat[:,4]!=0]
427 |         
428 |         m=ColumnDat[:,-1]
429 |         #Build the predictor from the CNN classified mask
430 |         #Subsample the pixels
431 |         sample_size = np.int64(SubSample * ColumnDat.shape[0])
432 |         if (sample_size < MinSample) and (ColumnDat.shape[0] > MinSample):
433 |             sample_size = MinSample
434 |         elif (sample_size < MinSample) and (ColumnDat.shape[0] < MinSample):
435 |             sample_size = ColumnDat.shape[0]
436 |             print('WARNING: small sample size for predictor fit')
437 |         idx = np.random.randint(low = 1, high = int(len(ColumnDat)-1), size=sample_size) #using numpy so should be fast
438 |         ColumnDat = ColumnDat[idx,:]
439 |         X_presmote = ColumnDat[:,0:4] 
440 |         Y_presmote = ColumnDat[:,4]
441 |         if UseSmote and len(np.unique(Y_presmote))>1: #SMOTE doesn't work with 1 class 
442 |             print('Correcting class imbalance with SMOTE-ENN')
443 |             smote_enn = SMOTEENN(sampling_strategy ='auto', random_state=0)
444 |             X, Y = smote_enn.fit_resample(X_presmote, Y_presmote)
445 |         else:
446 |             print('not using SMOTE methods')
447 |             X = X_presmote
448 |             Y = Y_presmote     
449 | 
450 |         
451 | 
452 |         print('Fitting MLP Classifier on ' + str(len(X)) + ' pixels')
453 |         EstimatorNN.fit(X, Y)
454 | 
455 |             
456 |         #Fit the predictor to all pixels
457 |         if Ndims == 4:
458 |             FullDat = np.concatenate((rv,gv,bv,Entropyv), axis = 1)
459 |             del(rv,gv,bv,Entropyv, MedImage)
460 |         else:
461 |             FullDat = np.concatenate((rv,gv,bv), axis = 1)
462 |             del(rv,gv,bv, MedImage)
463 | 
464 |         PredictedPixels = EstimatorNN.predict(ScaledValues)
465 | 
466 |         
467 |         #Reshape the predictions to image format and display
468 |         PredictedImage = PredictedPixels.reshape(Im3D.shape[0], Im3D.shape[1])
469 |         if SmallestElement > 0:
470 |             PredictedImage = modal(np.uint8(PredictedImage), disk(2*SmallestElement+1)) #clean up the class with a mode filter
471 | 
472 | 
473 |         #Produce classification reports 
474 |         Class = Class.reshape(-1,1)
475 |         PredictedImageVECT = PredictedImage.reshape(-1,1) #This is the pixel-based prediction
476 |         PredictedClassVECT = PredictedClass.reshape(-1,1) # This is the CNN tiles prediction
477 |         PredictedImageVECT = PredictedImageVECT[Class != 0]
478 |         PredictedClassVECT = PredictedClassVECT[Class != 0]
479 |         Class = Class[Class != 0]
480 |         Class = np.int32(Class)
481 |         PredictedImageVECT = np.int32(PredictedImageVECT)
482 |         reportSSC = metrics.classification_report(Class, PredictedImageVECT, digits = 3)
483 |         reportCNN = metrics.classification_report(Class, PredictedClassVECT, digits = 3)
484 |         print('CNN tiled classification results for ' + os.path.basename(im))
485 |         print(reportCNN)
486 |         print('\n')
487 |         print('CNN-Supervised classification results for ' + os.path.basename(im))
488 |         print(reportSSC)
489 |         #print('Confusion Matrix:')
490 |         #print(metrics.confusion_matrix(Class, PredictedImageVECT))
491 |         print('\n')
492 |         
493 |         CSCname = ScorePath + 'CSC_' + os.path.basename(im)[:-4] + '_' + Experiment + '.csv'    
494 |         classification_report_csv(reportSSC, CSCname)
495 | 
496 |         CNNname = ScorePath + 'CNN_' + os.path.basename(im)[:-4] + '_' + Experiment + '.csv'    
497 |         classification_report_csv(reportCNN, CNNname)            
498 |         
499 |         #Display and/or oputput figure results
500 |         #PredictedImage = PredictedPixels.reshape(Entropy.shape[0], Entropy.shape[1])
501 |         for c in range(0,6): #this sets 1 pixel to each class to standardise colour display
502 |             ClassIm[c,0] = c
503 |             PredictedClass[c,0] = c
504 |             PredictedImage[c,0] = c
505 |         #get_ipython().run_line_magic('matplotlib', 'qt')
506 |         plt.figure(figsize = (12, 9.5)) #reduce these values if you have a small screen
507 |         plt.subplot(2,2,1)
508 |         plt.imshow(Im3D)
509 |         plt.title('Classification results for ' + os.path.basename(im), fontweight='bold')
510 |         plt.xlabel('Input RGB Image', fontweight='bold')
511 |         plt.subplot(2,2,2)
512 |         cmapCHM = colors.ListedColormap(['black','lightblue','orange','green','yellow','red'])
513 |         plt.imshow(ClassIm, cmap=cmapCHM)
514 |         plt.xlabel('Validation Labels', fontweight='bold')
515 |         class0_box = mpatches.Patch(color='black', label='Unclassified')
516 |         class1_box = mpatches.Patch(color='lightblue', label='Water')
517 |         class2_box = mpatches.Patch(color='orange', label='Sediment')
518 |         class3_box = mpatches.Patch(color='green', label='Green Veg.')
519 |         class4_box = mpatches.Patch(color='yellow', label='Senesc. Veg.')
520 |         class5_box = mpatches.Patch(color='red', label='Paved Road')
521 |         ax=plt.gca()
522 |         ax.legend(handles=[class0_box, class1_box,class2_box,class3_box,class4_box,class5_box])
523 |         plt.subplot(2,2,3)
524 |         plt.imshow(PredictedClass, cmap=cmapCHM)
525 |         plt.xlabel('CNN tiles Classification. F1: ' + GetF1(reportCNN), fontweight='bold')
526 |         plt.subplot(2,2,4)
527 |         cmapCHM = colors.ListedColormap(['black', 'lightblue','orange','green','yellow','red'])
528 |         plt.imshow(PredictedImage, cmap=cmapCHM)
529 |         
530 |         plt.xlabel('CNN-Supervised Classification. F1: ' + GetF1(reportSSC), fontweight='bold' )
531 |         
532 |         FigName = ScorePath + 'CSC_'+  Experiment + '_'+ os.path.basename(im)[:-4] +'.png'
533 |         plt.savefig(FigName, dpi=OutDPI)
534 |         if not DisplayHoldout:
535 |             plt.close()
536 |         # Figure output below is NOT geocoded.
537 |         if SaveClassRaster:
538 |             ClassRasterName = ScorePath + 'CSC_'+ os.path.basename(im)[:-4] +'.tif'
539 |             PredictedImage = PredictedPixels.reshape(Im3D.shape[0], Im3D.shape[1]) #This removes the pixels from line 500 that are used to control coolur scheme
540 |             io.imsave(ClassRasterName, PredictedImage)
541 |             
542 | 
543 |             
544 | 
545 | 
546 | 
547 | 


--------------------------------------------------------------------------------
/code/CnnSupervisedClassification_PyQGIS.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | #------------------------------------------------------------------------------
  3 | __author__ = 'Patrice Carbonneau'
  4 | __contact__ = 'patrice.carbonneau@durham.ac.uk'
  5 | __copyright__ = '(c) Patrice Carbonneau'
  6 | __license__ = 'MIT'
  7 | __date__ = '26 SEPT 2019'
  8 | __version__ = '1.2'
  9 | __status__ = "stable release"
 10 | __url__ = "https://github.com/geojames/Self-Supervised-Classification"
 11 | 
 12 | 
 13 | """
 14 | Name:           CnnSupervisedClassification_PyQGIS.py
 15 | 
 16 | Compatibility:  Python 3.6
 17 | 
 18 | Description:    Performs Self-Supervised Image CLassification with a 
 19 |                 pre-trained Convolutional Neural Network model.
 20 |                 User options are in the first section of code.
 21 |                 
 22 | Requires:       keras, numpy, pandas, matplotlib, skimage, sklearn, gdal
 23 | 
 24 | Dev Revisions:  PC -  First version
 25 | 
 26 |                 
 27 | Licence:        MIT
 28 | 
 29 | Permission is hereby granted, free of charge, to any person obtaining a copy of
 30 | this software and associated documentation files (the "Software"), to deal in 
 31 | the Software without restriction, including without limitation the rights to 
 32 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 
 33 | of the Software, and to permit persons to whom the Software is furnished to do 
 34 | so, subject to the following conditions:
 35 | The above copyright notice and this permission notice shall be included in all 
 36 | copies or substantial portions of the Software.
 37 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 
 38 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 
 39 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 
 40 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 
 41 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
 42 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
 43 | SOFTWARE.
 44 | """
 45 | 
 46 | # ------------------------------------------------------------------------------------------------
 47 | # IMPORTS
 48 | # ------------------------------------------------------------------------------------------------
 49 | """ Libraries"""
 50 | import os.path
 51 | import datetime
 52 | import time
 53 | import sys
 54 | import numpy as np
 55 | import pandas as pd
 56 | from keras import regularizers
 57 | from keras import optimizers
 58 | from keras.models import load_model
 59 | from keras.models import Sequential
 60 | from keras.layers import Dense, Dropout
 61 | from keras.wrappers.scikit_learn import KerasClassifier
 62 | from skimage import io
 63 | from skimage.filters.rank import median
 64 | from skimage.morphology import disk
 65 | from sklearn import metrics
 66 | from sklearn.utils import class_weight
 67 | import gdal, osr
 68 | from qgis.utils import iface
 69 | 
 70 | 
 71 | # ------------------------------------------------------------------------------------------------
 72 | # USER INPUT
 73 | """User data input. Fill in the info below before running"""
 74 | # ------------------------------------------------------------------------------------------------
 75 | 
 76 | ModelName = 'Empty'     #should be the model name from previous run of TrainCNN.py (NO FILE ENDING)
 77 | ModelPath = 'Empty'  # path to the model
 78 | ImageFolder = 'Empty'
 79 | ImageFile = 'Empty'
 80 | ClassFile = 'Empty'#leave empty if none.  In that case no validation will be performed.
 81 | NoData = 0 #no data values if mosaic has no alpha layer
 82 | 
 83 | '''BASIC PARAMETER CHOICES'''
 84 | TrainingEpochs = 15 #This requires minimal experimentation to tune
 85 | Ndims = 3 # Feature Dimensions. should be 3 for RGB
 86 | NClasses = 5  #The number of classes in the data. This MUST be the same as the classes used to retrain the model
 87 |   
 88 | 
 89 | '''MODEL PARAMETERS''' #These would usually not be edited
 90 | DropRate = 0.5
 91 | LearningRate = 0.005
 92 | 
 93 | 
 94 | 
 95 | # timer
 96 | start_time = datetime.datetime.now()
 97 | loop_time = 0
 98 | 
 99 | 
100 | # END USER INPUT
101 | # ------------------------------------------------------------------------------------------------
102 | 
103 | 
104 |     
105 | # ------------------------------------------------------------------------------------------------
106 | """ HELPER FUNCTIONS SECTION"""
107 | # ------------------------------------------------------------------------------------------------
108 | # Helper function to crop images to have an integer number of tiles. No padding is used.
109 | def CropToTile (Im, size):
110 |     if len(Im.shape) == 2:#handle greyscale
111 |         Im = Im.reshape(Im.shape[0], Im.shape[1],1)
112 | 
113 |     crop_dim0 = size * (Im.shape[0]//size)
114 |     crop_dim1 = size * (Im.shape[1]//size)
115 |     return Im[0:crop_dim0, 0:crop_dim1, :]
116 | #END - CropToTile   
117 |     
118 | #Helper functions to move images in and out of tensor format
119 | def split_image_to_tiles(im, size):
120 |     
121 |     if len(im.shape) ==2:
122 |         h, w = im.shape
123 |         d = 1
124 |     else:
125 |         h, w, d = im.shape
126 | 
127 |      
128 |     nTiles_height = h//size
129 |     nTiles_width = w//size
130 |     TileTensor = np.zeros((nTiles_height*nTiles_width, size,size,d))
131 |     B=0
132 |     for y in range(0, nTiles_height):
133 |         for x in range(0, nTiles_width):
134 |             x1 = np.int32(x * size)
135 |             y1 = np.int32(y * size)
136 |             x2 = np.int32(x1 + size)
137 |             y2 = np.int32(y1 + size)
138 |             TileTensor[B,:,:,:] = im[y1:y2,x1:x2].reshape(size,size,d)
139 |             B+=1
140 | 
141 |     return TileTensor
142 | # END - split_image_to_tiles
143 |     
144 | 
145 | 
146 | # Takes the predicted tiles and reconstructs an image from them
147 | def class_prediction_to_image(im, PredictedTiles, size):
148 | 
149 |     if len(im.shape) ==2:
150 |         h, w = im.shape
151 |         d = 1
152 |     else:
153 |         h, w, d = im.shape
154 |      
155 |     nTiles_height = h//size
156 |     nTiles_width = w//size
157 |     #TileTensor = np.zeros((nTiles_height*nTiles_width, size,size,d))
158 |     TileImage = np.zeros(im.shape)
159 |     B=0
160 |     for y in range(0, nTiles_height):
161 |         for x in range(0, nTiles_width):
162 |             x1 = np.int32(x * size)
163 |             y1 = np.int32(y * size)
164 |             x2 = np.int32(x1 + size)
165 |             y2 = np.int32(y1 + size)
166 |             #TileTensor[B,:,:,:] = im[y1:y2,x1:x2].reshape(size,size,d)
167 |             #TileImage[y1:y2,x1:x2] = np.nanmax(PredictedTiles[B,:])
168 |             TileImage[y1:y2,x1:x2] = np.argmax(PredictedTiles[B,:])
169 |             B+=1
170 | 
171 |     return TileImage
172 | # END - class_prediction_to_image
173 | 
174 | # This is a helper function to repeat a filter on 3 colour bands.
175 | #   Avoids an extra loop in the big loops below
176 | def ColourFilter(Image):
177 |     med = np.zeros(np.shape(Image))
178 |     for b in range (0,3):
179 |         img = Image[:,:,b]
180 |         med[:,:,b] = median(img, disk(5))
181 |     return med
182 | # END - ColourFilter
183 | 
184 | #Save classification reports to csv with Pandas
185 | def classification_report_csv(report, filename):
186 |     report_data = []
187 |     lines = report.split('\n')
188 |     for line in lines[2:-5]:
189 |         row = {}
190 |         row_data = line.split(' ') 
191 |         row_data = list(filter(None, row_data))
192 |         row['class'] = row_data[0]
193 |         row['precision'] = float(row_data[1])
194 |         row['recall'] = float(row_data[2])
195 |         row['f1_score'] = float(row_data[3])
196 |         row['support'] = float(row_data[4])
197 |         report_data.append(row)
198 |     dataframe = pd.DataFrame.from_dict(report_data)
199 |     dataframe.to_csv(filename, index = False) 
200 | # END - classification_report_csv
201 | 
202 | # Return a class prediction to the 1-Nclasses hierarchical classes
203 | def SimplifyClass(ClassImage, ClassKey):
204 |     Iclasses = np.unique(ClassImage)
205 |     for c in range(0, len(Iclasses)):
206 |         KeyIndex = ClassKey.loc[ClassKey['LocalClass'] == Iclasses[c]]
207 |         Hclass = KeyIndex.iloc[0]['HierarchClass']
208 |         ClassImage[ClassImage == Iclasses[c]] = Hclass
209 |     return ClassImage
210 | # END - SimplifyClass
211 | 
212 | #fetches the overall avg F1 score from a classification report
213 | def GetF1(report):
214 |     lines = report.split('\n')
215 |     for line in lines[0:-1]:
216 |         if 'weighted' in line:
217 |             dat = line.split(' ')
218 |     
219 |     return dat[17]
220 | # END - GetF1
221 | 
222 | # ------------------------------------------------------------------------------------------------
223 | """Instantiate MLP pixel-based classifiers""" 
224 | # ------------------------------------------------------------------------------------------------
225 | 
226 | 
227 | MLP = True
228 | def best_model_L2D():
229 | 	# create model
230 |     model = Sequential()
231 |     model.add(Dense(512, kernel_regularizer= regularizers.l2(0.001), input_dim=Ndims, kernel_initializer='normal', activation='relu'))
232 |     model.add(Dense(256, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
233 |     model.add(Dropout(0.5))
234 |     model.add(Dense(128, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
235 |     model.add(Dense(64, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
236 |     model.add(Dense(32, kernel_regularizer= regularizers.l2(0.001), kernel_initializer='normal', activation='relu'))
237 | 
238 |     model.add(Dense(NClasses, kernel_initializer='normal', activation='softmax'))
239 |     
240 |     #Tune an optimiser
241 |     Optim = optimizers.Adam(lr=LearningRate, beta_1=0.9, beta_2=0.999, decay=0.0, amsgrad=True)
242 |     
243 |     # Compile model
244 |     model.compile(loss='sparse_categorical_crossentropy', optimizer=Optim, metrics = ['accuracy'])
245 |     return model
246 |     
247 | EstimatorNN = KerasClassifier(build_fn=best_model_L2D, epochs=TrainingEpochs, 
248 |                                   batch_size=250000, verbose=0)
249 |     
250 | # ------------------------------------------------------------------------------------------------
251 | # MAIN
252 | # ------------------------------------------------------------------------------------------------
253 |     
254 | 
255 | if not QgsProject.instance().mapLayers().values():
256 |     print('ERROR: this script assumes that the image to be classified is open in QGIS with project CRS correctly set')
257 | else:
258 |     '''Form the file names'''
259 |     ImageName = os.path.join(ImageFolder, ImageFile)
260 |     ClassName = os.path.join(ImageFolder, ClassFile)
261 |     OutputFile1 = 'CLASS_CNN_'+ImageFile
262 |     OutputFile2 = 'CLASS_CSC_'+ImageFile
263 |     OutputName1 =   os.path.join(ImageFolder, OutputFile1) #Location of the CNN class with fullpath
264 |     OutputName2 =   os.path.join(ImageFolder, OutputFile2) #Location of the MLP class with fullpath
265 |     Chatty = 0
266 | 
267 |     """Load the convnet model"""
268 |     #print('Loading re-trained convnet model produced by a run of TrainCNN.py')
269 |     print('Loading ' + ModelName + '.h5')
270 |     time.sleep(1)
271 |     FullModelPath = os.path.join(ModelPath, ModelName + '.h5')
272 |     ConvNetmodel = load_model(FullModelPath)
273 |     ClassKeyPath = os.path.join(ModelPath, ModelName + '.csv')
274 |     ClassKey = pd.read_csv(ClassKeyPath)
275 | 
276 | 
277 |     """Classify the holdout images with CNN-Supervised Classification"""
278 |     size = 50 #Do not edit. The base models supplied all assume a tile size of 50.
279 | 
280 |         # timer
281 |     start_time = datetime.datetime.now()
282 |     loop_time = 0
283 | 
284 | 
285 |     print('CNN-supervised classification of ' + ImageName)
286 |     time.sleep(1)   
287 |     Im3D = io.imread(ImageName)
288 |     if Im3D.shape[2]>3:
289 |         ALPHA = Im3D[:,:,3]
290 |         Im3D = Im3D[:,:,0:3]
291 |     else:
292 |         ALPHA = ~(((Im3D[:,:,0]==NoData) & (Im3D[:,:,1]==NoData) & (Im3D[:,:,2]==NoData)))
293 | 
294 |             
295 |               
296 | 
297 |                 
298 |     #Tile the images to run the convnet
299 |     ImCrop = CropToTile (Im3D, size)
300 |     I_tiles = split_image_to_tiles(ImCrop, size)
301 |     I_tiles = np.int16(I_tiles) / 255
302 |     del(Im3D, ImCrop)
303 |     #Apply the convnet
304 |     print('Detecting CNN-supervised training areas')
305 |     time.sleep(1)
306 |     PredictedTiles = ConvNetmodel.predict(I_tiles, batch_size = 64, verbose = Chatty)
307 |     del(I_tiles)
308 |     #Convert the convnet one-hot predictions to a new class label image
309 |     PredictedClass = class_prediction_to_image(ALPHA, PredictedTiles, size)
310 |     del(PredictedTiles, ConvNetmodel)
311 |     PredictedClass = SimplifyClass(PredictedClass, ClassKey)
312 |     PredictedClass[ALPHA==0]=0 #Set classes to 0 for off-image patches (transparent values in the alpha)
313 |     '''Georeferenced Export of CNN class'''
314 |         
315 |     ImageFile = gdal.Open(ImageName)
316 |     driver = gdal.GetDriverByName("GTiff")
317 |     outRaster = driver.Create(OutputName1, PredictedClass.shape[1], PredictedClass.shape[0], gdal.GDT_Byte)
318 |     outRaster.SetGeoTransform(ImageFile.GetGeoTransform())
319 |     outRasterSRS = osr.SpatialReference()
320 |     project_crs_name = iface.mapCanvas().mapSettings().destinationCrs().authid()
321 |     project_crs = int(project_crs_name[5:])
322 |     outRasterSRS.ImportFromEPSG(project_crs)
323 |     outRaster.SetProjection(outRasterSRS.ExportToWkt())
324 |     outRaster.GetRasterBand(1).WriteArray(PredictedClass)
325 |     outRaster.FlushCache()  # saves image to disk
326 |     outRaster = None        # close output file
327 |     ImageFile = None        # close input file
328 |         ##Open the new Class Raster data in QGIS
329 |     print('Displaying Classification')
330 |     time.sleep(1)
331 |     fileInfo = QFileInfo(OutputName1)
332 |     baseName = fileInfo.baseName()
333 |     rlayer = QgsRasterLayer(OutputName1, baseName)
334 |     if not rlayer.isValid():
335 |         print ('Layer failed to load!')
336 |     else:
337 |         QgsProject.instance().addMapLayer(rlayer)
338 | 
339 |     '''MLP part'''
340 |     Im3D = io.imread(ImageName)
341 |     Im3D = Im3D[:,:,0:3]
342 |     print('3x3 Median filter' )
343 |     time.sleep(1)
344 |     MedImage = ColourFilter(Im3D) #Median filter on all 3 bands
345 |     del(Im3D)
346 |     print('Preparing data for MLP')
347 |     time.sleep(1)
348 |     rv = MedImage[:,:,0].reshape(-1,1) / 255#Split and vectorise the bands
349 |     gv = MedImage[:,:,1].reshape(-1,1) / 255
350 |     bv = MedImage[:,:,2].reshape(-1,1) / 255
351 |     del(MedImage)
352 |     PredictedClass = PredictedClass.reshape(-1,1)
353 | 
354 | 
355 |     ColumnDat = np.concatenate((rv,gv,bv,PredictedClass), axis = 1)
356 |     ColumnDat = ColumnDat[ColumnDat[:,-1]!=0]#ignore class 0
357 |     del(PredictedClass)
358 |     X= ColumnDat[:,0:-1]
359 |     Y=ColumnDat[:,-1]
360 |     del(ColumnDat)
361 |     #Adjust class weights
362 |     weights = class_weight.compute_class_weight('balanced', np.unique(Y),Y)
363 | 
364 |     #Fit the training
365 |     EstimatorNN.fit(X, Y, class_weight=weights)
366 |     del(X,Y)
367 |     #Estiange for all pixels in the mosaic
368 |     FullDat = np.concatenate((rv,gv,bv), axis = 1)
369 |     del(rv,gv,bv)
370 |     PredictedPixels = EstimatorNN.predict(FullDat)
371 |     del(FullDat)
372 |     PredictedImage = PredictedPixels.reshape(ALPHA.shape[0], ALPHA.shape[1])
373 |     PredictedImage[ALPHA==0]=0
374 |     del(ALPHA, EstimatorNN, PredictedPixels)
375 |         #plt.imshow(PredictedImage)
376 |     '''Georeferenced Export of the final CSC class'''
377 |     ImageFile = gdal.Open(ImageName)
378 |     driver = gdal.GetDriverByName("GTiff")
379 |     outRaster = driver.Create(OutputName2, PredictedImage.shape[1], PredictedImage.shape[0], gdal.GDT_Byte)
380 |     outRaster.SetGeoTransform(ImageFile.GetGeoTransform())
381 |     outRasterSRS = osr.SpatialReference()
382 |     project_crs_name = iface.mapCanvas().mapSettings().destinationCrs().authid()
383 |     project_crs = int(project_crs_name[5:])
384 |     outRasterSRS.ImportFromEPSG(project_crs)
385 |     outRaster.SetProjection(outRasterSRS.ExportToWkt())
386 |     outRaster.GetRasterBand(1).WriteArray(PredictedImage)
387 |     outRaster.FlushCache()  # saves image to disk
388 |     outRaster = None        # close output file
389 |     ImageFile = None        # close input file
390 |     ##Open the new Class Raster data in QGIS
391 |     print('Displaying CSC Classification')
392 |     time.sleep(1)
393 |     fileInfo = QFileInfo(OutputName2)
394 |     baseName = fileInfo.baseName()
395 |     rlayer = QgsRasterLayer(OutputName2, baseName)
396 |     if not rlayer.isValid():
397 |         print ('Layer failed to load!')
398 |     else:
399 |             QgsProject.instance().addMapLayer(rlayer)
400 | 
401 | 
402 | 
403 |     '''Quality report if validation available'''
404 | 
405 |     try:
406 |         Class = io.imread(ClassName)
407 |         #Produce classification reports 
408 |         Class = Class.reshape(-1,1)
409 |         PredictedImageVECT = PredictedImage.reshape(-1,1) #This is the pixel-based prediction
410 |         PredictedImageVECT = PredictedImageVECT[Class != 0]
411 |         Class = Class[Class != 0]
412 |         Class = np.int32(Class)
413 |         PredictedImageVECT = np.int32(PredictedImageVECT)
414 |         reportSSC = metrics.classification_report(Class, PredictedImageVECT, digits = 3)
415 | 
416 | 
417 |         print('CNN-Supervised classification results for ' + ImageName)
418 |         print(reportSSC)
419 |         print('Confusion Matrix:')
420 |         print(metrics.confusion_matrix(Class, PredictedImageVECT))
421 |         print('\n')
422 | 
423 | 
424 |             
425 |             
426 |     except:
427 |         print('no validation data found')
428 | 
429 |         #
430 |         #
431 |             
432 |     print("*-*-*-*-*")
433 |     print("|")
434 |     print("|   Total Classification Time =",datetime.datetime.now() - start_time)
435 |     print("|")
436 |     print("*-*-*-*-*")
437 | 


--------------------------------------------------------------------------------
/code/CompileClassificationReport.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # -*- coding: utf-8 -*-
  3 | #------------------------------------------------------------------------------
  4 | __author__ = 'Patrice Carbonneau'
  5 | __contact__ = 'patrice.carbonneau@durham.ac.uk'
  6 | __copyright__ = '(c) Patrice Carbonneau'
  7 | __license__ = 'MIT'
  8 | __date__ = '15 APR 2019'
  9 | __version__ = '1.1'
 10 | __status__ = "initial release"
 11 | __url__ = "https://github.com/geojames/Self-Supervised-Classification"
 12 | 
 13 | """
 14 | Name:           CompileClassificationReport.py
 15 | Compatibility:  Python 3.6
 16 | Description:    this utility runs through the individual outputs of 
 17 |                 SelfSupervisedClassification and compiles the results into a
 18 |                 single spreadsheet with a column structure that works well with
 19 |                 Pandas row selection and Seaborn switches for hues and other
 20 |                 visualisation options
 21 | Requires:       numpy, pandas, glob
 22 | 
 23 | Dev Revisions:  JTD - 19/6/10 - Updated and optimied for Pandas
 24 |                     
 25 | Licence:        MIT
 26 | Permission is hereby granted, free of charge, to any person obtaining a copy of
 27 | this software and associated documentation files (the "Software"), to deal in 
 28 | the Software without restriction, including without limitation the rights to 
 29 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 
 30 | of the Software, and to permit persons to whom the Software is furnished to do 
 31 | so, subject to the following conditions:
 32 | The above copyright notice and this permission notice shall be included in all 
 33 | copies or substantial portions of the Software.
 34 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 
 35 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 
 36 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 
 37 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 
 38 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
 39 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
 40 | SOFTWARE.
 41 | """
 42 | 
 43 | ###############################################################################
 44 | """ Libraries"""
 45 | 
 46 | import numpy as np
 47 | import pandas as pd
 48 | import glob
 49 | import os.path
 50 | 
 51 | 
 52 | #############################################################
 53 | """User data input. Fill in the info below before running"""
 54 | #############################################################
 55 | 
 56 | ScorePath = "Empty"     #output folder
 57 | Experiment = 'Empty'
 58 | model = 'Empty'         #this will be a label, not the exact model name
 59 | Phase2Type = 'MLP'      #indicate wether the phase 2 ML algorithm was MLP (default) or RF
 60 | 
 61 | # Path checks- checks for folder ending slash, adds if nessesary
 62 | 
 63 | if ('/' or "'\'") not in ScorePath[-1]:
 64 |     ScorePath = ScorePath +'/'
 65 | 
 66 | ###############################################################################
 67 | """Compile a single classification report from all the individual image reports"""
 68 | 
 69 | # Getting River Names from the input files
 70 | # Glob list fo all jpg images, get unique names form the total list
 71 | cnn_csv = glob.glob(ScorePath+"CNN_*.csv")
 72 | TestRiverTuple = []
 73 | for c in cnn_csv:
 74 |     TestRiverTuple.append(os.path.basename(c).split('_')[1])
 75 | TestRiverTuple = np.unique(TestRiverTuple)
 76 | 
 77 | # Get the CNN (phase 1) reports compiled
 78 | 
 79 | #establish a DF for the CNN results
 80 | CNN_df = pd.DataFrame(np.zeros([1,8]),columns = ['class','f1_score','support','RiverName','Image','type','experiment','model'])
 81 |                     
 82 | # for each river, get all the csv files that match river name
 83 | #   for each csv file, extract: class, f1 and support scores to new DF
 84 | #       Clean and concat the CNN csv files as we go
 85 | #   Renames classes and add type, exp, model column info
 86 | for f,riv in enumerate(TestRiverTuple):
 87 |     print('Compiling CNN reports for ' + riv)
 88 |     cnn_csv = glob.glob(ScorePath+"CNN_" + riv + "*.csv")
 89 |     
 90 |     for i,csv in enumerate(cnn_csv):
 91 |         DF = pd.read_csv(csv)                       
 92 |         FileDf = DF.filter(['class','f1_score','support'],axis=1)
 93 |         FileDf = FileDf.drop(FileDf[(FileDf.f1_score == 0) & (FileDf.support == 0)].index)
 94 |         FileDf = FileDf.append(pd.DataFrame([['ALL',np.sum((FileDf.f1_score*FileDf.support))/np.sum(FileDf.support),np.sum(FileDf.support)]],columns=['class','f1_score','support']))
 95 |         FileDf['RiverName'] = riv
 96 |         FileDf['Image'] = os.path.basename(csv).split('_')[1] + "_" + os.path.basename(csv).split('_')[2]
 97 |         CNN_df = pd.concat([CNN_df, FileDf], sort = False)
 98 |         
 99 |     CNN_df = CNN_df[1:]   
100 |     CNN_df['f1_score'] = 100 * CNN_df['f1_score']
101 |     CNN_df['class'][CNN_df['class']==1] = 'Water'
102 |     CNN_df['class'][CNN_df['class']==2] = 'Sediment'
103 |     CNN_df['class'][CNN_df['class']==3] = 'Green Veg.'
104 |     CNN_df['class'][CNN_df['class']==4] = 'Senesc. Veg.'
105 |     CNN_df['class'][CNN_df['class']==5] = 'Paved Road'
106 |     CNN_df['type'] = 'CNN'
107 |     CNN_df['experiment'] = Experiment
108 |     CNN_df['model'] = model
109 | 
110 | # Get the MLP (phase 2) classification reports compiled. 
111 | 
112 | # establish Phase 2 DF  
113 | p2_df = pd.DataFrame(np.zeros([1,8]),columns = ['class','f1_score','support','RiverName','Image','type','experiment','model'])])
114 |                   
115 | # for each river, get all the csv files that match river name
116 | #   for each csv file, extract: class, f1 and support scores to new DF
117 | #       Clean and concat the CNN csv files as we go
118 | #   Renames classes and add type, exp, model column info
119 | for f,riv in enumerate(TestRiverTuple):
120 |     print('Compiling Phase 2 reports for ' + riv)
121 |     
122 |     if Phase2Type == 'MLP':
123 |         p2_csv = glob.glob(ScorePath+"MLP_" + riv + "*.csv")
124 |     elif Phase2Type == 'RF':
125 |         p2_csv = glob.glob(ScorePath+"RF_" + riv + "*.csv")
126 | 
127 |     for i,csv in enumerate(p2_csv):
128 | 
129 |         DF = pd.read_csv(csv)                       
130 |         FileDf = DF.filter(['class','f1_score','support'],axis=1)
131 |         FileDf = FileDf.drop(FileDf[(FileDf.f1_score == 0) & (FileDf.support == 0)].index)
132 |         FileDf = FileDf.append(pd.DataFrame([['ALL',np.sum((FileDf.f1_score*FileDf.support))/np.sum(FileDf.support),np.sum(FileDf.support)]],columns=['class','f1_score','support']))
133 |         FileDf['RiverName'] = riv
134 |         FileDf['Image'] = os.path.basename(csv).split('_')[1] + "_" + os.path.basename(csv).split('_')[2]
135 |         p2_df = pd.concat([p2_df, FileDf], sort = False)
136 |         
137 |     p2_df = p2_df[1:]
138 |     p2_df['f1_score'] = 100 * p2_df['f1_score']
139 |     p2_df['class'][p2_df['class']==1] = 'Water'
140 |     p2_df['class'][p2_df['class']==2] = 'Sediment'
141 |     p2_df['class'][p2_df['class']==3] = 'Green Veg.'
142 |     p2_df['class'][p2_df['class']==4] = 'Senesc. Veg.'
143 |     p2_df['class'][p2_df['class']==5] = 'Paved Road'
144 |     p2_df['experiment'] = Experiment
145 |     p2_df['model'] = model
146 |     
147 |     if Phase2Type == 'MLP':
148 |         p2_df['type'] = 'MLP' 
149 |     elif Phase2Type == 'RF':
150 |         p2_df['type'] = 'RF' 
151 | 
152 | SaveName = ScorePath + 'Compiled_' + Experiment + '.csv'
153 | pd.concat([CNN_df,p2_df]).to_csv(SaveName,index=False)
154 | 


--------------------------------------------------------------------------------
/code/TrainCNN.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # -*- coding: utf-8 -*-
  3 | #------------------------------------------------------------------------------
  4 | __author__ = 'Patrice Carbonneau'
  5 | __contact__ = 'patrice.carbonneau@durham.ac.uk'
  6 | __copyright__ = '(c) Patrice Carbonneau'
  7 | __license__ = 'MIT'
  8 | __date__ = '15 APR 2019'
  9 | __version__ = '1.1'
 10 | __status__ = "initial release"
 11 | __url__ = "https://github.com/geojames/Self-Supervised-Classification"
 12 | 
 13 | """
 14 | Name:           TrainCNN.py
 15 | Compatibility:  Python 3.6
 16 | Description:    This file trains and outputs a NASNet Large model to for the 
 17 |                 task of river classification from RGB images
 18 | 
 19 | Requires:       keras, TensorFlow-GPU, numpy, pandas, matplotlib, skimage, sklearn
 20 | 
 21 | Dev Revisions:  JTD - 19/4/19 - Updated file paths, added auto detection of
 22 |                     river names from input images, improved image reading loops
 23 | 
 24 | Licence:        MIT
 25 | Permission is hereby granted, free of charge, to any person obtaining a copy of
 26 | this software and associated documentation files (the "Software"), to deal in 
 27 | the Software without restriction, including without limitation the rights to 
 28 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 
 29 | of the Software, and to permit persons to whom the Software is furnished to do 
 30 | so, subject to the following conditions:
 31 | 
 32 | The above copyright notice and this permission notice shall be included in all 
 33 | copies or substantial portions of the Software.
 34 | 
 35 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 
 36 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 
 37 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 
 38 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 
 39 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
 40 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
 41 | SOFTWARE.
 42 | """
 43 | 
 44 | from keras import layers
 45 | from keras import models
 46 | from keras import regularizers
 47 | from keras import optimizers
 48 | from keras.applications import nasnet
 49 | from keras.models import load_model
 50 | import numpy as np
 51 | import pandas as pd
 52 | import matplotlib.pyplot as plt
 53 | from skimage import io
 54 | from keras.utils.np_utils import to_categorical
 55 | from keras.models import Sequential
 56 | from keras.layers import Dense, Dropout, BatchNormalization
 57 | from keras.wrappers.scikit_learn import KerasClassifier
 58 | from sklearn.preprocessing import StandardScaler
 59 | from sklearn.pipeline import Pipeline
 60 | from sklearn.ensemble import RandomForestClassifier as RFC
 61 | from sklearn.model_selection import train_test_split
 62 | import os.path
 63 | import sys
 64 | from sklearn import metrics
 65 | import copy
 66 | from skimage.morphology import disk
 67 | from skimage.filters.rank import median, entropy
 68 | from IPython import get_ipython
 69 | import glob
 70 | import fnmatch
 71 | 
 72 | #########################################################
 73 | #Image loading and data preparation.
 74 | #This loads all images, and class label masks in the TRAIN folder and it:
 75 | # 1 - Tiles the images into a 4D tensor of (nTiles, Width, Height, Bands), channel-last format
 76 | # 2 - Looks at the classification mask and when a tile is 90% classified, produces a label vector value. 0 if less than 90%
 77 | # 3 - Feeds this into the NASNet Large convnet.  Level of trainability of the convnet can be set by user
 78 | 
 79 | #------------------------------------------------------------
 80 | """User data input. Fill in the info below before running"""
 81 | #------------------------------------------------------------
 82 | #Watch the \\ and if there is a bug go to single quotes.  This folder shold have the training images, labels and the pre-trained NASNet model.
 83 | #Trained model and class key will also be written out to the training folder
 84 | TrainPath = 'E:\\DeepRiverscapes\\MargSix'  
 85 | #Input and Output name for the model.  It wil be saved in the training folder
 86 | ModelInputName = 'NASNetLarge_base_50px' #no extensions
 87 | ModelOutputName = 'Marg6'      
 88 |          #where the model will be saved
 89 | 
 90 | NClasses = 5 #The number of classes in the imagery. Adjust as needed
 91 | 
 92 | 
 93 | 
 94 | #use this option to train the model with a validation subset. Accuracy and loss checks will be displayed.
 95 | #When the tuning is satisfactory, set to False and train with the whole dataset. Model will only be saved if this is set to False
 96 | #When true the sript will exit with a system error and display loss/acc vs epoch figures.  This is intentional.
 97 | ModelTuning = False
 98 | TuningEpochs = 20 
 99 | TuningSubSamp = 0.55 # Subsample of data, 0-1, to be used in tuning.
100 | 
101 | #If the model is tuned, enter the right number of epochs.
102 | #This is only used when ModelTuning is False.  
103 | TrainingEpochs = 7
104 | BatchSize = 100 #will depend on your GPU
105 | 
106 | # Path checks- checks for folder ending slash, adds if nessesary
107 | if ('/' or "'\'") not in TrainPath[-1]:
108 |     TrainPath = TrainPath + '/'
109 | 
110 | 
111 | 
112 | ##################################################################
113 | """ HELPER FUNCTIONS SECTION"""
114 | ##################################################################
115 | # Helper function to crop images to have an integer number of tiles. No padding is used.
116 | def CropToTile (Im, size):
117 |     if len(Im.shape) == 2:#handle greyscale
118 |         Im = Im.reshape(Im.shape[0], Im.shape[1],1)
119 | 
120 |     crop_dim0 = size * (Im.shape[0]//size)
121 |     crop_dim1 = size * (Im.shape[1]//size)
122 |     return Im[0:crop_dim0, 0:crop_dim1, :]
123 |     
124 |     
125 | #Helper functions to move images in and out of tensor format
126 | def split_image_to_tiles(im, size):
127 |     
128 |     if len(im.shape) ==2:
129 |         h, w = im.shape
130 |         d = 1
131 |     else:
132 |         h, w, d = im.shape
133 | 
134 |      
135 |     nTiles_height = h//size
136 |     nTiles_width = w//size
137 |     TileTensor = np.zeros((nTiles_height*nTiles_width, size,size,d))
138 |     B=0
139 |     for y in range(0, nTiles_height):
140 |         for x in range(0, nTiles_width):
141 |             x1 = np.int32(x * size)
142 |             y1 = np.int32(y * size)
143 |             x2 = np.int32(x1 + size)
144 |             y2 = np.int32(y1 + size)
145 |             TileTensor[B,:,:,:] = im[y1:y2,x1:x2].reshape(size,size,d)
146 |             B+=1
147 | 
148 |     return TileTensor
149 | 
150 | #Create the label vector
151 | def PrepareTensorData(ImageTile, ClassTile, size):
152 |     #this takes the image tile tensor and the class tile tensor
153 |     #It produces a label vector from the tiles which have 90% of a pure class
154 |     #It then extract the image tiles that have a classification value in the labels
155 |     LabelVector = np.zeros(ClassTile.shape[0])
156 |     
157 |     for v in range(0,ClassTile.shape[0]):
158 |         Tile = ClassTile[v,:,:,0]
159 |         vals, counts = np.unique(Tile, return_counts = True)
160 |         if (vals[0] == 0) and (counts[0] > 0.1 * size**2): #Handle unlabelled (class = 0) patches
161 |             LabelVector[v] = 0
162 |         elif counts[np.argmax(counts)] >= 0.9 * size**2:
163 |             LabelVector[v] = vals[np.argmax(counts)] 
164 |     
165 |     ClassifiedTiles = np.zeros((np.count_nonzero(LabelVector), size,size,3))
166 |     ClassifiedLabels = np.zeros((np.count_nonzero(LabelVector),1))
167 |     C = 0
168 |     for t in range(0,len(LabelVector)):
169 |         if LabelVector[t] > 0:
170 |             ClassifiedTiles[C,:,:,:] = ImageTile[t,:,:,:]
171 |             ClassifiedLabels[C,0] = LabelVector[t]
172 |             C += 1
173 |     return ClassifiedLabels, ClassifiedTiles
174 | 
175 | 
176 | #############################
177 | def class_prediction_to_image(im, PredictedTiles, size):
178 | 
179 |     if len(im.shape) ==2:
180 |         h, w = im.shape
181 |         d = 1
182 |     else:
183 |         h, w, d = im.shape
184 | 
185 |      
186 |     nTiles_height = h//size
187 |     nTiles_width = w//size
188 |     #TileTensor = np.zeros((nTiles_height*nTiles_width, size,size,d))
189 |     TileImage = np.zeros(im.shape)
190 |     B=0
191 |     for y in range(0, nTiles_height):
192 |         for x in range(0, nTiles_width):
193 |             x1 = np.int32(x * size)
194 |             y1 = np.int32(y * size)
195 |             x2 = np.int32(x1 + size)
196 |             y2 = np.int32(y1 + size)
197 |             TileImage[y1:y2,x1:x2] = np.argmax(PredictedTiles[B,:])
198 |             B+=1
199 | 
200 |     return TileImage
201 | 
202 | 
203 |     
204 | ################################################################################
205 | #Keep track of the new classes in a Pandas DataFrame
206 | def MakePandasKey(ClassKey, NClasses, f, r):
207 |     for c in range(1,NClasses+1):
208 |         ClassKey['River'].loc[c+(f*NClasses)] = r
209 |         ClassKey['LocalClass'].loc[c+(f*NClasses)] = (f*NClasses+1) + (c-1)
210 |         ClassKey['HierarchClass'].loc[c+(f*NClasses)] = 1 + (c-1)
211 |         if c==1:
212 |             ClassKey['ClassName'].loc[c+(f*NClasses)] = 'water'
213 |         elif c==2:
214 |             ClassKey['ClassName'].loc[c+(f*NClasses)] = 'sediment'
215 |         elif c==3:
216 |             ClassKey['ClassName'].loc[c+(f*NClasses)] = 'green_vegetation'
217 |         elif c==4:
218 |             ClassKey['ClassName'].loc[c+(f*NClasses)] = 'senescent_vegetation'
219 |         else:
220 |             ClassKey['ClassName'].loc[c+(f*NClasses)] = 'Road'
221 |     return ClassKey
222 |             
223 | ###############################################################################
224 | # Return a class prediction to the 1-Nclasses Macro Classes classes
225 | def SimplifyClass(ClassImage, ClassKey):
226 |     Iclasses = np.unique(ClassImage)
227 |     for c in range(0, len(Iclasses)):
228 |         KeyIndex = ClassKey.loc[ClassKey['LocalClass'] == Iclasses[c]]
229 |         Hclass = KeyIndex.iloc[0]['HierarchClass']
230 |         ClassImage[ClassImage == Iclasses[c]] = Hclass
231 |     return ClassImage
232 | 
233 | ###############################################################################
234 | # Checks            
235 | def checkInputImgNames(path):
236 |     # Glob list fo all jpg images
237 |     img = glob.glob(path+"*.jpg")
238 | 
239 |     # test file names using fnmatch fo a 5-digit number
240 |     #   if aone false, raise an exception and kill everything 
241 |     for im in img:
242 |         if fnmatch.fnmatch(im,'*_[0-9][0-9][0-9][0-9]*.jpg') == False:       
243 |             raise Exception("Image: %s is not in the correct format. All images need to have a five digit number after the RiverName, eg RiverName_00000.jpg"%(os.path.basename(im)))
244 |             sys.exit(1)
245 |     
246 | 
247 | ###############################################################################
248 | """Data Preparation Section"""
249 | ###############################################################################    
250 | size = 50 #Do not edit. The base models supplied all assume a tile size of 50.  
251 | 
252 | # Check Files Name Pattern
253 | checkInputImgNames(TrainPath)
254 | 
255 | # Getting River Names from the files
256 | # Glob list fo all jpg images, get unique names form the total list
257 | img = glob.glob(TrainPath + "*.jpg")
258 | RiverTuple = []
259 | for im in img:
260 |     RiverTuple.append(os.path.basename(im).partition('_')[0])
261 | RiverTuple = np.unique(RiverTuple)
262 | 
263 | # Get training class images
264 | class_img = glob.glob(TrainPath + "SCLS_*.tif*")
265 | 
266 | if len(img) != len(class_img):
267 |     raise Exception("Training and Classified Images Counts do not match, Please Double-check and Rerun")
268 |     sys.exit(1)
269 | 
270 | 
271 | ClassKeyDict ={'LocalClass': np.zeros(1+(len(RiverTuple)*5)).ravel(), 'HierarchClass' : np.zeros(1+(len(RiverTuple)*5)).ravel()} 
272 | ClassKey = pd.DataFrame(ClassKeyDict)
273 | ClassKey['River'] = ""
274 | ClassKey['ClassName'] = ""
275 | 
276 | LocalClasses = np.int16(NClasses*len(RiverTuple))  
277 | ImageTensor = np.zeros((1,size,size,3))
278 | LabelTensor = np.zeros((1,(NClasses*len(RiverTuple))+1))
279 |    
280 | for f,riv in enumerate(RiverTuple):
281 |     ClassKey = MakePandasKey(ClassKey, NClasses, f, riv)
282 |     for i,im in enumerate(img):  
283 |         print('processing image ' + os.path.basename(im))
284 |         Im3D = io.imread(im)
285 |         if len(Im3D) == 2:
286 |             Im3D = Im3D[0]
287 | 
288 |         Class = io.imread(class_img[i])
289 |         if Im3D.shape[0] != Class.shape[0]:
290 |             sys.exit("Error, Image and class mask do not have the same dimensions")
291 |             
292 |         NewClass = Class + f*NClasses #Transform macro-class from the classification 1 to N where N is the number of classes to micro-classes.
293 |         NewClass[Class == 0] = 0 #this step avoids the unclassified areas becoming class f*(NClasses)
294 |         Class = copy.deepcopy(NewClass)
295 |         ImCrop = CropToTile (Im3D, size)
296 |         ClsCrop =  CropToTile (Class, size)           
297 |         I_tiles = split_image_to_tiles(ImCrop, size)
298 |         I_tiles = np.int16(I_tiles) / 255
299 |         C_tiles = split_image_to_tiles(ClsCrop, size)
300 |         CLabelVector, ClassifiedTiles = PrepareTensorData(I_tiles, C_tiles, size)
301 |         Label1hot= to_categorical(CLabelVector, num_classes = LocalClasses+1) #convert to one-hot encoding
302 |         ImageTensor = np.concatenate((ImageTensor,ClassifiedTiles), axis = 0)
303 |         LabelTensor = np.concatenate((LabelTensor, Label1hot), axis = 0)
304 |         del(ImCrop,ClsCrop,I_tiles, C_tiles, Label1hot, Class, NewClass, Im3D, CLabelVector, ClassifiedTiles)
305 |             
306 | 
307 | #Delete the first blank tile from initialisation
308 | ImageTensor = ImageTensor[1:,:,:,:]
309 | LabelTensor = LabelTensor[1:,:]
310 | print(str(ImageTensor.shape[0]) + ' image tiles of ' + str(size) + ' X ' + str(size) + ' extracted')    
311 |            
312 | 
313 | ##########################################################
314 | """Convnet section"""
315 | ##########################################################
316 | #Setup the convnet and add dense layers
317 | print('Loading ' + ModelInputName + '.h5') #be sure the pre-trained CNN is in the same folder as the training images
318 | FullModelPath = TrainPath + ModelInputName + '.h5'
319 | conv_base = load_model(FullModelPath)
320 | 
321 | 
322 | model = models.Sequential()
323 | model.add(conv_base)
324 | model.add(layers.Flatten())
325 | model.add(layers.Dense(256, activation='relu', kernel_regularizer= regularizers.l2(0.001)))
326 | model.add(layers.Dropout(0.5))
327 | model.add(layers.Dense(128, activation='relu', kernel_regularizer=regularizers.l2(0.001)))
328 | model.add(layers.Dense(LabelTensor.shape[1], activation='softmax'))
329 | 
330 | #Freeze all or part of the convolutional base to keep imagenet weigths intact
331 | conv_base.trainable = True
332 | set_trainable = False
333 | for layer in conv_base.layers:
334 |     set_trainable = False
335 |     if  ('separable_conv_2_normal_left2' in layer.name) or ('separable_conv_2_normal_right2' in layer.name):
336 |         set_trainable = True
337 |     if set_trainable:
338 |         layer.trainable = True
339 |     else:
340 |         layer.trainable = False
341 | 
342 |             
343 | 
344 | #Tune an optimiser
345 | Optim = optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=True)
346 |  
347 | #Compile and display the model          
348 | model.compile(optimizer=Optim,loss='categorical_crossentropy', metrics=['acc'])
349 | model.summary()
350 | 
351 | 
352 | if ModelTuning:
353 |     #Split the data for tuning. Use a double pass of train_test_split to shave off some data
354 |     (trainX, testX, trainY, testY) = train_test_split(ImageTensor, LabelTensor, test_size=TuningSubSamp-0.001)
355 |     (Partiel_trainX, Partiel_testX, Partiel_trainY, Partiel_testY) = train_test_split(testX, testY, test_size=0.25)
356 |     del(ImageTensor, LabelTensor)
357 |     history = model.fit(Partiel_trainX, Partiel_trainY, epochs = TuningEpochs, batch_size = 75, validation_data = (Partiel_testX, Partiel_testY))
358 |     #Plot the test results
359 |     history_dict = history.history
360 |     loss_values = history_dict['loss']
361 |     val_loss_values = history_dict['val_loss']
362 |     
363 |     epochs = range(1, len(loss_values) + 1)
364 |     get_ipython().run_line_magic('matplotlib', 'qt')
365 |     plt.figure(figsize = (12, 9.5))
366 |     plt.subplot(1,2,1)
367 |     plt.plot(epochs, loss_values, 'bo', label = 'Training loss')
368 |     plt.plot(epochs,val_loss_values, 'b', label = 'Validation loss')
369 |     plt.title('Training and Validation Loss')
370 |     plt.xlabel('Epochs')
371 |     plt.ylabel('Loss')
372 |     plt.legend()
373 |     
374 |     acc_values = history_dict['acc']
375 |     val_acc_values = history_dict['val_acc']
376 |     plt.subplot(1,2,2)
377 |     plt.plot(epochs, acc_values, 'go', label = 'Training acc')
378 |     plt.plot(epochs, val_acc_values, 'g', label = 'Validation acc')
379 |     plt.title('Training and Validation Accuracy')
380 |     plt.xlabel('Epochs')
381 |     plt.ylabel('Accuracy')
382 |     plt.legend()
383 |     plt.show()
384 | 
385 |     
386 |     sys.exit("Tuning Finished, adjust parameters and re-train the model") # stop the code if still in tuning phase.
387 | 
388 | '''Fit all the data for transfer learning and train the final CNN'''
389 | model.fit(ImageTensor, LabelTensor, batch_size=BatchSize, epochs=TrainingEpochs, verbose=1)
390 | 
391 | 
392 | """Save the model and class key"""
393 | FullModelPath = TrainPath + ModelOutputName + '.h5'
394 | model.save(FullModelPath)
395 | 
396 | ClassKeyName = TrainPath + ModelOutputName + '.csv'
397 | ClassKey.to_csv(path_or_buf = ClassKeyName, index = False)
398 | 
399 | 
400 | 
401 | 
402 | 
403 | 


--------------------------------------------------------------------------------
/docs/_config.yml:
--------------------------------------------------------------------------------
1 | theme: jekyll-theme-cayman


--------------------------------------------------------------------------------
/docs/readme.md:
--------------------------------------------------------------------------------
1 | new docs folder for github pages
2 | 


--------------------------------------------------------------------------------
/sample_data/Train/NASNetMobile_base_50px.h5:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/NASNetMobile_base_50px.h5


--------------------------------------------------------------------------------
/sample_data/Train/SCLS_StMarg26607.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/SCLS_StMarg26607.tif


--------------------------------------------------------------------------------
/sample_data/Train/SCLS_StMarg26626.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/SCLS_StMarg26626.tif


--------------------------------------------------------------------------------
/sample_data/Train/SCLS_StMarg26722.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/SCLS_StMarg26722.tif


--------------------------------------------------------------------------------
/sample_data/Train/SCLS_StMarg26734.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/SCLS_StMarg26734.tif


--------------------------------------------------------------------------------
/sample_data/Train/SCLS_StMarg26758.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/SCLS_StMarg26758.tif


--------------------------------------------------------------------------------
/sample_data/Train/SCLS_StMarg26776.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/SCLS_StMarg26776.tif


--------------------------------------------------------------------------------
/sample_data/Train/SCLS_StMarg26806.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/SCLS_StMarg26806.tif


--------------------------------------------------------------------------------
/sample_data/Train/SCLS_StMarg26818.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/SCLS_StMarg26818.tif


--------------------------------------------------------------------------------
/sample_data/Train/SCLS_StMarg26836.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/SCLS_StMarg26836.tif


--------------------------------------------------------------------------------
/sample_data/Train/SCLS_StMarg27113.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/SCLS_StMarg27113.tif


--------------------------------------------------------------------------------
/sample_data/Train/StMarg_26607.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/StMarg_26607.jpg


--------------------------------------------------------------------------------
/sample_data/Train/StMarg_26626.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/StMarg_26626.jpg


--------------------------------------------------------------------------------
/sample_data/Train/StMarg_26722.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/StMarg_26722.jpg


--------------------------------------------------------------------------------
/sample_data/Train/StMarg_26734.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/StMarg_26734.jpg


--------------------------------------------------------------------------------
/sample_data/Train/StMarg_26758.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/StMarg_26758.jpg


--------------------------------------------------------------------------------
/sample_data/Train/StMarg_26776.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/StMarg_26776.jpg


--------------------------------------------------------------------------------
/sample_data/Train/StMarg_26806.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/StMarg_26806.jpg


--------------------------------------------------------------------------------
/sample_data/Train/StMarg_26818.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/StMarg_26818.jpg


--------------------------------------------------------------------------------
/sample_data/Train/StMarg_26836.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/StMarg_26836.jpg


--------------------------------------------------------------------------------
/sample_data/Train/StMarg_27113.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Train/StMarg_27113.jpg


--------------------------------------------------------------------------------
/sample_data/Validate/SCLS_StMarg31162.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/SCLS_StMarg31162.tif


--------------------------------------------------------------------------------
/sample_data/Validate/SCLS_StMarg31180.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/SCLS_StMarg31180.tif


--------------------------------------------------------------------------------
/sample_data/Validate/SCLS_StMarg31198.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/SCLS_StMarg31198.tif


--------------------------------------------------------------------------------
/sample_data/Validate/SCLS_StMarg31216.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/SCLS_StMarg31216.tif


--------------------------------------------------------------------------------
/sample_data/Validate/SCLS_StMarg31234.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/SCLS_StMarg31234.tif


--------------------------------------------------------------------------------
/sample_data/Validate/SCLS_StMarg31252.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/SCLS_StMarg31252.tif


--------------------------------------------------------------------------------
/sample_data/Validate/SCLS_StMarg31270.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/SCLS_StMarg31270.tif


--------------------------------------------------------------------------------
/sample_data/Validate/SCLS_StMarg31288.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/SCLS_StMarg31288.tif


--------------------------------------------------------------------------------
/sample_data/Validate/SCLS_StMarg31306.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/SCLS_StMarg31306.tif


--------------------------------------------------------------------------------
/sample_data/Validate/SCLS_StMarg31324.tif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/SCLS_StMarg31324.tif


--------------------------------------------------------------------------------
/sample_data/Validate/StMarg_31162.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/StMarg_31162.jpg


--------------------------------------------------------------------------------
/sample_data/Validate/StMarg_31180.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/StMarg_31180.jpg


--------------------------------------------------------------------------------
/sample_data/Validate/StMarg_31198.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/StMarg_31198.jpg


--------------------------------------------------------------------------------
/sample_data/Validate/StMarg_31216.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/StMarg_31216.jpg


--------------------------------------------------------------------------------
/sample_data/Validate/StMarg_31234.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/StMarg_31234.jpg


--------------------------------------------------------------------------------
/sample_data/Validate/StMarg_31252.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/StMarg_31252.jpg


--------------------------------------------------------------------------------
/sample_data/Validate/StMarg_31270.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/StMarg_31270.jpg


--------------------------------------------------------------------------------
/sample_data/Validate/StMarg_31288.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/StMarg_31288.jpg


--------------------------------------------------------------------------------
/sample_data/Validate/StMarg_31306.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/StMarg_31306.jpg


--------------------------------------------------------------------------------
/sample_data/Validate/StMarg_31324.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/geojames/CNN-Supervised-Classification/594326487372ff06e45aa8a06913fdd9b2aa90fe/sample_data/Validate/StMarg_31324.jpg


--------------------------------------------------------------------------------
/sample_data/readme.md:
--------------------------------------------------------------------------------
 1 | ### Self-Supervised Classification Sample Data
 2 | 
 3 | The data in this folder is intended to allow users to verify that the code functions correctly on their system without the need to download the large data volume from the repository.
 4 | 
 5 | 1) in the 'Train' Folder, there are samples of images/label pairs for training (with representative filenames). Please note that this small sample was not selected to produce a well trained CNN model.  The only purpose of the sample data is for testing and debugging. 
 6 |     - Downloade this data and on line 84 of TrainCNN.py, enter the correct file path.
 7 |     
 8 | 2) Also in the 'Train' folder, `NASNet_Mobile_50px.h5` is a base NASNet Mobile model with weights initialised with the ImageNet database.  The model is set to work with image tiles of 50X50 pixels.  Download this model to the same folder as the training images. By default, this name (without the .h5 extension) is on line 86 of TrainCNN.py.  Enter a different name on line 87 for the final trained version of the CNN model.
 9 | 
10 | 3) run TrainCNN.py.  Monitor your GPU usage and insure that TensorFlow-GPU is working correctly.
11 | 
12 | 4) in the 'Validate' folder, there are sample images that can be used for a mock validation of the trained model. Download these to a new folder and edit lines 79 to 83 of `SelfSupervisedClassification.py`.  Line 79 should have the model name selected in TrainCNN.py.  Line 80 should point to the same file path as line 84 in TrainCNN.py.  The other parametes can be left as dafault in the first instance.  
13 | 
14 | 5) At the end, the folder designated as the ScorePath on line 82 of `SelfSupervisedClassification.py`  should have 10 4-part figures in png format and 20 small csv files.  Use CompileClassiticationReport to merge the csv files into a single database formatted for use with the Seaborn visualisation library.
15 | 
16 | 
17 |    
18 | 


--------------------------------------------------------------------------------