├── .gitignore
├── README.md
├── Report.pdf
├── Slides.pdf
├── res
    └── images
    │   ├── example-tree.png
    │   └── tier.png
└── src
    ├── ConvGP.py
    ├── deapfix.py
    ├── evolution.py
    ├── helpers.py
    ├── search.py
    └── stgp.py


/.gitignore:
--------------------------------------------------------------------------------
1 | .DS_Store


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # ConvGP
 2 | 
 3 | This was my project as part of the requirements for my Honours degree in Computer Science at Victoria University of Wellington. A novel method for Binary image classification, utilising a memetic approach (Genetic programming combined with gradient descent)
 4 | 
 5 | ## How to use?
 6 | 
 7 | I set this up to feature a similar API to sklearn. The model can be trained and predictions made in only three lines of code
 8 | 
 9 | ```python
10 | gp = stgp.ConvGP()
11 | gp.fit(trainingX, trainingY)
12 | predictions = gp.predict(testingX)
13 | ```
14 | 
15 | Any questions feel free to open an issue
16 | 
17 | ## About
18 | The key idea is to combine aspects from genetic programming and convolutional neural networks
19 | to overcome various limitations of ConvNets, i.e.
20 | 
21 | - Need for manually crafted architectures
22 | - Poor Interpretability. Although google brain appears to be doing some promising research in this area [here](https://distill.pub/2017/feature-visualization/), however a large limitation is still interpretability of feature interaction 
23 | - Require large amounts of training data
24 | 
25 | The developed method uses strongly-typed genetic programming to automatically evolve trees which can be used for binary image classification. An example of an evolved tree is shown below
26 | 
27 | ![Example Tree](res/images/example-tree.png "A sample solution for the JAFFE dataset")
28 | 
29 | A breakdown of the tree architecture is given below, the structure is enforced using strongly-typed genetic programming.
30 | 
31 | ![Example Architecture](res/images/tier.png "Example tree demonstranting the architecture")
32 | 
33 | The proposed method overcomes some of the aforementioned problems.
34 | 
35 | - The architecture is automatically evolved rather than manually crafted
36 | - The solution offers high interpretability, as shown with the example above
37 | 
38 | Filter/kernel values are learnt through a combination of gradient descent and evolution, as gradient descent is run periodically throughout the process to optimise the values.
39 | 
40 | Papers to come, pending publishing.
41 | 
42 | ## Citing
43 | 
44 | Part of this work was published in the 2018 IEEE Congress on Evolutionary Computation (CEC) at: https://ieeexplore.ieee.org/abstract/document/8477933. Which can be cited as
45 | 
46 | ```latex
47 | @INPROCEEDINGS{8477933, 
48 | author={B. {Evans} and H. {Al-Sahaf} and B. {Xue} and M. {Zhang}}, 
49 | booktitle={2018 IEEE Congress on Evolutionary Computation (CEC)}, 
50 | title={Evolutionary Deep Learning: A Genetic Programming Approach to Image Classification}, 
51 | year={2018}, 
52 | volume={}, 
53 | number={}, 
54 | pages={1-6}, 
55 | keywords={convolution;feedforward neural nets;genetic algorithms;handwritten character recognition;image classification;learning (artificial intelligence);medical image processing;deep learning;genetic programming approach;image classification;cell images;convolutional neural networks;CNNs;genetic programming solution;image datasets;recognising handwritten digits;medical diagnosis;Computer architecture;Feature extraction;Machine learning;Visualization;Genetic programming;Task analysis;Image recognition;Genetic programming;Image classification;Deep learning;Feature extraction}, 
56 | doi={10.1109/CEC.2018.8477933}, 
57 | ISSN={}, 
58 | month={July},}
59 | ```
60 | 
61 | The portion which incorporates gradient descent was made available online at: https://arxiv.org/abs/1909.13030 and can be cited as follows, but please cite the paper above unless it is about the gradient descent specifically
62 | 
63 | ```latex
64 | @misc{evans2019genetic,
65 |     title={Genetic Programming and Gradient Descent: A Memetic Approach to Binary Image Classification},
66 |     author={Benjamin Patrick Evans and Harith Al-Sahaf and Bing Xue and Mengjie Zhang},
67 |     year={2019},
68 |     eprint={1909.13030},
69 |     archivePrefix={arXiv},
70 |     primaryClass={cs.NE}
71 | }
72 | ```
73 | 
74 | 
75 | #### Foot note
76 | This work was originally written in [ECJ](https://cs.gmu.edu/~sean/papers/gecco17-ecj.pdf), and was based on some existing code from my supervisors work on [2TGP](http://www.sciencedirect.com/science/article/pii/S0957417412003867). The code base has since been ported to Python3, which is what you see attached here. This was done for a number of reasons, mainly the large number of available libraries which can reduce the overall code size (as originally code was written all from scratch) and the improved readability.
77 | 


--------------------------------------------------------------------------------
/Report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/benjaminpatrickevans/ConvGP/dade8af0e7bf686b5615ec6f60a256af489c50eb/Report.pdf


--------------------------------------------------------------------------------
/Slides.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/benjaminpatrickevans/ConvGP/dade8af0e7bf686b5615ec6f60a256af489c50eb/Slides.pdf


--------------------------------------------------------------------------------
/res/images/example-tree.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/benjaminpatrickevans/ConvGP/dade8af0e7bf686b5615ec6f60a256af489c50eb/res/images/example-tree.png


--------------------------------------------------------------------------------
/res/images/tier.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/benjaminpatrickevans/ConvGP/dade8af0e7bf686b5615ec6f60a256af489c50eb/res/images/tier.png


--------------------------------------------------------------------------------
/src/ConvGP.py:
--------------------------------------------------------------------------------
  1 | 
  2 | # coding: utf-8
  3 | 
  4 | # In[ ]:
  5 | 
  6 | import scipy as sp
  7 | from  scipy import ndimage
  8 | import random
  9 | from glob import glob
 10 | import stgp
 11 | from sklearn import neighbors, svm, tree, naive_bayes, ensemble
 12 | from sklearn import metrics
 13 | from sklearn.utils import shuffle
 14 | from skimage import transform
 15 | import numpy as np
 16 | import search
 17 | import time
 18 | import sys
 19 | import math
 20 | import helpers
 21 | import os
 22 | import pickle
 23 | 
 24 | # # Read in the data for training/testing
 25 | # 1. Read the images from disk
 26 | # 2. Save these in a dict from label -> images
 27 | # 3. Split these based off label into training/testing images
 28 | # 
 29 | # Need to do it in this order to ensure we get an equal split of instances from each class in the data, since classification accuracy is used as fitness this is important.
 30 | 
 31 | # In[ ]:
 32 | 
 33 | # Reads in all the data as a dict from label -> [images]
 34 | def read_data(directory, scale=False, scaled_height=None, scaled_width=None):
 35 |     data = {}
 36 | 
 37 |     # Assumes the images are in subfolders, where the folder name is the images label
 38 |     for subdir in glob(directory+"/*/"):
 39 |         label = subdir.split("/")[-2] # Second to last element is thee class/sub folder name
 40 |         images = [ndimage.imread(image, flatten=True) for image in glob(subdir+"/*.*")] # Read in all the images from subdirectories. Flatten for greyscale
 41 |     
 42 |         images = [image.astype(float) / 255. for image in images] # Store in range 0..1 rather than 0..255
 43 |         
 44 |         if scale: # Resize images
 45 |             images = [transform.resize(image, (scaled_height, scaled_width)) for image in images]
 46 |             
 47 |         # Shuffle the images (seed specified at the top of program so this will be reproducable)
 48 |         random.shuffle(images)
 49 |         data[label] = images
 50 |         
 51 |     # Set of all class names
 52 |     class_names = list(data.keys())
 53 | 
 54 |     # Sanity check
 55 |     if len(class_names) != 2:
 56 |         print("Binary classification only! But labels found were:", labels)
 57 |     
 58 |     return data, class_names
 59 | 
 60 | # Splits the data into four arrays trainingX, trainingY, testingX, testingY
 61 | def format_and_split_data(data, class_names, split, seed):
 62 |     trainingX = []
 63 |     trainingY = []
 64 |     
 65 |     testingX = []
 66 |     testingY = []
 67 |     
 68 |     # For all the classes, split into training/testing (need to do it per class to ensure we get a good split of all classes)
 69 |     for label in class_names:
 70 |         x = data[label]
 71 |         length = int(len(x))
 72 |         y = [label] * length
 73 |         
 74 |         training_length = int(length * split)
 75 |         trainingX.extend(x[:training_length])
 76 |         trainingY.extend(y[:training_length])
 77 |         
 78 |         testingX.extend(x[training_length:])
 79 |         testingY.extend(y[training_length:])
 80 |     
 81 |     # And just so the order isnt all class1s then all class2s, shuffle the data in unison
 82 |     trainingX, trainingY = shuffle(trainingX, trainingY, random_state=seed)
 83 |     testingX, testingY = shuffle(testingX, testingY, random_state=seed)
 84 | 
 85 |     return trainingX, trainingY, testingX, testingY
 86 | 
 87 | 
 88 | # # Run the various models
 89 | # 
 90 | # Now we have the data, we can run and evaluate the various algorithms
 91 | 
 92 | # In[ ]:
 93 | 
 94 | def pretty_float(f):
 95 |     return "{0:.2f}".format(f)
 96 | 
 97 | # The method of comparison
 98 | def classification_accuracy(real_labels, predicted_labels):
 99 |     return metrics.accuracy_score(real_labels, predicted_labels)
100 | 
101 | def fit_and_evaluate(model, trainingX, trainingY, testingX, testingY, seed=None, verbose=False):
102 |     start = time.time() # Track the time taken
103 | 
104 |     if seed is not None:
105 |         model.fit(trainingX, trainingY, seed=seed, verbose=verbose)
106 |     else:
107 |         model.fit(trainingX, trainingY)
108 |     
109 |     training_time = time.time() - start
110 |         
111 |     predicted_training = model.predict(trainingX)
112 |     
113 |     start = time.time()
114 |     predicted_testing = model.predict(testingX)
115 |     testing_time = time.time() - start
116 |     
117 |     return classification_accuracy(trainingY, predicted_training), classification_accuracy(testingY, predicted_testing), training_time, testing_time 
118 | 
119 | 
120 | 
121 | # In[ ]:
122 | 
123 | def print_stats(title, arr):
124 |     print(title, pretty_float(np.min(arr)), pretty_float(np.mean(arr)), pretty_float(np.max(arr)), pretty_float(np.std(arr)), len(arr))
125 | 
126 | 
127 | # In[ ]:
128 | 
129 | def run_general_classifiers(trainingX, trainingY, testingX, testingY):
130 |     print("General Classifiers")
131 | 
132 |     # The general classification methods require a list of features, rather than a 2d array so we need to flatten these
133 |     flattened_trainingX = [image.flatten() for image in trainingX]
134 |     flattened_testingX = [image.flatten() for image in testingX]
135 | 
136 |     # The general classifiers to compare against
137 |     general_classifiers = {
138 |         "Nearest Neighbour": neighbors.KNeighborsClassifier(1),
139 |         "SVM": svm.SVC(),
140 |         "Decision Tree": tree.DecisionTreeClassifier(),
141 |         "Naive Bayes": naive_bayes.GaussianNB(),
142 |         "Adaboost": ensemble.AdaBoostClassifier()
143 |     }
144 | 
145 |     print("Name, Training accuracy, Testing Accuracy, Training Time, Testing Time")
146 |     
147 |     # These methods are deterministic, so only need to be run once
148 |     for classifier in general_classifiers:
149 |         model = general_classifiers[classifier]
150 |         training_accuracy, testing_accuracy, training_time, testing_time = fit_and_evaluate(model, flattened_trainingX, trainingY, flattened_testingX, testingY)
151 |         print(classifier, pretty_float(training_accuracy * 100), pretty_float(testing_accuracy * 100), pretty_float(training_time * 1000), pretty_float(testing_time * 1000))
152 | 
153 | 
154 | 
155 | # In[ ]:
156 | 
157 | def run_convgp(trainingX, trainingY, testingX, testingY, evolution_seed, lr, gd_frequency, extended):
158 |     print("ConvGP")
159 |     
160 |     convgp = stgp.ConvGP(lr=lr, gd_frequency=gd_frequency, extended=extended)
161 |     
162 |     # Print out the parameters for reference
163 |     convgp.print_info()
164 |     print("\tEvolution seed", evolution_seed)
165 |         
166 |     training_accuracy, testing_accuracy, training_time, testing_time = fit_and_evaluate(convgp, trainingX, trainingY, testingX, testingY, seed=evolution_seed, verbose=False)
167 | 
168 |     stats = [training_accuracy * 100, testing_accuracy * 100, training_time, testing_time]
169 |     
170 |     return convgp, stats
171 | 
172 | def save_stats(stats, file_name):
173 |     with open(file_name, 'wb') as fp:
174 |         pickle.dump(stats, fp)
175 |     print(stats)
176 | 
177 | # In[ ]:
178 | 
179 | def run(dataset_name, training_seed, evolution_seed, lr, gd_frequency, scale=False, extended=False, training_split=0.5):
180 |     # Reproducability for shuffle
181 |     random.seed(training_seed)
182 | 
183 |     # Which data to use
184 |     data_directory = "data/"
185 | 
186 |     # Where to save the output
187 |     output_directory = "out/"
188 | 
189 |     # If the dir doesnt exist, make it
190 |     if not os.path.exists(output_directory):
191 |         os.makedirs(output_directory)
192 | 
193 |     print("Data is:", dataset_name)
194 |     print("Seed for data shuffle is:", training_seed)
195 | 
196 |     # Used only if scale is set to True. Must be used if images are of different sizes
197 |     scaled_width = 64 
198 |     scaled_height = 64
199 | 
200 |     # Read and split data into training and testing    
201 |     data, class_names = read_data(data_directory+dataset_name, scale, scaled_width, scaled_height)
202 |     trainingX, trainingY, testingX, testingY = format_and_split_data(data, class_names, training_split, training_seed)
203 |     
204 |     #run_general_classifiers(trainingX, trainingY, testingX, testingY)
205 |     convgp, stats = run_convgp(trainingX, trainingY, testingX, testingY, evolution_seed, lr, gd_frequency, extended)
206 | 
207 |     extended_text = "E" if extended else ""
208 |     lr_text = "-"+str(lr)+"-" if gd_frequency > 0 else ""
209 |     out_prefix = output_directory + dataset_name+"-"+str(training_seed)+"-"+str(evolution_seed)+lr_text+str(gd_frequency)+extended_text
210 | 
211 |     convgp.save_logbook(out_prefix+"-logbook.txt")
212 |     save_stats(stats, out_prefix+"-stats.txt")
213 |     convgp.save_tree(out_prefix+"-best.png")
214 | 
215 | 
216 | if __name__ == "__main__":
217 |     if len(sys.argv) != 8:
218 |         print("You must run the program with 8 args: dataset_name (str), training seed (int), evolutionary seed (int), learning rate(float), gradient descent frequency (int), scale (true/false), extended (true/false).")
219 |         print(sys.argv)
220 |         sys.exit()
221 |         
222 |     dataset_name = sys.argv[1]
223 |     training_seed = int(sys.argv[2])
224 |     evolution_seed = int(sys.argv[3])
225 |     lr = float(sys.argv[4])
226 |     gd_frequency = int(sys.argv[5])
227 |     scale = sys.argv[6].upper() == "TRUE"
228 |     extended = sys.argv[7].upper() == "TRUE"
229 |     
230 |     run(dataset_name, training_seed, evolution_seed, lr, gd_frequency, scale, extended)
231 | 
232 | 


--------------------------------------------------------------------------------
/src/deapfix.py:
--------------------------------------------------------------------------------
 1 | import random
 2 | from inspect import isclass
 3 | import sys
 4 | 
 5 | # DEAP has some issues with strongly typed GP and tree generation (see here: https://groups.google.com/forum/#!searchin/deap-users/stgp/deap-users/YOeb65eRNG4/AYUMcNldhdwJ and here: https://groups.google.com/forum/#!msg/deap-users/adq50--lzJ4/hefHPJKpBQAJ )
 6 | # The following code replaces some of the code in DEAP, to make this work. 
 7 | 
 8 | # The block off code below be ignored, essentially its a workaround to stop the need for "identity nodes" which bloat the trees, to fix
 9 | # the issue of strongly typed trees not being able to be generated in particular circumstances (i.e. full method but cant happen with current types)
10 | 
11 | # genHalfAndHalf, genFull and genGrow copied directly from deap (https://github.com/DEAP/deap/blob/master/deap/gp.py)
12 | def genFull(pset, min_, max_, type_=None):
13 |     def condition(height, depth):
14 |         """Expression generation stops when the depth is equal to height."""
15 |         return depth == height
16 |     return generate(pset, min_, max_, condition, type_)
17 | 
18 | def genGrow(pset, min_, max_, type_=None):
19 |     def condition(height, depth):
20 |         """Expression generation stops when the depth is equal to height
21 |         or when it is randomly determined that a a node should be a terminal.
22 |         """
23 |         return depth == height or             (depth >= min_ and random.random() < pset.terminalRatio)
24 |     return generate(pset, min_, max_, condition, type_)
25 | 
26 | 
27 | def genHalfAndHalf(pset, min_, max_, type_=None):
28 |     method = random.choice((genGrow, genFull))
29 |     return method(pset, min_, max_, type_)
30 | 
31 | # Small change made to method below from DEAP version. If you try and add a primtiive, but none of the appropriate type
32 | # is available, then try add a terminal instead.
33 | def generate(pset, min_, max_, condition, type_=None):
34 |     """Generate a Tree as a list of list. The tree is build
35 |     from the root to the leaves, and it stop growing when the
36 |     condition is fulfilled.
37 |     :param pset: Primitive set from which primitives are selected.
38 |     :param min_: Minimum height of the produced trees.
39 |     :param max_: Maximum Height of the produced trees.
40 |     :param condition: The condition is a function that takes two arguments,
41 |                       the height of the tree to build and the current
42 |                       depth in the tree.
43 |     :param type_: The type that should return the tree when called, when
44 |                   :obj:`None` (default) the type of :pset: (pset.ret)
45 |                   is assumed.
46 |     :returns: A grown tree with leaves at possibly different depths
47 |               dependending on the condition function.
48 |     """
49 |     if type_ is None:
50 |         type_ = pset.ret
51 |     expr = []
52 |     height = random.randint(min_, max_)
53 |     stack = [(0, type_)]
54 |     while len(stack) != 0:
55 |         depth, type_ = stack.pop()
56 |         
57 |         # If we are at the end of a branch, add a terminal
58 |         if condition(height, depth):
59 |             term = add_terminal(pset, type_)
60 |             expr.append(term)
61 |             
62 |         # Otherwise add a function    
63 |         else:
64 |             try:
65 |                 prim = random.choice(pset.primitives[type_])
66 |                 expr.append(prim)
67 |                 for arg in reversed(prim.args):
68 |                     stack.append((depth + 1, arg))
69 |             except IndexError: 
70 |                 # This is where the change occurs, if no primitive is available, try and add a terminal instead
71 |                 term = add_terminal(pset, type_)
72 |                 expr.append(term)
73 |             
74 |     return expr
75 | 
76 | 
77 | def add_terminal(pset, type_):
78 |     try:
79 |         term = random.choice(pset.terminals[type_])
80 |     except IndexError:
81 |         _, _, traceback = sys.exc_info()
82 |         raise IndexError("The custom generate function tried to add a terminal of type '%s', but there is none available." % (type_,), traceback)
83 |     if isclass(term):
84 |         term = term()
85 |     
86 |     return term


--------------------------------------------------------------------------------
/src/evolution.py:
--------------------------------------------------------------------------------
  1 | from deap import tools, algorithms
  2 | import search
  3 | from copy import deepcopy
  4 | 
  5 | # This is a modified version of the eaSimple algorithm from the DEAP library (https://github.com/DEAP/deap/blob/master/deap/algorithms.py). Modified to include gradient descent
  6 | def gradientEvolution(population, toolbox, cxpb, mutpb, ngen, xs, ys, context, arguments, classes, gd_frequency, epochs, lr,  extended=False, patience=10, stats=None,
  7 |              halloffame=None, verbose=__debug__):
  8 | 
  9 |     logbook = tools.Logbook()
 10 |     logbook.header = ['gen', 'nevals'] + (stats.fields if stats else [])
 11 | 
 12 |     # Evaluate the individuals with an invalid fitness
 13 |     invalid_ind = [ind for ind in population if not ind.fitness.valid]
 14 |     fitnesses = toolbox.map(toolbox.evaluate, invalid_ind)
 15 |     for ind, fit in zip(invalid_ind, fitnesses):
 16 |         ind.fitness.values = fit
 17 | 
 18 |     if halloffame is not None:
 19 |         halloffame.update(population)
 20 | 
 21 |     record = stats.compile(population) if stats else {}
 22 |     logbook.record(gen=0, nevals=len(invalid_ind), **record)
 23 |     if verbose:
 24 |         print(logbook.stream)
 25 | 
 26 |     no_improvement = 0 # Number of iterations without improvement in max
 27 |     max_fitness = 0
 28 |     original_mutpb = mutpb
 29 | 
 30 |     # Run on entire population
 31 |     num_best = len(population) #// 10
 32 | 
 33 |     # Begin the generational process
 34 |     for gen in range(0, ngen):
 35 |         # Select the next generation individuals
 36 |         offspring = toolbox.select(population, len(population))
 37 | 
 38 |         # Vary the pool of individuals
 39 |         offspring = algorithms.varAnd(offspring, toolbox, cxpb, mutpb)
 40 | 
 41 |         # Evaluate the individuals with an invalid fitness
 42 |         invalid_ind = [ind for ind in offspring if not ind.fitness.valid]
 43 |         fitnesses = toolbox.map(toolbox.evaluate, invalid_ind)
 44 | 
 45 |         last_max = max_fitness
 46 | 
 47 |         for ind, fit in zip(invalid_ind, fitnesses):
 48 |             max_fitness = max(max_fitness, fit[0])
 49 |             ind.fitness.values = fit
 50 | 
 51 |         # Track how many gens without improvement
 52 |         if max_fitness == last_max:
 53 |             no_improvement += 1
 54 |         else:
 55 |             no_improvement = 0
 56 |             mutpb = original_mutpb# Reset mutation rate
 57 | 
 58 |         # If we havent been progressing, keep increasing the mutation
 59 |         if no_improvement >= patience:
 60 |             increase = original_mutpb * 0.1 # 10% Increase
 61 |             cxpb -= increase # Decrease crossover
 62 |             if verbose:
 63 |                 print("Increasing mutation due to no improvement", mutpb, increase)
 64 |             mutpb += increase # Increase mutation
 65 | 
 66 |         # On final generation apply gradient descent on fittest individual only
 67 |         if gd_frequency != -1 and (gen == ngen or max_fitness == 1):
 68 |             if verbose:
 69 |                 print("Applying gradient descent on fittest individual")
 70 | 
 71 |             fittest_index = 0
 72 |             highest_fitness = 0
 73 | 
 74 |             # Find fittest individual
 75 |             for idx, individual in enumerate(offspring):
 76 |                 fitness = ind.fitness.values[0]
 77 | 
 78 |                 if fitness > highest_fitness:
 79 |                     highest_fitness = fitness
 80 |                     fittest_index = idx
 81 | 
 82 |             fittest_ind = offspring[fittest_index]
 83 | 
 84 |             if extended:
 85 |                 epochs = 100
 86 |                 
 87 |             updated_fittest_ind = search.gradient_descent(fittest_ind, xs, ys, context, arguments, classes, epochs, lr)
 88 | 
 89 |             # Update the individual on offspring
 90 |             offspring[fittest_index] = updated_fittest_ind
 91 | 
 92 |         # Apply gradient descent every n generations
 93 |         elif gd_frequency != -1 and gen % gd_frequency == 0:
 94 |             if verbose:
 95 |                 print("Applying gradient descent", gen)
 96 | 
 97 |             # Sort the offspring in descending fitness
 98 |             best_individuals = sorted(offspring, reverse=True, key=lambda ind: ind.fitness.values[0])
 99 | 
100 |             # Run gradient descent on the best, leaving the rest unchanged
101 |             cache = {} #Save the tree ->updated tree map, so in the case of duplicated trees in the best individuals we do not need to run gradient descent again
102 |             updated_best = []
103 | 
104 |             for ind in best_individuals[:num_best]:
105 |                 tree_str = str(ind)
106 | 
107 |                 # If we have already seen this tree, use the original value (dont run gradient descent!)
108 |                 if tree_str in cache:
109 |                     updated = deepcopy(cache[tree_str])
110 |                 # Otherwise need to run gradient descent and store in cashe
111 |                 else:
112 |                     updated = search.gradient_descent(ind, xs, ys, context, arguments, classes, epochs, lr)
113 |                     cache[tree_str] = updated
114 | 
115 |                 updated_best.append(updated)
116 | 
117 |             best_individuals[:num_best] = updated_best
118 | 
119 |             # Update the offspring
120 |             offspring = best_individuals
121 |             
122 | 
123 |         # Update the hall of fame with the generated individuals
124 |         if halloffame is not None:
125 |             halloffame.update(offspring)
126 | 
127 | 
128 |         # Replace the current population by the offspring
129 |         population[:] = offspring
130 | 
131 |         # Append the current generation statistics to the logbook
132 |         record = stats.compile(population) if stats else {}
133 |         logbook.record(gen=gen, nevals=len(invalid_ind), **record)
134 |         if verbose:
135 |             print(logbook.stream)
136 | 
137 |          # Early exit, we achieved top fitness
138 |         if max_fitness == 1:
139 |             break
140 | 
141 | 
142 |     return population, logbook


--------------------------------------------------------------------------------
/src/helpers.py:
--------------------------------------------------------------------------------
  1 | import autograd.numpy as np
  2 | from  scipy import ndimage, ndarray
  3 | from autograd import grad
  4 | from autograd.scipy import signal
  5 | from autograd.scipy.special import expit
  6 | from autograd.extend import defvjp
  7 | import autograd.numpy.numpy_vjps as vjps
  8 | import skimage.measure
  9 | 
 10 | class OperatorOut:
 11 |     def __init__(self, value, features):
 12 |         self.value = value
 13 |         self.features = features
 14 | 
 15 |     def __str__(self):
 16 |         return "Size("+repr(self.value)+")"
 17 |     
 18 |     __repr__ = __str__
 19 | 
 20 | # Numerically stable sigmoid function. Used for output of tree
 21 | def sigmoid(x):
 22 |     return expit(x)
 23 | 
 24 | # Performs arithmetic operator but also needs to deal with pairs as inputs not just floats
 25 | def arithmetic_op(fn, l, r):
 26 |     # First item in tuple is the value, second is the list of features
 27 |     l_value, l_features = l
 28 |     r_value, r_features = r
 29 |     out = fn(l_value, r_value)
 30 | 
 31 |     return (out, l_features + r_features)
 32 | 
 33 | 
 34 | def extract_ellipse(pixels, start_x, start_y, agg_width, agg_height):  
 35 |     # Top left was passed in, but its nicer for computing if we just use the centre instead
 36 |     x_radius = agg_width // 2
 37 |     center_x = start_x + x_radius       
 38 |     y_radius = agg_height // 2  # y center, half height                            
 39 |     center_y = start_y + y_radius
 40 | 
 41 |     # Protect against any divide by zeros, by treating the radius as 1 instead
 42 |     if x_radius == 0:
 43 |         x_radius = 1
 44 | 
 45 |     if y_radius == 0:
 46 |         y_radius = 1                          
 47 | 
 48 |     # This formula will give a true value for all points within ellipse, false for those not in ellipse (this is a basic ellipse with no rotation)
 49 |     ellipse = [((x-center_x)/x_radius)**2 + ((y-center_y)/y_radius)**2 <= 1 for (y, x), _ in np.ndenumerate(pixels)]
 50 |     ellipse = np.reshape(ellipse, pixels.shape) # Back to a nd array rather than list
 51 |      
 52 |     return pixels[ellipse]
 53 | 
 54 | 
 55 | def extract_window(pixels, shape, x, y, w, h):
 56 |     shape = shape.value
 57 |     dimensions = pixels.shape
 58 |     
 59 |     # Image is in row, col order
 60 |     img_width = dimensions[1]
 61 |     img_height = dimensions[0]
 62 |     
 63 |     # Convert to integers so we can use for indexing
 64 |     x_start = int(x * img_width)
 65 |     y_start = int(y * img_height)
 66 |     agg_width = int(w  * img_width)
 67 |     agg_height = int(h * img_height)
 68 |     
 69 |     # Ensure we are within the images bounds
 70 |     x_end = min(x_start + agg_width, img_width)
 71 |     y_end = min(y_start + agg_height, img_height)
 72 |     
 73 |     values = None
 74 |     
 75 |     # Need to extract different regions based off the shape
 76 |     if shape == "Rectangle":
 77 |         values = pixels[y_start:y_end, x_start:x_end]
 78 |     elif shape == "Column":
 79 |         values = pixels[y_start:y_end, x_start:min(x_start+1, img_width)]
 80 |     elif shape == "Row":
 81 |         values = pixels[y_start:min(y_start+1, img_height), x_start:x_end]
 82 |     elif shape == "Ellipse":
 83 |         values = extract_ellipse(pixels, x_start, y_start, agg_width, agg_height)
 84 |     else:
 85 |         print("Shape not found!!")
 86 |     
 87 |     # Convert to a 1d array
 88 |     if values is not None:
 89 |         values.flatten()
 90 |         
 91 |     return values
 92 | 
 93 | # Convolve takes an image and a filter to apply, and returns the convolved image (with RELU applied).
 94 | def convolve(image, kernel, filter_size):
 95 |     # Currently kernel values is a list but we want a 2d array
 96 |     kernel = np.reshape(kernel, (filter_size, filter_size))
 97 | 
 98 |     # TODO: APPLY PADDING
 99 |     convolved = signal.convolve(image, kernel, mode='valid')
100 | 
101 |     # ReLU
102 |     activated = np.maximum(0, convolved)
103 | 
104 |     return activated
105 | 
106 | # Pooling takes an image, and returns a subsampled version. Uses max pooling of the specified size
107 | def pooling(image, size):
108 |     m,n = image.shape
109 |     
110 |     # NOTE: This has only been tested with size =2, likely need changes with different pooling sizes 
111 |     if m % size != 0:
112 |         image = image[:-1,:]
113 |     
114 |     if n % size != 0:
115 |         image = image[:,:-1]
116 |         
117 |     m,n = image.shape
118 |         
119 |     out = image.reshape(m//size,size,n//size,size).max(axis=(1,3))
120 |     return out
121 | 
122 | # Take a shape/region of the image and apply the given function to this region
123 | def agg(fn, image, shape, x, y, width, height):
124 |     # They are integers, treat them as floats instead when passing through
125 |     window = extract_window(image, shape, x.value, y.value, width.value, height.value)
126 |     out = fn(window) if window.size > 0 else 0
127 |     return (out, [out])
128 | 
129 | 
130 | def grad_np_mean(ans, x, axis=None, keepdims=False):
131 |     shape, dtype = anp.shape(x), anp.result_type(x)
132 |     def vjp(g):
133 |         g_repeated, num_reps = repeat_to_match_shape(g, shape, dtype, axis, keepdims)
134 |         return g_repeated / num_reps
135 |     return vjp
136 | # Important - Need to redefine the gradient forn standard deviation as it will cause nans if the std is 0. 
137 | #We can just use the means gradient, as the magnitude of each value is not hugely important with gradient descent (rather the direction)
138 | defvjp(np.std, vjps.grad_np_mean)
139 | 
140 | def protectedDiv(num, den):
141 |     if den != 0:
142 |         return num / den
143 |     else:
144 |         return 0.
145 | 
146 | def mse_loss(real_label, predicted_label):
147 |     return 1/2 * (real_label - predicted_label)**2
148 | 
149 | def ce_loss_old(real_label, predicted_label):
150 |     # Just to be safe so we dont get any log zeros
151 |     predicted_label = bound_check(predicted_label)
152 |     return - np.sum(np.multiply(real_label, np.log(predicted_label)) + np.multiply((1-real_label), np.log(1-predicted_label)))
153 | 
154 | # Works for only a single label input
155 | def ce_loss(real_label, predicted_label):
156 |     # Just to be safe so we dont get any log zeros
157 |     predicted_label = bound_check(predicted_label)
158 |     class_zero = real_label * np.log(predicted_label) # Need to use np.log for autograd to work rather than math.log
159 |     class_one = (1-real_label) * np.log(1-predicted_label)
160 |     return - (class_zero + class_one)
161 | 
162 | 
163 | # Make 1s 0.99999, and 0s 0.000001 - No effect on other inputs
164 | def bound_check(x):
165 |     if x == 1:
166 |         return 0.99999999
167 |     elif x == 0:
168 |         return 0.00000001
169 |     else:
170 |         return x


--------------------------------------------------------------------------------
/src/search.py:
--------------------------------------------------------------------------------
  1 | from deap import gp
  2 | import helpers
  3 | import autograd.numpy as np
  4 | import autograd.scipy.signal
  5 | from copy import deepcopy
  6 | from autograd import grad
  7 | from multiprocessing import Pool
  8 | from functools import partial
  9 | from sklearn.utils import shuffle
 10 | 
 11 | 
 12 | def compile(code, context, arguments):
 13 |     if len(arguments) > 0:
 14 |         args = ",".join(arg for arg in arguments)
 15 |         code = "lambda {args}: {code}".format(args=args, code=code)
 16 |     try:
 17 |         return eval(code, context, {})
 18 |     except MemoryError:
 19 |         _, _, traceback = sys.exc_info()
 20 |         raise MemoryError("Gradient Descent : Error in tree evaluation :"
 21 |                             " Python cannot evaluate a tree higher than 90. "
 22 |                             "To avoid this problem, you should use bloat control on your "
 23 |                             "operators. See the DEAP documentation for more information. "
 24 |                             "DEAP will now abort.", traceback)
 25 | 
 26 | 
 27 | def one_example_loss(pair, tree, filters, classes):
 28 |     x, y = pair
 29 |     real_class = 1. if y == classes[0] else 0.
 30 |     out = tree(x, *filters)[0]
 31 |     predicted_class = helpers.sigmoid(out) # Pass through a sigmoid, since treating as probability value
 32 |     loss = helpers.ce_loss(real_class, predicted_class)
 33 |     return loss
 34 | 
 35 | def compute_loss(tree, xs, ys, filters, classes):
 36 |     length = len(xs)
 37 | 
 38 |     # Need to pass in the tree, filters and classes
 39 |     partial_func = partial(one_example_loss, tree=tree, filters=filters, classes=classes)
 40 | 
 41 |     # Now pass in each of the images to the function, using map so we can parralelize computation
 42 |     losses = map(partial_func, zip(xs, ys))
 43 |     total_loss = sum(losses)
 44 |     
 45 |     # Average loss so batch size doesnt effect magnitude
 46 |     return total_loss / float(length)
 47 | 
 48 | 
 49 | # Replace all filters with placeholders - So we can pass them in as arguments for computing derivatives easily
 50 | def replace_filters_with_args(tree_str, labels, context, arguments):
 51 |     filters = {} # Keep a copy of the original filters 
 52 |     prefix = "Filter"
 53 |     
 54 |     for node, label in labels.items():
 55 |         # Filters are represented as lists, so we only care for lists
 56 |         if type(label) == list:
 57 |             arg_name = prefix + str(node)
 58 |             filters[node] = label # Save the original value
 59 |             labels[node] = arg_name # Replace the value
 60 |             arguments.append(arg_name) # Add to the list of arguments, rpefix with filter
 61 |     
 62 |     # Replace all the filters in the tree with their placeholders
 63 |     for arg, value in filters.items():
 64 |         tree_str = tree_str.replace(str(value), prefix + str(arg), 1) # Replace only one occurence, in case two filters have same value
 65 | 
 66 |     callable_tree = compile(tree_str, context, arguments)
 67 |     
 68 |     return callable_tree, filters
 69 | 
 70 | # Creates a generator of batches
 71 | def make_batches(x, y, batch_size):
 72 |     length = len(x)
 73 |     for i in range(length//batch_size):
 74 |         yield x[batch_size*i:batch_size*(i+1)], y[batch_size*i:batch_size*(i+1)]
 75 | 
 76 | def split_and_shuffle(x, y, split=0.8):
 77 |     x, y = shuffle(x, y)
 78 |     length = int(len(x) * split)
 79 | 
 80 |     return x[:length], y[:length], x[length:], y[length:]
 81 | 
 82 | # Takes a string representation of a tree, and performs gradient descent on the parameters
 83 | def gradient_descent(original_tree, xs, ys, context, arguments, classes, epochs, lr):
 84 |     tree = deepcopy(original_tree) # Dont modify original tree
 85 |         
 86 |     tree_str = str(tree)
 87 | 
 88 |     # If theres no convolutions, dont bother performing gradient descent as theres no updateable params
 89 |     if "Conv" not in tree_str:
 90 |         return tree
 91 | 
 92 |     _, _, labels = gp.graph(tree)
 93 | 
 94 |     arguments = list(arguments) # Make a copy of the arguments so we dont modify original args
 95 | 
 96 |     callable_tree, original_filters = replace_filters_with_args(tree_str, labels, context, arguments)
 97 | 
 98 |     keys = sorted(list(original_filters.keys())) # Make sure the order is correct, should be ascending
 99 |     filters = [original_filters[key] for key in keys] # Retrieve the original filter values
100 | 
101 |     #Split into validation and training sets
102 |     trainX, trainY, validX, validY = split_and_shuffle(xs, ys)
103 |     batch_size = len(trainX) // 10
104 | 
105 |     # Whether or not to decay learning rate
106 |     decay = True
107 | 
108 |     # Flag for when to exit
109 |     finished = False
110 | 
111 |     # Start with largest loss
112 |     lowest_loss = float("inf") 
113 |     best_filters = list(filters)
114 | 
115 |     # Number of iterations without improvement to wait before exiting
116 |     patience = epochs // 5 #20%
117 |     num_iterations_without_improv = 0
118 | 
119 |     last_loss = lowest_loss
120 | 
121 |     # Gradient descent
122 |     for i in range(epochs):
123 | 
124 |         # Learning rate decay, halve every 10 epochs
125 |         if i > 0 and i % 10 == 0:
126 |             lr /= 2
127 | 
128 |         for batchXs, batchYs in make_batches(trainX, trainY, batch_size):
129 |             # Compute changes
130 |             grad_fn = grad(compute_loss, 3) # Compute derivative w.r.t the filters - Third index is the filters, see line below
131 |             deltas = grad_fn(callable_tree, batchXs, batchYs, filters, classes)
132 |             deltas = np.asarray(deltas)
133 | 
134 |             # Apply changes
135 |             filters -= (lr * deltas)
136 | 
137 |             # Calculate loss on validation set
138 |             validation_loss = compute_loss(callable_tree, validX, validY, filters, classes)
139 | 
140 |             # If this is the lowest loss weve seen, save the filters
141 |             if validation_loss < lowest_loss:
142 |                 lowest_loss = validation_loss
143 |                 best_filters = list(filters)
144 | 
145 |             # If we are getting worse on the validation set, record how many epochs this has occured
146 |             if validation_loss > last_loss:
147 |                 num_iterations_without_improv += 1
148 |             else:
149 |                 num_iterations_without_improv = 0
150 | 
151 |             # if we havent improved for patience epochs, then exit
152 |             if num_iterations_without_improv >= patience:
153 |                 filters = best_filters
154 |                 finished = True
155 |                 break
156 | 
157 |             last_loss = validation_loss
158 |             
159 |         if finished:
160 |             break
161 | 
162 | 
163 |     # Need to modify the tree to have the new filter values - Need to do it in the this fashion to preserve the order
164 |     for idx, key in enumerate(keys):      
165 |         tree[key] = deepcopy(tree[key]) # For some reason deep copy doesnt go DEEP, so need to do it again
166 |         tree[key].value = list(filters[idx])
167 | 
168 |     return tree


--------------------------------------------------------------------------------
/src/stgp.py:
--------------------------------------------------------------------------------
  1 | 
  2 | # coding: utf-8
  3 | 
  4 | # In[1]:
  5 | 
  6 | #import numpy as np
  7 | import autograd.numpy as np
  8 | from  scipy import ndimage, ndarray
  9 | import skimage.measure
 10 | from sklearn import metrics
 11 | from deap import algorithms, base, creator, tools, gp
 12 | import random, operator, math
 13 | import matplotlib.pyplot as plt
 14 | from glob import glob
 15 | import pygraphviz as pgv
 16 | from autograd import grad
 17 | import datetime
 18 | import deapfix
 19 | import helpers
 20 | import evolution
 21 | from enum import Enum
 22 | import sys
 23 | from scoop import futures
 24 | import pickle
 25 | 
 26 | # Custom classes for Strongly typed GP structure.
 27 | 
 28 | class Size:
 29 |     def __init__(self, value):
 30 |         self.value = value
 31 | 
 32 |     def __str__(self):
 33 |         return "Size("+repr(self.value)+")"
 34 |     
 35 |     __repr__ = __str__
 36 | 
 37 | class Position:
 38 |     def __init__(self, value):
 39 |         self.value = value
 40 | 
 41 |     def __str__(self):
 42 |         return "Position("+repr(self.value)+")"
 43 |     
 44 |     __repr__ = __str__
 45 | 
 46 | 
 47 | class Shape:
 48 | 
 49 |     allowable_shapes = {"Rectangle", "Ellipse", "Column", "Row"}
 50 | 
 51 |     def __init__(self, value):
 52 |         if value not in Shape.allowable_shapes:
 53 |             raise Exception("Invalid shape: ", value)
 54 |         self.value = value
 55 | 
 56 |     def __str__(self):
 57 |         return "Shape("+repr(self.value)+")"
 58 |     
 59 |     __repr__ = __str__
 60 | 
 61 | 
 62 | class ConvGP():
 63 | 
 64 |     """ Classifier implementing the ConvGP method. Implemented using the DEAP library. """
 65 | 
 66 |     def __init__(self, pooling_size=2, filter_size=3,
 67 |         pop_size = 1024, generations=50, tourn_size = 7, num_best = 5, crs_rate = 0.75, mut_rate = 0.2, gd_frequency=10, epochs=100, lr=0.05, extended=False):
 68 | 
 69 |         self.pooling_size = pooling_size
 70 |         self.filter_size = filter_size
 71 |         self.pop_size = pop_size
 72 |         self.generations = generations
 73 |         self.tourn_size = tourn_size
 74 |         self.num_best = num_best
 75 |         self.crs_rate = crs_rate
 76 |         self.mut_rate = mut_rate
 77 | 
 78 |         self.gd_frequency = gd_frequency # When to run Gradient descent, 5 means every 5 epochs. Set to -1 to disable gradient descent
 79 | 
 80 |         self.epochs = epochs
 81 |         self.lr = lr
 82 |         self.extended = extended # Whether or not to apply gradient descent for extended period on final generation
 83 | 
 84 |         self.pset = self.create_pset()
 85 |         self.mstats = self.create_stats()
 86 |         self.toolbox = self.create_toolbox()
 87 | 
 88 |     # Prints out the parameters for reference
 89 |     def print_info(self):
 90 |         print("ConvGP Settings")
 91 |         print("\tPooling size:", self.pooling_size)
 92 |         print("\tFilter size:", self.filter_size)
 93 |         print("\tPopulation size:", self.pop_size)
 94 |         print("\tTournament size:", self.tourn_size)
 95 |         print("\tGenerations:", self.generations)
 96 | 
 97 |         print("\tCrossover rate:", self.crs_rate)
 98 |         print("\tMutation rate:", self.mut_rate)
 99 |         print("\tReproduction rate:", 1 - self.mut_rate - self.crs_rate)
100 | 
101 |         print("\tGradient descent frequency:", self.gd_frequency)
102 |         print("\tEpochs:", self.epochs)
103 |         print("\tLearning Rate:", self.lr)
104 | 
105 |     # Chooses the appropriate class based on probability value (class[0] if > 0.5, else class 1). Defined here so can be used from outside of class in a consistent way
106 |     def determine_class(self, value):
107 |         return self.classes_[0] if value > 0.5 else self.classes_[1]
108 | 
109 |     # Use the tree to predict class labels
110 |     def predict_labels(self, individual, data):
111 |         # Transform the tree expression in a callable function
112 |         tree = self.toolbox.compile(expr=individual)
113 | 
114 |         predicted_labels = [self.determine_class(helpers.sigmoid(tree(image)[0])) for image in data]
115 |         
116 |         return np.asarray(predicted_labels)
117 | 
118 |     def predict_probabilities(self, individual, data):
119 |         tree = self.toolbox.compile(expr=individual)
120 | 
121 |         predicted_labels = []
122 | 
123 |         for image in data:
124 |             out = tree(image)[0]
125 | 
126 |             # Probability for the two classes - Since binary
127 |             zero_probability = helpers.sigmoid(out)
128 |             one_probability = 1 - zero_probability
129 |             predicted_labels.append([zero_probability, one_probability])
130 |         
131 |         return np.asarray(predicted_labels)   
132 | 
133 |     # How should fitness of an individual be determined? In this case use classification accuracy with a penalty
134 |     def fitness_function(self, individual, data, real_labels):  
135 |         predicted_labels = self.predict_labels(individual, data)
136 | 
137 |         # Percentage of elementwise matches between real and predicted labels
138 |         classification_accuracy = metrics.accuracy_score(real_labels, predicted_labels)
139 | 
140 |         # Deap requires multiple values be returned, so the comma is important!
141 |         return classification_accuracy, 
142 | 
143 | 
144 |     def fit(self, trainingX, trainingY, seed = 1, verbose = True):
145 |         # Reproducability
146 |         random.seed(seed)
147 |         np.random.seed(seed)
148 | 
149 |         # The possible class labels/names as an alphabetical list. This is first converted to a set to get the unique values. 
150 |         self.classes_ = sorted(set(trainingY))
151 | 
152 |         if len(self.classes_) != 2:
153 |             raise Exception("This method only supports binary classification! But labels found were:" + self.classes_)
154 | 
155 |         self.toolbox.register("evaluate", self.fitness_function, data=trainingX, real_labels=trainingY)
156 | 
157 |         # Run training process
158 |         pop = self.toolbox.population(n=self.pop_size)
159 |         hof = tools.HallOfFame(self.num_best)
160 |         pop, log = evolution.gradientEvolution(pop, self.toolbox, self.crs_rate, self.mut_rate, self.generations,
161 |             trainingX, trainingY, self.pset.context, self.pset.arguments, self.classes_, self.gd_frequency, self.epochs, self.lr, self.extended, stats=self.mstats, halloffame=hof, verbose=verbose)
162 |         
163 |         # Save the results
164 |         self.logbook = log
165 |         self.hof = hof
166 |         self.tree = hof[0]
167 | 
168 | 
169 |     def predict(self, data):
170 |         if self.tree is None:
171 |             raise Exception("You must call fit before predict!")
172 | 
173 |         # Predict the labels using the saved tree
174 |         return self.predict_labels(self.tree, data)
175 | 
176 |      # Return probabilities
177 |     def predict_proba(self, data):
178 |         if self.tree is None:
179 |             raise Exception("You must call fit before predict!")
180 | 
181 |         return self.predict_probabilities(self.tree, data)
182 |         
183 | 
184 |     def create_pset(self):
185 |         # Program Takes in a single image as input. Outputs a float which is then used for classification by thresholding
186 |         pset = gp.PrimitiveSetTyped("MAIN", [ndarray], tuple)
187 | 
188 |         # Need to add the custom types, so deap is able to compile these
189 |         pset.context["Size"] = Size
190 |         pset.context["Position"] = Position
191 |         pset.context["Shape"] = Shape
192 | 
193 |         # ================
194 |         # Terminal set
195 |         # ================
196 | 
197 |         # Convolution Tier
198 |         pset.addEphemeralConstant("Filter", lambda: [random.uniform(-1, 1) for _ in range(self.filter_size * self.filter_size)] , list)
199 | 
200 |         # Aggregation tier
201 |         pset.addEphemeralConstant("Shape", lambda: Shape(random.choice(tuple(Shape.allowable_shapes))), Shape) # Shape of window
202 |         
203 |         pset.addEphemeralConstant("Size", lambda: Size(random.uniform(0.15, 0.75)), Size)
204 | 
205 |         pset.addEphemeralConstant("Pos", lambda: Position(random.uniform(0.05, 0.90)), Position) # Size and position of window
206 | 
207 |         # Classification Tier
208 |         pset.addEphemeralConstant("Random", lambda: random.uniform(-1, 1), float)
209 | 
210 |         # To respect half and half generation
211 |         pset.addEphemeralConstant("RandomTuple", lambda: (random.uniform(-1, 1), []), tuple)
212 | 
213 |         # ================
214 |         # Function set
215 |         # ================
216 | 
217 |         # Convolution Tier
218 |         pset.addPrimitive(lambda image, kernel: helpers.convolve(image, kernel, self.filter_size), [ndarray, list], ndarray, name="Convolution")
219 |         pset.addPrimitive(lambda image: helpers.pooling(image, self.pooling_size), [ndarray], ndarray, name="Pooling")
220 | 
221 |         # Aggregation Tier - The inputs correspond to: Image, Shape, X, Y, Width, Height. 
222 |         # The output is a pair containing the output of the aggregation function and the output stored in a list for feature construction purposes
223 |         pset.addPrimitive(lambda *args: helpers.agg(np.min, *args), [ndarray, Shape, Position, Position, Size, Size], tuple, name="aggmin")
224 |         pset.addPrimitive(lambda *args: helpers.agg(np.mean, *args), [ndarray, Shape, Position, Position, Size, Size], tuple, name="aggmean")
225 |         pset.addPrimitive(lambda *args: helpers.agg(np.max, *args), [ndarray, Shape, Position, Position, Size, Size], tuple, name="aggmax")
226 |         pset.addPrimitive(lambda *args: helpers.agg(np.std, *args), [ndarray, Shape, Position, Position, Size, Size], tuple, name="aggstd")
227 | 
228 |         # Classification Tier - The basic arithmetic operators however they need to take tuples since the features are passed through the tree
229 |         pset.addPrimitive(lambda x, y: helpers.arithmetic_op(operator.add, x, y), [tuple, tuple], tuple, name="add")
230 |         pset.addPrimitive(lambda x, y: helpers.arithmetic_op(operator.sub, x, y), [tuple, tuple], tuple, name="sub")
231 |         pset.addPrimitive(lambda x, y: helpers.arithmetic_op(operator.mul, x, y), [tuple, tuple], tuple, name="mul")
232 |         pset.addPrimitive(lambda x, y: helpers.arithmetic_op(helpers.protectedDiv, x, y), [tuple, tuple], tuple, name="div")
233 |         
234 |         return pset
235 | 
236 |     def create_stats(self):
237 |         # Data to track per generation. Track the min, mean, std, max of the fitness and tree sizes 
238 |         stats_fit = tools.Statistics(lambda ind: ind.fitness.values)
239 |         stats_size = tools.Statistics(len)
240 | 
241 |         mstats = tools.MultiStatistics(fitness=stats_fit, size=stats_size)
242 |         mstats.register("avg", np.mean)
243 |         mstats.register("std", np.std)
244 |         mstats.register("min", np.min)
245 |         mstats.register("max", np.max)
246 |         
247 |         return mstats
248 | 
249 |     def create_toolbox(self):
250 |         # This a maximization problem, as fitness is classification accuracy (higher the better)
251 |         creator.create("FitnessMax", base.Fitness, weights=(1.0,))
252 | 
253 |         # Individuals in the population should be represented as tree structures (standard GP)
254 |         creator.create("Individual", gp.PrimitiveTree, fitness=creator.FitnessMax)
255 | 
256 |         toolbox = base.Toolbox()
257 | 
258 |         # Ramped Half and half generation (full and grow)
259 |         toolbox.register("expr", deapfix.genHalfAndHalf, pset=self.pset, min_=2, max_=5)
260 |         toolbox.register("individual", tools.initIterate, creator.Individual, toolbox.expr)
261 |         toolbox.register("population", tools.initRepeat, list, toolbox.individual)
262 |         toolbox.register("compile", gp.compile, pset=self.pset)
263 | 
264 |         # Tournament size
265 |         toolbox.register("select", tools.selTournament, tournsize=self.tourn_size)
266 |         toolbox.register("mate", gp.cxOnePoint)
267 |         toolbox.register("expr_mut", deapfix.genFull, min_=0, max_=2)
268 |         toolbox.register("mutate", gp.mutUniform, expr=toolbox.expr_mut, pset=self.pset)
269 | 
270 |         # Max tree heights for crossover and mutation
271 |         toolbox.decorate("mate", gp.staticLimit(key=operator.attrgetter("height"), max_value=10))
272 |         toolbox.decorate("mutate", gp.staticLimit(key=operator.attrgetter("height"), max_value=10))
273 | 
274 | 
275 |         return toolbox
276 | 
277 |     # Used to return a list of constructed features for an image, using the best individual
278 |     def construct_features(self, image):
279 |         if self.tree is None:
280 |             raise Exception("You must call fit before attempting to construct features!")
281 | 
282 |         callable_tree = self.to_callable(self.tree)
283 | 
284 |         # 0 is output, 1 is constructed features
285 |         return callable_tree(image)[1]
286 | 
287 |     # Convert a tree to a callable function
288 |     def to_callable(self, individual):
289 |         return self.toolbox.compile(expr=individual)
290 | 
291 |     def save_logbook(self, file_name):
292 |         if self.logbook is None:
293 |             raise Exception("You must call fit before save!")
294 | 
295 |         with open(file_name, 'wb') as fp:
296 |             pickle.dump(self.logbook, fp)
297 | 
298 |     #  To Plot/draw the resulting trees
299 |     def save_tree(self, file_name):
300 |         if self.tree is None:
301 |             raise Exception("You must call fit before save!")
302 | 
303 |         nodes, edges, labels = gp.graph(self.tree)
304 | 
305 |         g = pgv.AGraph()
306 |         g.add_nodes_from(nodes)
307 |         g.add_edges_from(edges)
308 |         g.layout(prog="dot")
309 | 
310 |         for i in nodes:
311 |             n = g.get_node(i)
312 |             label = labels[i]
313 |             label_type = type(label)
314 | 
315 |             # Pretty formatting
316 |             if label_type in [Position, Size]:
317 |                 # 2 decimal points
318 |                 label = "{0:.2f}".format(label.value)
319 |             elif label_type == Shape:
320 |                 label = label.value
321 |             elif label_type == tuple:
322 |                 # 2DP, only first part of tuple matter (second part is constructed features)
323 |                 label = "{0:.2f}".format(label[0])
324 |             elif label_type == list:
325 |                 formatted_list = [ "{0:.2f}".format(elem) for elem in label]
326 |                 formatted_list = np.reshape(formatted_list, (3,3))
327 |                 label = formatted_list
328 | 
329 |             n.attr["label"] = label
330 | 
331 | 
332 |         g.draw(file_name)
333 | 
334 | 
335 | 
336 | 


--------------------------------------------------------------------------------