├── .gitignore ├── LICENSE ├── Makefile ├── README.md ├── deepswarm ├── __init__.py ├── aco.py ├── backends.py ├── deepswarm.py ├── log.py ├── nodes.py └── storage.py ├── examples ├── cifar10.py ├── context.py ├── fashion-mnist.py └── mnist.py ├── requirements.txt ├── settings ├── cifar10.yaml ├── default.yaml ├── fashion-mnist.yaml └── mnist.yaml ├── setup.py └── tests ├── test_aco.py ├── test_graph.py └── test_nodes.py /.gitignore: -------------------------------------------------------------------------------- 1 | # macOS 2 | .DS_Store 3 | 4 | # Python 5 | *.pyc 6 | __pycache__/ 7 | deepswarm-env/ 8 | 9 | # Build files 10 | build/ 11 | *.egg-info/ 12 | dist/ 13 | 14 | # Log files 15 | saves/ 16 | 17 | # Temporary files 18 | temp-model -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Edvinas Byla 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | .PHONY: test clean upload 2 | 3 | test: 4 | python -m unittest discover tests 5 | 6 | clean: 7 | rm -rf build *.egg-info dist 8 | find . -name '*.pyc' -exec rm -f {} + 9 | find . -name '*.pyo' -exec rm -f {} + 10 | find . -name '*~' -exec rm -f {} + 11 | 12 | upload: clean 13 | python setup.py sdist bdist_wheel 14 | twine upload dist/* -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

2 | 3 |

4 | 5 |

6 | Neural Architecture Search Powered by Swarm Intelligence 🐜 7 |

8 | 9 | 10 | # DeepSwarm [![](https://img.shields.io/badge/python-3.6+-brightgreen.svg)](https://www.python.org/downloads/release/python-360/) [![](https://img.shields.io/badge/TensorFlow-1.13.1-brightgreen.svg)](https://www.tensorflow.org/) 11 | 12 | DeepSwarm is an open-source library which uses Ant Colony Optimization to tackle the neural architecture search problem. The main goal of DeepSwarm is to automate one of the most tedious and daunting tasks, so people can spend more of their time on more important and interesting things. DeepSwarm offers a powerful configuration system which allows you to fine-tune the search space to your needs. 13 | 14 | ## Example 🖼 15 | 16 | ```python 17 | from deepswarm.backends import Dataset, TFKerasBackend 18 | from deepswarm.deepswarm import DeepSwarm 19 | 20 | dataset = Dataset(training_examples=x_train, training_labels=y_train, testing_examples=x_test, testing_labels=y_test) 21 | backend = TFKerasBackend(dataset=dataset) 22 | deepswarm = DeepSwarm(backend=backend) 23 | topology = deepswarm.find_topology() 24 | trained_topology = deepswarm.train_topology(topology, 50) 25 | 26 | ``` 27 | 28 | ## Installation 💾 29 | 30 | 1. Install the package 31 | 32 | ```sh 33 | pip install deepswarm 34 | ``` 35 | 2. Install one of the implemented backends that you want to use 36 | 37 | ```sh 38 | pip install tensorflow-gpu==1.13.1 39 | ``` 40 | 41 | ## Usage 🕹 42 | 43 | 1. Create a new file containing the example code 44 | 45 | ```sh 46 | touch train.py 47 | ``` 48 | 2. Create settings directory which contains `default.yaml` file. Alternatively you can run the script and instantly stop it, as this should automatically create settings directory which contains `default.yaml` file 49 | 50 | 3. Update the newly created YAML file to your dataset needs. The only two important changes you must make are: (1) change the loss function to reflect your task (2) change the shape of input and output nodes 51 | 52 | 53 | ## Search 🔎 54 | 55 |

56 | 57 |

58 | 59 | (1) The ant is placed on the input node. (2) The ant checks what transitions are available. (3) The ant uses the ACS selection rule to choose the next node. (4) After choosing the next node the ant selects node’s attributes. (5) After all ants finished their tour the pheromone is updated. (6) The maximum allowed depth is increased and the new ant population is generated. 60 | 61 | Note: Arrow thickness indicates the pheromone amount, meaning that thicker arrows have more pheromone. 62 | 63 | ## Configuration 🛠 64 | 65 | | Node type | Attributes | 66 | | :------------- |:-------------| 67 | | Input | **shape**: tuple which defines the input shape, depending on the backend could be (width, height, channels) or (channels, width, height). | 68 | | Conv2D | **filter_count**: defines how many filters can be used.
**kernel_size**: defines what size kernels can be used. For example, if it is set to [1, 3], then only 1x1 and 3x3 kernels will be used.
**activation**: defines what activation functions can be used. Allowed values are: ReLU, ELU, LeakyReLU, Sigmoid and Softmax. | 69 | | Dropout | **rate**: defines the allowed dropout rates. For example, if it is set to [0.1, 0.3], then either 10% or 30% of input units will be dropped. | 70 | | BatchNormalization | - | 71 | | Pool2D | **pool_type**: defines the types of allowed pooling nodes. Allowed values are: max (max pooling) and average (average pooling).
**pool_size**: defines the allowed pooling window sizes. For example, if it is set to [2], then only 2x2 pooling windows will be used.
**stride**: defines the allowed stride sizes. | 72 | | Flatten | - | 73 | | Dense | **output_size**: defines the allowed output space dimensionality.
**activation**: defines what activation functions can be used. Allowed values are: ReLU, ELU, LeakyReLU, Sigmoid and Softmax. | 74 | | Output | **output_size**: defines the output size (how many different classes to classify).
**activation**: defines what activation functions can be used. Allowed value are ReLU, ELU, LeakyReLU, Sigmoid and Softmax. | 75 | 76 | | Setting | Description | 77 | | :------------- |:-------------| 78 | | save_folder | Specifies the name of the folder which should be used to load the backup. If not specified the search will start from zero. | 79 | | metrics | Specifies what metrics should algorithm use to evaluate the models. Currently available options are: accuracy and loss. | 80 | | max_depth | Specifies the maximum allowed network depth (how deeply the graph can be expanded). The search is performed until the maximum depth is reached. However, it does not mean that the depth of the best architecture will be equal to the max_depth. | 81 | | reuse_patience | Specifies the maximum number of times that weights can be reused without improving the cost. For example, if it is set to 1 it means that when some model X reuses weights from model Y and model X cost did not improve compared to model Y, next time instead of reusing model Y weights, new random weights will be generated.| 82 | | start | Specifies the starting pheromone value for all the new connections. | 83 | | decay | Specifies the local pheromone decay rate in percentage. For example, if it is set to 0.1 it means that during the local pheromone update the pheromone value will be decreased by 10%. | 84 | | evaporation | Specifies the global pheromone evaporation rate in percentage. For example, if it is set to 0.1 it means that during the global pheromone update the pheromone value will be decreased by 10%. | 85 | | greediness | Specifies how greedy should ants be during the edge selection (the number is given in percentage). For example, 0.5 means that 50% of the time when ant selects a new edge it should select the one with the highest associated probability. | 86 | | ant_count | Specifies how many ants should be generated during each generation (time before the depth is increased). | 87 | | epochs | Specifies for how many epochs each candidate architecture should be trained. | 88 | | batch_size | Specifies the batch size (number of samples used to calculate a single gradient step) used during the training process. | 89 | | patience | Specifies the early stopping number used during the training (after how many epochs when the cost is not improving the training process should be stopped). | 90 | | loss | Specifies what loss function should be used during the training. Currently available options are sparse_categorical_crossentropy and categorical_crossentropy. | 91 | | spatial_nodes | Specifies which nodes are placed before the flattening node. Values in this array must correspond to node names. | 92 | | flat_nodes | Specifies which nodes are placed after the flattening node (array should also include the flattening node). Values in this array must correspond to node names. | 93 | | verbose| Specifies if the associated component should log the output.| 94 | 95 | ## Future goals 🌟 96 | 97 | - [ ] Add a node which can combine the input from the two previous nodes. 98 | - [ ] Add a node which can skip the depth n in order to connect to the node in depth n+1. 99 | - [ ] Delete the models which are not referenced anymore. 100 | - [ ] Add an option to assemble the best n models into one model. 101 | - [ ] Add functionality to reuse the weights from the non-continues blocks, i.e. take the best weights for depth n-1 from one model and then take the best weights for depth n+1 from another model. 102 | 103 | ## Citation 🖋 104 | 105 | Online version is available at: [arXiv:1905.07350](https://arxiv.org/abs/1905.07350) 106 | ```bibtex 107 | @article{byla2019deepswarm, 108 | title = {DeepSwarm: Optimising Convolutional Neural Networks using Swarm Intelligence}, 109 | author = {Edvinas Byla and Wei Pang}, 110 | journal = {arXiv preprint arXiv:1905.07350}, 111 | year = {2019} 112 | } 113 | ``` 114 | 115 | ## Acknowledgments 🎓 116 | 117 | DeepSwarm was developed under the supervision of [Dr Wei Pang](https://www.abdn.ac.uk/ncs/people/profiles/pang.wei) in partial fulfilment of the requirements for the degree of Bachelor of Science of the [University of Aberdeen](https://www.abdn.ac.uk). 118 | -------------------------------------------------------------------------------- /deepswarm/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2019 Edvinas Byla 2 | # Licensed under MIT License 3 | 4 | import argparse 5 | import operator 6 | import os 7 | import sys 8 | 9 | from pathlib import Path 10 | from shutil import copyfile 11 | from yaml import load, Loader 12 | 13 | 14 | # Create argument parser which allows users to pass a custom settings file name 15 | # If the user didn't pass a custom script name then use sys.argv[0] 16 | parser = argparse.ArgumentParser() 17 | parser.add_argument('-s', '--settings_file_name', default=os.path.basename(sys.argv[0]), 18 | help='Settings file name. The default value is the name of invoked script without the .py extenstion') 19 | args, _ = parser.parse_known_args() 20 | 21 | # Retrieve filename without the extension 22 | filename = os.path.splitext(args.settings_file_name)[0] 23 | 24 | # If mnist.yaml doesn't exist it means that the package was installed via pip in 25 | # which case we should use the current working directory as the base path 26 | base_path = Path(os.path.dirname(os.path.dirname(__file__))) 27 | if not (base_path / 'settings' / 'mnist.yaml').exists(): 28 | module_path = base_path 29 | 30 | # Change the base path to the current working directory 31 | base_path = Path(os.getcwd()) 32 | settings_directory = (base_path / 'settings') 33 | 34 | # Create settings directory if it doesn't exist 35 | if not settings_directory.exists(): 36 | settings_directory.mkdir() 37 | 38 | # If default settings file doesn't exist, copy one from the module directory 39 | module_default_config = module_path / 'settings/default.yaml' 40 | settings_default_config = settings_directory / 'default.yaml' 41 | if not settings_default_config.exists() and module_default_config.exists(): 42 | copyfile(module_default_config, settings_default_config) 43 | 44 | # As the base path is now configured we try to load configuration file 45 | # associated with the filename 46 | settings_directory = base_path / 'settings' 47 | settings_file_path = Path(settings_directory, filename).with_suffix('.yaml') 48 | 49 | # If the file doesn't exist fallback to the default settings file 50 | if not settings_file_path.exists(): 51 | settings_file_path = Path(settings_directory, 'default').with_suffix('.yaml') 52 | 53 | # Read settings file 54 | with open(settings_file_path, 'r') as settings_file: 55 | settings = load(settings_file, Loader=Loader) 56 | 57 | # Add script name to settings, so it's added to the log 58 | settings['script'] = os.path.basename(sys.argv[0]) 59 | settings['settings_file'] = str(settings_file_path) 60 | 61 | # Create convenient variables 62 | cfg = settings["DeepSwarm"] 63 | nodes = settings["Nodes"] 64 | left_cost_is_better = operator.le if cfg['metrics'] == 'loss' else operator.ge 65 | -------------------------------------------------------------------------------- /deepswarm/aco.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2019 Edvinas Byla 2 | # Licensed under MIT License 3 | 4 | import math 5 | import random 6 | 7 | from . import cfg, left_cost_is_better 8 | from .log import Log 9 | from .nodes import Node, NeighbourNode 10 | 11 | 12 | class ACO: 13 | """Class responsible for performing Ant Colony Optimization.""" 14 | 15 | def __init__(self, backend, storage): 16 | self.graph = Graph() 17 | self.current_depth = 0 18 | self.backend = backend 19 | self.storage = storage 20 | 21 | def search(self): 22 | """Performs neural architecture search using Ant colony optimization. 23 | 24 | Returns: 25 | ant which found the best network topology. 26 | """ 27 | 28 | # Generate random ant only if the search started from zero 29 | if not self.storage.loaded_from_save: 30 | Log.header("STARTING ACO SEARCH", type="GREEN") 31 | self.best_ant = Ant(self.graph.generate_path(self.random_select)) 32 | self.best_ant.evaluate(self.backend, self.storage) 33 | Log.info(self.best_ant) 34 | else: 35 | Log.header("RESUMING ACO SEARCH", type="GREEN") 36 | 37 | while self.graph.current_depth <= cfg['max_depth']: 38 | Log.header("Current search depth is %i" % self.graph.current_depth, type="GREEN") 39 | ants = self.generate_ants() 40 | 41 | # Sort ants using user selected metric 42 | ants.sort() if cfg['metrics'] == 'loss' else ants.sort(reverse=True) 43 | 44 | # Update the best ant if new better ant is found 45 | if left_cost_is_better(ants[0].cost, self.best_ant.cost): 46 | self.best_ant = ants[0] 47 | Log.header("NEW BEST ANT FOUND", type="GREEN") 48 | 49 | # Log best ant information 50 | Log.header("BEST ANT DURING ITERATION") 51 | Log.info(self.best_ant) 52 | 53 | # Perform global pheromone update 54 | self.update_pheromone(ant=self.best_ant, update_rule=self.global_update) 55 | 56 | # Print pheromone information and increase the graph's depth 57 | self.graph.show_pheromone() 58 | self.graph.increase_depth() 59 | 60 | # Perform a backup 61 | self.storage.perform_backup() 62 | return self.best_ant 63 | 64 | def generate_ants(self): 65 | """Generates a new ant population. 66 | 67 | Returns: 68 | list containing different evaluated ants. 69 | """ 70 | 71 | ants = [] 72 | for ant_number in range(cfg['aco']['ant_count']): 73 | Log.header("GENERATING ANT %i" % (ant_number + 1)) 74 | ant = Ant() 75 | # Generate ant's path using ACO selection rule 76 | ant.path = self.graph.generate_path(self.aco_select) 77 | # Evaluate how good is the new path 78 | ant.evaluate(self.backend, self.storage) 79 | ants.append(ant) 80 | Log.info(ant) 81 | # Perform local pheromone update 82 | self.update_pheromone(ant=ant, update_rule=self.local_update) 83 | return ants 84 | 85 | def random_select(self, neighbours): 86 | """Randomly selects one neighbour node and its attributes. 87 | 88 | Args: 89 | neighbours [NeighbourNode]: list of neighbour nodes. 90 | Returns: 91 | a randomly selected neighbour node. 92 | """ 93 | 94 | current_node = random.choice(neighbours).node 95 | current_node.select_random_attributes() 96 | return current_node 97 | 98 | def aco_select(self, neighbours): 99 | """Selects one neighbour node and its attributes using ACO selection rule. 100 | 101 | Args: 102 | neighbours [NeighbourNode]: list of neighbour nodes. 103 | Returns: 104 | selected neighbour node. 105 | """ 106 | 107 | # Transform a list of NeighbourNode objects to list of tuples 108 | # (Node, pheromone, heuristic) 109 | tuple_neighbours = [(n.node, n.pheromone, n.heuristic) for n in neighbours] 110 | # Select node using ant colony selection rule 111 | current_node = self.aco_select_rule(tuple_neighbours) 112 | # Select custom attributes using ant colony selection rule 113 | current_node.select_custom_attributes(self.aco_select_rule) 114 | return current_node 115 | 116 | def aco_select_rule(self, neighbours): 117 | """Selects neigbour using ACO transition rule. 118 | 119 | Args: 120 | neighbours [(Object, float, float)]: list of tuples, where each tuple 121 | contains: an object to be selected, object's pheromone value and 122 | object's heuristic value. 123 | Returns: 124 | selected object. 125 | """ 126 | 127 | probabilities = [] 128 | denominator = 0.0 129 | 130 | # Calculate probability for each neighbour 131 | for (_, pheromone, heuristic) in neighbours: 132 | probability = pheromone * heuristic 133 | probabilities.append(probability) 134 | denominator += probability 135 | 136 | # Try to perform greedy select: exploitation 137 | random_variable = random.uniform(0, 1) 138 | if random_variable <= cfg['aco']['greediness']: 139 | # Find max probability 140 | max_probability = max(probabilities) 141 | # Gather the indices of probabilities that are equal to the max probability 142 | max_indices = [i for i, j in enumerate(probabilities) if j == max_probability] 143 | # From those max indices select random index 144 | neighbour_index = random.choice(max_indices) 145 | return neighbours[neighbour_index][0] 146 | 147 | # Otherwise perform select using roulette wheel: exploration 148 | probabilities = [x / denominator for x in probabilities] 149 | probability_sum = sum(probabilities) 150 | random_treshold = random.uniform(0, probability_sum) 151 | current_value = 0 152 | for neighbour_index, probability in enumerate(probabilities): 153 | current_value += probability 154 | if current_value > random_treshold: 155 | return neighbours[neighbour_index][0] 156 | 157 | def update_pheromone(self, ant, update_rule): 158 | """Updates the pheromone using given update rule. 159 | 160 | Args: 161 | ant: ant which should perform the pheromone update. 162 | update_rule: function which takes pheromone value and ant's cost, 163 | and returns a new pheromone value. 164 | """ 165 | 166 | current_node = self.graph.input_node 167 | # Skip the input node as it's not connected to any previous node 168 | for node in ant.path[1:]: 169 | # Use a node from the path to retrieve its corresponding instance from the graph 170 | neighbour = next((x for x in current_node.neighbours if x.node.name == node.name), None) 171 | 172 | # If the path was closed using complete_path method, ignore the rest of the path 173 | if neighbour is None: 174 | break 175 | 176 | # Update pheromone connecting to a neighbour 177 | neighbour.pheromone = update_rule( 178 | old_value=neighbour.pheromone, 179 | cost=ant.cost 180 | ) 181 | 182 | # Update attribute's pheromone values 183 | for attribute in neighbour.node.attributes: 184 | # Find what attribute value was used for node 185 | attribute_value = getattr(node, attribute.name) 186 | # Retrieve pheromone for that value 187 | old_pheromone_value = attribute.dict[attribute_value] 188 | # Update pheromone 189 | attribute.dict[attribute_value] = update_rule( 190 | old_value=old_pheromone_value, 191 | cost=ant.cost 192 | ) 193 | 194 | # Advance the current node 195 | current_node = neighbour.node 196 | 197 | def local_update(self, old_value, cost): 198 | """Performs local pheromone update.""" 199 | 200 | decay = cfg['aco']['pheromone']['decay'] 201 | pheromone_0 = cfg['aco']['pheromone']['start'] 202 | return (1 - decay) * old_value + (decay * pheromone_0) 203 | 204 | def global_update(self, old_value, cost): 205 | """Performs global pheromone update.""" 206 | 207 | # Calculate solution cost based on metrics 208 | added_pheromone = (1 / (cost * 10)) if cfg['metrics'] == 'loss' else cost 209 | evaporation = cfg['aco']['pheromone']['evaporation'] 210 | return (1 - evaporation) * old_value + (evaporation * added_pheromone) 211 | 212 | def __getstate__(self): 213 | d = dict(self.__dict__) 214 | del d['backend'] 215 | return d 216 | 217 | 218 | class Ant: 219 | """Class responsible for representing the ant.""" 220 | 221 | def __init__(self, path=[]): 222 | self.path = path 223 | self.loss = math.inf 224 | self.accuracy = 0.0 225 | self.path_description = None 226 | self.path_hash = None 227 | 228 | def evaluate(self, backend, storage): 229 | """Evaluates how good ant's path is. 230 | 231 | Args: 232 | backend: Backend object. 233 | storage: Storage object. 234 | """ 235 | 236 | # Extract path information 237 | self.path_description, path_hashes = storage.hash_path(self.path) 238 | self.path_hash = path_hashes[-1] 239 | 240 | # Check if the model already exists if yes, then just re-use it 241 | existing_model, existing_model_hash = storage.load_model(backend, path_hashes, self.path) 242 | if existing_model is None: 243 | # Generate model 244 | new_model = backend.generate_model(self.path) 245 | else: 246 | # Re-use model 247 | new_model = existing_model 248 | 249 | # Train model 250 | new_model = backend.train_model(new_model) 251 | # Evaluate model 252 | self.loss, self.accuracy = backend.evaluate_model(new_model) 253 | 254 | # If the new model was created from the older model, record older model progress 255 | if existing_model_hash is not None: 256 | storage.record_model_performance(existing_model_hash, self.cost) 257 | 258 | # Save model 259 | storage.save_model(backend, new_model, path_hashes, self.cost) 260 | 261 | @property 262 | def cost(self): 263 | """Returns value which represents ant's cost.""" 264 | 265 | return self.loss if cfg['metrics'] == 'loss' else self.accuracy 266 | 267 | def __lt__(self, other): 268 | return self.cost < other.cost 269 | 270 | def __str__(self): 271 | return "======= \n Ant: %s \n Loss: %f \n Accuracy: %f \n Path: %s \n Hash: %s \n=======" % ( 272 | hex(id(self)), 273 | self.loss, 274 | self.accuracy, 275 | self.path_description, 276 | self.path_hash, 277 | ) 278 | 279 | 280 | class Graph: 281 | """Class responsible for representing the graph.""" 282 | 283 | def __init__(self, current_depth=0): 284 | self.topology = [] 285 | self.current_depth = current_depth 286 | self.input_node = self.get_node(Node.create_using_type('Input'), current_depth) 287 | self.increase_depth() 288 | 289 | def get_node(self, node, depth): 290 | """Tries to retrieve a given node from the graph. If the node does not 291 | exist then the node is inserted into the graph before being retrieved. 292 | 293 | Args: 294 | node: Node which should be found in the graph. 295 | depth: depth at which the node should be stored. 296 | """ 297 | 298 | # If we are trying to insert the node into a not existing layer, we pad the 299 | # topology by adding empty dictionaries, until the required depth is reached 300 | while depth > (len(self.topology) - 1): 301 | self.topology.append({}) 302 | 303 | # If the node already exists return it, otherwise add it to the topology first 304 | return self.topology[depth].setdefault(node.name, node) 305 | 306 | def increase_depth(self): 307 | """Increases the depth of the graph.""" 308 | 309 | self.current_depth += 1 310 | 311 | def generate_path(self, select_rule): 312 | """Generates path through the graph based on given selection rule. 313 | 314 | Args: 315 | select_rule ([NeigbourNode]): function which receives a list of 316 | neighbours. 317 | 318 | Returns: 319 | a path which contains Node objects. 320 | """ 321 | 322 | current_node = self.input_node 323 | path = [current_node.create_deepcopy()] 324 | for depth in range(self.current_depth): 325 | # If the node doesn't have any neighbours stop expanding the path 326 | if not self.has_neighbours(current_node, depth): 327 | break 328 | 329 | # Select node using given rule 330 | current_node = select_rule(current_node.neighbours) 331 | # Add only the copy of the node, so that original stays unmodified 332 | path.append(current_node.create_deepcopy()) 333 | 334 | completed_path = self.complete_path(path) 335 | return completed_path 336 | 337 | def has_neighbours(self, node, depth): 338 | """Checks if the node has any neighbours. 339 | 340 | Args: 341 | node: Node that needs to be checked. 342 | depth: depth at which the node is stored in the graph. 343 | 344 | Returns: 345 | a boolean value which indicates if the node has any neighbours. 346 | """ 347 | 348 | # Expand only if it hasn't been expanded 349 | if node.is_expanded is False: 350 | available_transitions = node.available_transitions 351 | for (transition_name, heuristic_value) in available_transitions: 352 | neighbour_node = self.get_node(Node(transition_name), depth + 1) 353 | node.neighbours.append(NeighbourNode(neighbour_node, heuristic_value)) 354 | node.is_expanded = True 355 | 356 | # Return value indicating if the node has neighbours after being expanded 357 | return len(node.neighbours) > 0 358 | 359 | def complete_path(self, path): 360 | """Completes the path if it is not fully completed (i.e. missing OutputNode). 361 | 362 | Args: 363 | path [Node]: list of nodes defining the path. 364 | 365 | Returns: 366 | completed path which contains list of nodes. 367 | """ 368 | 369 | # If the path is not completed, then complete it and return completed path 370 | # We intentionally don't add these ending nodes as neighbours to the last node 371 | # in the path, because during the first few iterations these nodes will always be part 372 | # of the best path (as it's impossible to close path automatically when it's so short) 373 | # this would result in bias pheromone received by these nodes during later iterations 374 | if path[-1].name in cfg['spatial_nodes']: 375 | path.append(self.get_node(Node.create_using_type('Flatten'), len(path))) 376 | if path[-1].name in cfg['flat_nodes']: 377 | path.append(self.get_node(Node.create_using_type('Output'), len(path))) 378 | return path 379 | 380 | def show_pheromone(self): 381 | """Logs the pheromone information for the graph.""" 382 | 383 | # If the output is disabled by the user then don't log the pheromone 384 | if cfg['aco']['pheromone']['verbose'] is False: 385 | return 386 | 387 | Log.header("PHEROMONE START", type="RED") 388 | for idx, layer in enumerate(self.topology): 389 | info = [] 390 | for node in layer.values(): 391 | for neighbour in node.neighbours: 392 | info.append("%s [%s] -> %f -> %s [%s]" % (node.name, hex(id(node)), 393 | neighbour.pheromone, neighbour.node.name, hex(id(neighbour.node)))) 394 | 395 | # If neighbour node doesn't have any attributes skip attribute info 396 | if not neighbour.node.attributes: 397 | continue 398 | 399 | info.append("\t%s [%s]:" % (neighbour.node.name, hex(id(neighbour.node)))) 400 | for attribute in neighbour.node.attributes: 401 | info.append("\t\t%s: %s" % (attribute.name, attribute.dict)) 402 | if info: 403 | Log.header("Layer %d" % (idx + 1)) 404 | Log.info('\n'.join(info)) 405 | Log.header("PHEROMONE END", type="RED") 406 | -------------------------------------------------------------------------------- /deepswarm/backends.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2019 Edvinas Byla 2 | # Licensed under MIT License 3 | 4 | import os 5 | import tensorflow as tf 6 | import time 7 | 8 | from abc import ABC, abstractmethod 9 | from sklearn.model_selection import train_test_split 10 | from tensorflow.keras import backend as K 11 | 12 | from . import cfg 13 | 14 | 15 | class Dataset: 16 | """Class responsible for encapsulating all the required data.""" 17 | 18 | def __init__(self, training_examples, training_labels, testing_examples, testing_labels, 19 | validation_data=None, validation_split=0.1): 20 | self.x_train = training_examples 21 | self.y_train = training_labels 22 | self.x_test = testing_examples 23 | self.y_test = testing_labels 24 | self.validation_data = validation_data 25 | self.validation_split = validation_split 26 | 27 | 28 | class BaseBackend(ABC): 29 | """Abstract class used to define Backend API.""" 30 | 31 | def __init__(self, dataset, optimizer=None): 32 | self.dataset = dataset 33 | self.optimizer = optimizer 34 | 35 | @abstractmethod 36 | def generate_model(self, path): 37 | """Create and return a backend model representation. 38 | 39 | Args: 40 | path [Node]: list of nodes where each node represents a single 41 | network layer, the path starts with InputNode and ends with EndNode. 42 | Returns: 43 | model which represents neural network structure in the implemented 44 | backend, this model can be evaluated using evaluate_model method. 45 | """ 46 | 47 | @abstractmethod 48 | def reuse_model(self, old_model, new_model_path, distance): 49 | """Create a new model by reusing layers (and their weights) from the old model. 50 | 51 | Args: 52 | old_model: old model which represents neural network structure. 53 | new_model_path [Node]: path representing new model. 54 | distance (int): distance which shows how many layers from old model need 55 | to be removed in order to create a base for new model i.e. if old model is 56 | NodeA->NodeB->NodeC->NodeD and new model is NodeA->NodeB->NodeC->NodeE, 57 | distance = 1. 58 | Returns: 59 | model which represents neural network structure. 60 | """ 61 | 62 | @abstractmethod 63 | def train_model(self, model): 64 | """Train model which was created using generate_model method. 65 | 66 | Args: 67 | model: model which represents neural network structure. 68 | Returns: 69 | model which represents neural network structure. 70 | """ 71 | 72 | @abstractmethod 73 | def fully_train_model(self, model, epochs, augment): 74 | """Fully trains the model without early stopping. At the end of the 75 | training, the model with the best performing weights on the validation 76 | set is returned. 77 | 78 | Args: 79 | model: model which represents neural network structure. 80 | epochs (int): for how many epoch train the model. 81 | augment (kwargs): augmentation arguments. 82 | Returns: 83 | model which represents neural network structure. 84 | """ 85 | 86 | @abstractmethod 87 | def evaluate_model(self, model): 88 | """Evaluate model which was created using generate_model method. 89 | 90 | Args: 91 | model: model which represents neural network structure. 92 | Returns: 93 | loss & accuracy tuple. 94 | """ 95 | 96 | @abstractmethod 97 | def save_model(self, model, path): 98 | """Saves model on disk. 99 | 100 | Args: 101 | model: model which represents neural network structure. 102 | path: string which represents model location. 103 | """ 104 | 105 | @abstractmethod 106 | def load_model(self, path): 107 | """Load model from disk, in case of fail should return None. 108 | 109 | Args: 110 | path: string which represents model location. 111 | Returns: 112 | model: model which represents neural network structure, or in case 113 | fail None. 114 | """ 115 | 116 | @abstractmethod 117 | def free_gpu(self): 118 | """Frees GPU memory.""" 119 | 120 | 121 | class TFKerasBackend(BaseBackend): 122 | """Backend based on TensorFlow Keras API""" 123 | 124 | def __init__(self, dataset, optimizer=None): 125 | # If the user passes custom optimizer we serialize it, as reusing the 126 | # same optimizer instance causes crash in TensorFlow 1.13.1, see issue 127 | # https://github.com/Pattio/DeepSwarm/issues/3 128 | if optimizer is not None: 129 | optimizer = tf.keras.optimizers.serialize(optimizer) 130 | 131 | super().__init__(dataset, optimizer) 132 | self.data_format = K.image_data_format() 133 | 134 | def generate_model(self, path): 135 | # Create an input layer 136 | input_layer = self.create_layer(path[0]) 137 | layer = input_layer 138 | 139 | # Convert each node to layer and then connect it to the previous layer 140 | for node in path[1:]: 141 | layer = self.create_layer(node)(layer) 142 | 143 | # Return generated model 144 | model = tf.keras.Model(inputs=input_layer, outputs=layer) 145 | self.compile_model(model) 146 | return model 147 | 148 | def reuse_model(self, old_model, new_model_path, distance): 149 | # Find the starting point of the new model 150 | starting_point = len(new_model_path) - distance 151 | last_layer = old_model.layers[starting_point - 1].output 152 | 153 | # Append layers from the new model to the old model 154 | for node in new_model_path[starting_point:]: 155 | last_layer = self.create_layer(node)(last_layer) 156 | 157 | # Return new model 158 | model = tf.keras.Model(inputs=old_model.inputs, outputs=last_layer) 159 | self.compile_model(model) 160 | return model 161 | 162 | def compile_model(self, model): 163 | optimizer_parameters = { 164 | 'optimizer': 'adam', 165 | 'loss': cfg['backend']['loss'], 166 | 'metrics': ['accuracy'], 167 | } 168 | 169 | # If user specified custom optimizer, use it instead of the default one 170 | # we also need to deserialize optimizer as it was serialized during init 171 | if self.optimizer is not None: 172 | optimizer_parameters['optimizer'] = tf.keras.optimizers.deserialize(self.optimizer) 173 | model.compile(**optimizer_parameters) 174 | 175 | def create_layer(self, node): 176 | # Workaround to prevent Keras from throwing an exception ("All layer 177 | # names should be unique.") It happens when new layers are appended to 178 | # an existing model, but Keras fails to increment repeating layer names 179 | # i.e. conv_1 -> conv_2 180 | parameters = {'name': str(time.time())} 181 | 182 | if node.type == 'Input': 183 | parameters['shape'] = node.shape 184 | return tf.keras.Input(**parameters) 185 | 186 | if node.type == 'Conv2D': 187 | parameters.update({ 188 | 'filters': node.filter_count, 189 | 'kernel_size': node.kernel_size, 190 | 'padding': 'same', 191 | 'data_format': self.data_format, 192 | 'activation': self.map_activation(node.activation), 193 | }) 194 | return tf.keras.layers.Conv2D(**parameters) 195 | 196 | if node.type == 'Pool2D': 197 | parameters.update({ 198 | 'pool_size': node.pool_size, 199 | 'strides': node.stride, 200 | 'padding': 'same', 201 | 'data_format': self.data_format, 202 | }) 203 | if node.pool_type == 'max': 204 | return tf.keras.layers.MaxPooling2D(**parameters) 205 | elif node.pool_type == 'average': 206 | return tf.keras.layers.AveragePooling2D(**parameters) 207 | 208 | if node.type == 'BatchNormalization': 209 | return tf.keras.layers.BatchNormalization(**parameters) 210 | 211 | if node.type == 'Flatten': 212 | return tf.keras.layers.Flatten(**parameters) 213 | 214 | if node.type == 'Dense': 215 | parameters.update({ 216 | 'units': node.output_size, 217 | 'activation': self.map_activation(node.activation), 218 | }) 219 | return tf.keras.layers.Dense(**parameters) 220 | 221 | if node.type == 'Dropout': 222 | parameters.update({ 223 | 'rate': node.rate, 224 | }) 225 | return tf.keras.layers.Dropout(**parameters) 226 | 227 | if node.type == 'Output': 228 | parameters.update({ 229 | 'units': node.output_size, 230 | 'activation': self.map_activation(node.activation), 231 | }) 232 | return tf.keras.layers.Dense(**parameters) 233 | 234 | raise Exception('Not handled node type: %s' % str(node)) 235 | 236 | def map_activation(self, activation): 237 | if activation == "ReLU": 238 | return tf.keras.activations.relu 239 | if activation == "ELU": 240 | return tf.keras.activations.elu 241 | if activation == "LeakyReLU": 242 | return tf.nn.leaky_relu 243 | if activation == "Sigmoid": 244 | return tf.keras.activations.sigmoid 245 | if activation == "Softmax": 246 | return tf.keras.activations.softmax 247 | raise Exception('Not handled activation: %s' % str(activation)) 248 | 249 | def train_model(self, model): 250 | # Create a checkpoint path 251 | checkpoint_path = 'temp-model' 252 | 253 | # Setup training parameters 254 | fit_parameters = { 255 | 'x': self.dataset.x_train, 256 | 'y': self.dataset.y_train, 257 | 'epochs': cfg['backend']['epochs'], 258 | 'batch_size': cfg['backend']['batch_size'], 259 | 'callbacks': [ 260 | self.create_early_stop_callback(), 261 | self.create_checkpoint_callback(checkpoint_path), 262 | ], 263 | 'validation_split': self.dataset.validation_split, 264 | 'verbose': cfg['backend']['verbose'], 265 | } 266 | 267 | # If validation data is given then override validation_split 268 | if self.dataset.validation_data is not None: 269 | fit_parameters['validation_data'] = self.dataset.validation_data 270 | 271 | # Train model 272 | model.fit(**fit_parameters) 273 | 274 | # Load model from checkpoint 275 | checkpoint_model = self.load_model(checkpoint_path) 276 | # Delete checkpoint 277 | if os.path.isfile(checkpoint_path): 278 | os.remove(checkpoint_path) 279 | # Return checkpoint model if it exists, otherwise return trained model 280 | return checkpoint_model if checkpoint_model is not None else model 281 | 282 | def fully_train_model(self, model, epochs, augment): 283 | # Setup validation data 284 | if self.dataset.validation_data is not None: 285 | x_val, y_val = self.dataset.validation_data 286 | x_train, y_train = self.dataset.x_train, self.dataset.y_train 287 | else: 288 | x_train, x_val, y_train, y_val = train_test_split( 289 | self.dataset.x_train, 290 | self.dataset.y_train, 291 | test_size=self.dataset.validation_split, 292 | ) 293 | 294 | # Create checkpoint path 295 | checkpoint_path = 'temp-model' 296 | 297 | # Create and fit data generator 298 | datagen = tf.keras.preprocessing.image.ImageDataGenerator(**augment) 299 | datagen.fit(x_train) 300 | 301 | # Train model 302 | model.fit_generator( 303 | generator=datagen.flow(x_train, y_train, batch_size=cfg['backend']['batch_size']), 304 | steps_per_epoch=len(self.dataset.x_train) / cfg['backend']['batch_size'], 305 | epochs=epochs, 306 | callbacks=[self.create_checkpoint_callback(checkpoint_path)], 307 | validation_data=(x_val, y_val), 308 | verbose=cfg['backend']['verbose'], 309 | ) 310 | 311 | # Load model from checkpoint 312 | checkpoint_model = self.load_model(checkpoint_path) 313 | # Delete checkpoint 314 | if os.path.isfile(checkpoint_path): 315 | os.remove(checkpoint_path) 316 | # Return checkpoint model if it exists, otherwise return trained model 317 | return checkpoint_model if checkpoint_model is not None else model 318 | 319 | def create_early_stop_callback(self): 320 | early_stop_parameters = { 321 | 'patience': cfg['backend']['patience'], 322 | 'verbose': cfg['backend']['verbose'], 323 | 'restore_best_weights': True, 324 | } 325 | early_stop_parameters['monitor'] = 'val_loss' if cfg['metrics'] == 'loss' else 'val_acc' 326 | return tf.keras.callbacks.EarlyStopping(**early_stop_parameters) 327 | 328 | def create_checkpoint_callback(self, checkpoint_path): 329 | checkpoint_parameters = { 330 | 'filepath': checkpoint_path, 331 | 'verbose': cfg['backend']['verbose'], 332 | 'save_best_only': True, 333 | } 334 | checkpoint_parameters['monitor'] = 'val_loss' if cfg['metrics'] == 'loss' else 'val_acc' 335 | return tf.keras.callbacks.ModelCheckpoint(**checkpoint_parameters) 336 | 337 | def evaluate_model(self, model): 338 | loss, accuracy = model.evaluate( 339 | x=self.dataset.x_test, 340 | y=self.dataset.y_test, 341 | verbose=cfg['backend']['verbose'] 342 | ) 343 | return (loss, accuracy) 344 | 345 | def save_model(self, model, path): 346 | model.save(path) 347 | self.free_gpu() 348 | 349 | def load_model(self, path): 350 | try: 351 | model = tf.keras.models.load_model(path) 352 | return model 353 | except: 354 | return None 355 | 356 | def free_gpu(self): 357 | K.clear_session() 358 | -------------------------------------------------------------------------------- /deepswarm/deepswarm.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2019 Edvinas Byla 2 | # Licensed under MIT License 3 | 4 | from . import settings, left_cost_is_better 5 | from .aco import ACO 6 | from .log import Log 7 | from .storage import Storage 8 | 9 | 10 | class DeepSwarm: 11 | """Class responsible for providing user facing interface.""" 12 | 13 | def __init__(self, backend): 14 | self.backend = backend 15 | self.storage = Storage(self) 16 | 17 | # Enable logging and log current settings 18 | self.setup_logging() 19 | 20 | # Try to load from the backup and restore backend as it was not saved 21 | if self.storage.loaded_from_save: 22 | self.__dict__ = self.storage.backup.__dict__ 23 | self.backend = backend 24 | self.aco.backend = backend 25 | 26 | def setup_logging(self): 27 | """Enables logging and logs current settings.""" 28 | 29 | Log.enable(self.storage) 30 | Log.header("DeepSwarm settings") 31 | Log.info(settings) 32 | 33 | def find_topology(self): 34 | """Finds the best neural network topology. 35 | 36 | Returns: 37 | network model in the format of backend which was used during 38 | initialization. 39 | """ 40 | 41 | # Create a new object only if there are no backups 42 | if not self.storage.loaded_from_save: 43 | self.aco = ACO(backend=self.backend, storage=self.storage) 44 | 45 | best_ant = self.aco.search() 46 | best_model = self.storage.load_specified_model(self.backend, best_ant.path_hash) 47 | return best_model 48 | 49 | def train_topology(self, model, epochs, augment={}): 50 | """Trains given neural network topology for a specified number of epochs. 51 | 52 | Args: 53 | model: model which represents neural network structure. 54 | epochs (int): for how many epoch train the model. 55 | augment (kwargs): augmentation arguments. 56 | Returns: 57 | network model in the format of backend which was used during 58 | initialization. 59 | """ 60 | 61 | # Before training make a copy of old weights in case performance 62 | # degrades during the training 63 | loss, accuracy = self.backend.evaluate_model(model) 64 | old_weights = model.get_weights() 65 | 66 | # Train the network 67 | model_name = 'best-trained-topology' 68 | trained_topology = self.backend.fully_train_model(model, epochs, augment) 69 | loss_new, accuracy_new = self.backend.evaluate_model(trained_topology) 70 | 71 | # Setup the metrics 72 | if settings['DeepSwarm']['metrics'] == 'loss': 73 | metrics_old = loss 74 | metrics_new = loss_new 75 | else: 76 | metrics_old = accuracy 77 | metrics_new = accuracy_new 78 | 79 | # Restore the weights if performance did not improve 80 | if left_cost_is_better(metrics_old, metrics_new): 81 | trained_topology.set_weights(old_weights) 82 | 83 | # Save and return the best topology 84 | self.storage.save_specified_model(self.backend, model_name, trained_topology) 85 | return self.storage.load_specified_model(self.backend, model_name) 86 | 87 | def evaluate_topology(self, model): 88 | """Evaluates neural network performance.""" 89 | 90 | Log.header('EVALUATING PERFORMANCE ON TEST SET') 91 | loss, accuracy = self.backend.evaluate_model(model) 92 | Log.info('Accuracy is %f and loss is %f' % (accuracy, loss)) 93 | 94 | def __getstate__(self): 95 | d = dict(self.__dict__) 96 | del d['backend'] 97 | return d 98 | -------------------------------------------------------------------------------- /deepswarm/log.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2019 Edvinas Byla 2 | # Licensed under MIT License 3 | 4 | import json 5 | import logging 6 | import re 7 | 8 | from colorama import init as colorama_init 9 | from colorama import Fore, Back, Style 10 | 11 | 12 | class Log: 13 | """Class responsible for logging information.""" 14 | 15 | # Define header styles 16 | HEADER_W = [Fore.BLACK, Back.WHITE, Style.BRIGHT] 17 | HEADER_R = [Fore.WHITE, Back.RED, Style.BRIGHT] 18 | HEADER_G = [Fore.WHITE, Back.GREEN, Style.BRIGHT] 19 | 20 | @classmethod 21 | def enable(cls, storage): 22 | """Initializes the logger. 23 | 24 | Args: 25 | storage: Storage object. 26 | """ 27 | 28 | # Init colorama to enable colors 29 | colorama_init() 30 | # Get deepswarm logger 31 | cls.logger = logging.getLogger("deepswarm") 32 | 33 | # Create stream handler 34 | stream_handler = logging.StreamHandler() 35 | stream_formater = logging.Formatter("%(message)s") 36 | stream_handler.setFormatter(stream_formater) 37 | # Add stream handler to logger 38 | cls.logger.addHandler(stream_handler) 39 | 40 | # Create and setup file handler 41 | file_handler = logging.FileHandler(storage.current_path / "deepswarm.log") 42 | file_formater = FileFormatter("%(asctime)s\n%(message)s") 43 | file_handler.setFormatter(file_formater) 44 | # Add file handle to logger 45 | cls.logger.addHandler(file_handler) 46 | 47 | # Set logger level to debug 48 | cls.logger.setLevel(logging.DEBUG) 49 | 50 | @classmethod 51 | def header(cls, message, type="WHITE"): 52 | if type == "RED": 53 | options = cls.HEADER_R 54 | elif type == "GREEN": 55 | options = cls.HEADER_G 56 | else: 57 | options = cls.HEADER_W 58 | 59 | cls.info(message.center(80, '-'), options) 60 | 61 | @classmethod 62 | def debug(cls, message, options=[Fore.CYAN]): 63 | formated_message = cls.create_message(message, options) 64 | cls.logger.debug(formated_message) 65 | 66 | @classmethod 67 | def info(cls, message, options=[Fore.GREEN]): 68 | formated_message = cls.create_message(message, options) 69 | cls.logger.info(formated_message) 70 | 71 | @classmethod 72 | def warning(cls, message, options=[Fore.YELLOW]): 73 | formated_message = cls.create_message(message, options) 74 | cls.logger.warning(formated_message) 75 | 76 | @classmethod 77 | def error(cls, message, options=[Fore.MAGENTA]): 78 | formated_message = cls.create_message(message, options) 79 | cls.logger.error(formated_message) 80 | 81 | @classmethod 82 | def critical(cls, message, options=[Fore.RED, Style.BRIGHT]): 83 | formated_message = cls.create_message(message, options) 84 | cls.logger.critical(formated_message) 85 | 86 | @classmethod 87 | def create_message(cls, message, options): 88 | # Convert dictionary to nicely formatted JSON 89 | if isinstance(message, dict): 90 | message = json.dumps(message, indent=4, sort_keys=True) 91 | 92 | # Convert all objects that are not strings to strings 93 | if isinstance(message, str) is False: 94 | message = str(message) 95 | 96 | return ''.join(options) + message + '\033[0m' 97 | 98 | 99 | class FileFormatter(logging.Formatter): 100 | """Class responsible for removing ANSI characters from the log file.""" 101 | 102 | def plain(self, string): 103 | # Regex code adapted from Martijn Pieters https://stackoverflow.com/a/14693789 104 | ansi_escape = re.compile(r'\x1B\[[0-?]*[ -/]*[@-~]|[-]{2,}') 105 | return ansi_escape.sub('', string) 106 | 107 | def format(self, record): 108 | message = super(FileFormatter, self).format(record) 109 | plain_message = self.plain(message) 110 | separator = '=' * 80 111 | return ''.join((separator, "\n", plain_message, "\n", separator)) 112 | -------------------------------------------------------------------------------- /deepswarm/nodes.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2019 Edvinas Byla 2 | # Licensed under MIT License 3 | 4 | import copy 5 | import random 6 | 7 | from . import cfg, nodes 8 | 9 | 10 | class NodeAttribute: 11 | """Class responsible for encapsulating Node's attribute.""" 12 | 13 | def __init__(self, name, options): 14 | self.name = name 15 | self.dict = {option: cfg['aco']['pheromone']['start'] for option in options} 16 | 17 | 18 | class NeighbourNode: 19 | """Class responsible for encapsulating Node's neighbour.""" 20 | 21 | def __init__(self, node, heuristic, pheromone=cfg['aco']['pheromone']['start']): 22 | self.node = node 23 | self.heuristic = heuristic 24 | self.pheromone = pheromone 25 | 26 | 27 | class Node: 28 | """Class responsible for representing Node.""" 29 | 30 | def __init__(self, name): 31 | self.name = name 32 | self.neighbours = [] 33 | self.is_expanded = False 34 | self.type = nodes[self.name]['type'] 35 | self.setup_attributes() 36 | self.setup_transitions() 37 | self.select_random_attributes() 38 | 39 | @classmethod 40 | def create_using_type(cls, type): 41 | """Create Node's instance using given type. 42 | 43 | Args: 44 | type (str): type defined in .yaml file. 45 | Returns: 46 | Node's instance. 47 | """ 48 | 49 | for node in nodes: 50 | if nodes[node]['type'] == type: 51 | return cls(node) 52 | raise Exception('Type does not exist: %s' % str(type)) 53 | 54 | def setup_attributes(self): 55 | """Adds attributes from the settings file.""" 56 | 57 | self.attributes = [] 58 | for attribute_name in nodes[self.name]['attributes']: 59 | attribute_value = nodes[self.name]['attributes'][attribute_name] 60 | self.attributes.append(NodeAttribute(attribute_name, attribute_value)) 61 | 62 | def setup_transitions(self): 63 | """Adds transitions from the settings file.""" 64 | 65 | self.available_transitions = [] 66 | for transition_name in nodes[self.name]['transitions']: 67 | heuristic_value = nodes[self.name]['transitions'][transition_name] 68 | self.available_transitions.append((transition_name, heuristic_value)) 69 | 70 | def select_attributes(self, custom_select): 71 | """Selects attributes using a given select rule. 72 | 73 | Args: 74 | custom_select: select function which takes dictionary containing 75 | (attribute, value) pairs and returns selected value. 76 | """ 77 | 78 | selected_attributes = {} 79 | for attribute in self.attributes: 80 | value = custom_select(attribute.dict) 81 | selected_attributes[attribute.name] = value 82 | 83 | # For each selected attribute create class attribute 84 | for key, value in selected_attributes.items(): 85 | setattr(self, key, value) 86 | 87 | def select_custom_attributes(self, custom_select): 88 | """Wraps select_attributes method by converting the attribute dictionary 89 | to list of tuples (attribute_value, pheromone, heuristic). 90 | 91 | Args: 92 | custom_select: selection function which takes a list of tuples 93 | containing (attribute_value, pheromone, heuristic). 94 | """ 95 | 96 | # Define a function which transforms attributes before selecting them 97 | def select_transformed_custom_attributes(attribute_dictionary): 98 | # Convert to list of tuples containing (attribute_value, pheromone, heuristic) 99 | values = [(value, pheromone, 1.0) for value, pheromone in attribute_dictionary.items()] 100 | # Return value, which was selected using custom select 101 | return custom_select(values) 102 | self.select_attributes(select_transformed_custom_attributes) 103 | 104 | def select_random_attributes(self): 105 | """Selects random attributes.""" 106 | 107 | self.select_attributes(lambda dict: random.choice(list(dict.keys()))) 108 | 109 | def create_deepcopy(self): 110 | """Returns a newly created copy of Node object.""" 111 | 112 | return copy.deepcopy(self) 113 | 114 | def __deepcopy__(self, memo): 115 | cls = self.__class__ 116 | result = cls.__new__(cls) 117 | memo[id(self)] = result 118 | for k, v in self.__dict__.items(): 119 | # Skip unnecessary stuff in order to make copying more efficient 120 | if k in ["neighbours", "available_transitions"]: 121 | v = [] 122 | setattr(result, k, copy.deepcopy(v, memo)) 123 | return result 124 | 125 | def __str__(self): 126 | attributes = ', '.join([a.name + ":" + str(getattr(self, a.name)) for a in self.attributes]) 127 | return self.name + "(" + attributes + ")" 128 | -------------------------------------------------------------------------------- /deepswarm/storage.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2019 Edvinas Byla 2 | # Licensed under MIT License 3 | 4 | import hashlib 5 | import pickle 6 | 7 | from datetime import datetime 8 | 9 | from . import base_path, cfg, left_cost_is_better 10 | 11 | 12 | class Storage: 13 | """Class responsible for backups and weight reuse.""" 14 | 15 | DIR = { 16 | "MODEL": "models", 17 | "OBJECT": "objects", 18 | } 19 | 20 | ITEM = {"BACKUP": "backup"} 21 | 22 | def __init__(self, deepswarm): 23 | self.loaded_from_save = False 24 | self.backup = None 25 | self.path_lookup = {} 26 | self.models = {} 27 | self.deepswarm = deepswarm 28 | self.setup_path() 29 | self.setup_directories() 30 | 31 | def setup_path(self): 32 | """Loads existing backup or creates a new backup directory.""" 33 | 34 | # If storage directory doesn't exist create one 35 | storage_path = base_path / 'saves' 36 | if not storage_path.exists(): 37 | storage_path.mkdir() 38 | 39 | # Check if user specified save folder which should be used to load the data 40 | user_folder = cfg['save_folder'] 41 | if user_folder is not None and (storage_path / user_folder).exists(): 42 | self.current_path = storage_path / user_folder 43 | self.loaded_from_save = True 44 | # Store deepswarm object to backup 45 | self.backup = self.load_object(Storage.ITEM["BACKUP"]) 46 | self.backup.storage.loaded_from_save = True 47 | return 48 | 49 | # Otherwise create a new directory 50 | directory_path = storage_path / datetime.now().strftime('%Y-%m-%d-%H-%M-%S') 51 | if not directory_path.exists(): 52 | directory_path.mkdir() 53 | self.current_path = directory_path 54 | return 55 | 56 | def setup_directories(self): 57 | """Creates all the required directories.""" 58 | 59 | for directory in Storage.DIR.values(): 60 | directory_path = self.current_path / directory 61 | if not directory_path.exists(): 62 | directory_path.mkdir() 63 | 64 | def perform_backup(self): 65 | """Saves DeepSwarm object to the backup directory.""" 66 | 67 | self.save_object(self.deepswarm, Storage.ITEM["BACKUP"]) 68 | 69 | def save_model(self, backend, model, path_hashes, cost): 70 | """Saves the model and adds its information to the dictionaries. 71 | 72 | Args: 73 | backend: Backend object. 74 | model: model which represents neural network structure. 75 | path_hashes [string]: list of hashes, where each hash represents a 76 | sub-path. 77 | cost: cost associated with the model. 78 | """ 79 | 80 | sub_path_associated = False 81 | # The last element describes the whole path 82 | model_hash = path_hashes[-1] 83 | 84 | # For each sub-path find it's corresponding entry in hash table 85 | for path_hash in path_hashes: 86 | # Check if there already exists model for this sub-path 87 | existing_model_hash = self.path_lookup.get(path_hash) 88 | model_info = self.models.get(existing_model_hash) 89 | 90 | # If the old model is better then skip this sub-path 91 | if model_info is not None and left_cost_is_better(model_info[0], cost): 92 | continue 93 | 94 | # Otherwise associated this sub-path with a new model 95 | self.path_lookup[path_hash] = model_hash 96 | sub_path_associated = True 97 | 98 | # Save model on disk only if it was associated with some sub-path 99 | if sub_path_associated: 100 | # Add an entry to models dictionary 101 | self.models[model_hash] = (cost, 0) 102 | # Save to disk 103 | self.save_specified_model(backend, model_hash, model) 104 | 105 | def load_model(self, backend, path_hashes, path): 106 | """Loads model with the best weights. 107 | 108 | Args: 109 | backend: Backend object. 110 | path_hashes [string]: list of hashes, where each hash represents a 111 | sub-path. 112 | path [Node]: a path which represents the model. 113 | Returns: 114 | if the model exists returns a tuple containing model and its hash, 115 | otherwise returns a tuple containing None values. 116 | """ 117 | 118 | # Go through all hashes backwards 119 | for idx, path_hash in enumerate(path_hashes[::-1]): 120 | # See if particular hash is associated with some model 121 | model_hash = self.path_lookup.get(path_hash) 122 | model_info = self.models.get(model_hash) 123 | 124 | # Don't reuse model if it hasn't improved for longer than allowed in patience 125 | if model_hash is not None and model_info[1] < cfg['reuse_patience']: 126 | model = self.load_specified_model(backend, model_hash) 127 | # If failed to load model, skip to next hash 128 | if model is None: 129 | continue 130 | 131 | # If there is no difference between models, just return the old model, 132 | # otherwise create a new model by reusing the old model. Even though, 133 | # backend.reuse_model function could be called to handle both 134 | # cases, this approach saves some unnecessary computation 135 | new_model = model if idx == 0 else backend.reuse_model(model, path, idx) 136 | 137 | # We also return base model (a model which was used as a base to 138 | # create a new model) hash. This hash information is used later to 139 | # track if the base model is improving over time or is it stuck 140 | return (new_model, model_hash) 141 | return (None, None) 142 | 143 | def load_specified_model(self, backend, model_name): 144 | """Loads specified model using its name. 145 | 146 | Args: 147 | backend: Backend object. 148 | model_name: name of the model. 149 | Returns: 150 | model which represents neural network structure. 151 | """ 152 | 153 | file_path = self.current_path / Storage.DIR["MODEL"] / model_name 154 | model = backend.load_model(file_path) 155 | return model 156 | 157 | def save_specified_model(self, backend, model_name, model): 158 | """Saves specified model using its name without and adding its information 159 | to the dictionaries. 160 | 161 | Args: 162 | backend: Backend object. 163 | model_name: name of the model. 164 | model: model which represents neural network structure. 165 | """ 166 | 167 | save_path = self.current_path / Storage.DIR["MODEL"] / model_name 168 | backend.save_model(model, save_path) 169 | 170 | def record_model_performance(self, path_hash, cost): 171 | """Records how many times the model cost didn't improve. 172 | 173 | Args: 174 | path_hash: hash value associated with the model. 175 | cost: cost value associated with the model. 176 | """ 177 | 178 | model_hash = self.path_lookup.get(path_hash) 179 | old_cost, no_improvements = self.models.get(model_hash) 180 | 181 | # If cost hasn't changed at all, increment no improvement count 182 | if old_cost is not None and old_cost == cost: 183 | self.models[model_hash] = (old_cost, (no_improvements + 1)) 184 | 185 | def hash_path(self, path): 186 | """Takes a path and returns a tuple containing path description and 187 | list of sub-path hashes. 188 | 189 | Args: 190 | path [Node]: path which represents the model. 191 | Returns: 192 | tuple where the first element is a string representing the path 193 | description and the second element is a list of sub-path hashes. 194 | """ 195 | 196 | hashes = [] 197 | path_description = str(path[0]) 198 | for node in path[1:]: 199 | path_description += ' -> %s' % (node) 200 | current_hash = hashlib.sha3_256(path_description.encode('utf-8')).hexdigest() 201 | hashes.append(current_hash) 202 | return (path_description, hashes) 203 | 204 | def save_object(self, data, name): 205 | """Saves given object to the object backup directory. 206 | 207 | Args: 208 | data: object that needs to be saved. 209 | name: string value representing the name of the object. 210 | """ 211 | 212 | with open(self.current_path / Storage.DIR["OBJECT"] / name, 'wb') as f: 213 | pickle.dump(data, f, pickle.HIGHEST_PROTOCOL) 214 | 215 | def load_object(self, name): 216 | """Load given object from the object backup directory. 217 | 218 | Args: 219 | name: string value representing the name of the object. 220 | Returns: 221 | object which has the same name as the given argument. 222 | """ 223 | 224 | with open(self.current_path / Storage.DIR["OBJECT"] / name, 'rb') as f: 225 | data = pickle.load(f) 226 | return data 227 | -------------------------------------------------------------------------------- /examples/cifar10.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2019 Edvinas Byla 2 | # Licensed under MIT License 3 | 4 | import context 5 | import tensorflow as tf 6 | 7 | from deepswarm.backends import Dataset, TFKerasBackend 8 | from deepswarm.deepswarm import DeepSwarm 9 | 10 | # Load CIFAR-10 dataset 11 | cifar10 = tf.keras.datasets.cifar10 12 | (x_train, y_train), (x_test, y_test) = cifar10.load_data() 13 | # Convert class vectors to binary class matrices 14 | y_train = tf.keras.utils.to_categorical(y_train, 10) 15 | y_test = tf.keras.utils.to_categorical(y_test, 10) 16 | # Create dataset object, which controls all the data 17 | dataset = Dataset( 18 | training_examples=x_train, 19 | training_labels=y_train, 20 | testing_examples=x_test, 21 | testing_labels=y_test, 22 | validation_split=0.1, 23 | ) 24 | # Create backend responsible for training & validating 25 | backend = TFKerasBackend(dataset=dataset) 26 | # Create DeepSwarm object responsible for optimization 27 | deepswarm = DeepSwarm(backend=backend) 28 | # Find the topology for a given dataset 29 | topology = deepswarm.find_topology() 30 | # Evaluate discovered topology 31 | deepswarm.evaluate_topology(topology) 32 | # Train topology on augmented data for additional 50 epochs 33 | trained_topology = deepswarm.train_topology(topology, 50, augment={ 34 | 'rotation_range': 15, 35 | 'width_shift_range': 0.1, 36 | 'height_shift_range': 0.1, 37 | 'horizontal_flip': True, 38 | }) 39 | # Evaluate the final topology 40 | deepswarm.evaluate_topology(trained_topology) 41 | -------------------------------------------------------------------------------- /examples/context.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2019 Edvinas Byla 2 | # Licensed under MIT License 3 | 4 | import os 5 | import sys 6 | 7 | sys.path.append(os.path.dirname(os.path.dirname(os.path.realpath(__file__)))) 8 | -------------------------------------------------------------------------------- /examples/fashion-mnist.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2019 Edvinas Byla 2 | # Licensed under MIT License 3 | 4 | import context 5 | import tensorflow as tf 6 | 7 | from deepswarm.backends import Dataset, TFKerasBackend 8 | from deepswarm.deepswarm import DeepSwarm 9 | 10 | # Load Fashion MNIST dataset 11 | fashion_mnist = tf.keras.datasets.fashion_mnist 12 | (x_train, y_train), (x_test, y_test) = fashion_mnist.load_data() 13 | # Normalize and reshape data 14 | x_train, x_test = x_train / 255.0, x_test / 255.0 15 | x_train = x_train.reshape(x_train.shape[0], 28, 28, 1) 16 | x_test = x_test.reshape(x_test.shape[0], 28, 28, 1) 17 | # Create dataset object, which controls all the data 18 | normalized_dataset = Dataset( 19 | training_examples=x_train, 20 | training_labels=y_train, 21 | testing_examples=x_test, 22 | testing_labels=y_test, 23 | validation_split=0.1, 24 | ) 25 | # Create backend responsible for training & validating 26 | backend = TFKerasBackend(dataset=normalized_dataset) 27 | # Create DeepSwarm object responsible for optimization 28 | deepswarm = DeepSwarm(backend=backend) 29 | # Find the topology for a given dataset 30 | topology = deepswarm.find_topology() 31 | # Evaluate discovered topology 32 | deepswarm.evaluate_topology(topology) 33 | # Train topology for additional 50 epochs 34 | trained_topology = deepswarm.train_topology(topology, 50) 35 | # Evaluate the final topology 36 | deepswarm.evaluate_topology(trained_topology) 37 | -------------------------------------------------------------------------------- /examples/mnist.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2019 Edvinas Byla 2 | # Licensed under MIT License 3 | 4 | import context 5 | import tensorflow as tf 6 | 7 | from deepswarm.backends import Dataset, TFKerasBackend 8 | from deepswarm.deepswarm import DeepSwarm 9 | 10 | # Load MNIST dataset 11 | mnist = tf.keras.datasets.mnist 12 | (x_train, y_train), (x_test, y_test) = mnist.load_data() 13 | # Normalize and reshape data 14 | x_train, x_test = x_train / 255.0, x_test / 255.0 15 | x_train = x_train.reshape(x_train.shape[0], 28, 28, 1) 16 | x_test = x_test.reshape(x_test.shape[0], 28, 28, 1) 17 | # Create dataset object, which controls all the data 18 | normalized_dataset = Dataset( 19 | training_examples=x_train, 20 | training_labels=y_train, 21 | testing_examples=x_test, 22 | testing_labels=y_test, 23 | validation_split=0.1, 24 | ) 25 | # Create backend responsible for training & validating 26 | backend = TFKerasBackend(dataset=normalized_dataset) 27 | # Create DeepSwarm object responsible for optimization 28 | deepswarm = DeepSwarm(backend=backend) 29 | # Find the topology for a given dataset 30 | topology = deepswarm.find_topology() 31 | # Evaluate discovered topology 32 | deepswarm.evaluate_topology(topology) 33 | # Train topology for additional 30 epochs 34 | trained_topology = deepswarm.train_topology(topology, 30) 35 | # Evaluate the final topology 36 | deepswarm.evaluate_topology(trained_topology) 37 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | colorama==0.4.1 2 | pyyaml==5.1 3 | scikit-learn==0.20.3 -------------------------------------------------------------------------------- /settings/cifar10.yaml: -------------------------------------------------------------------------------- 1 | DeepSwarm: 2 | save_folder: 3 | metrics: accuracy 4 | max_depth: 20 5 | reuse_patience: 1 6 | 7 | aco: 8 | pheromone: 9 | start: 0.1 10 | decay: 0.1 11 | evaporation: 0.1 12 | verbose: False 13 | greediness: 0.5 14 | ant_count: 16 15 | 16 | backend: 17 | epochs: 20 18 | batch_size: 64 19 | patience: 5 20 | loss: categorical_crossentropy 21 | verbose: False 22 | 23 | spatial_nodes: [InputNode, Conv2DNode, DropoutSpatialNode, BatchNormalizationNode, Pool2DNode] 24 | flat_nodes: [FlattenNode, DenseNode, DropoutFlatNode, BatchNormalizationFlatNode] 25 | 26 | Nodes: 27 | 28 | InputNode: 29 | type: Input 30 | attributes: 31 | shape: [!!python/tuple [32, 32, 3]] 32 | transitions: 33 | Conv2DNode: 1.0 34 | 35 | Conv2DNode: 36 | type: Conv2D 37 | attributes: 38 | filter_count: [64, 128, 256] 39 | kernel_size: [1, 3, 5] 40 | activation: [ReLU] 41 | transitions: 42 | Conv2DNode: 0.8 43 | Pool2DNode: 1.2 44 | FlattenNode: 1.0 45 | DropoutSpatialNode: 1.1 46 | BatchNormalizationNode: 1.2 47 | 48 | DropoutSpatialNode: 49 | type: Dropout 50 | attributes: 51 | rate: [0.1, 0.3, 0.5] 52 | transitions: 53 | Conv2DNode: 1.1 54 | Pool2DNode: 1.0 55 | FlattenNode: 1.0 56 | BatchNormalizationNode: 1.1 57 | 58 | BatchNormalizationNode: 59 | type: BatchNormalization 60 | attributes: {} 61 | transitions: 62 | Conv2DNode: 1.1 63 | Pool2DNode: 1.1 64 | DropoutSpatialNode: 1.0 65 | FlattenNode: 1.0 66 | 67 | Pool2DNode: 68 | type: Pool2D 69 | attributes: 70 | pool_type: [max, average] 71 | pool_size: [2] 72 | stride: [2, 3] 73 | transitions: 74 | Conv2DNode: 1.1 75 | FlattenNode: 1.0 76 | BatchNormalizationNode: 1.1 77 | 78 | FlattenNode: 79 | type: Flatten 80 | attributes: {} 81 | transitions: 82 | DenseNode: 1.0 83 | OutputNode: 0.8 84 | BatchNormalizationFlatNode: 0.9 85 | 86 | DenseNode: 87 | type: Dense 88 | attributes: 89 | output_size: [64, 128, 256] 90 | activation: [ReLU, Sigmoid] 91 | transitions: 92 | DenseNode: 0.8 93 | DropoutFlatNode: 1.2 94 | BatchNormalizationFlatNode: 1.2 95 | OutputNode: 1.0 96 | 97 | DropoutFlatNode: 98 | type: Dropout 99 | attributes: 100 | rate: [0.3, 0.5, 0.7] 101 | transitions: 102 | DenseNode: 1.0 103 | BatchNormalizationFlatNode: 1.0 104 | OutputNode: 0.9 105 | 106 | BatchNormalizationFlatNode: 107 | type: BatchNormalization 108 | attributes: {} 109 | transitions: 110 | DenseNode: 1.1 111 | DropoutFlatNode: 1.1 112 | OutputNode: 0.9 113 | 114 | OutputNode: 115 | type: Output 116 | attributes: 117 | output_size: [10] 118 | activation: [Softmax] 119 | transitions: {} 120 | -------------------------------------------------------------------------------- /settings/default.yaml: -------------------------------------------------------------------------------- 1 | DeepSwarm: 2 | save_folder: 3 | metrics: accuracy 4 | max_depth: 15 5 | reuse_patience: 1 6 | 7 | aco: 8 | pheromone: 9 | start: 0.1 10 | decay: 0.1 11 | evaporation: 0.1 12 | verbose: False 13 | greediness: 0.5 14 | ant_count: 16 15 | 16 | backend: 17 | epochs: 15 18 | batch_size: 64 19 | patience: 5 20 | loss: sparse_categorical_crossentropy 21 | verbose: False 22 | 23 | spatial_nodes: [InputNode, Conv2DNode, DropoutSpatialNode, BatchNormalizationNode, Pool2DNode] 24 | flat_nodes: [FlattenNode, DenseNode, DropoutFlatNode, BatchNormalizationFlatNode] 25 | 26 | Nodes: 27 | 28 | InputNode: 29 | type: Input 30 | attributes: 31 | shape: [!!python/tuple [28, 28, 1]] 32 | transitions: 33 | Conv2DNode: 1.0 34 | 35 | Conv2DNode: 36 | type: Conv2D 37 | attributes: 38 | filter_count: [32, 64, 128] 39 | kernel_size: [1, 3, 5] 40 | activation: [ReLU] 41 | transitions: 42 | Conv2DNode: 0.8 43 | Pool2DNode: 1.2 44 | FlattenNode: 1.0 45 | DropoutSpatialNode: 1.1 46 | BatchNormalizationNode: 1.2 47 | 48 | DropoutSpatialNode: 49 | type: Dropout 50 | attributes: 51 | rate: [0.1, 0.3] 52 | transitions: 53 | Conv2DNode: 1.1 54 | Pool2DNode: 1.0 55 | FlattenNode: 1.0 56 | BatchNormalizationNode: 1.1 57 | 58 | BatchNormalizationNode: 59 | type: BatchNormalization 60 | attributes: {} 61 | transitions: 62 | Conv2DNode: 1.1 63 | Pool2DNode: 1.1 64 | DropoutSpatialNode: 1.0 65 | FlattenNode: 1.0 66 | 67 | Pool2DNode: 68 | type: Pool2D 69 | attributes: 70 | pool_type: [max, average] 71 | pool_size: [2] 72 | stride: [2, 3] 73 | transitions: 74 | Conv2DNode: 1.1 75 | FlattenNode: 1.0 76 | BatchNormalizationNode: 1.1 77 | 78 | FlattenNode: 79 | type: Flatten 80 | attributes: {} 81 | transitions: 82 | DenseNode: 1.0 83 | OutputNode: 0.8 84 | BatchNormalizationFlatNode: 0.9 85 | 86 | DenseNode: 87 | type: Dense 88 | attributes: 89 | output_size: [64, 128] 90 | activation: [ReLU, Sigmoid] 91 | transitions: 92 | DenseNode: 0.8 93 | DropoutFlatNode: 1.2 94 | BatchNormalizationFlatNode: 1.2 95 | OutputNode: 1.0 96 | 97 | DropoutFlatNode: 98 | type: Dropout 99 | attributes: 100 | rate: [0.1, 0.3] 101 | transitions: 102 | DenseNode: 1.0 103 | BatchNormalizationFlatNode: 1.0 104 | OutputNode: 0.9 105 | 106 | BatchNormalizationFlatNode: 107 | type: BatchNormalization 108 | attributes: {} 109 | transitions: 110 | DenseNode: 1.1 111 | DropoutFlatNode: 1.1 112 | OutputNode: 0.9 113 | 114 | OutputNode: 115 | type: Output 116 | attributes: 117 | output_size: [10] 118 | activation: [Softmax] 119 | transitions: {} 120 | -------------------------------------------------------------------------------- /settings/fashion-mnist.yaml: -------------------------------------------------------------------------------- 1 | DeepSwarm: 2 | save_folder: 3 | metrics: accuracy 4 | max_depth: 15 5 | reuse_patience: 1 6 | 7 | aco: 8 | pheromone: 9 | start: 0.1 10 | decay: 0.1 11 | evaporation: 0.1 12 | verbose: False 13 | greediness: 0.5 14 | ant_count: 16 15 | 16 | backend: 17 | epochs: 20 18 | batch_size: 64 19 | patience: 5 20 | loss: sparse_categorical_crossentropy 21 | verbose: False 22 | 23 | spatial_nodes: [InputNode, Conv2DNode, DropoutSpatialNode, BatchNormalizationNode, Pool2DNode] 24 | flat_nodes: [FlattenNode, DenseNode, DropoutFlatNode, BatchNormalizationFlatNode] 25 | 26 | Nodes: 27 | 28 | InputNode: 29 | type: Input 30 | attributes: 31 | shape: [!!python/tuple [28, 28, 1]] 32 | transitions: 33 | Conv2DNode: 1.0 34 | 35 | Conv2DNode: 36 | type: Conv2D 37 | attributes: 38 | filter_count: [64, 128, 256] 39 | kernel_size: [1, 3, 5] 40 | activation: [ReLU] 41 | transitions: 42 | Conv2DNode: 0.8 43 | Pool2DNode: 1.2 44 | FlattenNode: 1.0 45 | DropoutSpatialNode: 1.1 46 | BatchNormalizationNode: 1.2 47 | 48 | DropoutSpatialNode: 49 | type: Dropout 50 | attributes: 51 | rate: [0.1, 0.3] 52 | transitions: 53 | Conv2DNode: 1.1 54 | Pool2DNode: 1.0 55 | FlattenNode: 1.0 56 | BatchNormalizationNode: 1.1 57 | 58 | BatchNormalizationNode: 59 | type: BatchNormalization 60 | attributes: {} 61 | transitions: 62 | Conv2DNode: 1.1 63 | Pool2DNode: 1.1 64 | DropoutSpatialNode: 1.0 65 | FlattenNode: 1.0 66 | 67 | Pool2DNode: 68 | type: Pool2D 69 | attributes: 70 | pool_type: [max, average] 71 | pool_size: [2] 72 | stride: [2, 3] 73 | transitions: 74 | Conv2DNode: 1.1 75 | FlattenNode: 1.0 76 | BatchNormalizationNode: 1.1 77 | 78 | FlattenNode: 79 | type: Flatten 80 | attributes: {} 81 | transitions: 82 | DenseNode: 1.0 83 | OutputNode: 0.8 84 | BatchNormalizationFlatNode: 0.9 85 | 86 | DenseNode: 87 | type: Dense 88 | attributes: 89 | output_size: [64, 128] 90 | activation: [ReLU, Sigmoid] 91 | transitions: 92 | DenseNode: 0.8 93 | DropoutFlatNode: 1.2 94 | BatchNormalizationFlatNode: 1.2 95 | OutputNode: 1.0 96 | 97 | DropoutFlatNode: 98 | type: Dropout 99 | attributes: 100 | rate: [0.1, 0.3] 101 | transitions: 102 | DenseNode: 1.0 103 | BatchNormalizationFlatNode: 1.0 104 | OutputNode: 0.9 105 | 106 | BatchNormalizationFlatNode: 107 | type: BatchNormalization 108 | attributes: {} 109 | transitions: 110 | DenseNode: 1.1 111 | DropoutFlatNode: 1.1 112 | OutputNode: 0.9 113 | 114 | OutputNode: 115 | type: Output 116 | attributes: 117 | output_size: [10] 118 | activation: [Softmax] 119 | transitions: {} 120 | -------------------------------------------------------------------------------- /settings/mnist.yaml: -------------------------------------------------------------------------------- 1 | DeepSwarm: 2 | save_folder: 3 | metrics: accuracy 4 | max_depth: 15 5 | reuse_patience: 1 6 | 7 | aco: 8 | pheromone: 9 | start: 0.1 10 | decay: 0.1 11 | evaporation: 0.1 12 | verbose: False 13 | greediness: 0.5 14 | ant_count: 16 15 | 16 | backend: 17 | epochs: 15 18 | batch_size: 64 19 | patience: 5 20 | loss: sparse_categorical_crossentropy 21 | verbose: False 22 | 23 | spatial_nodes: [InputNode, Conv2DNode, DropoutSpatialNode, BatchNormalizationNode, Pool2DNode] 24 | flat_nodes: [FlattenNode, DenseNode, DropoutFlatNode, BatchNormalizationFlatNode] 25 | 26 | Nodes: 27 | 28 | InputNode: 29 | type: Input 30 | attributes: 31 | shape: [!!python/tuple [28, 28, 1]] 32 | transitions: 33 | Conv2DNode: 1.0 34 | 35 | Conv2DNode: 36 | type: Conv2D 37 | attributes: 38 | filter_count: [32, 64, 128] 39 | kernel_size: [1, 3, 5] 40 | activation: [ReLU] 41 | transitions: 42 | Conv2DNode: 0.8 43 | Pool2DNode: 1.2 44 | FlattenNode: 1.0 45 | DropoutSpatialNode: 1.1 46 | BatchNormalizationNode: 1.2 47 | 48 | DropoutSpatialNode: 49 | type: Dropout 50 | attributes: 51 | rate: [0.1, 0.3] 52 | transitions: 53 | Conv2DNode: 1.1 54 | Pool2DNode: 1.0 55 | FlattenNode: 1.0 56 | BatchNormalizationNode: 1.1 57 | 58 | BatchNormalizationNode: 59 | type: BatchNormalization 60 | attributes: {} 61 | transitions: 62 | Conv2DNode: 1.1 63 | Pool2DNode: 1.1 64 | DropoutSpatialNode: 1.0 65 | FlattenNode: 1.0 66 | 67 | Pool2DNode: 68 | type: Pool2D 69 | attributes: 70 | pool_type: [max, average] 71 | pool_size: [2] 72 | stride: [2, 3] 73 | transitions: 74 | Conv2DNode: 1.1 75 | FlattenNode: 1.0 76 | BatchNormalizationNode: 1.1 77 | 78 | FlattenNode: 79 | type: Flatten 80 | attributes: {} 81 | transitions: 82 | DenseNode: 1.0 83 | OutputNode: 0.8 84 | BatchNormalizationFlatNode: 0.9 85 | 86 | DenseNode: 87 | type: Dense 88 | attributes: 89 | output_size: [64, 128] 90 | activation: [ReLU, Sigmoid] 91 | transitions: 92 | DenseNode: 0.8 93 | DropoutFlatNode: 1.2 94 | BatchNormalizationFlatNode: 1.2 95 | OutputNode: 1.0 96 | 97 | DropoutFlatNode: 98 | type: Dropout 99 | attributes: 100 | rate: [0.1, 0.3] 101 | transitions: 102 | DenseNode: 1.0 103 | BatchNormalizationFlatNode: 1.0 104 | OutputNode: 0.9 105 | 106 | BatchNormalizationFlatNode: 107 | type: BatchNormalization 108 | attributes: {} 109 | transitions: 110 | DenseNode: 1.1 111 | DropoutFlatNode: 1.1 112 | OutputNode: 0.9 113 | 114 | OutputNode: 115 | type: Output 116 | attributes: 117 | output_size: [10] 118 | activation: [Softmax] 119 | transitions: {} 120 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import setuptools 2 | 3 | with open("README.md", "r") as fh: 4 | long_description = fh.read() 5 | 6 | setuptools.setup( 7 | name="deepswarm", 8 | version="0.0.10", 9 | author="Edvinas Byla", 10 | author_email="edvinasbyla@gmail.com", 11 | description="Neural Architecture Search Powered by Swarm Intelligence", 12 | long_description=long_description, 13 | long_description_content_type="text/markdown", 14 | url="https://github.com/Pattio/DeepSwarm", 15 | packages=setuptools.find_packages(), 16 | package_data={'deepswarm': ['../settings/default.yaml']}, 17 | install_requires=[ 18 | 'colorama==0.4.1', 19 | 'pyyaml==5.1', 20 | 'scikit-learn==0.20.3', 21 | ], 22 | classifiers=[ 23 | "Programming Language :: Python :: 3.6", 24 | "License :: OSI Approved :: MIT License", 25 | "Operating System :: OS Independent", 26 | ], 27 | ) 28 | -------------------------------------------------------------------------------- /tests/test_aco.py: -------------------------------------------------------------------------------- 1 | import math 2 | import unittest 3 | 4 | from deepswarm import cfg 5 | from deepswarm.aco import ACO, Ant 6 | 7 | 8 | class TestACO(unittest.TestCase): 9 | 10 | def setUp(self): 11 | self.aco = ACO(None, None) 12 | 13 | def test_ant_init(self): 14 | # Test if the ant is initialized properly 15 | ant = Ant() 16 | self.assertEqual(ant.loss, math.inf) 17 | self.assertEqual(ant.accuracy, 0.0) 18 | self.assertEqual(ant.path, []) 19 | if cfg['metrics'] == 'loss': 20 | self.assertEqual(ant.cost, ant.loss) 21 | else: 22 | self.assertEqual(ant.cost, ant.accuracy) 23 | 24 | def test_ant_init_with_path(self): 25 | # Test if the ant is initialized properly when a path is given 26 | self.aco.graph.increase_depth() 27 | path = self.aco.graph.generate_path(self.aco.aco_select) 28 | ant = Ant(path) 29 | self.assertEqual(ant.loss, math.inf) 30 | self.assertEqual(ant.accuracy, 0.0) 31 | self.assertEqual(ant.path, path) 32 | 33 | def test_ant_comparison(self): 34 | # Test if ants are compared properly 35 | ant_1 = Ant() 36 | ant_1.accuracy = 0.8 37 | ant_2 = Ant() 38 | ant_2.loss = 0.8 39 | self.assertTrue(ant_2 < ant_1) 40 | 41 | def test_local_update(self): 42 | # Test if local update rule works properly 43 | new_value = self.aco.local_update(11.23, None) 44 | self.assertEqual(new_value, 10.117) 45 | 46 | def test_global_update(self): 47 | # Test if global update rule works properly 48 | new_value = self.aco.global_update(11.23, 13.79) 49 | self.assertEqual(new_value, 11.486) 50 | 51 | def test_pheromone_update(self): 52 | # Test pheromone update 53 | self.aco.graph.increase_depth() 54 | self.aco.graph.increase_depth() 55 | path = self.aco.graph.generate_path(self.aco.aco_select) 56 | ant = Ant(path) 57 | self.aco.update_pheromone(ant, self.aco.local_update) 58 | self.aco.update_pheromone(ant, self.aco.global_update) 59 | -------------------------------------------------------------------------------- /tests/test_graph.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | 3 | from deepswarm.aco import Graph 4 | from deepswarm.nodes import Node 5 | 6 | 7 | class TestNodes(unittest.TestCase): 8 | 9 | def setUp(self): 10 | self.graph = Graph() 11 | 12 | def test_graph_init(self): 13 | # Test if the newly created graph contains the input node 14 | self.assertEqual(len(self.graph.topology), 1) 15 | self.assertEqual(self.graph.current_depth, 1) 16 | input_node = self.graph.input_node 17 | self.assertIs(self.graph.topology[0][input_node.name], input_node) 18 | 19 | def test_depth_increase(self): 20 | # Test if the depth is increased correctly 21 | self.assertEqual(self.graph.current_depth, 1) 22 | self.graph.increase_depth() 23 | self.assertEqual(self.graph.current_depth, 2) 24 | 25 | def test_path_generation(self): 26 | # Create a rule which selects first available node 27 | def select_rule(neighbours): 28 | return neighbours[0].node 29 | 30 | # Generate the path 31 | path = self.graph.generate_path(select_rule) 32 | # Test if the path is not empty 33 | self.assertNotEqual(path, []) 34 | # Test if the path starts with an input node 35 | self.assertEqual(path[0].type, 'Input') 36 | # Test if path ends with output node 37 | self.assertEqual(path[-1].type, 'Output') 38 | 39 | def test_path_completion(self): 40 | # Create a path containing only the input node 41 | old_path = [self.graph.input_node] 42 | # Complete that path 43 | new_path = self.graph.complete_path(old_path) 44 | # Test if path starts with an input node 45 | self.assertEqual(new_path[0].type, 'Input') 46 | # Test if path ends with output node 47 | self.assertEqual(new_path[-1].type, 'Output') 48 | 49 | def test_node_retrieval(self): 50 | # Test if the newly created graph contains the input node 51 | self.assertEqual(len(self.graph.topology), 1) 52 | # Retrieve first available transition from the input node 53 | available_transition = self.graph.input_node.available_transitions[0] 54 | # Use its name to initialize Node object 55 | available_transition_name = available_transition[0] 56 | available_transition_node = Node(available_transition_name) 57 | self.graph.get_node(available_transition_node, 1) 58 | # Test if graph's depth increased after adding a new node 59 | self.assertEqual(len(self.graph.topology), 2) 60 | # Test if the node was added correctly 61 | self.assertIs(self.graph.topology[1][available_transition_name], available_transition_node) 62 | 63 | def test_node_expansion(self): 64 | # Test if the input node was not expanded yet 65 | input_node = self.graph.input_node 66 | self.assertFalse(input_node.is_expanded) 67 | self.assertEqual(input_node.neighbours, []) 68 | # Try to expand it 69 | has_neighbours = self.graph.has_neighbours(input_node, 0) 70 | # Test if the input node was expanded successfully 71 | self.assertTrue(input_node.is_expanded) 72 | # Test if the input node has neighbours 73 | self.assertTrue(has_neighbours) 74 | self.assertNotEqual(input_node.neighbours, []) 75 | # Test if neighbour node was added to the topology 76 | neighbour_node = input_node.neighbours[0].node 77 | self.assertIs(self.graph.topology[1][neighbour_node.name], neighbour_node) 78 | -------------------------------------------------------------------------------- /tests/test_nodes.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | 3 | from deepswarm.nodes import Node, NodeAttribute, NeighbourNode 4 | 5 | 6 | class TestNodes(unittest.TestCase): 7 | 8 | def setUp(self): 9 | self.input_node = Node.create_using_type('Input') 10 | 11 | def test_create_using_type(self): 12 | # Test default values 13 | self.assertEqual(self.input_node.neighbours, []) 14 | self.assertEqual(self.input_node.type, 'Input') 15 | self.assertFalse(self.input_node.is_expanded) 16 | self.assertNotEqual(self.input_node.attributes, []) 17 | self.assertNotEqual(self.input_node.available_transitions, []) 18 | # Test if generated description is correct 19 | description = self.input_node.name + '(' + 'shape:' + str(self.input_node.shape) + ')' 20 | self.assertEqual(description, str(self.input_node)) 21 | 22 | def test_init(self): 23 | # Test if you can create node just by using its name 24 | input_node_new = Node(self.input_node.name) 25 | self.assertEqual(input_node_new.type, self.input_node.type) 26 | 27 | def test_deepcopy(self): 28 | # Test if the copied object is an instance of Node 29 | input_node_copy = self.input_node.create_deepcopy() 30 | self.assertIsInstance(input_node_copy, Node) 31 | # Test if unnecessary attributes were removed 32 | self.assertNotEqual(input_node_copy.available_transitions, self.input_node.attributes) 33 | # Test if unnecessary attributes are empty arrays 34 | self.assertEqual(input_node_copy.neighbours, []) 35 | self.assertEqual(input_node_copy.available_transitions, []) 36 | 37 | def test_available_transition(self): 38 | # Retrieve first available transition 39 | available_transition = self.input_node.available_transitions[0] 40 | # Use its name to initialize Node object 41 | available_transition_name = available_transition[0] 42 | available_transition_node = Node(available_transition_name) 43 | self.assertIsInstance(available_transition_node, Node) 44 | # Check if the node was properly initialized 45 | self.assertNotEqual(available_transition_node.attributes, []) 46 | self.assertNotEqual(available_transition_node.available_transitions, []) 47 | # Check if available transition contains a heuristic value 48 | self.assertIsInstance(available_transition[1], float) 49 | 50 | def test_custom_attribute_selection(self): 51 | # Initialize node which connects to the input node 52 | node = Node(self.input_node.available_transitions[0][0]) 53 | # For each attribute select first available value 54 | node.select_custom_attributes(lambda values: values[0][0]) 55 | # Collect selected values 56 | old_attribute_values = [getattr(node, attribute.name) for attribute in node.attributes] 57 | # For each attribute if available select second value 58 | node.select_custom_attributes(lambda values: values[1][0] if len(values) > 1 else values[0][0]) 59 | # Collect newly selected values 60 | new_attribute_values = [getattr(node, attribute.name) for attribute in node.attributes] 61 | # Newly selected values should be different from old values 62 | self.assertNotEqual(old_attribute_values, new_attribute_values) 63 | 64 | def test_adding_neighbour_node(self): 65 | # Find first available transition 66 | transition_name, transition_pheromone = self.input_node.available_transitions[0] 67 | # Initialize node object using transition's name 68 | node = Node(transition_name) 69 | # Create NeighbourNode object 70 | neighbour_node = NeighbourNode(node, transition_pheromone) 71 | # Check if NeighbourNode object was created properly 72 | self.assertIsInstance(neighbour_node.node, Node) 73 | self.assertIsInstance(neighbour_node.heuristic, float) 74 | self.assertIsInstance(neighbour_node.pheromone, float) 75 | # Add NeighbourNode object to neighbours list 76 | self.input_node.neighbours.append(neighbour_node) 77 | 78 | def test_node_attributes_init(self): 79 | # Create test attribute 80 | attribute_name = 'filter_count' 81 | attribute_values = [16, 32, 64] 82 | attribute = NodeAttribute(attribute_name, attribute_values) 83 | # Check if attribute name was set correctly 84 | self.assertEqual(attribute.name, attribute_name) 85 | # Check if each attribute value was added to the dictionary 86 | for attribute_value in attribute_values: 87 | self.assertIn(attribute_value, attribute.dict) 88 | # Gather all unique pheromone values 89 | pheromone_values = list(set(attribute.dict.values())) 90 | # Because NodeAttribute object was just initialized and no changes to 91 | # pheromone values were performed, all pheromone values must be the same 92 | # meaning that pheromone_values must contain only 1 element 93 | self.assertEqual(len(pheromone_values), 1) 94 | --------------------------------------------------------------------------------