├── .gitignore
├── LICENSE
├── Makefile
├── README.md
├── deepswarm
├── __init__.py
├── aco.py
├── backends.py
├── deepswarm.py
├── log.py
├── nodes.py
└── storage.py
├── examples
├── cifar10.py
├── context.py
├── fashion-mnist.py
└── mnist.py
├── requirements.txt
├── settings
├── cifar10.yaml
├── default.yaml
├── fashion-mnist.yaml
└── mnist.yaml
├── setup.py
└── tests
├── test_aco.py
├── test_graph.py
└── test_nodes.py
/.gitignore:
--------------------------------------------------------------------------------
1 | # macOS
2 | .DS_Store
3 |
4 | # Python
5 | *.pyc
6 | __pycache__/
7 | deepswarm-env/
8 |
9 | # Build files
10 | build/
11 | *.egg-info/
12 | dist/
13 |
14 | # Log files
15 | saves/
16 |
17 | # Temporary files
18 | temp-model
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2018 Edvinas Byla
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | .PHONY: test clean upload
2 |
3 | test:
4 | python -m unittest discover tests
5 |
6 | clean:
7 | rm -rf build *.egg-info dist
8 | find . -name '*.pyc' -exec rm -f {} +
9 | find . -name '*.pyo' -exec rm -f {} +
10 | find . -name '*~' -exec rm -f {} +
11 |
12 | upload: clean
13 | python setup.py sdist bdist_wheel
14 | twine upload dist/*
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 | Neural Architecture Search Powered by Swarm Intelligence 🐜
7 |
8 |
9 |
10 | # DeepSwarm [](https://www.python.org/downloads/release/python-360/) [](https://www.tensorflow.org/)
11 |
12 | DeepSwarm is an open-source library which uses Ant Colony Optimization to tackle the neural architecture search problem. The main goal of DeepSwarm is to automate one of the most tedious and daunting tasks, so people can spend more of their time on more important and interesting things. DeepSwarm offers a powerful configuration system which allows you to fine-tune the search space to your needs.
13 |
14 | ## Example 🖼
15 |
16 | ```python
17 | from deepswarm.backends import Dataset, TFKerasBackend
18 | from deepswarm.deepswarm import DeepSwarm
19 |
20 | dataset = Dataset(training_examples=x_train, training_labels=y_train, testing_examples=x_test, testing_labels=y_test)
21 | backend = TFKerasBackend(dataset=dataset)
22 | deepswarm = DeepSwarm(backend=backend)
23 | topology = deepswarm.find_topology()
24 | trained_topology = deepswarm.train_topology(topology, 50)
25 |
26 | ```
27 |
28 | ## Installation 💾
29 |
30 | 1. Install the package
31 |
32 | ```sh
33 | pip install deepswarm
34 | ```
35 | 2. Install one of the implemented backends that you want to use
36 |
37 | ```sh
38 | pip install tensorflow-gpu==1.13.1
39 | ```
40 |
41 | ## Usage 🕹
42 |
43 | 1. Create a new file containing the example code
44 |
45 | ```sh
46 | touch train.py
47 | ```
48 | 2. Create settings directory which contains `default.yaml` file. Alternatively you can run the script and instantly stop it, as this should automatically create settings directory which contains `default.yaml` file
49 |
50 | 3. Update the newly created YAML file to your dataset needs. The only two important changes you must make are: (1) change the loss function to reflect your task (2) change the shape of input and output nodes
51 |
52 |
53 | ## Search 🔎
54 |
55 |
56 |
57 |
58 |
59 | (1) The ant is placed on the input node. (2) The ant checks what transitions are available. (3) The ant uses the ACS selection rule to choose the next node. (4) After choosing the next node the ant selects node’s attributes. (5) After all ants finished their tour the pheromone is updated. (6) The maximum allowed depth is increased and the new ant population is generated.
60 |
61 | Note: Arrow thickness indicates the pheromone amount, meaning that thicker arrows have more pheromone.
62 |
63 | ## Configuration 🛠
64 |
65 | | Node type | Attributes |
66 | | :------------- |:-------------|
67 | | Input | **shape**: tuple which defines the input shape, depending on the backend could be (width, height, channels) or (channels, width, height). |
68 | | Conv2D | **filter_count**: defines how many filters can be used.
**kernel_size**: defines what size kernels can be used. For example, if it is set to [1, 3], then only 1x1 and 3x3 kernels will be used.
**activation**: defines what activation functions can be used. Allowed values are: ReLU, ELU, LeakyReLU, Sigmoid and Softmax. |
69 | | Dropout | **rate**: defines the allowed dropout rates. For example, if it is set to [0.1, 0.3], then either 10% or 30% of input units will be dropped. |
70 | | BatchNormalization | - |
71 | | Pool2D | **pool_type**: defines the types of allowed pooling nodes. Allowed values are: max (max pooling) and average (average pooling).
**pool_size**: defines the allowed pooling window sizes. For example, if it is set to [2], then only 2x2 pooling windows will be used.
**stride**: defines the allowed stride sizes. |
72 | | Flatten | - |
73 | | Dense | **output_size**: defines the allowed output space dimensionality.
**activation**: defines what activation functions can be used. Allowed values are: ReLU, ELU, LeakyReLU, Sigmoid and Softmax. |
74 | | Output | **output_size**: defines the output size (how many different classes to classify).
**activation**: defines what activation functions can be used. Allowed value are ReLU, ELU, LeakyReLU, Sigmoid and Softmax. |
75 |
76 | | Setting | Description |
77 | | :------------- |:-------------|
78 | | save_folder | Specifies the name of the folder which should be used to load the backup. If not specified the search will start from zero. |
79 | | metrics | Specifies what metrics should algorithm use to evaluate the models. Currently available options are: accuracy and loss. |
80 | | max_depth | Specifies the maximum allowed network depth (how deeply the graph can be expanded). The search is performed until the maximum depth is reached. However, it does not mean that the depth of the best architecture will be equal to the max_depth. |
81 | | reuse_patience | Specifies the maximum number of times that weights can be reused without improving the cost. For example, if it is set to 1 it means that when some model X reuses weights from model Y and model X cost did not improve compared to model Y, next time instead of reusing model Y weights, new random weights will be generated.|
82 | | start | Specifies the starting pheromone value for all the new connections. |
83 | | decay | Specifies the local pheromone decay rate in percentage. For example, if it is set to 0.1 it means that during the local pheromone update the pheromone value will be decreased by 10%. |
84 | | evaporation | Specifies the global pheromone evaporation rate in percentage. For example, if it is set to 0.1 it means that during the global pheromone update the pheromone value will be decreased by 10%. |
85 | | greediness | Specifies how greedy should ants be during the edge selection (the number is given in percentage). For example, 0.5 means that 50% of the time when ant selects a new edge it should select the one with the highest associated probability. |
86 | | ant_count | Specifies how many ants should be generated during each generation (time before the depth is increased). |
87 | | epochs | Specifies for how many epochs each candidate architecture should be trained. |
88 | | batch_size | Specifies the batch size (number of samples used to calculate a single gradient step) used during the training process. |
89 | | patience | Specifies the early stopping number used during the training (after how many epochs when the cost is not improving the training process should be stopped). |
90 | | loss | Specifies what loss function should be used during the training. Currently available options are sparse_categorical_crossentropy and categorical_crossentropy. |
91 | | spatial_nodes | Specifies which nodes are placed before the flattening node. Values in this array must correspond to node names. |
92 | | flat_nodes | Specifies which nodes are placed after the flattening node (array should also include the flattening node). Values in this array must correspond to node names. |
93 | | verbose| Specifies if the associated component should log the output.|
94 |
95 | ## Future goals 🌟
96 |
97 | - [ ] Add a node which can combine the input from the two previous nodes.
98 | - [ ] Add a node which can skip the depth n in order to connect to the node in depth n+1.
99 | - [ ] Delete the models which are not referenced anymore.
100 | - [ ] Add an option to assemble the best n models into one model.
101 | - [ ] Add functionality to reuse the weights from the non-continues blocks, i.e. take the best weights for depth n-1 from one model and then take the best weights for depth n+1 from another model.
102 |
103 | ## Citation 🖋
104 |
105 | Online version is available at: [arXiv:1905.07350](https://arxiv.org/abs/1905.07350)
106 | ```bibtex
107 | @article{byla2019deepswarm,
108 | title = {DeepSwarm: Optimising Convolutional Neural Networks using Swarm Intelligence},
109 | author = {Edvinas Byla and Wei Pang},
110 | journal = {arXiv preprint arXiv:1905.07350},
111 | year = {2019}
112 | }
113 | ```
114 |
115 | ## Acknowledgments 🎓
116 |
117 | DeepSwarm was developed under the supervision of [Dr Wei Pang](https://www.abdn.ac.uk/ncs/people/profiles/pang.wei) in partial fulfilment of the requirements for the degree of Bachelor of Science of the [University of Aberdeen](https://www.abdn.ac.uk).
118 |
--------------------------------------------------------------------------------
/deepswarm/__init__.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) 2019 Edvinas Byla
2 | # Licensed under MIT License
3 |
4 | import argparse
5 | import operator
6 | import os
7 | import sys
8 |
9 | from pathlib import Path
10 | from shutil import copyfile
11 | from yaml import load, Loader
12 |
13 |
14 | # Create argument parser which allows users to pass a custom settings file name
15 | # If the user didn't pass a custom script name then use sys.argv[0]
16 | parser = argparse.ArgumentParser()
17 | parser.add_argument('-s', '--settings_file_name', default=os.path.basename(sys.argv[0]),
18 | help='Settings file name. The default value is the name of invoked script without the .py extenstion')
19 | args, _ = parser.parse_known_args()
20 |
21 | # Retrieve filename without the extension
22 | filename = os.path.splitext(args.settings_file_name)[0]
23 |
24 | # If mnist.yaml doesn't exist it means that the package was installed via pip in
25 | # which case we should use the current working directory as the base path
26 | base_path = Path(os.path.dirname(os.path.dirname(__file__)))
27 | if not (base_path / 'settings' / 'mnist.yaml').exists():
28 | module_path = base_path
29 |
30 | # Change the base path to the current working directory
31 | base_path = Path(os.getcwd())
32 | settings_directory = (base_path / 'settings')
33 |
34 | # Create settings directory if it doesn't exist
35 | if not settings_directory.exists():
36 | settings_directory.mkdir()
37 |
38 | # If default settings file doesn't exist, copy one from the module directory
39 | module_default_config = module_path / 'settings/default.yaml'
40 | settings_default_config = settings_directory / 'default.yaml'
41 | if not settings_default_config.exists() and module_default_config.exists():
42 | copyfile(module_default_config, settings_default_config)
43 |
44 | # As the base path is now configured we try to load configuration file
45 | # associated with the filename
46 | settings_directory = base_path / 'settings'
47 | settings_file_path = Path(settings_directory, filename).with_suffix('.yaml')
48 |
49 | # If the file doesn't exist fallback to the default settings file
50 | if not settings_file_path.exists():
51 | settings_file_path = Path(settings_directory, 'default').with_suffix('.yaml')
52 |
53 | # Read settings file
54 | with open(settings_file_path, 'r') as settings_file:
55 | settings = load(settings_file, Loader=Loader)
56 |
57 | # Add script name to settings, so it's added to the log
58 | settings['script'] = os.path.basename(sys.argv[0])
59 | settings['settings_file'] = str(settings_file_path)
60 |
61 | # Create convenient variables
62 | cfg = settings["DeepSwarm"]
63 | nodes = settings["Nodes"]
64 | left_cost_is_better = operator.le if cfg['metrics'] == 'loss' else operator.ge
65 |
--------------------------------------------------------------------------------
/deepswarm/aco.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) 2019 Edvinas Byla
2 | # Licensed under MIT License
3 |
4 | import math
5 | import random
6 |
7 | from . import cfg, left_cost_is_better
8 | from .log import Log
9 | from .nodes import Node, NeighbourNode
10 |
11 |
12 | class ACO:
13 | """Class responsible for performing Ant Colony Optimization."""
14 |
15 | def __init__(self, backend, storage):
16 | self.graph = Graph()
17 | self.current_depth = 0
18 | self.backend = backend
19 | self.storage = storage
20 |
21 | def search(self):
22 | """Performs neural architecture search using Ant colony optimization.
23 |
24 | Returns:
25 | ant which found the best network topology.
26 | """
27 |
28 | # Generate random ant only if the search started from zero
29 | if not self.storage.loaded_from_save:
30 | Log.header("STARTING ACO SEARCH", type="GREEN")
31 | self.best_ant = Ant(self.graph.generate_path(self.random_select))
32 | self.best_ant.evaluate(self.backend, self.storage)
33 | Log.info(self.best_ant)
34 | else:
35 | Log.header("RESUMING ACO SEARCH", type="GREEN")
36 |
37 | while self.graph.current_depth <= cfg['max_depth']:
38 | Log.header("Current search depth is %i" % self.graph.current_depth, type="GREEN")
39 | ants = self.generate_ants()
40 |
41 | # Sort ants using user selected metric
42 | ants.sort() if cfg['metrics'] == 'loss' else ants.sort(reverse=True)
43 |
44 | # Update the best ant if new better ant is found
45 | if left_cost_is_better(ants[0].cost, self.best_ant.cost):
46 | self.best_ant = ants[0]
47 | Log.header("NEW BEST ANT FOUND", type="GREEN")
48 |
49 | # Log best ant information
50 | Log.header("BEST ANT DURING ITERATION")
51 | Log.info(self.best_ant)
52 |
53 | # Perform global pheromone update
54 | self.update_pheromone(ant=self.best_ant, update_rule=self.global_update)
55 |
56 | # Print pheromone information and increase the graph's depth
57 | self.graph.show_pheromone()
58 | self.graph.increase_depth()
59 |
60 | # Perform a backup
61 | self.storage.perform_backup()
62 | return self.best_ant
63 |
64 | def generate_ants(self):
65 | """Generates a new ant population.
66 |
67 | Returns:
68 | list containing different evaluated ants.
69 | """
70 |
71 | ants = []
72 | for ant_number in range(cfg['aco']['ant_count']):
73 | Log.header("GENERATING ANT %i" % (ant_number + 1))
74 | ant = Ant()
75 | # Generate ant's path using ACO selection rule
76 | ant.path = self.graph.generate_path(self.aco_select)
77 | # Evaluate how good is the new path
78 | ant.evaluate(self.backend, self.storage)
79 | ants.append(ant)
80 | Log.info(ant)
81 | # Perform local pheromone update
82 | self.update_pheromone(ant=ant, update_rule=self.local_update)
83 | return ants
84 |
85 | def random_select(self, neighbours):
86 | """Randomly selects one neighbour node and its attributes.
87 |
88 | Args:
89 | neighbours [NeighbourNode]: list of neighbour nodes.
90 | Returns:
91 | a randomly selected neighbour node.
92 | """
93 |
94 | current_node = random.choice(neighbours).node
95 | current_node.select_random_attributes()
96 | return current_node
97 |
98 | def aco_select(self, neighbours):
99 | """Selects one neighbour node and its attributes using ACO selection rule.
100 |
101 | Args:
102 | neighbours [NeighbourNode]: list of neighbour nodes.
103 | Returns:
104 | selected neighbour node.
105 | """
106 |
107 | # Transform a list of NeighbourNode objects to list of tuples
108 | # (Node, pheromone, heuristic)
109 | tuple_neighbours = [(n.node, n.pheromone, n.heuristic) for n in neighbours]
110 | # Select node using ant colony selection rule
111 | current_node = self.aco_select_rule(tuple_neighbours)
112 | # Select custom attributes using ant colony selection rule
113 | current_node.select_custom_attributes(self.aco_select_rule)
114 | return current_node
115 |
116 | def aco_select_rule(self, neighbours):
117 | """Selects neigbour using ACO transition rule.
118 |
119 | Args:
120 | neighbours [(Object, float, float)]: list of tuples, where each tuple
121 | contains: an object to be selected, object's pheromone value and
122 | object's heuristic value.
123 | Returns:
124 | selected object.
125 | """
126 |
127 | probabilities = []
128 | denominator = 0.0
129 |
130 | # Calculate probability for each neighbour
131 | for (_, pheromone, heuristic) in neighbours:
132 | probability = pheromone * heuristic
133 | probabilities.append(probability)
134 | denominator += probability
135 |
136 | # Try to perform greedy select: exploitation
137 | random_variable = random.uniform(0, 1)
138 | if random_variable <= cfg['aco']['greediness']:
139 | # Find max probability
140 | max_probability = max(probabilities)
141 | # Gather the indices of probabilities that are equal to the max probability
142 | max_indices = [i for i, j in enumerate(probabilities) if j == max_probability]
143 | # From those max indices select random index
144 | neighbour_index = random.choice(max_indices)
145 | return neighbours[neighbour_index][0]
146 |
147 | # Otherwise perform select using roulette wheel: exploration
148 | probabilities = [x / denominator for x in probabilities]
149 | probability_sum = sum(probabilities)
150 | random_treshold = random.uniform(0, probability_sum)
151 | current_value = 0
152 | for neighbour_index, probability in enumerate(probabilities):
153 | current_value += probability
154 | if current_value > random_treshold:
155 | return neighbours[neighbour_index][0]
156 |
157 | def update_pheromone(self, ant, update_rule):
158 | """Updates the pheromone using given update rule.
159 |
160 | Args:
161 | ant: ant which should perform the pheromone update.
162 | update_rule: function which takes pheromone value and ant's cost,
163 | and returns a new pheromone value.
164 | """
165 |
166 | current_node = self.graph.input_node
167 | # Skip the input node as it's not connected to any previous node
168 | for node in ant.path[1:]:
169 | # Use a node from the path to retrieve its corresponding instance from the graph
170 | neighbour = next((x for x in current_node.neighbours if x.node.name == node.name), None)
171 |
172 | # If the path was closed using complete_path method, ignore the rest of the path
173 | if neighbour is None:
174 | break
175 |
176 | # Update pheromone connecting to a neighbour
177 | neighbour.pheromone = update_rule(
178 | old_value=neighbour.pheromone,
179 | cost=ant.cost
180 | )
181 |
182 | # Update attribute's pheromone values
183 | for attribute in neighbour.node.attributes:
184 | # Find what attribute value was used for node
185 | attribute_value = getattr(node, attribute.name)
186 | # Retrieve pheromone for that value
187 | old_pheromone_value = attribute.dict[attribute_value]
188 | # Update pheromone
189 | attribute.dict[attribute_value] = update_rule(
190 | old_value=old_pheromone_value,
191 | cost=ant.cost
192 | )
193 |
194 | # Advance the current node
195 | current_node = neighbour.node
196 |
197 | def local_update(self, old_value, cost):
198 | """Performs local pheromone update."""
199 |
200 | decay = cfg['aco']['pheromone']['decay']
201 | pheromone_0 = cfg['aco']['pheromone']['start']
202 | return (1 - decay) * old_value + (decay * pheromone_0)
203 |
204 | def global_update(self, old_value, cost):
205 | """Performs global pheromone update."""
206 |
207 | # Calculate solution cost based on metrics
208 | added_pheromone = (1 / (cost * 10)) if cfg['metrics'] == 'loss' else cost
209 | evaporation = cfg['aco']['pheromone']['evaporation']
210 | return (1 - evaporation) * old_value + (evaporation * added_pheromone)
211 |
212 | def __getstate__(self):
213 | d = dict(self.__dict__)
214 | del d['backend']
215 | return d
216 |
217 |
218 | class Ant:
219 | """Class responsible for representing the ant."""
220 |
221 | def __init__(self, path=[]):
222 | self.path = path
223 | self.loss = math.inf
224 | self.accuracy = 0.0
225 | self.path_description = None
226 | self.path_hash = None
227 |
228 | def evaluate(self, backend, storage):
229 | """Evaluates how good ant's path is.
230 |
231 | Args:
232 | backend: Backend object.
233 | storage: Storage object.
234 | """
235 |
236 | # Extract path information
237 | self.path_description, path_hashes = storage.hash_path(self.path)
238 | self.path_hash = path_hashes[-1]
239 |
240 | # Check if the model already exists if yes, then just re-use it
241 | existing_model, existing_model_hash = storage.load_model(backend, path_hashes, self.path)
242 | if existing_model is None:
243 | # Generate model
244 | new_model = backend.generate_model(self.path)
245 | else:
246 | # Re-use model
247 | new_model = existing_model
248 |
249 | # Train model
250 | new_model = backend.train_model(new_model)
251 | # Evaluate model
252 | self.loss, self.accuracy = backend.evaluate_model(new_model)
253 |
254 | # If the new model was created from the older model, record older model progress
255 | if existing_model_hash is not None:
256 | storage.record_model_performance(existing_model_hash, self.cost)
257 |
258 | # Save model
259 | storage.save_model(backend, new_model, path_hashes, self.cost)
260 |
261 | @property
262 | def cost(self):
263 | """Returns value which represents ant's cost."""
264 |
265 | return self.loss if cfg['metrics'] == 'loss' else self.accuracy
266 |
267 | def __lt__(self, other):
268 | return self.cost < other.cost
269 |
270 | def __str__(self):
271 | return "======= \n Ant: %s \n Loss: %f \n Accuracy: %f \n Path: %s \n Hash: %s \n=======" % (
272 | hex(id(self)),
273 | self.loss,
274 | self.accuracy,
275 | self.path_description,
276 | self.path_hash,
277 | )
278 |
279 |
280 | class Graph:
281 | """Class responsible for representing the graph."""
282 |
283 | def __init__(self, current_depth=0):
284 | self.topology = []
285 | self.current_depth = current_depth
286 | self.input_node = self.get_node(Node.create_using_type('Input'), current_depth)
287 | self.increase_depth()
288 |
289 | def get_node(self, node, depth):
290 | """Tries to retrieve a given node from the graph. If the node does not
291 | exist then the node is inserted into the graph before being retrieved.
292 |
293 | Args:
294 | node: Node which should be found in the graph.
295 | depth: depth at which the node should be stored.
296 | """
297 |
298 | # If we are trying to insert the node into a not existing layer, we pad the
299 | # topology by adding empty dictionaries, until the required depth is reached
300 | while depth > (len(self.topology) - 1):
301 | self.topology.append({})
302 |
303 | # If the node already exists return it, otherwise add it to the topology first
304 | return self.topology[depth].setdefault(node.name, node)
305 |
306 | def increase_depth(self):
307 | """Increases the depth of the graph."""
308 |
309 | self.current_depth += 1
310 |
311 | def generate_path(self, select_rule):
312 | """Generates path through the graph based on given selection rule.
313 |
314 | Args:
315 | select_rule ([NeigbourNode]): function which receives a list of
316 | neighbours.
317 |
318 | Returns:
319 | a path which contains Node objects.
320 | """
321 |
322 | current_node = self.input_node
323 | path = [current_node.create_deepcopy()]
324 | for depth in range(self.current_depth):
325 | # If the node doesn't have any neighbours stop expanding the path
326 | if not self.has_neighbours(current_node, depth):
327 | break
328 |
329 | # Select node using given rule
330 | current_node = select_rule(current_node.neighbours)
331 | # Add only the copy of the node, so that original stays unmodified
332 | path.append(current_node.create_deepcopy())
333 |
334 | completed_path = self.complete_path(path)
335 | return completed_path
336 |
337 | def has_neighbours(self, node, depth):
338 | """Checks if the node has any neighbours.
339 |
340 | Args:
341 | node: Node that needs to be checked.
342 | depth: depth at which the node is stored in the graph.
343 |
344 | Returns:
345 | a boolean value which indicates if the node has any neighbours.
346 | """
347 |
348 | # Expand only if it hasn't been expanded
349 | if node.is_expanded is False:
350 | available_transitions = node.available_transitions
351 | for (transition_name, heuristic_value) in available_transitions:
352 | neighbour_node = self.get_node(Node(transition_name), depth + 1)
353 | node.neighbours.append(NeighbourNode(neighbour_node, heuristic_value))
354 | node.is_expanded = True
355 |
356 | # Return value indicating if the node has neighbours after being expanded
357 | return len(node.neighbours) > 0
358 |
359 | def complete_path(self, path):
360 | """Completes the path if it is not fully completed (i.e. missing OutputNode).
361 |
362 | Args:
363 | path [Node]: list of nodes defining the path.
364 |
365 | Returns:
366 | completed path which contains list of nodes.
367 | """
368 |
369 | # If the path is not completed, then complete it and return completed path
370 | # We intentionally don't add these ending nodes as neighbours to the last node
371 | # in the path, because during the first few iterations these nodes will always be part
372 | # of the best path (as it's impossible to close path automatically when it's so short)
373 | # this would result in bias pheromone received by these nodes during later iterations
374 | if path[-1].name in cfg['spatial_nodes']:
375 | path.append(self.get_node(Node.create_using_type('Flatten'), len(path)))
376 | if path[-1].name in cfg['flat_nodes']:
377 | path.append(self.get_node(Node.create_using_type('Output'), len(path)))
378 | return path
379 |
380 | def show_pheromone(self):
381 | """Logs the pheromone information for the graph."""
382 |
383 | # If the output is disabled by the user then don't log the pheromone
384 | if cfg['aco']['pheromone']['verbose'] is False:
385 | return
386 |
387 | Log.header("PHEROMONE START", type="RED")
388 | for idx, layer in enumerate(self.topology):
389 | info = []
390 | for node in layer.values():
391 | for neighbour in node.neighbours:
392 | info.append("%s [%s] -> %f -> %s [%s]" % (node.name, hex(id(node)),
393 | neighbour.pheromone, neighbour.node.name, hex(id(neighbour.node))))
394 |
395 | # If neighbour node doesn't have any attributes skip attribute info
396 | if not neighbour.node.attributes:
397 | continue
398 |
399 | info.append("\t%s [%s]:" % (neighbour.node.name, hex(id(neighbour.node))))
400 | for attribute in neighbour.node.attributes:
401 | info.append("\t\t%s: %s" % (attribute.name, attribute.dict))
402 | if info:
403 | Log.header("Layer %d" % (idx + 1))
404 | Log.info('\n'.join(info))
405 | Log.header("PHEROMONE END", type="RED")
406 |
--------------------------------------------------------------------------------
/deepswarm/backends.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) 2019 Edvinas Byla
2 | # Licensed under MIT License
3 |
4 | import os
5 | import tensorflow as tf
6 | import time
7 |
8 | from abc import ABC, abstractmethod
9 | from sklearn.model_selection import train_test_split
10 | from tensorflow.keras import backend as K
11 |
12 | from . import cfg
13 |
14 |
15 | class Dataset:
16 | """Class responsible for encapsulating all the required data."""
17 |
18 | def __init__(self, training_examples, training_labels, testing_examples, testing_labels,
19 | validation_data=None, validation_split=0.1):
20 | self.x_train = training_examples
21 | self.y_train = training_labels
22 | self.x_test = testing_examples
23 | self.y_test = testing_labels
24 | self.validation_data = validation_data
25 | self.validation_split = validation_split
26 |
27 |
28 | class BaseBackend(ABC):
29 | """Abstract class used to define Backend API."""
30 |
31 | def __init__(self, dataset, optimizer=None):
32 | self.dataset = dataset
33 | self.optimizer = optimizer
34 |
35 | @abstractmethod
36 | def generate_model(self, path):
37 | """Create and return a backend model representation.
38 |
39 | Args:
40 | path [Node]: list of nodes where each node represents a single
41 | network layer, the path starts with InputNode and ends with EndNode.
42 | Returns:
43 | model which represents neural network structure in the implemented
44 | backend, this model can be evaluated using evaluate_model method.
45 | """
46 |
47 | @abstractmethod
48 | def reuse_model(self, old_model, new_model_path, distance):
49 | """Create a new model by reusing layers (and their weights) from the old model.
50 |
51 | Args:
52 | old_model: old model which represents neural network structure.
53 | new_model_path [Node]: path representing new model.
54 | distance (int): distance which shows how many layers from old model need
55 | to be removed in order to create a base for new model i.e. if old model is
56 | NodeA->NodeB->NodeC->NodeD and new model is NodeA->NodeB->NodeC->NodeE,
57 | distance = 1.
58 | Returns:
59 | model which represents neural network structure.
60 | """
61 |
62 | @abstractmethod
63 | def train_model(self, model):
64 | """Train model which was created using generate_model method.
65 |
66 | Args:
67 | model: model which represents neural network structure.
68 | Returns:
69 | model which represents neural network structure.
70 | """
71 |
72 | @abstractmethod
73 | def fully_train_model(self, model, epochs, augment):
74 | """Fully trains the model without early stopping. At the end of the
75 | training, the model with the best performing weights on the validation
76 | set is returned.
77 |
78 | Args:
79 | model: model which represents neural network structure.
80 | epochs (int): for how many epoch train the model.
81 | augment (kwargs): augmentation arguments.
82 | Returns:
83 | model which represents neural network structure.
84 | """
85 |
86 | @abstractmethod
87 | def evaluate_model(self, model):
88 | """Evaluate model which was created using generate_model method.
89 |
90 | Args:
91 | model: model which represents neural network structure.
92 | Returns:
93 | loss & accuracy tuple.
94 | """
95 |
96 | @abstractmethod
97 | def save_model(self, model, path):
98 | """Saves model on disk.
99 |
100 | Args:
101 | model: model which represents neural network structure.
102 | path: string which represents model location.
103 | """
104 |
105 | @abstractmethod
106 | def load_model(self, path):
107 | """Load model from disk, in case of fail should return None.
108 |
109 | Args:
110 | path: string which represents model location.
111 | Returns:
112 | model: model which represents neural network structure, or in case
113 | fail None.
114 | """
115 |
116 | @abstractmethod
117 | def free_gpu(self):
118 | """Frees GPU memory."""
119 |
120 |
121 | class TFKerasBackend(BaseBackend):
122 | """Backend based on TensorFlow Keras API"""
123 |
124 | def __init__(self, dataset, optimizer=None):
125 | # If the user passes custom optimizer we serialize it, as reusing the
126 | # same optimizer instance causes crash in TensorFlow 1.13.1, see issue
127 | # https://github.com/Pattio/DeepSwarm/issues/3
128 | if optimizer is not None:
129 | optimizer = tf.keras.optimizers.serialize(optimizer)
130 |
131 | super().__init__(dataset, optimizer)
132 | self.data_format = K.image_data_format()
133 |
134 | def generate_model(self, path):
135 | # Create an input layer
136 | input_layer = self.create_layer(path[0])
137 | layer = input_layer
138 |
139 | # Convert each node to layer and then connect it to the previous layer
140 | for node in path[1:]:
141 | layer = self.create_layer(node)(layer)
142 |
143 | # Return generated model
144 | model = tf.keras.Model(inputs=input_layer, outputs=layer)
145 | self.compile_model(model)
146 | return model
147 |
148 | def reuse_model(self, old_model, new_model_path, distance):
149 | # Find the starting point of the new model
150 | starting_point = len(new_model_path) - distance
151 | last_layer = old_model.layers[starting_point - 1].output
152 |
153 | # Append layers from the new model to the old model
154 | for node in new_model_path[starting_point:]:
155 | last_layer = self.create_layer(node)(last_layer)
156 |
157 | # Return new model
158 | model = tf.keras.Model(inputs=old_model.inputs, outputs=last_layer)
159 | self.compile_model(model)
160 | return model
161 |
162 | def compile_model(self, model):
163 | optimizer_parameters = {
164 | 'optimizer': 'adam',
165 | 'loss': cfg['backend']['loss'],
166 | 'metrics': ['accuracy'],
167 | }
168 |
169 | # If user specified custom optimizer, use it instead of the default one
170 | # we also need to deserialize optimizer as it was serialized during init
171 | if self.optimizer is not None:
172 | optimizer_parameters['optimizer'] = tf.keras.optimizers.deserialize(self.optimizer)
173 | model.compile(**optimizer_parameters)
174 |
175 | def create_layer(self, node):
176 | # Workaround to prevent Keras from throwing an exception ("All layer
177 | # names should be unique.") It happens when new layers are appended to
178 | # an existing model, but Keras fails to increment repeating layer names
179 | # i.e. conv_1 -> conv_2
180 | parameters = {'name': str(time.time())}
181 |
182 | if node.type == 'Input':
183 | parameters['shape'] = node.shape
184 | return tf.keras.Input(**parameters)
185 |
186 | if node.type == 'Conv2D':
187 | parameters.update({
188 | 'filters': node.filter_count,
189 | 'kernel_size': node.kernel_size,
190 | 'padding': 'same',
191 | 'data_format': self.data_format,
192 | 'activation': self.map_activation(node.activation),
193 | })
194 | return tf.keras.layers.Conv2D(**parameters)
195 |
196 | if node.type == 'Pool2D':
197 | parameters.update({
198 | 'pool_size': node.pool_size,
199 | 'strides': node.stride,
200 | 'padding': 'same',
201 | 'data_format': self.data_format,
202 | })
203 | if node.pool_type == 'max':
204 | return tf.keras.layers.MaxPooling2D(**parameters)
205 | elif node.pool_type == 'average':
206 | return tf.keras.layers.AveragePooling2D(**parameters)
207 |
208 | if node.type == 'BatchNormalization':
209 | return tf.keras.layers.BatchNormalization(**parameters)
210 |
211 | if node.type == 'Flatten':
212 | return tf.keras.layers.Flatten(**parameters)
213 |
214 | if node.type == 'Dense':
215 | parameters.update({
216 | 'units': node.output_size,
217 | 'activation': self.map_activation(node.activation),
218 | })
219 | return tf.keras.layers.Dense(**parameters)
220 |
221 | if node.type == 'Dropout':
222 | parameters.update({
223 | 'rate': node.rate,
224 | })
225 | return tf.keras.layers.Dropout(**parameters)
226 |
227 | if node.type == 'Output':
228 | parameters.update({
229 | 'units': node.output_size,
230 | 'activation': self.map_activation(node.activation),
231 | })
232 | return tf.keras.layers.Dense(**parameters)
233 |
234 | raise Exception('Not handled node type: %s' % str(node))
235 |
236 | def map_activation(self, activation):
237 | if activation == "ReLU":
238 | return tf.keras.activations.relu
239 | if activation == "ELU":
240 | return tf.keras.activations.elu
241 | if activation == "LeakyReLU":
242 | return tf.nn.leaky_relu
243 | if activation == "Sigmoid":
244 | return tf.keras.activations.sigmoid
245 | if activation == "Softmax":
246 | return tf.keras.activations.softmax
247 | raise Exception('Not handled activation: %s' % str(activation))
248 |
249 | def train_model(self, model):
250 | # Create a checkpoint path
251 | checkpoint_path = 'temp-model'
252 |
253 | # Setup training parameters
254 | fit_parameters = {
255 | 'x': self.dataset.x_train,
256 | 'y': self.dataset.y_train,
257 | 'epochs': cfg['backend']['epochs'],
258 | 'batch_size': cfg['backend']['batch_size'],
259 | 'callbacks': [
260 | self.create_early_stop_callback(),
261 | self.create_checkpoint_callback(checkpoint_path),
262 | ],
263 | 'validation_split': self.dataset.validation_split,
264 | 'verbose': cfg['backend']['verbose'],
265 | }
266 |
267 | # If validation data is given then override validation_split
268 | if self.dataset.validation_data is not None:
269 | fit_parameters['validation_data'] = self.dataset.validation_data
270 |
271 | # Train model
272 | model.fit(**fit_parameters)
273 |
274 | # Load model from checkpoint
275 | checkpoint_model = self.load_model(checkpoint_path)
276 | # Delete checkpoint
277 | if os.path.isfile(checkpoint_path):
278 | os.remove(checkpoint_path)
279 | # Return checkpoint model if it exists, otherwise return trained model
280 | return checkpoint_model if checkpoint_model is not None else model
281 |
282 | def fully_train_model(self, model, epochs, augment):
283 | # Setup validation data
284 | if self.dataset.validation_data is not None:
285 | x_val, y_val = self.dataset.validation_data
286 | x_train, y_train = self.dataset.x_train, self.dataset.y_train
287 | else:
288 | x_train, x_val, y_train, y_val = train_test_split(
289 | self.dataset.x_train,
290 | self.dataset.y_train,
291 | test_size=self.dataset.validation_split,
292 | )
293 |
294 | # Create checkpoint path
295 | checkpoint_path = 'temp-model'
296 |
297 | # Create and fit data generator
298 | datagen = tf.keras.preprocessing.image.ImageDataGenerator(**augment)
299 | datagen.fit(x_train)
300 |
301 | # Train model
302 | model.fit_generator(
303 | generator=datagen.flow(x_train, y_train, batch_size=cfg['backend']['batch_size']),
304 | steps_per_epoch=len(self.dataset.x_train) / cfg['backend']['batch_size'],
305 | epochs=epochs,
306 | callbacks=[self.create_checkpoint_callback(checkpoint_path)],
307 | validation_data=(x_val, y_val),
308 | verbose=cfg['backend']['verbose'],
309 | )
310 |
311 | # Load model from checkpoint
312 | checkpoint_model = self.load_model(checkpoint_path)
313 | # Delete checkpoint
314 | if os.path.isfile(checkpoint_path):
315 | os.remove(checkpoint_path)
316 | # Return checkpoint model if it exists, otherwise return trained model
317 | return checkpoint_model if checkpoint_model is not None else model
318 |
319 | def create_early_stop_callback(self):
320 | early_stop_parameters = {
321 | 'patience': cfg['backend']['patience'],
322 | 'verbose': cfg['backend']['verbose'],
323 | 'restore_best_weights': True,
324 | }
325 | early_stop_parameters['monitor'] = 'val_loss' if cfg['metrics'] == 'loss' else 'val_acc'
326 | return tf.keras.callbacks.EarlyStopping(**early_stop_parameters)
327 |
328 | def create_checkpoint_callback(self, checkpoint_path):
329 | checkpoint_parameters = {
330 | 'filepath': checkpoint_path,
331 | 'verbose': cfg['backend']['verbose'],
332 | 'save_best_only': True,
333 | }
334 | checkpoint_parameters['monitor'] = 'val_loss' if cfg['metrics'] == 'loss' else 'val_acc'
335 | return tf.keras.callbacks.ModelCheckpoint(**checkpoint_parameters)
336 |
337 | def evaluate_model(self, model):
338 | loss, accuracy = model.evaluate(
339 | x=self.dataset.x_test,
340 | y=self.dataset.y_test,
341 | verbose=cfg['backend']['verbose']
342 | )
343 | return (loss, accuracy)
344 |
345 | def save_model(self, model, path):
346 | model.save(path)
347 | self.free_gpu()
348 |
349 | def load_model(self, path):
350 | try:
351 | model = tf.keras.models.load_model(path)
352 | return model
353 | except:
354 | return None
355 |
356 | def free_gpu(self):
357 | K.clear_session()
358 |
--------------------------------------------------------------------------------
/deepswarm/deepswarm.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) 2019 Edvinas Byla
2 | # Licensed under MIT License
3 |
4 | from . import settings, left_cost_is_better
5 | from .aco import ACO
6 | from .log import Log
7 | from .storage import Storage
8 |
9 |
10 | class DeepSwarm:
11 | """Class responsible for providing user facing interface."""
12 |
13 | def __init__(self, backend):
14 | self.backend = backend
15 | self.storage = Storage(self)
16 |
17 | # Enable logging and log current settings
18 | self.setup_logging()
19 |
20 | # Try to load from the backup and restore backend as it was not saved
21 | if self.storage.loaded_from_save:
22 | self.__dict__ = self.storage.backup.__dict__
23 | self.backend = backend
24 | self.aco.backend = backend
25 |
26 | def setup_logging(self):
27 | """Enables logging and logs current settings."""
28 |
29 | Log.enable(self.storage)
30 | Log.header("DeepSwarm settings")
31 | Log.info(settings)
32 |
33 | def find_topology(self):
34 | """Finds the best neural network topology.
35 |
36 | Returns:
37 | network model in the format of backend which was used during
38 | initialization.
39 | """
40 |
41 | # Create a new object only if there are no backups
42 | if not self.storage.loaded_from_save:
43 | self.aco = ACO(backend=self.backend, storage=self.storage)
44 |
45 | best_ant = self.aco.search()
46 | best_model = self.storage.load_specified_model(self.backend, best_ant.path_hash)
47 | return best_model
48 |
49 | def train_topology(self, model, epochs, augment={}):
50 | """Trains given neural network topology for a specified number of epochs.
51 |
52 | Args:
53 | model: model which represents neural network structure.
54 | epochs (int): for how many epoch train the model.
55 | augment (kwargs): augmentation arguments.
56 | Returns:
57 | network model in the format of backend which was used during
58 | initialization.
59 | """
60 |
61 | # Before training make a copy of old weights in case performance
62 | # degrades during the training
63 | loss, accuracy = self.backend.evaluate_model(model)
64 | old_weights = model.get_weights()
65 |
66 | # Train the network
67 | model_name = 'best-trained-topology'
68 | trained_topology = self.backend.fully_train_model(model, epochs, augment)
69 | loss_new, accuracy_new = self.backend.evaluate_model(trained_topology)
70 |
71 | # Setup the metrics
72 | if settings['DeepSwarm']['metrics'] == 'loss':
73 | metrics_old = loss
74 | metrics_new = loss_new
75 | else:
76 | metrics_old = accuracy
77 | metrics_new = accuracy_new
78 |
79 | # Restore the weights if performance did not improve
80 | if left_cost_is_better(metrics_old, metrics_new):
81 | trained_topology.set_weights(old_weights)
82 |
83 | # Save and return the best topology
84 | self.storage.save_specified_model(self.backend, model_name, trained_topology)
85 | return self.storage.load_specified_model(self.backend, model_name)
86 |
87 | def evaluate_topology(self, model):
88 | """Evaluates neural network performance."""
89 |
90 | Log.header('EVALUATING PERFORMANCE ON TEST SET')
91 | loss, accuracy = self.backend.evaluate_model(model)
92 | Log.info('Accuracy is %f and loss is %f' % (accuracy, loss))
93 |
94 | def __getstate__(self):
95 | d = dict(self.__dict__)
96 | del d['backend']
97 | return d
98 |
--------------------------------------------------------------------------------
/deepswarm/log.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) 2019 Edvinas Byla
2 | # Licensed under MIT License
3 |
4 | import json
5 | import logging
6 | import re
7 |
8 | from colorama import init as colorama_init
9 | from colorama import Fore, Back, Style
10 |
11 |
12 | class Log:
13 | """Class responsible for logging information."""
14 |
15 | # Define header styles
16 | HEADER_W = [Fore.BLACK, Back.WHITE, Style.BRIGHT]
17 | HEADER_R = [Fore.WHITE, Back.RED, Style.BRIGHT]
18 | HEADER_G = [Fore.WHITE, Back.GREEN, Style.BRIGHT]
19 |
20 | @classmethod
21 | def enable(cls, storage):
22 | """Initializes the logger.
23 |
24 | Args:
25 | storage: Storage object.
26 | """
27 |
28 | # Init colorama to enable colors
29 | colorama_init()
30 | # Get deepswarm logger
31 | cls.logger = logging.getLogger("deepswarm")
32 |
33 | # Create stream handler
34 | stream_handler = logging.StreamHandler()
35 | stream_formater = logging.Formatter("%(message)s")
36 | stream_handler.setFormatter(stream_formater)
37 | # Add stream handler to logger
38 | cls.logger.addHandler(stream_handler)
39 |
40 | # Create and setup file handler
41 | file_handler = logging.FileHandler(storage.current_path / "deepswarm.log")
42 | file_formater = FileFormatter("%(asctime)s\n%(message)s")
43 | file_handler.setFormatter(file_formater)
44 | # Add file handle to logger
45 | cls.logger.addHandler(file_handler)
46 |
47 | # Set logger level to debug
48 | cls.logger.setLevel(logging.DEBUG)
49 |
50 | @classmethod
51 | def header(cls, message, type="WHITE"):
52 | if type == "RED":
53 | options = cls.HEADER_R
54 | elif type == "GREEN":
55 | options = cls.HEADER_G
56 | else:
57 | options = cls.HEADER_W
58 |
59 | cls.info(message.center(80, '-'), options)
60 |
61 | @classmethod
62 | def debug(cls, message, options=[Fore.CYAN]):
63 | formated_message = cls.create_message(message, options)
64 | cls.logger.debug(formated_message)
65 |
66 | @classmethod
67 | def info(cls, message, options=[Fore.GREEN]):
68 | formated_message = cls.create_message(message, options)
69 | cls.logger.info(formated_message)
70 |
71 | @classmethod
72 | def warning(cls, message, options=[Fore.YELLOW]):
73 | formated_message = cls.create_message(message, options)
74 | cls.logger.warning(formated_message)
75 |
76 | @classmethod
77 | def error(cls, message, options=[Fore.MAGENTA]):
78 | formated_message = cls.create_message(message, options)
79 | cls.logger.error(formated_message)
80 |
81 | @classmethod
82 | def critical(cls, message, options=[Fore.RED, Style.BRIGHT]):
83 | formated_message = cls.create_message(message, options)
84 | cls.logger.critical(formated_message)
85 |
86 | @classmethod
87 | def create_message(cls, message, options):
88 | # Convert dictionary to nicely formatted JSON
89 | if isinstance(message, dict):
90 | message = json.dumps(message, indent=4, sort_keys=True)
91 |
92 | # Convert all objects that are not strings to strings
93 | if isinstance(message, str) is False:
94 | message = str(message)
95 |
96 | return ''.join(options) + message + '\033[0m'
97 |
98 |
99 | class FileFormatter(logging.Formatter):
100 | """Class responsible for removing ANSI characters from the log file."""
101 |
102 | def plain(self, string):
103 | # Regex code adapted from Martijn Pieters https://stackoverflow.com/a/14693789
104 | ansi_escape = re.compile(r'\x1B\[[0-?]*[ -/]*[@-~]|[-]{2,}')
105 | return ansi_escape.sub('', string)
106 |
107 | def format(self, record):
108 | message = super(FileFormatter, self).format(record)
109 | plain_message = self.plain(message)
110 | separator = '=' * 80
111 | return ''.join((separator, "\n", plain_message, "\n", separator))
112 |
--------------------------------------------------------------------------------
/deepswarm/nodes.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) 2019 Edvinas Byla
2 | # Licensed under MIT License
3 |
4 | import copy
5 | import random
6 |
7 | from . import cfg, nodes
8 |
9 |
10 | class NodeAttribute:
11 | """Class responsible for encapsulating Node's attribute."""
12 |
13 | def __init__(self, name, options):
14 | self.name = name
15 | self.dict = {option: cfg['aco']['pheromone']['start'] for option in options}
16 |
17 |
18 | class NeighbourNode:
19 | """Class responsible for encapsulating Node's neighbour."""
20 |
21 | def __init__(self, node, heuristic, pheromone=cfg['aco']['pheromone']['start']):
22 | self.node = node
23 | self.heuristic = heuristic
24 | self.pheromone = pheromone
25 |
26 |
27 | class Node:
28 | """Class responsible for representing Node."""
29 |
30 | def __init__(self, name):
31 | self.name = name
32 | self.neighbours = []
33 | self.is_expanded = False
34 | self.type = nodes[self.name]['type']
35 | self.setup_attributes()
36 | self.setup_transitions()
37 | self.select_random_attributes()
38 |
39 | @classmethod
40 | def create_using_type(cls, type):
41 | """Create Node's instance using given type.
42 |
43 | Args:
44 | type (str): type defined in .yaml file.
45 | Returns:
46 | Node's instance.
47 | """
48 |
49 | for node in nodes:
50 | if nodes[node]['type'] == type:
51 | return cls(node)
52 | raise Exception('Type does not exist: %s' % str(type))
53 |
54 | def setup_attributes(self):
55 | """Adds attributes from the settings file."""
56 |
57 | self.attributes = []
58 | for attribute_name in nodes[self.name]['attributes']:
59 | attribute_value = nodes[self.name]['attributes'][attribute_name]
60 | self.attributes.append(NodeAttribute(attribute_name, attribute_value))
61 |
62 | def setup_transitions(self):
63 | """Adds transitions from the settings file."""
64 |
65 | self.available_transitions = []
66 | for transition_name in nodes[self.name]['transitions']:
67 | heuristic_value = nodes[self.name]['transitions'][transition_name]
68 | self.available_transitions.append((transition_name, heuristic_value))
69 |
70 | def select_attributes(self, custom_select):
71 | """Selects attributes using a given select rule.
72 |
73 | Args:
74 | custom_select: select function which takes dictionary containing
75 | (attribute, value) pairs and returns selected value.
76 | """
77 |
78 | selected_attributes = {}
79 | for attribute in self.attributes:
80 | value = custom_select(attribute.dict)
81 | selected_attributes[attribute.name] = value
82 |
83 | # For each selected attribute create class attribute
84 | for key, value in selected_attributes.items():
85 | setattr(self, key, value)
86 |
87 | def select_custom_attributes(self, custom_select):
88 | """Wraps select_attributes method by converting the attribute dictionary
89 | to list of tuples (attribute_value, pheromone, heuristic).
90 |
91 | Args:
92 | custom_select: selection function which takes a list of tuples
93 | containing (attribute_value, pheromone, heuristic).
94 | """
95 |
96 | # Define a function which transforms attributes before selecting them
97 | def select_transformed_custom_attributes(attribute_dictionary):
98 | # Convert to list of tuples containing (attribute_value, pheromone, heuristic)
99 | values = [(value, pheromone, 1.0) for value, pheromone in attribute_dictionary.items()]
100 | # Return value, which was selected using custom select
101 | return custom_select(values)
102 | self.select_attributes(select_transformed_custom_attributes)
103 |
104 | def select_random_attributes(self):
105 | """Selects random attributes."""
106 |
107 | self.select_attributes(lambda dict: random.choice(list(dict.keys())))
108 |
109 | def create_deepcopy(self):
110 | """Returns a newly created copy of Node object."""
111 |
112 | return copy.deepcopy(self)
113 |
114 | def __deepcopy__(self, memo):
115 | cls = self.__class__
116 | result = cls.__new__(cls)
117 | memo[id(self)] = result
118 | for k, v in self.__dict__.items():
119 | # Skip unnecessary stuff in order to make copying more efficient
120 | if k in ["neighbours", "available_transitions"]:
121 | v = []
122 | setattr(result, k, copy.deepcopy(v, memo))
123 | return result
124 |
125 | def __str__(self):
126 | attributes = ', '.join([a.name + ":" + str(getattr(self, a.name)) for a in self.attributes])
127 | return self.name + "(" + attributes + ")"
128 |
--------------------------------------------------------------------------------
/deepswarm/storage.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) 2019 Edvinas Byla
2 | # Licensed under MIT License
3 |
4 | import hashlib
5 | import pickle
6 |
7 | from datetime import datetime
8 |
9 | from . import base_path, cfg, left_cost_is_better
10 |
11 |
12 | class Storage:
13 | """Class responsible for backups and weight reuse."""
14 |
15 | DIR = {
16 | "MODEL": "models",
17 | "OBJECT": "objects",
18 | }
19 |
20 | ITEM = {"BACKUP": "backup"}
21 |
22 | def __init__(self, deepswarm):
23 | self.loaded_from_save = False
24 | self.backup = None
25 | self.path_lookup = {}
26 | self.models = {}
27 | self.deepswarm = deepswarm
28 | self.setup_path()
29 | self.setup_directories()
30 |
31 | def setup_path(self):
32 | """Loads existing backup or creates a new backup directory."""
33 |
34 | # If storage directory doesn't exist create one
35 | storage_path = base_path / 'saves'
36 | if not storage_path.exists():
37 | storage_path.mkdir()
38 |
39 | # Check if user specified save folder which should be used to load the data
40 | user_folder = cfg['save_folder']
41 | if user_folder is not None and (storage_path / user_folder).exists():
42 | self.current_path = storage_path / user_folder
43 | self.loaded_from_save = True
44 | # Store deepswarm object to backup
45 | self.backup = self.load_object(Storage.ITEM["BACKUP"])
46 | self.backup.storage.loaded_from_save = True
47 | return
48 |
49 | # Otherwise create a new directory
50 | directory_path = storage_path / datetime.now().strftime('%Y-%m-%d-%H-%M-%S')
51 | if not directory_path.exists():
52 | directory_path.mkdir()
53 | self.current_path = directory_path
54 | return
55 |
56 | def setup_directories(self):
57 | """Creates all the required directories."""
58 |
59 | for directory in Storage.DIR.values():
60 | directory_path = self.current_path / directory
61 | if not directory_path.exists():
62 | directory_path.mkdir()
63 |
64 | def perform_backup(self):
65 | """Saves DeepSwarm object to the backup directory."""
66 |
67 | self.save_object(self.deepswarm, Storage.ITEM["BACKUP"])
68 |
69 | def save_model(self, backend, model, path_hashes, cost):
70 | """Saves the model and adds its information to the dictionaries.
71 |
72 | Args:
73 | backend: Backend object.
74 | model: model which represents neural network structure.
75 | path_hashes [string]: list of hashes, where each hash represents a
76 | sub-path.
77 | cost: cost associated with the model.
78 | """
79 |
80 | sub_path_associated = False
81 | # The last element describes the whole path
82 | model_hash = path_hashes[-1]
83 |
84 | # For each sub-path find it's corresponding entry in hash table
85 | for path_hash in path_hashes:
86 | # Check if there already exists model for this sub-path
87 | existing_model_hash = self.path_lookup.get(path_hash)
88 | model_info = self.models.get(existing_model_hash)
89 |
90 | # If the old model is better then skip this sub-path
91 | if model_info is not None and left_cost_is_better(model_info[0], cost):
92 | continue
93 |
94 | # Otherwise associated this sub-path with a new model
95 | self.path_lookup[path_hash] = model_hash
96 | sub_path_associated = True
97 |
98 | # Save model on disk only if it was associated with some sub-path
99 | if sub_path_associated:
100 | # Add an entry to models dictionary
101 | self.models[model_hash] = (cost, 0)
102 | # Save to disk
103 | self.save_specified_model(backend, model_hash, model)
104 |
105 | def load_model(self, backend, path_hashes, path):
106 | """Loads model with the best weights.
107 |
108 | Args:
109 | backend: Backend object.
110 | path_hashes [string]: list of hashes, where each hash represents a
111 | sub-path.
112 | path [Node]: a path which represents the model.
113 | Returns:
114 | if the model exists returns a tuple containing model and its hash,
115 | otherwise returns a tuple containing None values.
116 | """
117 |
118 | # Go through all hashes backwards
119 | for idx, path_hash in enumerate(path_hashes[::-1]):
120 | # See if particular hash is associated with some model
121 | model_hash = self.path_lookup.get(path_hash)
122 | model_info = self.models.get(model_hash)
123 |
124 | # Don't reuse model if it hasn't improved for longer than allowed in patience
125 | if model_hash is not None and model_info[1] < cfg['reuse_patience']:
126 | model = self.load_specified_model(backend, model_hash)
127 | # If failed to load model, skip to next hash
128 | if model is None:
129 | continue
130 |
131 | # If there is no difference between models, just return the old model,
132 | # otherwise create a new model by reusing the old model. Even though,
133 | # backend.reuse_model function could be called to handle both
134 | # cases, this approach saves some unnecessary computation
135 | new_model = model if idx == 0 else backend.reuse_model(model, path, idx)
136 |
137 | # We also return base model (a model which was used as a base to
138 | # create a new model) hash. This hash information is used later to
139 | # track if the base model is improving over time or is it stuck
140 | return (new_model, model_hash)
141 | return (None, None)
142 |
143 | def load_specified_model(self, backend, model_name):
144 | """Loads specified model using its name.
145 |
146 | Args:
147 | backend: Backend object.
148 | model_name: name of the model.
149 | Returns:
150 | model which represents neural network structure.
151 | """
152 |
153 | file_path = self.current_path / Storage.DIR["MODEL"] / model_name
154 | model = backend.load_model(file_path)
155 | return model
156 |
157 | def save_specified_model(self, backend, model_name, model):
158 | """Saves specified model using its name without and adding its information
159 | to the dictionaries.
160 |
161 | Args:
162 | backend: Backend object.
163 | model_name: name of the model.
164 | model: model which represents neural network structure.
165 | """
166 |
167 | save_path = self.current_path / Storage.DIR["MODEL"] / model_name
168 | backend.save_model(model, save_path)
169 |
170 | def record_model_performance(self, path_hash, cost):
171 | """Records how many times the model cost didn't improve.
172 |
173 | Args:
174 | path_hash: hash value associated with the model.
175 | cost: cost value associated with the model.
176 | """
177 |
178 | model_hash = self.path_lookup.get(path_hash)
179 | old_cost, no_improvements = self.models.get(model_hash)
180 |
181 | # If cost hasn't changed at all, increment no improvement count
182 | if old_cost is not None and old_cost == cost:
183 | self.models[model_hash] = (old_cost, (no_improvements + 1))
184 |
185 | def hash_path(self, path):
186 | """Takes a path and returns a tuple containing path description and
187 | list of sub-path hashes.
188 |
189 | Args:
190 | path [Node]: path which represents the model.
191 | Returns:
192 | tuple where the first element is a string representing the path
193 | description and the second element is a list of sub-path hashes.
194 | """
195 |
196 | hashes = []
197 | path_description = str(path[0])
198 | for node in path[1:]:
199 | path_description += ' -> %s' % (node)
200 | current_hash = hashlib.sha3_256(path_description.encode('utf-8')).hexdigest()
201 | hashes.append(current_hash)
202 | return (path_description, hashes)
203 |
204 | def save_object(self, data, name):
205 | """Saves given object to the object backup directory.
206 |
207 | Args:
208 | data: object that needs to be saved.
209 | name: string value representing the name of the object.
210 | """
211 |
212 | with open(self.current_path / Storage.DIR["OBJECT"] / name, 'wb') as f:
213 | pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
214 |
215 | def load_object(self, name):
216 | """Load given object from the object backup directory.
217 |
218 | Args:
219 | name: string value representing the name of the object.
220 | Returns:
221 | object which has the same name as the given argument.
222 | """
223 |
224 | with open(self.current_path / Storage.DIR["OBJECT"] / name, 'rb') as f:
225 | data = pickle.load(f)
226 | return data
227 |
--------------------------------------------------------------------------------
/examples/cifar10.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) 2019 Edvinas Byla
2 | # Licensed under MIT License
3 |
4 | import context
5 | import tensorflow as tf
6 |
7 | from deepswarm.backends import Dataset, TFKerasBackend
8 | from deepswarm.deepswarm import DeepSwarm
9 |
10 | # Load CIFAR-10 dataset
11 | cifar10 = tf.keras.datasets.cifar10
12 | (x_train, y_train), (x_test, y_test) = cifar10.load_data()
13 | # Convert class vectors to binary class matrices
14 | y_train = tf.keras.utils.to_categorical(y_train, 10)
15 | y_test = tf.keras.utils.to_categorical(y_test, 10)
16 | # Create dataset object, which controls all the data
17 | dataset = Dataset(
18 | training_examples=x_train,
19 | training_labels=y_train,
20 | testing_examples=x_test,
21 | testing_labels=y_test,
22 | validation_split=0.1,
23 | )
24 | # Create backend responsible for training & validating
25 | backend = TFKerasBackend(dataset=dataset)
26 | # Create DeepSwarm object responsible for optimization
27 | deepswarm = DeepSwarm(backend=backend)
28 | # Find the topology for a given dataset
29 | topology = deepswarm.find_topology()
30 | # Evaluate discovered topology
31 | deepswarm.evaluate_topology(topology)
32 | # Train topology on augmented data for additional 50 epochs
33 | trained_topology = deepswarm.train_topology(topology, 50, augment={
34 | 'rotation_range': 15,
35 | 'width_shift_range': 0.1,
36 | 'height_shift_range': 0.1,
37 | 'horizontal_flip': True,
38 | })
39 | # Evaluate the final topology
40 | deepswarm.evaluate_topology(trained_topology)
41 |
--------------------------------------------------------------------------------
/examples/context.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) 2019 Edvinas Byla
2 | # Licensed under MIT License
3 |
4 | import os
5 | import sys
6 |
7 | sys.path.append(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
8 |
--------------------------------------------------------------------------------
/examples/fashion-mnist.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) 2019 Edvinas Byla
2 | # Licensed under MIT License
3 |
4 | import context
5 | import tensorflow as tf
6 |
7 | from deepswarm.backends import Dataset, TFKerasBackend
8 | from deepswarm.deepswarm import DeepSwarm
9 |
10 | # Load Fashion MNIST dataset
11 | fashion_mnist = tf.keras.datasets.fashion_mnist
12 | (x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
13 | # Normalize and reshape data
14 | x_train, x_test = x_train / 255.0, x_test / 255.0
15 | x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
16 | x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
17 | # Create dataset object, which controls all the data
18 | normalized_dataset = Dataset(
19 | training_examples=x_train,
20 | training_labels=y_train,
21 | testing_examples=x_test,
22 | testing_labels=y_test,
23 | validation_split=0.1,
24 | )
25 | # Create backend responsible for training & validating
26 | backend = TFKerasBackend(dataset=normalized_dataset)
27 | # Create DeepSwarm object responsible for optimization
28 | deepswarm = DeepSwarm(backend=backend)
29 | # Find the topology for a given dataset
30 | topology = deepswarm.find_topology()
31 | # Evaluate discovered topology
32 | deepswarm.evaluate_topology(topology)
33 | # Train topology for additional 50 epochs
34 | trained_topology = deepswarm.train_topology(topology, 50)
35 | # Evaluate the final topology
36 | deepswarm.evaluate_topology(trained_topology)
37 |
--------------------------------------------------------------------------------
/examples/mnist.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) 2019 Edvinas Byla
2 | # Licensed under MIT License
3 |
4 | import context
5 | import tensorflow as tf
6 |
7 | from deepswarm.backends import Dataset, TFKerasBackend
8 | from deepswarm.deepswarm import DeepSwarm
9 |
10 | # Load MNIST dataset
11 | mnist = tf.keras.datasets.mnist
12 | (x_train, y_train), (x_test, y_test) = mnist.load_data()
13 | # Normalize and reshape data
14 | x_train, x_test = x_train / 255.0, x_test / 255.0
15 | x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
16 | x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
17 | # Create dataset object, which controls all the data
18 | normalized_dataset = Dataset(
19 | training_examples=x_train,
20 | training_labels=y_train,
21 | testing_examples=x_test,
22 | testing_labels=y_test,
23 | validation_split=0.1,
24 | )
25 | # Create backend responsible for training & validating
26 | backend = TFKerasBackend(dataset=normalized_dataset)
27 | # Create DeepSwarm object responsible for optimization
28 | deepswarm = DeepSwarm(backend=backend)
29 | # Find the topology for a given dataset
30 | topology = deepswarm.find_topology()
31 | # Evaluate discovered topology
32 | deepswarm.evaluate_topology(topology)
33 | # Train topology for additional 30 epochs
34 | trained_topology = deepswarm.train_topology(topology, 30)
35 | # Evaluate the final topology
36 | deepswarm.evaluate_topology(trained_topology)
37 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | colorama==0.4.1
2 | pyyaml==5.1
3 | scikit-learn==0.20.3
--------------------------------------------------------------------------------
/settings/cifar10.yaml:
--------------------------------------------------------------------------------
1 | DeepSwarm:
2 | save_folder:
3 | metrics: accuracy
4 | max_depth: 20
5 | reuse_patience: 1
6 |
7 | aco:
8 | pheromone:
9 | start: 0.1
10 | decay: 0.1
11 | evaporation: 0.1
12 | verbose: False
13 | greediness: 0.5
14 | ant_count: 16
15 |
16 | backend:
17 | epochs: 20
18 | batch_size: 64
19 | patience: 5
20 | loss: categorical_crossentropy
21 | verbose: False
22 |
23 | spatial_nodes: [InputNode, Conv2DNode, DropoutSpatialNode, BatchNormalizationNode, Pool2DNode]
24 | flat_nodes: [FlattenNode, DenseNode, DropoutFlatNode, BatchNormalizationFlatNode]
25 |
26 | Nodes:
27 |
28 | InputNode:
29 | type: Input
30 | attributes:
31 | shape: [!!python/tuple [32, 32, 3]]
32 | transitions:
33 | Conv2DNode: 1.0
34 |
35 | Conv2DNode:
36 | type: Conv2D
37 | attributes:
38 | filter_count: [64, 128, 256]
39 | kernel_size: [1, 3, 5]
40 | activation: [ReLU]
41 | transitions:
42 | Conv2DNode: 0.8
43 | Pool2DNode: 1.2
44 | FlattenNode: 1.0
45 | DropoutSpatialNode: 1.1
46 | BatchNormalizationNode: 1.2
47 |
48 | DropoutSpatialNode:
49 | type: Dropout
50 | attributes:
51 | rate: [0.1, 0.3, 0.5]
52 | transitions:
53 | Conv2DNode: 1.1
54 | Pool2DNode: 1.0
55 | FlattenNode: 1.0
56 | BatchNormalizationNode: 1.1
57 |
58 | BatchNormalizationNode:
59 | type: BatchNormalization
60 | attributes: {}
61 | transitions:
62 | Conv2DNode: 1.1
63 | Pool2DNode: 1.1
64 | DropoutSpatialNode: 1.0
65 | FlattenNode: 1.0
66 |
67 | Pool2DNode:
68 | type: Pool2D
69 | attributes:
70 | pool_type: [max, average]
71 | pool_size: [2]
72 | stride: [2, 3]
73 | transitions:
74 | Conv2DNode: 1.1
75 | FlattenNode: 1.0
76 | BatchNormalizationNode: 1.1
77 |
78 | FlattenNode:
79 | type: Flatten
80 | attributes: {}
81 | transitions:
82 | DenseNode: 1.0
83 | OutputNode: 0.8
84 | BatchNormalizationFlatNode: 0.9
85 |
86 | DenseNode:
87 | type: Dense
88 | attributes:
89 | output_size: [64, 128, 256]
90 | activation: [ReLU, Sigmoid]
91 | transitions:
92 | DenseNode: 0.8
93 | DropoutFlatNode: 1.2
94 | BatchNormalizationFlatNode: 1.2
95 | OutputNode: 1.0
96 |
97 | DropoutFlatNode:
98 | type: Dropout
99 | attributes:
100 | rate: [0.3, 0.5, 0.7]
101 | transitions:
102 | DenseNode: 1.0
103 | BatchNormalizationFlatNode: 1.0
104 | OutputNode: 0.9
105 |
106 | BatchNormalizationFlatNode:
107 | type: BatchNormalization
108 | attributes: {}
109 | transitions:
110 | DenseNode: 1.1
111 | DropoutFlatNode: 1.1
112 | OutputNode: 0.9
113 |
114 | OutputNode:
115 | type: Output
116 | attributes:
117 | output_size: [10]
118 | activation: [Softmax]
119 | transitions: {}
120 |
--------------------------------------------------------------------------------
/settings/default.yaml:
--------------------------------------------------------------------------------
1 | DeepSwarm:
2 | save_folder:
3 | metrics: accuracy
4 | max_depth: 15
5 | reuse_patience: 1
6 |
7 | aco:
8 | pheromone:
9 | start: 0.1
10 | decay: 0.1
11 | evaporation: 0.1
12 | verbose: False
13 | greediness: 0.5
14 | ant_count: 16
15 |
16 | backend:
17 | epochs: 15
18 | batch_size: 64
19 | patience: 5
20 | loss: sparse_categorical_crossentropy
21 | verbose: False
22 |
23 | spatial_nodes: [InputNode, Conv2DNode, DropoutSpatialNode, BatchNormalizationNode, Pool2DNode]
24 | flat_nodes: [FlattenNode, DenseNode, DropoutFlatNode, BatchNormalizationFlatNode]
25 |
26 | Nodes:
27 |
28 | InputNode:
29 | type: Input
30 | attributes:
31 | shape: [!!python/tuple [28, 28, 1]]
32 | transitions:
33 | Conv2DNode: 1.0
34 |
35 | Conv2DNode:
36 | type: Conv2D
37 | attributes:
38 | filter_count: [32, 64, 128]
39 | kernel_size: [1, 3, 5]
40 | activation: [ReLU]
41 | transitions:
42 | Conv2DNode: 0.8
43 | Pool2DNode: 1.2
44 | FlattenNode: 1.0
45 | DropoutSpatialNode: 1.1
46 | BatchNormalizationNode: 1.2
47 |
48 | DropoutSpatialNode:
49 | type: Dropout
50 | attributes:
51 | rate: [0.1, 0.3]
52 | transitions:
53 | Conv2DNode: 1.1
54 | Pool2DNode: 1.0
55 | FlattenNode: 1.0
56 | BatchNormalizationNode: 1.1
57 |
58 | BatchNormalizationNode:
59 | type: BatchNormalization
60 | attributes: {}
61 | transitions:
62 | Conv2DNode: 1.1
63 | Pool2DNode: 1.1
64 | DropoutSpatialNode: 1.0
65 | FlattenNode: 1.0
66 |
67 | Pool2DNode:
68 | type: Pool2D
69 | attributes:
70 | pool_type: [max, average]
71 | pool_size: [2]
72 | stride: [2, 3]
73 | transitions:
74 | Conv2DNode: 1.1
75 | FlattenNode: 1.0
76 | BatchNormalizationNode: 1.1
77 |
78 | FlattenNode:
79 | type: Flatten
80 | attributes: {}
81 | transitions:
82 | DenseNode: 1.0
83 | OutputNode: 0.8
84 | BatchNormalizationFlatNode: 0.9
85 |
86 | DenseNode:
87 | type: Dense
88 | attributes:
89 | output_size: [64, 128]
90 | activation: [ReLU, Sigmoid]
91 | transitions:
92 | DenseNode: 0.8
93 | DropoutFlatNode: 1.2
94 | BatchNormalizationFlatNode: 1.2
95 | OutputNode: 1.0
96 |
97 | DropoutFlatNode:
98 | type: Dropout
99 | attributes:
100 | rate: [0.1, 0.3]
101 | transitions:
102 | DenseNode: 1.0
103 | BatchNormalizationFlatNode: 1.0
104 | OutputNode: 0.9
105 |
106 | BatchNormalizationFlatNode:
107 | type: BatchNormalization
108 | attributes: {}
109 | transitions:
110 | DenseNode: 1.1
111 | DropoutFlatNode: 1.1
112 | OutputNode: 0.9
113 |
114 | OutputNode:
115 | type: Output
116 | attributes:
117 | output_size: [10]
118 | activation: [Softmax]
119 | transitions: {}
120 |
--------------------------------------------------------------------------------
/settings/fashion-mnist.yaml:
--------------------------------------------------------------------------------
1 | DeepSwarm:
2 | save_folder:
3 | metrics: accuracy
4 | max_depth: 15
5 | reuse_patience: 1
6 |
7 | aco:
8 | pheromone:
9 | start: 0.1
10 | decay: 0.1
11 | evaporation: 0.1
12 | verbose: False
13 | greediness: 0.5
14 | ant_count: 16
15 |
16 | backend:
17 | epochs: 20
18 | batch_size: 64
19 | patience: 5
20 | loss: sparse_categorical_crossentropy
21 | verbose: False
22 |
23 | spatial_nodes: [InputNode, Conv2DNode, DropoutSpatialNode, BatchNormalizationNode, Pool2DNode]
24 | flat_nodes: [FlattenNode, DenseNode, DropoutFlatNode, BatchNormalizationFlatNode]
25 |
26 | Nodes:
27 |
28 | InputNode:
29 | type: Input
30 | attributes:
31 | shape: [!!python/tuple [28, 28, 1]]
32 | transitions:
33 | Conv2DNode: 1.0
34 |
35 | Conv2DNode:
36 | type: Conv2D
37 | attributes:
38 | filter_count: [64, 128, 256]
39 | kernel_size: [1, 3, 5]
40 | activation: [ReLU]
41 | transitions:
42 | Conv2DNode: 0.8
43 | Pool2DNode: 1.2
44 | FlattenNode: 1.0
45 | DropoutSpatialNode: 1.1
46 | BatchNormalizationNode: 1.2
47 |
48 | DropoutSpatialNode:
49 | type: Dropout
50 | attributes:
51 | rate: [0.1, 0.3]
52 | transitions:
53 | Conv2DNode: 1.1
54 | Pool2DNode: 1.0
55 | FlattenNode: 1.0
56 | BatchNormalizationNode: 1.1
57 |
58 | BatchNormalizationNode:
59 | type: BatchNormalization
60 | attributes: {}
61 | transitions:
62 | Conv2DNode: 1.1
63 | Pool2DNode: 1.1
64 | DropoutSpatialNode: 1.0
65 | FlattenNode: 1.0
66 |
67 | Pool2DNode:
68 | type: Pool2D
69 | attributes:
70 | pool_type: [max, average]
71 | pool_size: [2]
72 | stride: [2, 3]
73 | transitions:
74 | Conv2DNode: 1.1
75 | FlattenNode: 1.0
76 | BatchNormalizationNode: 1.1
77 |
78 | FlattenNode:
79 | type: Flatten
80 | attributes: {}
81 | transitions:
82 | DenseNode: 1.0
83 | OutputNode: 0.8
84 | BatchNormalizationFlatNode: 0.9
85 |
86 | DenseNode:
87 | type: Dense
88 | attributes:
89 | output_size: [64, 128]
90 | activation: [ReLU, Sigmoid]
91 | transitions:
92 | DenseNode: 0.8
93 | DropoutFlatNode: 1.2
94 | BatchNormalizationFlatNode: 1.2
95 | OutputNode: 1.0
96 |
97 | DropoutFlatNode:
98 | type: Dropout
99 | attributes:
100 | rate: [0.1, 0.3]
101 | transitions:
102 | DenseNode: 1.0
103 | BatchNormalizationFlatNode: 1.0
104 | OutputNode: 0.9
105 |
106 | BatchNormalizationFlatNode:
107 | type: BatchNormalization
108 | attributes: {}
109 | transitions:
110 | DenseNode: 1.1
111 | DropoutFlatNode: 1.1
112 | OutputNode: 0.9
113 |
114 | OutputNode:
115 | type: Output
116 | attributes:
117 | output_size: [10]
118 | activation: [Softmax]
119 | transitions: {}
120 |
--------------------------------------------------------------------------------
/settings/mnist.yaml:
--------------------------------------------------------------------------------
1 | DeepSwarm:
2 | save_folder:
3 | metrics: accuracy
4 | max_depth: 15
5 | reuse_patience: 1
6 |
7 | aco:
8 | pheromone:
9 | start: 0.1
10 | decay: 0.1
11 | evaporation: 0.1
12 | verbose: False
13 | greediness: 0.5
14 | ant_count: 16
15 |
16 | backend:
17 | epochs: 15
18 | batch_size: 64
19 | patience: 5
20 | loss: sparse_categorical_crossentropy
21 | verbose: False
22 |
23 | spatial_nodes: [InputNode, Conv2DNode, DropoutSpatialNode, BatchNormalizationNode, Pool2DNode]
24 | flat_nodes: [FlattenNode, DenseNode, DropoutFlatNode, BatchNormalizationFlatNode]
25 |
26 | Nodes:
27 |
28 | InputNode:
29 | type: Input
30 | attributes:
31 | shape: [!!python/tuple [28, 28, 1]]
32 | transitions:
33 | Conv2DNode: 1.0
34 |
35 | Conv2DNode:
36 | type: Conv2D
37 | attributes:
38 | filter_count: [32, 64, 128]
39 | kernel_size: [1, 3, 5]
40 | activation: [ReLU]
41 | transitions:
42 | Conv2DNode: 0.8
43 | Pool2DNode: 1.2
44 | FlattenNode: 1.0
45 | DropoutSpatialNode: 1.1
46 | BatchNormalizationNode: 1.2
47 |
48 | DropoutSpatialNode:
49 | type: Dropout
50 | attributes:
51 | rate: [0.1, 0.3]
52 | transitions:
53 | Conv2DNode: 1.1
54 | Pool2DNode: 1.0
55 | FlattenNode: 1.0
56 | BatchNormalizationNode: 1.1
57 |
58 | BatchNormalizationNode:
59 | type: BatchNormalization
60 | attributes: {}
61 | transitions:
62 | Conv2DNode: 1.1
63 | Pool2DNode: 1.1
64 | DropoutSpatialNode: 1.0
65 | FlattenNode: 1.0
66 |
67 | Pool2DNode:
68 | type: Pool2D
69 | attributes:
70 | pool_type: [max, average]
71 | pool_size: [2]
72 | stride: [2, 3]
73 | transitions:
74 | Conv2DNode: 1.1
75 | FlattenNode: 1.0
76 | BatchNormalizationNode: 1.1
77 |
78 | FlattenNode:
79 | type: Flatten
80 | attributes: {}
81 | transitions:
82 | DenseNode: 1.0
83 | OutputNode: 0.8
84 | BatchNormalizationFlatNode: 0.9
85 |
86 | DenseNode:
87 | type: Dense
88 | attributes:
89 | output_size: [64, 128]
90 | activation: [ReLU, Sigmoid]
91 | transitions:
92 | DenseNode: 0.8
93 | DropoutFlatNode: 1.2
94 | BatchNormalizationFlatNode: 1.2
95 | OutputNode: 1.0
96 |
97 | DropoutFlatNode:
98 | type: Dropout
99 | attributes:
100 | rate: [0.1, 0.3]
101 | transitions:
102 | DenseNode: 1.0
103 | BatchNormalizationFlatNode: 1.0
104 | OutputNode: 0.9
105 |
106 | BatchNormalizationFlatNode:
107 | type: BatchNormalization
108 | attributes: {}
109 | transitions:
110 | DenseNode: 1.1
111 | DropoutFlatNode: 1.1
112 | OutputNode: 0.9
113 |
114 | OutputNode:
115 | type: Output
116 | attributes:
117 | output_size: [10]
118 | activation: [Softmax]
119 | transitions: {}
120 |
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | import setuptools
2 |
3 | with open("README.md", "r") as fh:
4 | long_description = fh.read()
5 |
6 | setuptools.setup(
7 | name="deepswarm",
8 | version="0.0.10",
9 | author="Edvinas Byla",
10 | author_email="edvinasbyla@gmail.com",
11 | description="Neural Architecture Search Powered by Swarm Intelligence",
12 | long_description=long_description,
13 | long_description_content_type="text/markdown",
14 | url="https://github.com/Pattio/DeepSwarm",
15 | packages=setuptools.find_packages(),
16 | package_data={'deepswarm': ['../settings/default.yaml']},
17 | install_requires=[
18 | 'colorama==0.4.1',
19 | 'pyyaml==5.1',
20 | 'scikit-learn==0.20.3',
21 | ],
22 | classifiers=[
23 | "Programming Language :: Python :: 3.6",
24 | "License :: OSI Approved :: MIT License",
25 | "Operating System :: OS Independent",
26 | ],
27 | )
28 |
--------------------------------------------------------------------------------
/tests/test_aco.py:
--------------------------------------------------------------------------------
1 | import math
2 | import unittest
3 |
4 | from deepswarm import cfg
5 | from deepswarm.aco import ACO, Ant
6 |
7 |
8 | class TestACO(unittest.TestCase):
9 |
10 | def setUp(self):
11 | self.aco = ACO(None, None)
12 |
13 | def test_ant_init(self):
14 | # Test if the ant is initialized properly
15 | ant = Ant()
16 | self.assertEqual(ant.loss, math.inf)
17 | self.assertEqual(ant.accuracy, 0.0)
18 | self.assertEqual(ant.path, [])
19 | if cfg['metrics'] == 'loss':
20 | self.assertEqual(ant.cost, ant.loss)
21 | else:
22 | self.assertEqual(ant.cost, ant.accuracy)
23 |
24 | def test_ant_init_with_path(self):
25 | # Test if the ant is initialized properly when a path is given
26 | self.aco.graph.increase_depth()
27 | path = self.aco.graph.generate_path(self.aco.aco_select)
28 | ant = Ant(path)
29 | self.assertEqual(ant.loss, math.inf)
30 | self.assertEqual(ant.accuracy, 0.0)
31 | self.assertEqual(ant.path, path)
32 |
33 | def test_ant_comparison(self):
34 | # Test if ants are compared properly
35 | ant_1 = Ant()
36 | ant_1.accuracy = 0.8
37 | ant_2 = Ant()
38 | ant_2.loss = 0.8
39 | self.assertTrue(ant_2 < ant_1)
40 |
41 | def test_local_update(self):
42 | # Test if local update rule works properly
43 | new_value = self.aco.local_update(11.23, None)
44 | self.assertEqual(new_value, 10.117)
45 |
46 | def test_global_update(self):
47 | # Test if global update rule works properly
48 | new_value = self.aco.global_update(11.23, 13.79)
49 | self.assertEqual(new_value, 11.486)
50 |
51 | def test_pheromone_update(self):
52 | # Test pheromone update
53 | self.aco.graph.increase_depth()
54 | self.aco.graph.increase_depth()
55 | path = self.aco.graph.generate_path(self.aco.aco_select)
56 | ant = Ant(path)
57 | self.aco.update_pheromone(ant, self.aco.local_update)
58 | self.aco.update_pheromone(ant, self.aco.global_update)
59 |
--------------------------------------------------------------------------------
/tests/test_graph.py:
--------------------------------------------------------------------------------
1 | import unittest
2 |
3 | from deepswarm.aco import Graph
4 | from deepswarm.nodes import Node
5 |
6 |
7 | class TestNodes(unittest.TestCase):
8 |
9 | def setUp(self):
10 | self.graph = Graph()
11 |
12 | def test_graph_init(self):
13 | # Test if the newly created graph contains the input node
14 | self.assertEqual(len(self.graph.topology), 1)
15 | self.assertEqual(self.graph.current_depth, 1)
16 | input_node = self.graph.input_node
17 | self.assertIs(self.graph.topology[0][input_node.name], input_node)
18 |
19 | def test_depth_increase(self):
20 | # Test if the depth is increased correctly
21 | self.assertEqual(self.graph.current_depth, 1)
22 | self.graph.increase_depth()
23 | self.assertEqual(self.graph.current_depth, 2)
24 |
25 | def test_path_generation(self):
26 | # Create a rule which selects first available node
27 | def select_rule(neighbours):
28 | return neighbours[0].node
29 |
30 | # Generate the path
31 | path = self.graph.generate_path(select_rule)
32 | # Test if the path is not empty
33 | self.assertNotEqual(path, [])
34 | # Test if the path starts with an input node
35 | self.assertEqual(path[0].type, 'Input')
36 | # Test if path ends with output node
37 | self.assertEqual(path[-1].type, 'Output')
38 |
39 | def test_path_completion(self):
40 | # Create a path containing only the input node
41 | old_path = [self.graph.input_node]
42 | # Complete that path
43 | new_path = self.graph.complete_path(old_path)
44 | # Test if path starts with an input node
45 | self.assertEqual(new_path[0].type, 'Input')
46 | # Test if path ends with output node
47 | self.assertEqual(new_path[-1].type, 'Output')
48 |
49 | def test_node_retrieval(self):
50 | # Test if the newly created graph contains the input node
51 | self.assertEqual(len(self.graph.topology), 1)
52 | # Retrieve first available transition from the input node
53 | available_transition = self.graph.input_node.available_transitions[0]
54 | # Use its name to initialize Node object
55 | available_transition_name = available_transition[0]
56 | available_transition_node = Node(available_transition_name)
57 | self.graph.get_node(available_transition_node, 1)
58 | # Test if graph's depth increased after adding a new node
59 | self.assertEqual(len(self.graph.topology), 2)
60 | # Test if the node was added correctly
61 | self.assertIs(self.graph.topology[1][available_transition_name], available_transition_node)
62 |
63 | def test_node_expansion(self):
64 | # Test if the input node was not expanded yet
65 | input_node = self.graph.input_node
66 | self.assertFalse(input_node.is_expanded)
67 | self.assertEqual(input_node.neighbours, [])
68 | # Try to expand it
69 | has_neighbours = self.graph.has_neighbours(input_node, 0)
70 | # Test if the input node was expanded successfully
71 | self.assertTrue(input_node.is_expanded)
72 | # Test if the input node has neighbours
73 | self.assertTrue(has_neighbours)
74 | self.assertNotEqual(input_node.neighbours, [])
75 | # Test if neighbour node was added to the topology
76 | neighbour_node = input_node.neighbours[0].node
77 | self.assertIs(self.graph.topology[1][neighbour_node.name], neighbour_node)
78 |
--------------------------------------------------------------------------------
/tests/test_nodes.py:
--------------------------------------------------------------------------------
1 | import unittest
2 |
3 | from deepswarm.nodes import Node, NodeAttribute, NeighbourNode
4 |
5 |
6 | class TestNodes(unittest.TestCase):
7 |
8 | def setUp(self):
9 | self.input_node = Node.create_using_type('Input')
10 |
11 | def test_create_using_type(self):
12 | # Test default values
13 | self.assertEqual(self.input_node.neighbours, [])
14 | self.assertEqual(self.input_node.type, 'Input')
15 | self.assertFalse(self.input_node.is_expanded)
16 | self.assertNotEqual(self.input_node.attributes, [])
17 | self.assertNotEqual(self.input_node.available_transitions, [])
18 | # Test if generated description is correct
19 | description = self.input_node.name + '(' + 'shape:' + str(self.input_node.shape) + ')'
20 | self.assertEqual(description, str(self.input_node))
21 |
22 | def test_init(self):
23 | # Test if you can create node just by using its name
24 | input_node_new = Node(self.input_node.name)
25 | self.assertEqual(input_node_new.type, self.input_node.type)
26 |
27 | def test_deepcopy(self):
28 | # Test if the copied object is an instance of Node
29 | input_node_copy = self.input_node.create_deepcopy()
30 | self.assertIsInstance(input_node_copy, Node)
31 | # Test if unnecessary attributes were removed
32 | self.assertNotEqual(input_node_copy.available_transitions, self.input_node.attributes)
33 | # Test if unnecessary attributes are empty arrays
34 | self.assertEqual(input_node_copy.neighbours, [])
35 | self.assertEqual(input_node_copy.available_transitions, [])
36 |
37 | def test_available_transition(self):
38 | # Retrieve first available transition
39 | available_transition = self.input_node.available_transitions[0]
40 | # Use its name to initialize Node object
41 | available_transition_name = available_transition[0]
42 | available_transition_node = Node(available_transition_name)
43 | self.assertIsInstance(available_transition_node, Node)
44 | # Check if the node was properly initialized
45 | self.assertNotEqual(available_transition_node.attributes, [])
46 | self.assertNotEqual(available_transition_node.available_transitions, [])
47 | # Check if available transition contains a heuristic value
48 | self.assertIsInstance(available_transition[1], float)
49 |
50 | def test_custom_attribute_selection(self):
51 | # Initialize node which connects to the input node
52 | node = Node(self.input_node.available_transitions[0][0])
53 | # For each attribute select first available value
54 | node.select_custom_attributes(lambda values: values[0][0])
55 | # Collect selected values
56 | old_attribute_values = [getattr(node, attribute.name) for attribute in node.attributes]
57 | # For each attribute if available select second value
58 | node.select_custom_attributes(lambda values: values[1][0] if len(values) > 1 else values[0][0])
59 | # Collect newly selected values
60 | new_attribute_values = [getattr(node, attribute.name) for attribute in node.attributes]
61 | # Newly selected values should be different from old values
62 | self.assertNotEqual(old_attribute_values, new_attribute_values)
63 |
64 | def test_adding_neighbour_node(self):
65 | # Find first available transition
66 | transition_name, transition_pheromone = self.input_node.available_transitions[0]
67 | # Initialize node object using transition's name
68 | node = Node(transition_name)
69 | # Create NeighbourNode object
70 | neighbour_node = NeighbourNode(node, transition_pheromone)
71 | # Check if NeighbourNode object was created properly
72 | self.assertIsInstance(neighbour_node.node, Node)
73 | self.assertIsInstance(neighbour_node.heuristic, float)
74 | self.assertIsInstance(neighbour_node.pheromone, float)
75 | # Add NeighbourNode object to neighbours list
76 | self.input_node.neighbours.append(neighbour_node)
77 |
78 | def test_node_attributes_init(self):
79 | # Create test attribute
80 | attribute_name = 'filter_count'
81 | attribute_values = [16, 32, 64]
82 | attribute = NodeAttribute(attribute_name, attribute_values)
83 | # Check if attribute name was set correctly
84 | self.assertEqual(attribute.name, attribute_name)
85 | # Check if each attribute value was added to the dictionary
86 | for attribute_value in attribute_values:
87 | self.assertIn(attribute_value, attribute.dict)
88 | # Gather all unique pheromone values
89 | pheromone_values = list(set(attribute.dict.values()))
90 | # Because NodeAttribute object was just initialized and no changes to
91 | # pheromone values were performed, all pheromone values must be the same
92 | # meaning that pheromone_values must contain only 1 element
93 | self.assertEqual(len(pheromone_values), 1)
94 |
--------------------------------------------------------------------------------