├── Augmentation ├── AutoAugment │ ├── LICENSE │ ├── README.md │ └── autoaugment.py ├── README.md ├── architect.py ├── augment.py ├── config.py ├── genotypes.py ├── models │ ├── augment_cells.py │ ├── augment_cnn.py │ ├── ops.py │ ├── search_cells.py │ └── search_cnn.py ├── preproc.py ├── utils.py └── visualize.py ├── CNAS ├── README.md ├── darts │ ├── architecture.py │ ├── geno_types.py │ ├── model.py │ ├── model_search.py │ ├── operations.py │ ├── utils.py │ └── visualize.py ├── experiments │ ├── cell_visualize.py │ ├── distributed-pytorch.py │ ├── food101_val.py │ ├── heat_map.py │ ├── images_view.py │ ├── performance_view.py │ └── train_tiny_imagenet_resnet50.py ├── run │ ├── train_cnn.py │ └── train_search.py └── tests │ ├── test_cell_visualize.py │ ├── test_model_search.py │ ├── test_operation.py │ └── test_utils.py ├── DARTS ├── README.md ├── architect.py ├── augment.py ├── config.py ├── genotypes.py ├── models │ ├── augment_cells.py │ ├── augment_cnn.py │ ├── ops.py │ ├── search_cells.py │ └── search_cnn.py ├── preproc.py ├── search.py ├── utils.py └── visualize.py ├── ENASPytorch ├── README.md ├── data │ └── data.py ├── micro_child.py ├── micro_controller.py ├── train_search.py └── utils.py ├── ENASTF ├── README.md └── src │ ├── cifar10 │ ├── controller.py │ ├── data_utils.py │ ├── general_child.py │ ├── general_controller.py │ ├── image_ops.py │ ├── main.py │ ├── micro_child.py │ ├── micro_controller.py │ └── models.py │ ├── common_ops.py │ ├── controller.py │ └── utils.py ├── LICENSE ├── NAO ├── README.md ├── controller.py ├── decoder.py ├── encoder.py ├── model.py ├── model_search.py ├── operations.py ├── train_cifar.py ├── train_imagenet.py ├── train_search.py └── utils.py ├── NSGANET ├── README.md ├── misc │ ├── flops_counter.py │ └── utils.py ├── models │ ├── macro_decoder.py │ ├── macro_genotypes.py │ ├── macro_models.py │ ├── micro_genotypes.py │ ├── micro_models.py │ └── micro_operations.py ├── search │ ├── cifar10_search.py │ ├── evolution_search.py │ ├── macro_encoding.py │ ├── micro_encoding.py │ ├── nsganet.py │ └── train_search.py ├── validation │ ├── test.py │ └── train.py └── visualization │ ├── macro_visualize.py │ └── micro_visualize.py ├── PC-DARTS ├── README.md ├── architect.py ├── genotypes.py ├── model.py ├── model_search.py ├── model_search_random.py ├── operations.py ├── test.py ├── train.py ├── train_search.py ├── utils.py └── visualize.py ├── PDARTS ├── README.md ├── genotypes.py ├── model.py ├── model_search.py ├── operations.py ├── test.py ├── train_cifar.py ├── train_search.py ├── utils.py └── visualize.py ├── PRDARTS ├── README.md ├── genotypes.py ├── gpu_thread.py ├── model.py ├── model_search.py ├── operations.py ├── test.py ├── test_imagenet.py ├── testgenotype.json ├── train_cifar.py ├── train_imagenet.py ├── train_search.py ├── utils.py └── visualize.py ├── Plots ├── PLOTS_Benchmark_SUPPL.ipynb └── data │ ├── cell_sensitivity │ ├── cell_n_12.npy │ ├── cell_n_16.npy │ ├── cell_n_20.npy │ ├── cell_n_24.npy │ ├── cell_n_4.npy │ ├── cell_n_6.npy │ └── cell_n_8.npy │ ├── correlation_cells │ └── 8_to_20_cells.npy │ ├── correlation_seeds │ └── different_seed_24.npy │ ├── modified_search_space │ ├── 214_random_architectures.npy │ ├── 214genotypes.pkl │ └── 56_random_mod_architectures.npy │ └── performance │ ├── Augmentation.csv │ ├── CIFAR10.csv │ ├── CIFAR100.csv │ ├── Flowers102.csv │ ├── MIT67.csv │ └── Sport8.csv ├── README.md └── data ├── MIT67_test.csv ├── MIT67_train1.csv ├── MIT67_train2.csv ├── MIT67_train3.csv ├── MIT67_train4.csv ├── Sport8_test.csv ├── Sport8_train.csv ├── flowers102_test.csv ├── flowers102_train1.csv └── flowers102_train2.csv /Augmentation/AutoAugment/LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Philip Popien 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Augmentation/AutoAugment/README.md: -------------------------------------------------------------------------------- 1 | # AutoAugment - Learning Augmentation Policies from Data 2 | Unofficial implementation of the ImageNet, CIFAR10 and SVHN Augmentation Policies learned by [AutoAugment](https://arxiv.org/abs/1805.09501v1), described in this [Google AI Blogpost](https://ai.googleblog.com/2018/06/improving-deep-learning-performance.html). 3 | 4 | __Update July 13th, 2018:__ Wrote a [Blogpost](https://towardsdatascience.com/how-to-improve-your-image-classifier-with-googles-autoaugment-77643f0be0c9) about AutoAugment and Double Transfer Learning. Code updates: The fill color after applying translations, rotations and shearing can now be specified with e.g. "policy = ImageNetPolicy(fillcolor=(0, 0, 0))". Current functionality seems to work well. Will update as soon I know more details from the authors. 5 | 6 | __Update June 18th, 2018:__ Changed order and functionality of many magnitudes. Higher magnitude now always apply the operation with higher intensity and the sign is randomly sampled (e.g. rotating for 20 degrees to the left or right). This seems to be more in line with how it was done in the paper (judging from the figures). Have asked the authors for more details and will update as soon as I know more. 7 | 8 | ##### Tested with Python 3.6. Needs pillow>=5.0.0 9 | 10 | ![Examples of the best ImageNet Policy](figures/Figure2_Paper.png) 11 | 12 | 13 | ------------------ 14 | 15 | 16 | 17 | ## Example 18 | 19 | ```python 20 | from autoaugment import ImageNetPolicy 21 | image = PIL.Image.open(path) 22 | policy = ImageNetPolicy() 23 | transformed = policy(image) 24 | ``` 25 | 26 | To see examples of all operations and magnitudes applied to images, take a look at [AutoAugment_Exploration.ipynb](AutoAugment_Exploration.ipynb). 27 | 28 | ## Example as a PyTorch Transform - ImageNet 29 | 30 | ```python 31 | from autoaugment import ImageNetPolicy 32 | data = ImageFolder(rootdir, transform=transforms.Compose( 33 | [transforms.RandomResizedCrop(224), 34 | transforms.RandomHorizontalFlip(), ImageNetPolicy(), 35 | transforms.ToTensor(), transforms.Normalize(...)])) 36 | loader = DataLoader(data, ...) 37 | ``` 38 | 39 | ## Example as a PyTorch Transform - CIFAR10 40 | 41 | ```python 42 | from autoaugment import CIFAR10Policy 43 | data = ImageFolder(rootdir, transform=transforms.Compose( 44 | [transforms.RandomCrop(32, padding=4, fill=128), # fill parameter needs torchvision installed from source 45 | transforms.RandomHorizontalFlip(), CIFAR10Policy(), 46 | transforms.ToTensor(), 47 | Cutout(n_holes=1, length=16), # (https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py) 48 | transforms.Normalize(...)])) 49 | loader = DataLoader(data, ...) 50 | ``` 51 | 52 | ## Example as a PyTorch Transform - SVHN 53 | 54 | ```python 55 | from autoaugment import SVHNPolicy 56 | data = ImageFolder(rootdir, transform=transforms.Compose( 57 | [SVHNPolicy(), 58 | transforms.ToTensor(), 59 | Cutout(n_holes=1, length=20), # (https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py) 60 | transforms.Normalize(...)])) 61 | loader = DataLoader(data, ...) 62 | ``` 63 | 64 | ------------------ 65 | 66 | 67 | ## Results with AutoAugment 68 | 69 | ### Generalizable Data Augmentations 70 | 71 | > Finally, we show that policies found on one task can generalize well across different models and datasets. 72 | > For example, the policy found on ImageNet leads to significant improvements on a variety of FGVC datasets. Even on datasets for 73 | > which fine-tuning weights pre-trained on ImageNet does not help significantly [26], e.g. Stanford 74 | > Cars [27] and FGVC Aircraft [28], training with the ImageNet policy reduces test set error by 1.16% 75 | > and 1.76%, respectively. __This result suggests that transferring data augmentation policies offers an 76 | > alternative method for transfer learning__. 77 | 78 | ### CIFAR 10 79 | 80 | ![CIFAR10 Results](figures/CIFAR10_results.png) 81 | 82 | ### CIFAR 100 83 | 84 | ![CIFAR10 Results](figures/CIFAR100_results.png) 85 | 86 | ### ImageNet 87 | 88 | ![ImageNet Results](figures/ImageNet_results.png) 89 | 90 | ### SVHN 91 | 92 | ![SVHN Results](figures/SVHN_results.png) 93 | 94 | ### Fine Grained Visual Classification Datasets 95 | 96 | ![SVHN Results](figures/FGVC_results.png) 97 | -------------------------------------------------------------------------------- /Augmentation/README.md: -------------------------------------------------------------------------------- 1 | # Comparing different augmentation protocols 2 | 3 | ## Generate a Random Architecture 4 | 5 | ``` 6 | from models.search_cnn import SearchCNNController 7 | model = SearchCNNController(3, 16, 10, 20, None, n_nodes=4) 8 | genotype = model.genotype() 9 | ``` 10 | 11 | ## Augmentation : 12 | 13 | ``` 14 | python augment.py 15 | --name test 16 | --dataset CIFAR10 17 | --data_path /data # path to data 18 | --epochs 600 # or 1500 depending on experiments 19 | --init_channels 36 # or 50 depending on experiments 20 | --genotype genotype 21 | --cutout_length 16 # or 0 depending on experiments 22 | --drop_path_prob 0.2 # or 0 depending on experiments 23 | --aux_weight 0.4 # or 0 depending on experiments 24 | --seed 2 # or another one depending on experiments 25 | --autoaugment # remove not to use it 26 | ``` 27 | -------------------------------------------------------------------------------- /Augmentation/architect.py: -------------------------------------------------------------------------------- 1 | """ Architect controls architecture of cell by computing gradients of alphas """ 2 | import copy 3 | import torch 4 | 5 | 6 | class Architect(): 7 | """ Compute gradients of alphas """ 8 | def __init__(self, net, w_momentum, w_weight_decay): 9 | """ 10 | Args: 11 | net 12 | w_momentum: weights momentum 13 | """ 14 | self.net = net 15 | self.v_net = copy.deepcopy(net) 16 | self.w_momentum = w_momentum 17 | self.w_weight_decay = w_weight_decay 18 | 19 | def virtual_step(self, trn_X, trn_y, xi, w_optim): 20 | """ 21 | Compute unrolled weight w' (virtual step) 22 | 23 | Step process: 24 | 1) forward 25 | 2) calc loss 26 | 3) compute gradient (by backprop) 27 | 4) update gradient 28 | 29 | Args: 30 | xi: learning rate for virtual gradient step (same as weights lr) 31 | w_optim: weights optimizer 32 | """ 33 | # forward & calc loss 34 | loss = self.net.loss(trn_X, trn_y) # L_trn(w) 35 | 36 | # compute gradient 37 | gradients = torch.autograd.grad(loss, self.net.weights()) 38 | 39 | # do virtual step (update gradient) 40 | # below operations do not need gradient tracking 41 | with torch.no_grad(): 42 | # dict key is not the value, but the pointer. So original network weight have to 43 | # be iterated also. 44 | for w, vw, g in zip(self.net.weights(), self.v_net.weights(), gradients): 45 | m = w_optim.state[w].get('momentum_buffer', 0.) * self.w_momentum 46 | vw.copy_(w - xi * (m + g + self.w_weight_decay*w)) 47 | 48 | # synchronize alphas 49 | for a, va in zip(self.net.alphas(), self.v_net.alphas()): 50 | va.copy_(a) 51 | 52 | def unrolled_backward(self, trn_X, trn_y, val_X, val_y, xi, w_optim): 53 | """ Compute unrolled loss and backward its gradients 54 | Args: 55 | xi: learning rate for virtual gradient step (same as net lr) 56 | w_optim: weights optimizer - for virtual step 57 | """ 58 | # do virtual step (calc w`) 59 | self.virtual_step(trn_X, trn_y, xi, w_optim) 60 | 61 | # calc unrolled loss 62 | loss = self.v_net.loss(val_X, val_y) # L_val(w`) 63 | 64 | # compute gradient 65 | v_alphas = tuple(self.v_net.alphas()) 66 | v_weights = tuple(self.v_net.weights()) 67 | v_grads = torch.autograd.grad(loss, v_alphas + v_weights) #allow_unused=True 68 | dalpha = v_grads[:len(v_alphas)] 69 | dw = v_grads[len(v_alphas):] 70 | 71 | hessian = self.compute_hessian(dw, trn_X, trn_y) 72 | 73 | # update final gradient = dalpha - xi*hessian 74 | with torch.no_grad(): 75 | for alpha, da, h in zip(self.net.alphas(), dalpha, hessian): 76 | alpha.grad = da - xi*h 77 | 78 | def compute_hessian(self, dw, trn_X, trn_y): 79 | """ 80 | dw = dw` { L_val(w`, alpha) } 81 | w+ = w + eps * dw 82 | w- = w - eps * dw 83 | hessian = (dalpha { L_trn(w+, alpha) } - dalpha { L_trn(w-, alpha) }) / (2*eps) 84 | eps = 0.01 / ||dw|| 85 | """ 86 | norm = torch.cat([w.view(-1) for w in dw]).norm() 87 | eps = 0.01 / norm 88 | 89 | # w+ = w + eps*dw` 90 | with torch.no_grad(): 91 | for p, d in zip(self.net.weights(), dw): 92 | p += eps * d 93 | loss = self.net.loss(trn_X, trn_y) 94 | dalpha_pos = torch.autograd.grad(loss, self.net.alphas()) # dalpha { L_trn(w+) } #allow_unused=True 95 | 96 | # w- = w - eps*dw` 97 | with torch.no_grad(): 98 | for p, d in zip(self.net.weights(), dw): 99 | p -= 2. * eps * d 100 | loss = self.net.loss(trn_X, trn_y) 101 | dalpha_neg = torch.autograd.grad(loss, self.net.alphas()) # dalpha { L_trn(w-) } #allow_unused=True 102 | 103 | # recover w 104 | with torch.no_grad(): 105 | for p, d in zip(self.net.weights(), dw): 106 | p += eps * d 107 | 108 | hessian = [(p-n) / 2.*eps for p, n in zip(dalpha_pos, dalpha_neg)] 109 | return hessian 110 | -------------------------------------------------------------------------------- /Augmentation/genotypes.py: -------------------------------------------------------------------------------- 1 | """ Genotypes 2 | - Genotype: normal/reduce gene + normal/reduce cell output connection (concat) 3 | - gene: discrete ops information (w/o output connection) 4 | - dag: real ops (can be mixed or discrete, but Genotype has only discrete information itself) 5 | """ 6 | from collections import namedtuple 7 | import torch 8 | import torch.nn as nn 9 | from models import ops 10 | 11 | 12 | Genotype = namedtuple('Genotype', 'normal normal_concat reduce reduce_concat') 13 | 14 | PRIMITIVES = [ 15 | 'max_pool_3x3', 16 | 'avg_pool_3x3', 17 | 'skip_connect', # identity 18 | 'sep_conv_3x3', 19 | 'sep_conv_5x5', 20 | 'dil_conv_3x3', 21 | 'dil_conv_5x5', 22 | 'none' 23 | ] 24 | 25 | 26 | def to_dag(C_in, gene, SSC, reduction): 27 | """ generate discrete ops from gene """ 28 | dag = nn.ModuleList() 29 | for edges in gene: 30 | row = nn.ModuleList() 31 | for op_name, s_idx in edges: 32 | # reduction cell & from input nodes => stride = 2 33 | stride = 2 if reduction and s_idx < 2 else 1 34 | if SSC: 35 | op = ops.OPS_SSC[op_name](C_in, stride, True) 36 | else: 37 | op = ops.OPS_DARTS[op_name](C_in, stride, True) 38 | if not isinstance(op, ops.Identity): # Identity does not use drop path 39 | op = nn.Sequential( 40 | op, 41 | ops.DropPath_() 42 | ) 43 | op.s_idx = s_idx 44 | row.append(op) 45 | dag.append(row) 46 | 47 | return dag 48 | 49 | 50 | def from_str(s): 51 | """ generate genotype from string 52 | e.g. "Genotype( 53 | normal=[[('sep_conv_3x3', 0), ('sep_conv_3x3', 1)], 54 | [('sep_conv_3x3', 1), ('dil_conv_3x3', 2)], 55 | [('sep_conv_3x3', 1), ('sep_conv_3x3', 2)], 56 | [('sep_conv_3x3', 1), ('dil_conv_3x3', 4)]], 57 | normal_concat=range(2, 6), 58 | reduce=[[('max_pool_3x3', 0), ('max_pool_3x3', 1)], 59 | [('max_pool_3x3', 0), ('skip_connect', 2)], 60 | [('max_pool_3x3', 0), ('skip_connect', 2)], 61 | [('max_pool_3x3', 0), ('skip_connect', 2)]], 62 | reduce_concat=range(2, 6))" 63 | """ 64 | 65 | genotype = eval(s) 66 | 67 | return genotype 68 | 69 | 70 | def parse(alpha, k): 71 | """ 72 | parse continuous alpha to discrete gene. 73 | alpha is ParameterList: 74 | ParameterList [ 75 | Parameter(n_edges1, n_ops), 76 | Parameter(n_edges2, n_ops), 77 | ... 78 | ] 79 | 80 | gene is list: 81 | [ 82 | [('node1_ops_1', node_idx), ..., ('node1_ops_k', node_idx)], 83 | [('node2_ops_1', node_idx), ..., ('node2_ops_k', node_idx)], 84 | ... 85 | ] 86 | each node has two edges (k=2) in CNN. 87 | """ 88 | 89 | gene = [] 90 | #assert PRIMITIVES[-1] == 'none' # assume last PRIMITIVE is 'none' 91 | 92 | # 1) Convert the mixed op to discrete edge (single op) by choosing top-1 weight edge 93 | # 2) Choose top-k edges per node by edge score (top-1 weight in edge) 94 | for edges in alpha: 95 | # edges: Tensor(n_edges, n_ops) 96 | edge_max, primitive_indices = torch.topk(edges[:, :-1], 1) # ignore 'none' 97 | topk_edge_values, topk_edge_indices = torch.topk(edge_max.view(-1), k) 98 | node_gene = [] 99 | for edge_idx in topk_edge_indices: 100 | prim_idx = primitive_indices[edge_idx] 101 | prim = PRIMITIVES[prim_idx] 102 | node_gene.append((prim, edge_idx.item())) 103 | 104 | gene.append(node_gene) 105 | 106 | return gene 107 | -------------------------------------------------------------------------------- /Augmentation/models/augment_cells.py: -------------------------------------------------------------------------------- 1 | """ CNN cell for network augmentation """ 2 | import torch 3 | import torch.nn as nn 4 | from models import ops 5 | import genotypes as gt 6 | 7 | 8 | class AugmentCell(nn.Module): 9 | """ Cell for augmentation 10 | Each edge is discrete. 11 | """ 12 | def __init__(self, genotype, i, C_pp, C_p, C, reduction_p, reduction, SSC): 13 | super().__init__() 14 | self.reduction = reduction 15 | try: 16 | self.n_nodes = len(genotype.normal) 17 | except: 18 | self.n_nodes = len(genotype["cell_0"]) 19 | if reduction_p: 20 | self.preproc0 = ops.FactorizedReduce(C_pp, C) 21 | else: 22 | self.preproc0 = ops.StdConv(C_pp, C, 1, 1, 0) 23 | self.preproc1 = ops.StdConv(C_p, C, 1, 1, 0) 24 | 25 | # generate dag 26 | if reduction: 27 | try: 28 | gene = genotype.reduce 29 | self.concat = genotype.reduce_concat 30 | except: 31 | gene = genotype["cell_%d" % i] 32 | self.concat = range(2, 2+self.n_nodes) 33 | else: 34 | try: 35 | gene = genotype.normal 36 | self.concat = genotype.normal_concat 37 | except: 38 | gene = genotype["cell_%d" % i] 39 | self.concat = range(2, 2 + self.n_nodes) 40 | self.dag = gt.to_dag(C, gene, SSC, reduction) 41 | 42 | def forward(self, s0, s1): 43 | s0 = self.preproc0(s0) 44 | s1 = self.preproc1(s1) 45 | 46 | states = [s0, s1] 47 | for edges in self.dag: 48 | s_cur = sum(op(states[op.s_idx]) for op in edges) 49 | states.append(s_cur) 50 | 51 | s_out = torch.cat([states[i] for i in self.concat], dim=1) 52 | 53 | return s_out 54 | -------------------------------------------------------------------------------- /Augmentation/models/search_cells.py: -------------------------------------------------------------------------------- 1 | """ CNN cell for architecture search """ 2 | import torch 3 | import torch.nn as nn 4 | from models import ops 5 | 6 | 7 | class SearchCell(nn.Module): 8 | """ Cell for search 9 | Each edge is mixed and continuous relaxed. 10 | """ 11 | def __init__(self, n_nodes, C_pp, C_p, C, reduction_p, reduction): 12 | """ 13 | Args: 14 | n_nodes: # of intermediate n_nodes 15 | C_pp: C_out[k-2] 16 | C_p : C_out[k-1] 17 | C : C_in[k] (current) 18 | reduction_p: flag for whether the previous cell is reduction cell or not 19 | reduction: flag for whether the current cell is reduction cell or not 20 | """ 21 | super().__init__() 22 | self.reduction = reduction 23 | self.n_nodes = n_nodes 24 | 25 | # If previous cell is reduction cell, current input size does not match with 26 | # output size of cell[k-2]. So the output[k-2] should be reduced by preprocessing. 27 | if reduction_p: 28 | self.preproc0 = ops.FactorizedReduce(C_pp, C, affine=False) 29 | else: 30 | self.preproc0 = ops.StdConv(C_pp, C, 1, 1, 0, affine=False) 31 | self.preproc1 = ops.StdConv(C_p, C, 1, 1, 0, affine=False) 32 | 33 | # generate dag 34 | self.dag = nn.ModuleList() 35 | for i in range(self.n_nodes): 36 | self.dag.append(nn.ModuleList()) 37 | for j in range(2+i): # include 2 input nodes 38 | # reduction should be used only for input node 39 | stride = 2 if reduction and j < 2 else 1 40 | op = ops.MixedOp(C, stride) 41 | self.dag[i].append(op) 42 | 43 | def forward(self, s0, s1, w_dag): 44 | s0 = self.preproc0(s0) 45 | s1 = self.preproc1(s1) 46 | 47 | states = [s0, s1] 48 | for edges, w_list in zip(self.dag, w_dag): 49 | s_cur = sum(edges[i](s, w) for i, (s, w) in enumerate(zip(states, w_list))) 50 | states.append(s_cur) 51 | 52 | s_out = torch.cat(states[2:], dim=1) 53 | return s_out 54 | -------------------------------------------------------------------------------- /Augmentation/preproc.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import numpy as np 4 | import torchvision.transforms as transforms 5 | 6 | import utils 7 | 8 | from AutoAugment.autoaugment import CIFAR10Policy 9 | 10 | class Cutout(object): 11 | def __init__(self, length): 12 | self.length = length 13 | 14 | def __call__(self, img): 15 | h, w = img.size(1), img.size(2) 16 | mask = np.ones((h, w), np.float32) 17 | y = np.random.randint(h) 18 | x = np.random.randint(w) 19 | 20 | y1 = np.clip(y - self.length // 2, 0, h) 21 | y2 = np.clip(y + self.length // 2, 0, h) 22 | x1 = np.clip(x - self.length // 2, 0, w) 23 | x2 = np.clip(x + self.length // 2, 0, w) 24 | 25 | mask[y1: y2, x1: x2] = 0. 26 | mask = torch.from_numpy(mask) 27 | mask = mask.expand_as(img) 28 | img *= mask 29 | 30 | return img 31 | 32 | 33 | def data_transforms(dataset, cutout_length, autoaugment): 34 | dataset = dataset.lower() 35 | if dataset == 'cifar10' or dataset == 'cifar100': 36 | MEAN = [0.49139968, 0.48215827, 0.44653124] 37 | STD = [0.24703233, 0.24348505, 0.26158768] 38 | if autoaugment: 39 | transf_train = [ 40 | transforms.RandomCrop(32, padding=4), # fill 128 41 | transforms.RandomHorizontalFlip(), 42 | CIFAR10Policy()] 43 | else: 44 | transf_train = [ 45 | transforms.RandomCrop(32, padding=4), 46 | transforms.RandomHorizontalFlip() 47 | ] 48 | transf_val = [] 49 | elif dataset == 'mnist': 50 | MEAN = [0.13066051707548254] 51 | STD = [0.30810780244715075] 52 | transf_train = [ 53 | transforms.RandomAffine(degrees=15, translate=(0.1, 0.1), scale=(0.9, 1.1), shear=0.1) 54 | ] 55 | transf_val=[] 56 | elif dataset == 'fashionmnist': 57 | MEAN = [0.28604063146254594] 58 | STD = [0.35302426207299326] 59 | transf_train = [ 60 | transforms.RandomAffine(degrees=15, translate=(0.1, 0.1), scale=(0.9, 1.1), shear=0.1), 61 | transforms.RandomVerticalFlip() 62 | ] 63 | transf_val = [] 64 | #Same reprocessing for ImageNet, Sport8 and MIT67 65 | elif dataset in utils.LARGE_DATASETS: 66 | MEAN = [0.485, 0.456, 0.406] 67 | STD = [0.229, 0.224, 0.225] 68 | transf_train = [ 69 | transforms.RandomResizedCrop(224), 70 | transforms.RandomHorizontalFlip(), 71 | transforms.ColorJitter( 72 | brightness=0.4, 73 | contrast=0.4, 74 | saturation=0.4, 75 | hue=0.2) 76 | ] 77 | transf_val = [ 78 | transforms.Resize(256), 79 | transforms.CenterCrop(224), 80 | ] 81 | else: 82 | raise ValueError('not expected dataset = {}'.format(dataset)) 83 | 84 | normalize = [ 85 | transforms.ToTensor(), 86 | transforms.Normalize(MEAN, STD) 87 | ] 88 | 89 | train_transform = transforms.Compose(transf_train + normalize) 90 | valid_transform = transforms.Compose(transf_val + normalize) # FIXME validation is not set to square proportions, is this an issue? 91 | 92 | if cutout_length > 0: 93 | train_transform.transforms.append(Cutout(cutout_length)) 94 | 95 | return train_transform, valid_transform 96 | -------------------------------------------------------------------------------- /Augmentation/visualize.py: -------------------------------------------------------------------------------- 1 | """ Network architecture visualizer using graphviz """ 2 | import sys 3 | from graphviz import Digraph 4 | import genotypes as gt 5 | 6 | 7 | def plot(genotype, file_path, caption=None): 8 | """ make DAG plot and save to file_path as .png """ 9 | edge_attr = { 10 | 'fontsize': '20', 11 | 'fontname': 'times' 12 | } 13 | node_attr = { 14 | 'style': 'filled', 15 | 'shape': 'rect', 16 | 'align': 'center', 17 | 'fontsize': '20', 18 | 'height': '0.5', 19 | 'width': '0.5', 20 | 'penwidth': '2', 21 | 'fontname': 'times' 22 | } 23 | g = Digraph( 24 | format='png', 25 | edge_attr=edge_attr, 26 | node_attr=node_attr, 27 | engine='dot') 28 | g.body.extend(['rankdir=LR']) 29 | 30 | # input nodes 31 | g.node("c_{k-2}", fillcolor='darkseagreen2') 32 | g.node("c_{k-1}", fillcolor='darkseagreen2') 33 | 34 | # intermediate nodes 35 | n_nodes = len(genotype) 36 | for i in range(n_nodes): 37 | g.node(str(i), fillcolor='lightblue') 38 | 39 | for i, edges in enumerate(genotype): 40 | for op, j in edges: 41 | if j == 0: 42 | u = "c_{k-2}" 43 | elif j == 1: 44 | u = "c_{k-1}" 45 | else: 46 | u = str(j-2) 47 | 48 | v = str(i) 49 | g.edge(u, v, label=op, fillcolor="gray") 50 | 51 | # output node 52 | g.node("c_{k}", fillcolor='palegoldenrod') 53 | for i in range(n_nodes): 54 | g.edge(str(i), "c_{k}", fillcolor="gray") 55 | 56 | # add image caption 57 | if caption: 58 | g.attr(label=caption, overlap='false', fontsize='20', fontname='times') 59 | 60 | g.render(file_path, view=False) 61 | 62 | 63 | if __name__ == '__main__': 64 | if len(sys.argv) != 2: 65 | raise ValueError("usage:\n python {} GENOTYPE".format(sys.argv[0])) 66 | 67 | genotype_str = sys.argv[1] 68 | try: 69 | genotype = gt.from_str(genotype_str) 70 | except AttributeError: 71 | raise ValueError("Cannot parse {}".format(genotype_str)) 72 | 73 | plot(genotype.normal, "normal") 74 | plot(genotype.reduce, "reduction") 75 | -------------------------------------------------------------------------------- /CNAS/README.md: -------------------------------------------------------------------------------- 1 | # Automatic Convolutional Neural Architecture Search for Image Classification Under Different Scenes 2 | 3 | ## Generate random architectures 4 | 5 | ``` 6 | import darts.geno_types as geno_types 7 | import darts.model_search as models 8 | model = models.Network(3, 16, 10, 20, None, num_inp_node = 2, num_meta_node = 6) 9 | genotype = model.genotype() 10 | ``` 11 | 12 | ## Search 13 | 14 | ``` 15 | python run/train_search.py 16 | --batch-size 64 17 | --epochs 50 18 | --num-meta-node 6 19 | --cutout 20 | --data /data #path to data 21 | --train-dataset cifar10 # choose between cifar10, cifar100, sport8, mit67 and flowers102 22 | --save test 23 | ``` 24 | 25 | ## Augment 26 | 27 | ``` 28 | python run/train_cnn.py 29 | --epochs 600 30 | --learning-rate 0.025 31 | --batch-size 64 32 | --drop-path-prob 0.25 33 | --cutout 34 | --data /data # path to data 35 | --train-dataset cifar10 # choose between cifar10, cifar100, sport8, mit67 and flowers102 36 | --arch genotype 37 | --save test 38 | ``` 39 | 40 | 41 | 42 | -------------------------------------------------------------------------------- /CNAS/darts/geno_types.py: -------------------------------------------------------------------------------- 1 | from collections import namedtuple 2 | 3 | Genotype = namedtuple('Genotype', 'normal normal_concat reduce reduce_concat') 4 | 5 | DARTS_V1 = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('skip_connect', 0), ('sep_conv_3x3', 1), ('skip_connect', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('skip_connect', 2)], normal_concat=[2, 3, 4, 5], 6 | reduce=[('max_pool_3x3', 0), ('max_pool_3x3', 1), ('skip_connect', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('skip_connect', 2), ('skip_connect', 2), ('avg_pool_3x3', 0)], reduce_concat=[2, 3, 4, 5]) 7 | 8 | #-----------------------Train in cifar 10 9 | ''' 10 | 1. Not consider none in genotypes 11 | ''' 12 | # meta-node = 3; one step 13 | DARTS_MORE_V1 = Genotype(normal=[('cweight_com', 1), ('sep_conv_3x3', 0), ('cweight_com', 0), ('cweight_com', 1), ('cweight_com', 0), ('cweight_com', 1)], normal_concat=range(2, 5), 14 | reduce=[('cweight_com', 0), ('shuffle_conv_3x3', 1), ('shuffle_conv_3x3', 2), ('max_pool_3x3', 0), ('shuffle_conv_3x3', 3), ('shuffle_conv_3x3', 2)], reduce_concat=range(2, 5)) 15 | 16 | # meta-node = 4; one step 17 | DARTS_MORE_V2 = Genotype(normal=[('shuffle_conv_3x3', 0), ('cweight_com', 1), ('cweight_com', 1), ('cweight_com', 0), ('cweight_com', 1), ('cweight_com', 0), ('cweight_com', 0), ('cweight_com', 1)], normal_concat=range(2, 6), 18 | reduce=[('cweight_com', 1), ('avg_pool_3x3', 0), ('avg_pool_3x3', 0), ('cweight_com', 1), ('avg_pool_3x3', 0), ('shuffle_conv_3x3', 2), ('shuffle_conv_3x3', 2), ('shuffle_conv_3x3', 3)], reduce_concat=range(2, 6)) 19 | 20 | # meta-node = 5; one step 21 | DARTS_MORE_V3 = Genotype(normal=[('shuffle_conv_3x3', 0), ('cweight_com', 1), ('cweight_com', 2), ('dil_conv_3x3', 0), ('cweight_com', 2), ('skip_connect', 0), ('cweight_com', 1), ('cweight_com', 0), ('cweight_com', 0), ('cweight_com', 1)], normal_concat=range(2, 7), 22 | reduce=[('max_pool_3x3', 0), ('shuffle_conv_3x3', 1), ('max_pool_3x3', 0), ('shuffle_conv_3x3', 2), ('shuffle_conv_3x3', 3), ('shuffle_conv_3x3', 0), ('dil_conv_3x3', 3), ('shuffle_conv_3x3', 2), ('shuffle_conv_3x3', 4), ('shuffle_conv_3x3', 2)], reduce_concat=range(2, 7)) 23 | 24 | # meta-node = 6; one step 25 | DARTS_MORE_V4 = Genotype( 26 | normal=[('sep_conv_3x3', 1), ('shuffle_conv_3x3', 0), ('skip_connect', 0), ('cweight_com', 1), ('cweight_com', 3), 27 | ('cweight_com', 1), ('cweight_com', 4), ('cweight_com', 3), ('cweight_com', 3), ('cweight_com', 4), 28 | ('cweight_com', 4), ('cweight_com', 3)], normal_concat=range(2, 8), 29 | reduce=[('max_pool_3x3', 1), ('cweight_com', 0), ('max_pool_3x3', 1), ('skip_connect', 2), ('skip_connect', 2), 30 | ('max_pool_3x3', 1), ('skip_connect', 3), ('skip_connect', 4), ('skip_connect', 4), ('skip_connect', 3), 31 | ('skip_connect', 4), ('skip_connect', 3)], reduce_concat=range(2, 8)) 32 | 33 | 34 | """ 35 | 2. Consider none in genotypes 36 | """ 37 | 38 | # meta-node 3; one step 39 | DARTS_MORE_NONE_V1 = Genotype(normal=[('shuffle_conv_3x3', 1), ('sep_conv_3x3', 0), ('none', 2), ('none', 1), ('none', 3), ('none', 2)], normal_concat=range(2, 5), 40 | reduce=[('max_pool_3x3', 0), ('shuffle_conv_3x3', 1), ('max_pool_3x3', 0), ('shuffle_conv_3x3', 2), ('dil_conv_3x3', 3), ('cweight_com', 0)], reduce_concat=range(2, 5)) 41 | 42 | DARTS_MORE_NONE_V2 = Genotype(normal=[('cweight_com', 1), ('sep_conv_3x3', 0), ('none', 2), ('cweight_com', 1), ('none', 3), ('none', 2), ('none', 4), ('none', 3)], normal_concat=range(2, 6), 43 | reduce=[('shuffle_conv_3x3', 1), ('shuffle_conv_3x3', 0), ('shuffle_conv_3x3', 2), ('shuffle_conv_3x3', 0), ('sep_conv_3x3', 2), ('avg_pool_3x3', 1), ('none', 4), ('shuffle_conv_3x3', 3)], reduce_concat=range(2, 6)) 44 | 45 | # meta-node 5; one step 46 | DARTS_MORE_NONE_V3 = Genotype(normal=[('shuffle_conv_3x3', 0), ('none', 1), ('none', 2), ('shuffle_conv_3x3', 0), ('none', 3), ('none', 2),('none', 4), ('none', 3), ('none', 5), ('none', 4)], normal_concat=range(2, 7), 47 | reduce=[('cweight_com', 1), ('shuffle_conv_3x3', 0), ('shuffle_conv_3x3', 2), ('cweight_com', 1), ('cweight_com', 1),('shuffle_conv_3x3', 2), ('dil_conv_3x3', 2), ('avg_pool_3x3', 0), ('shuffle_conv_3x3', 1), ('none', 5)],reduce_concat=range(2, 7)) 48 | 49 | # meta-node 6; one step 50 | DARTS_MORE_NONE_V4 = Genotype(normal=[('sep_conv_3x3', 0), ('none', 1), ('none', 2), ('shuffle_conv_3x3', 0), ('none', 3), ('none', 2), ('none', 4), ('none', 3), ('none', 5), ('none', 4), ('none', 5), ('none', 6)], normal_concat=range(2, 8), 51 | reduce=[('sep_conv_3x3', 0), ('skip_connect', 1), ('max_pool_3x3', 1), ('shuffle_conv_3x3', 2), ('shuffle_conv_3x3', 2), ('avg_pool_3x3', 1), ('cweight_com', 4), ('shuffle_conv_3x3', 2), ('shuffle_conv_3x3', 2), ('shuffle_conv_3x3', 4), ('none', 5), ('none', 6)], reduce_concat=range(2, 8)) 52 | 53 | #-----------------------Train on tiny-imagenet 54 | 55 | # meta-node 4; one step 56 | DARTS_MORE_TY2 = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('shuffle_conv_3x3', 2), ('max_pool_3x3', 0), ('cweight_com', 1), ('sep_conv_3x3', 3), ('cweight_com', 1)], normal_concat=range(2, 6), reduce=[('max_pool_3x3', 0), ('max_pool_3x3', 1), ('cweight_com', 1), ('sep_conv_3x3', 2), ('dil_conv_3x3', 0), ('cweight_com', 1), ('cweight_com', 0), ('shuffle_conv_3x3', 3)], reduce_concat=range(2, 6)) 57 | 58 | 59 | # test 60 | DARTS = DARTS_MORE_V1 61 | -------------------------------------------------------------------------------- /CNAS/darts/visualize.py: -------------------------------------------------------------------------------- 1 | from graphviz import Digraph 2 | 3 | def plot(genotype, filename): 4 | g = Digraph( 5 | format='pdf', 6 | edge_attr = dict(fontsize='20', fontname='times'), 7 | node_attr = dict(style='filled', shape='rect', align='center', 8 | fontsize='20', height='0.5', width='0.5', 9 | penwidth='2', fontname='times'), 10 | engine='dot' 11 | ) 12 | g.body.extend(['randkdir=LR']) 13 | 14 | g.node('c_{k-2}', fillcolor='darkseagreen2') 15 | g.node('c_{k-1}', fillcolor='darkseagreen2') 16 | assert len(genotype) % 2 == 0 17 | steps = len(genotype) // 2 18 | 19 | for i in range(steps): 20 | g.node(str(i), fillcolor='lightblue') 21 | 22 | for i in range(steps): 23 | for k in [2*i, 2*i+1]: 24 | op, j = genotype[k] 25 | if j == 0: 26 | u = 'c_{k-2}' 27 | elif j == 1: 28 | u = 'c_{k-1}' 29 | else: 30 | u = str(j-2) 31 | v = str(i) 32 | g.edge(u, v, label=op, fillcolor='gray') 33 | 34 | g.node('c_{k}', fillcolor='palegoldenrod') 35 | for i in range(steps): 36 | g.edge(str(i), 'c_{k}', fillcolor='gray') 37 | 38 | g.render(filename, view=True) 39 | 40 | 41 | 42 | -------------------------------------------------------------------------------- /CNAS/experiments/cell_visualize.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import os 3 | import platform 4 | import time 5 | from darts import visualize 6 | from darts.utils import create_exp_dir 7 | 8 | 9 | def main(): 10 | 11 | genotype_name = 'DARTS' 12 | if len(sys.argv) != 2: 13 | print('usage:\n python {} ARCH_NAME, Default: DARTS'.format(sys.argv[0])) 14 | else: 15 | genotype_name = sys.argv[1] 16 | 17 | store_path = './cell_visualize_pdf/' + 'graph-{}-{}'.format('exp', time.strftime('%Y%m%d-%H%M')) 18 | create_exp_dir(store_path, scripts_to_save=None) 19 | 20 | if 'Windows' in platform.platform(): 21 | os.environ['PATH'] += os.pathsep + '../3rd_tools/graphviz-2.38/bin/' 22 | try: 23 | genotype = eval('geno_types.{}'.format(genotype_name)) 24 | except AttributeError: 25 | print('{} is not specified in geno_types.py'.format(genotype_name)) 26 | sys.exit(1) 27 | 28 | visualize.plot(genotype.normal, store_path+'/normal') 29 | visualize.plot(genotype.reduce, store_path+'/reduction') 30 | 31 | if __name__ == '__main__': 32 | main() 33 | -------------------------------------------------------------------------------- /CNAS/experiments/food101_val.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import shutil 4 | 5 | def gendir(): 6 | base = '/train_tiny_data/train_data/food-101/' 7 | labels = base + 'meta/classes.txt' 8 | if not os.path.exists(base + 'val'): 9 | os.mkdir(base + 'val') 10 | with open(labels) as f: 11 | for line in f.readlines(): 12 | dir_name = base + 'val/' + line.strip('\n') 13 | if not os.path.exists(dir_name): 14 | os.mkdir(dir_name) 15 | 16 | # mv images to val 17 | tests = base + 'meta/test.txt' 18 | with open(tests) as f: 19 | for line in f.readlines(): 20 | file_name = line.strip('\n') + '.jpg' 21 | file_src = base + 'images/' + file_name 22 | mv_file_dst = base + 'val/' + file_name 23 | if os.path.exists(file_src): 24 | shutil.move(file_src, mv_file_dst) 25 | # rename images 26 | os.rename(base+'images', base+'train') 27 | 28 | print('process finish!') 29 | 30 | 31 | if __name__ == '__main__': 32 | gendir() 33 | 34 | 35 | 36 | 37 | -------------------------------------------------------------------------------- /CNAS/experiments/heat_map.py: -------------------------------------------------------------------------------- 1 | import time 2 | import numpy as np 3 | import seaborn as sns 4 | from pandas import Series,DataFrame 5 | import pandas as pd 6 | import matplotlib.pyplot as plt 7 | from darts.utils import create_exp_dir 8 | 9 | class HeapMap(object): 10 | def __init__(self): 11 | super(HeapMap, self).__init__() 12 | self.store_path = './heap_map/' + 'graph-{}-{}'.format('exp', time.strftime('%Y%m%d-%H%M%S')) 13 | create_exp_dir(self.store_path, scripts_to_save=None) 14 | self.normal_array = \ 15 | np.array([[0.1718, 0.1675, 0.0404, 0.0415, 0.1407, 0.1185, 0.3196], 16 | [0.2320, 0.2777, 0.0293, 0.0365, 0.1056, 0.1144, 0.2045], 17 | [0.2790, 0.2823, 0.0455, 0.0483, 0.0959, 0.0934, 0.1557], 18 | [0.2479, 0.2862, 0.0348, 0.0425, 0.0799, 0.1855, 0.1232], 19 | [0.1644, 0.4227, 0.0221, 0.0322, 0.0543, 0.1427, 0.1614], 20 | [0.2697, 0.2799, 0.0431, 0.0439, 0.1305, 0.0710, 0.1620], 21 | [0.2961, 0.3373, 0.0399, 0.0464, 0.0785, 0.0802, 0.1217], 22 | [0.1484, 0.4467, 0.0248, 0.0343, 0.0789, 0.0912, 0.1757], 23 | [0.0534, 0.7706, 0.0154, 0.0177, 0.0249, 0.0454, 0.0725], 24 | [0.2279, 0.3509, 0.0416, 0.0399, 0.0992, 0.1016, 0.1390], 25 | [0.2491, 0.3712, 0.0404, 0.0452, 0.0738, 0.0917, 0.1287], 26 | [0.1132, 0.5628, 0.0213, 0.0275, 0.0497, 0.0732, 0.1523], 27 | [0.0419, 0.8060, 0.0144, 0.0159, 0.0294, 0.0421, 0.0502], 28 | [0.0261, 0.8249, 0.0162, 0.0183, 0.0343, 0.0375, 0.0427]]) 29 | 30 | self.reduce_array = \ 31 | np.array([[0.0878, 0.1235, 0.1910, 0.2339, 0.1080, 0.1146, 0.1412], 32 | [0.1000, 0.1843, 0.1368, 0.1768, 0.1326, 0.1264, 0.1431], 33 | [0.0889, 0.1127, 0.1696, 0.2059, 0.1317, 0.1106, 0.1805], 34 | [0.1395, 0.1231, 0.1485, 0.1959, 0.1331, 0.1359, 0.1239], 35 | [0.1732, 0.1628, 0.0814, 0.0900, 0.1416, 0.1546, 0.1965], 36 | [0.1283, 0.1254, 0.1641, 0.1768, 0.1200, 0.1361, 0.1493], 37 | [0.1115, 0.1263, 0.1467, 0.1745, 0.1582, 0.1249, 0.1579], 38 | [0.1609, 0.1568, 0.0761, 0.0773, 0.1792, 0.1443, 0.2055], 39 | [0.1413, 0.1592, 0.0743, 0.0706, 0.1569, 0.2406, 0.1571], 40 | [0.1275, 0.1215, 0.1779, 0.1909, 0.1146, 0.1279, 0.1397], 41 | [0.1250, 0.1513, 0.1504, 0.1723, 0.1298, 0.1243, 0.1469], 42 | [0.1539, 0.1432, 0.0772, 0.0717, 0.1476, 0.1760, 0.2305], 43 | [0.1856, 0.2151, 0.0833, 0.0713, 0.1028, 0.1746, 0.1674], 44 | [0.1552, 0.2100, 0.0764, 0.0677, 0.1766, 0.1507, 0.1634]]) 45 | index = ['n(0,0)', 'n(0,1)', 46 | 'n(1,0)', 'n(1,1)', 'n(1,2)', 47 | 'n(2,0)', 'n(2,1)', 'n(2,2)', 'n(2,3)', 48 | 'n(3,0)', 'n(3,1)', 'n(3,2)', 'n(3,3)', 'n(3,4)'] 49 | 50 | OPs = [ 'skip_connect','cweight_com','avg_pool_3x3', 51 | 'max_pool_3x3','sep_conv_3x3','dil_conv_3x3', 'shuffle_conv_3x3',] 52 | self.df1 = DataFrame(self.normal_array, index=index, columns=OPs) 53 | self.df2 = DataFrame(self.reduce_array, index=index, columns=OPs) 54 | 55 | 56 | def draw(self): 57 | f, ax1= plt.subplots(figsize=(15, 9)) 58 | sns.heatmap(self.df1, annot=True, ax=ax1, 59 | annot_kws={'size': 13, 'weight': 'bold'}) 60 | ax1.set_xlabel('Ops without none operation', labelpad=14, fontsize='medium') 61 | ax1.set_ylabel('Possiable Input Index', labelpad=14, fontsize='medium') 62 | # ax1.set_title('The weights for Ops without none operation in normal cell', pad = 18, fontsize='x-large') 63 | 64 | # f, ax2= plt.subplots(figsize=(15, 9)) 65 | # sns.heatmap(self.df2, annot=True, ax=ax2, 66 | # annot_kws={'size': 13, 'weight': 'bold'}) 67 | # ax2.set_xlabel('Ops without none operation', labelpad=14, fontsize='medium') 68 | # ax2.set_ylabel('Possible predecessors id for each intermediate node', labelpad=14, fontsize='medium') 69 | # #ax2.set_title('The weights for Ops without none operation in reduction cell', pad = 18, fontsize='x-large') 70 | plt.savefig(self.store_path+'/normal_hm.pdf', bbox_inches = 'tight', dpi=600) 71 | # plt.show() 72 | 73 | 74 | if __name__ == '__main__': 75 | hm = HeapMap() 76 | hm.draw() 77 | 78 | -------------------------------------------------------------------------------- /CNAS/experiments/images_view.py: -------------------------------------------------------------------------------- 1 | import time 2 | import matplotlib.pyplot as plt 3 | from matplotlib.backends.backend_pdf import PdfPages 4 | from darts.utils import create_exp_dir 5 | 6 | class ViewImage1(object): 7 | """Horizon""" 8 | def __init__(self, paths): 9 | 10 | super(ViewImage1, self).__init__() 11 | """Use latex""" 12 | from matplotlib import rc 13 | rc('font', **{'family': 'sans-serif', 'sans-serif': ['Helvetica']}) 14 | rc('text', usetex=True) 15 | self.paths = paths 16 | self.store_path = './image_view/' + 'graph-{}-{}'.format('exp', time.strftime('%Y%m%d-%H%M%S')) 17 | create_exp_dir(self.store_path, scripts_to_save=None) 18 | self.pdf = PdfPages(self.store_path+'/figure.pdf') 19 | 20 | def view(self): 21 | imgs = [] 22 | for path in self.paths: 23 | img = plt.imread(path) 24 | imgs.append(img) 25 | 26 | fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(12,16)) 27 | fig.subplots_adjust(hspace=0.1, wspace=0) 28 | idx = 0 29 | for i in range(2): 30 | for j in range(2): 31 | axs[i,j].xaxis.set_major_locator(plt.NullLocator()) 32 | axs[i,j].yaxis.set_major_locator(plt.NullLocator()) 33 | axs[i,j].imshow(imgs[idx], cmap='bone') 34 | axs[i,j].set_xlabel(r'$(\alpha_'+str(idx+1) + ')$', fontsize=22) 35 | plt.tight_layout() 36 | idx = idx+1 37 | # save as a high quality image 38 | self.pdf.savefig(bbox_inches = 'tight', dpi=600) 39 | # plt.savefig(bbox_inches = 'tight', format='png', dpi=600) 40 | # plt.show() 41 | 42 | class ViewImage2(object): 43 | """Vetical""" 44 | def __init__(self, paths): 45 | super(ViewImage2, self).__init__() 46 | """Use latex""" 47 | from matplotlib import rc 48 | rc('font', **{'family': 'sans-serif', 'sans-serif': ['Helvetica']}) 49 | rc('text', usetex=True) 50 | self.paths = paths 51 | self.store_path = './image_view/' + 'graph-{}-{}'.format('exp', time.strftime('%Y%m%d-%H%M%S')) 52 | create_exp_dir(self.store_path, scripts_to_save=None) 53 | self.pdf = PdfPages(self.store_path+'/figure.pdf') 54 | 55 | def view(self): 56 | imgs = [] 57 | for path in self.paths: 58 | img = plt.imread(path) 59 | imgs.append(img) 60 | 61 | fig, axs = plt.subplots(nrows=4, ncols=1, figsize=(16,8)) 62 | fig.subplots_adjust(hspace=0.1) 63 | idx = 0 64 | for i in range(4): 65 | axs[i].xaxis.set_major_locator(plt.NullLocator()) 66 | axs[i].yaxis.set_major_locator(plt.NullLocator()) 67 | axs[i].imshow(imgs[idx], cmap='bone') 68 | axs[i].set_xlabel(r'$(\alpha_'+str(idx+1) + ')$', fontsize=16) 69 | plt.tight_layout() 70 | idx = idx+1 71 | # save as a high quality image 72 | self.pdf.savefig(bbox_inches = 'tight', dpi=600) 73 | # plt.show() 74 | 75 | if __name__ == '__main__': 76 | root = './cell_visualize_rst/' 77 | #name = 'normal.png' 78 | name = 'reduction.png' 79 | ## ----none 80 | # paths = [root+'graph-exp-20181025-145953/'+name, 81 | # root+'graph-exp-20181025-150002/'+name, 82 | # root+'graph-exp-20181025-150007/'+name, 83 | # root+'graph-exp-20181025-150024/'+name] 84 | 85 | ## ---non-none 86 | paths = [root+'graph-exp-20181026-212506/'+name, 87 | root+'graph-exp-20181026-212533/'+name, 88 | root+'graph-exp-20181026-212941/'+name, 89 | root+'graph-exp-20181026-213713/'+name] 90 | vi = ViewImage1(paths) 91 | vi.view() 92 | 93 | 94 | 95 | 96 | 97 | -------------------------------------------------------------------------------- /CNAS/experiments/performance_view.py: -------------------------------------------------------------------------------- 1 | import re 2 | import os 3 | import time 4 | import numpy as np 5 | import matplotlib.pyplot as plt 6 | from darts.utils import create_exp_dir 7 | 8 | class PerformanceView(object): 9 | def __init__(self, file_path): 10 | super(PerformanceView, self).__init__() 11 | self.valid_loss = [] 12 | self.valid_acc = [] 13 | self.train_loss = [] 14 | self.train_acc = [] 15 | self.file_path = file_path 16 | assert os.path.exists(file_path) == True 17 | self._read_log_file1() 18 | self.fig, self.axs = plt.subplots(nrows=1, ncols=2, 19 | sharex=False, figsize=(12,4), dpi=600) 20 | # self.fig.suptitle('train on cifar-10', fontsize=12, fontweight='bold', y=1.0) 21 | self.fig.subplots_adjust(left=0.2, wspace=0.2) 22 | self.fig.tight_layout() 23 | self.store_path = './performance_view/' + 'graph-{}-{}'.format('exp', time.strftime('%Y%m%d-%H%M%S')) 24 | create_exp_dir(self.store_path, scripts_to_save=None) 25 | 26 | def _read_log_file1(self): 27 | with open(self.file_path, 'r') as fp: 28 | for line in fp.readlines(): 29 | if 'valid loss' in line: 30 | line = line.strip().split(',')[1:] 31 | for item in line: 32 | if 'valid loss' in item: 33 | self.valid_loss.append(float(item.split()[-1])) 34 | else: 35 | self.valid_acc.append(float(item.split()[3])) 36 | if 'train loss' in line: 37 | line = line.strip().split(',')[1:] 38 | for item in line: 39 | if 'train loss' in item: 40 | self.train_loss.append(float(item.split()[-1])) 41 | else: 42 | self.train_acc.append(float(item.split()[2])) 43 | 44 | 45 | def _read_log_file2(self): 46 | with open(self.file_path, 'r') as fp: 47 | for line in fp.readlines(): 48 | if 'valid_acc' in line: 49 | self.valid_acc.append(float(line.strip().split()[-1])) 50 | if 'train_acc' in line: 51 | self.train_acc.append(float(line.strip().split()[-1])) 52 | if 'lr' in line: # get learning rate 53 | lr = float(line.strip().split()[-1]) 54 | self.lrs.append(lr) 55 | 56 | def draw_loss(self): 57 | 58 | self.axs[0].plot(self.valid_loss,'r', label='valid loss') 59 | self.axs[0].plot(self.train_loss, 'b', label='train loss') 60 | self.axs[0].set_xlabel('epoch') 61 | self.axs[0].set_ylabel('loss') 62 | self.axs[0].set_title('valid loss vs train loss each epoch') 63 | self.axs[0].legend() 64 | 65 | def draw_acc(self): 66 | self.axs[1].plot(self.valid_acc, 'r', label='valid acc') 67 | self.axs[1].plot(self.train_acc, 'b', label='train acc') 68 | self.axs[1].set_xlabel('epoch') 69 | self.axs[1].set_ylabel('accuracy') 70 | self.axs[1].set_title('valid accuracy vs train accuracy each epoch') 71 | self.axs[1].legend() 72 | 73 | def show(self): 74 | plt.savefig(self.store_path+'/v3_loss_acc_cifar100.pdf', bbox_inches = 'tight', dpi=600) 75 | plt.show() 76 | 77 | if __name__ == '__main__': 78 | #pv = PerformanceView('../logs/eval/DARTS_MORE_NONE_V2/cifar10/eval-EXP-20181021-1427/log.txt') 79 | #pv = PerformanceView('../logs/eval/DARTS_MORE_NONE_V1/cifar100/eval-EXP-20181021-0200/log.txt') 80 | #pv = PerformanceView('../logs/eval/DARTS_MORE_NONE_V3/cifar10/eval-EXP-20181021-0206/log.txt') 81 | #pv = PerformanceView('../logs/eval/DARTS_MORE_NONE_V3/cifar100/eval-EXP-20181025-0243/log.txt') 82 | pv = PerformanceView('../logs/eval/DARTS_MORE_V3/tiny-imagenet\eval-EXP-20181122-1325/log.txt') 83 | pv.draw_loss() 84 | pv.draw_acc() 85 | pv.show() 86 | 87 | 88 | -------------------------------------------------------------------------------- /CNAS/tests/test_cell_visualize.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | import os 3 | import platform 4 | from visualize import plot 5 | 6 | class TestCellVis(unittest.TestCase): 7 | 8 | def setUp(self): 9 | self.cell_name = 'DARTS' 10 | if 'Windows' in platform.platform(): 11 | os.environ['PATH'] += os.pathsep + '../3rd_tools/graphviz-2.38/bin/' 12 | 13 | def test_plot(self): 14 | genotype = eval('geno_types.{}'.format(self.cell_name)) 15 | plot(genotype.normal, 'normal') 16 | plot(genotype.reduce, 'reduction') 17 | -------------------------------------------------------------------------------- /CNAS/tests/test_model_search.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | import torch 3 | import torch.nn.functional as F 4 | from darts.model_search import * 5 | 6 | class TestModelSearch(unittest.TestCase): 7 | def setUp(self): 8 | self.total_input = 14 9 | self.total_operations = 8 10 | self.input = torch.randn(8, 32, 32, 32) 11 | alpha = torch.randn(self.total_input, self.total_operations) 12 | self.weights = F.softmax(alpha, dim=-1) 13 | self.num_meta_node = 4 14 | self.multiplier = 4 15 | self.inp_c = 3 16 | self.c = 16 17 | self.num_classes = 10 18 | self.layers = 4 19 | self.criterion = torch.nn.CrossEntropyLoss() 20 | self.num_inp_node = 2 21 | 22 | def test_mixedop(self): 23 | mop = MixedOp(32, 1) 24 | rst = mop(self.input, self.weights[0]) 25 | print(rst) 26 | 27 | def test_architecture(self): 28 | model = Network(self.inp_c, self.c, self.num_classes, self.layers, 29 | self.criterion, self.num_inp_node).cuda() 30 | genotype = model.genotype() 31 | 32 | print('genotype = %s', genotype) 33 | mini_batch_imgs = torch.randn(8, 3, 32, 32).cuda() 34 | logits = model(mini_batch_imgs) 35 | print(logits) 36 | 37 | 38 | 39 | -------------------------------------------------------------------------------- /CNAS/tests/test_operation.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | import time 3 | import torch 4 | import sys 5 | sys.path.append('..') 6 | from darts.operations import * 7 | 8 | class TestOperation(unittest.TestCase): 9 | def setUp(self): 10 | self.input = torch.randn(40,32,32,32).cuda() 11 | self.target = torch.randint(0,10, (40,), dtype=torch.long).cuda() 12 | self.fc = torch.nn.Linear(32, 10).cuda() 13 | self.avgpool = torch.nn.AdaptiveAvgPool2d(1).cuda() 14 | self.criterion = torch.nn.CrossEntropyLoss().cuda() 15 | self.stride = 2 16 | self.c = 32 17 | self.atom = PRIMITIVES 18 | 19 | def classifier(self, layer): 20 | out = layer(self.input) 21 | out = self.avgpool(out) 22 | out = self.fc(out.view(out.size(0), -1)) 23 | loss = self.criterion(out, self.target) 24 | loss.backward() 25 | return loss.item() 26 | 27 | def test_op0(self): 28 | start = time.time() 29 | layer = OPS[self.atom[0]](self.c, self.stride, False).cuda() 30 | loss = self.classifier(layer = layer) 31 | end = time.time() 32 | print('ops0 none: loss: {0}, cost: {1}s'.format(loss, end-start)) 33 | 34 | def test_op1(self): 35 | start = time.time() 36 | layer = OPS[self.atom[1]](self.c, self.stride, False).cuda() 37 | loss = self.classifier(layer = layer) 38 | end = time.time() 39 | print('ops1 skip connection: loss: {0}, cost: {1}s'.format(loss, end-start)) 40 | 41 | def test_op2(self): 42 | start = time.time() 43 | layer = OPS[self.atom[2]](self.c, self.stride, False).cuda() 44 | loss = self.classifier(layer = layer) 45 | end = time.time() 46 | print('ops2 cweight_com: loss: {0}, cost: {1}s'.format(loss, end-start)) 47 | 48 | 49 | def test_op3(self): 50 | start = time.time() 51 | layer = OPS[self.atom[3]](self.c, self.stride, False).cuda() 52 | loss = self.classifier(layer = layer) 53 | end = time.time() 54 | print('ops3 avg_pool_3x3: loss: {0}, cost: {1}s'.format(loss, end-start)) 55 | 56 | def test_op4(self): 57 | start = time.time() 58 | layer = OPS[self.atom[4]](self.c, self.stride, False).cuda() 59 | loss = self.classifier(layer = layer) 60 | end = time.time() 61 | print('ops4 max_pool_3x3: loss: {0}, cost: {1}s'.format(loss, end-start)) 62 | 63 | def test_op5(self): 64 | start = time.time() 65 | layer = OPS[self.atom[5]](self.c, self.stride, False).cuda() 66 | loss = self.classifier(layer = layer) 67 | end = time.time() 68 | print('ops5 sep_conv_3x3: loss: {0}, cost: {1}s'.format(loss, end-start)) 69 | 70 | def test_op6(self): 71 | start = time.time() 72 | layer = OPS[self.atom[6]](self.c, self.stride, False).cuda() 73 | loss = self.classifier(layer = layer) 74 | end = time.time() 75 | print('ops6 dil_conv_3x3: loss: {0}, cost: {1}s'.format(loss, end - start)) 76 | 77 | def test_op7(self): 78 | start = time.time() 79 | layer = OPS[self.atom[7]](self.c, self.stride, False).cuda() 80 | loss = self.classifier(layer = layer) 81 | end = time.time() 82 | print('ops7 shuffle_conv_3x3: loss: {0}, cost: {1}s'.format(loss, end - start)) 83 | 84 | if __name__ == "__main__": 85 | unittest.main() -------------------------------------------------------------------------------- /CNAS/tests/test_utils.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | import torch 3 | import matplotlib.pyplot as plt 4 | 5 | from darts.utils import * 6 | 7 | class TestUtils(unittest.TestCase): 8 | 9 | def setUp(self): 10 | self.input = torch.randn(4,3,32,32) 11 | self.img = torch.randn(3, 32, 32) 12 | 13 | def test_cutout(self): 14 | self.fig = plt.figure() 15 | self.left = self.fig.add_subplot(1, 2, 1) 16 | self.left.set_title('Input Image') 17 | plt.imshow(self.img.numpy().transpose(1,2,0)) 18 | self.right = self.fig.add_subplot(1,2,2) 19 | len = 10 20 | cutout = Cutout(len) 21 | print('original img size: ', self.img.size()) 22 | cut_img = cutout(self.img) 23 | print('after cutout img size: ', cut_img.size()) 24 | 25 | self.right.set_title('Output Image') 26 | plt.imshow(cut_img.numpy().transpose(1,2,0)) 27 | plt.show() 28 | 29 | 30 | def test_get_freer_gpu(self): 31 | print(get_freer_gpu()) 32 | 33 | def test_calc_parameters_count(self): 34 | test_conv = torch.nn.Sequential( 35 | torch.nn.Conv2d(3, 32, 3, 1), 36 | torch.nn.Conv2d(32, 64, kernel_size=3, stride=1), 37 | torch.nn.BatchNorm2d(64), 38 | torch.nn.ReLU(inplace=True) 39 | ) 40 | test_conv(self.input) 41 | 42 | print('param counts: ', calc_parameters_count(test_conv), ' MB') 43 | 44 | 45 | if __name__ == '__main__': 46 | unittest.main() 47 | 48 | 49 | -------------------------------------------------------------------------------- /DARTS/README.md: -------------------------------------------------------------------------------- 1 | # DARTS: Differentiable Architecture Search 2 | 3 | ## Generate a Random Architecture 4 | 5 | ``` 6 | from models.search_cnn import SearchCNNController 7 | model = SearchCNNController(3, 16, 10, 20, None, n_nodes=4) 8 | genotype = model.genotype() 9 | ``` 10 | 11 | ## Search 12 | 13 | ``` 14 | python search.py 15 | --name test 16 | --data_path /data # path to data 17 | --dataset CIFAR10 # choose between CIFAR10, CIFAR100, Sport8, MIT67 and flowers102 18 | ``` 19 | 20 | ## Augment 21 | 22 | ``` 23 | python augment.py 24 | --name test 25 | --layers 20 # 20 for CIFAR10 and CIFAR100, 8 for Sport8, MIT67 and flowers102 26 | --dataset CIFAR10 # choose between CIFAR10, CIFAR100, Sport8, MIT67 and flowers102 27 | --datapath /data # path to data 28 | --genotype genotype 29 | ``` 30 | -------------------------------------------------------------------------------- /DARTS/architect.py: -------------------------------------------------------------------------------- 1 | """ Architect controls architecture of cell by computing gradients of alphas """ 2 | import copy 3 | import torch 4 | 5 | 6 | class Architect(): 7 | """ Compute gradients of alphas """ 8 | def __init__(self, net, w_momentum, w_weight_decay): 9 | """ 10 | Args: 11 | net 12 | w_momentum: weights momentum 13 | """ 14 | self.net = net 15 | self.v_net = copy.deepcopy(net) 16 | self.w_momentum = w_momentum 17 | self.w_weight_decay = w_weight_decay 18 | 19 | def virtual_step(self, trn_X, trn_y, xi, w_optim): 20 | """ 21 | Compute unrolled weight w' (virtual step) 22 | 23 | Step process: 24 | 1) forward 25 | 2) calc loss 26 | 3) compute gradient (by backprop) 27 | 4) update gradient 28 | 29 | Args: 30 | xi: learning rate for virtual gradient step (same as weights lr) 31 | w_optim: weights optimizer 32 | """ 33 | # forward & calc loss 34 | loss = self.net.loss(trn_X, trn_y) # L_trn(w) 35 | 36 | # compute gradient 37 | gradients = torch.autograd.grad(loss, self.net.weights()) 38 | 39 | # do virtual step (update gradient) 40 | # below operations do not need gradient tracking 41 | with torch.no_grad(): 42 | # dict key is not the value, but the pointer. So original network weight have to 43 | # be iterated also. 44 | for w, vw, g in zip(self.net.weights(), self.v_net.weights(), gradients): 45 | m = w_optim.state[w].get('momentum_buffer', 0.) * self.w_momentum 46 | vw.copy_(w - xi * (m + g + self.w_weight_decay*w)) 47 | 48 | # synchronize alphas 49 | for a, va in zip(self.net.alphas(), self.v_net.alphas()): 50 | va.copy_(a) 51 | 52 | def unrolled_backward(self, trn_X, trn_y, val_X, val_y, xi, w_optim): 53 | """ Compute unrolled loss and backward its gradients 54 | Args: 55 | xi: learning rate for virtual gradient step (same as net lr) 56 | w_optim: weights optimizer - for virtual step 57 | """ 58 | # do virtual step (calc w`) 59 | self.virtual_step(trn_X, trn_y, xi, w_optim) 60 | 61 | # calc unrolled loss 62 | loss = self.v_net.loss(val_X, val_y) # L_val(w`) 63 | 64 | # compute gradient 65 | v_alphas = tuple(self.v_net.alphas()) 66 | v_weights = tuple(self.v_net.weights()) 67 | v_grads = torch.autograd.grad(loss, v_alphas + v_weights) 68 | dalpha = v_grads[:len(v_alphas)] 69 | dw = v_grads[len(v_alphas):] 70 | 71 | hessian = self.compute_hessian(dw, trn_X, trn_y) 72 | 73 | # update final gradient = dalpha - xi*hessian 74 | with torch.no_grad(): 75 | for alpha, da, h in zip(self.net.alphas(), dalpha, hessian): 76 | alpha.grad = da - xi*h 77 | 78 | def compute_hessian(self, dw, trn_X, trn_y): 79 | """ 80 | dw = dw` { L_val(w`, alpha) } 81 | w+ = w + eps * dw 82 | w- = w - eps * dw 83 | hessian = (dalpha { L_trn(w+, alpha) } - dalpha { L_trn(w-, alpha) }) / (2*eps) 84 | eps = 0.01 / ||dw|| 85 | """ 86 | norm = torch.cat([w.view(-1) for w in dw]).norm() 87 | eps = 0.01 / norm 88 | 89 | # w+ = w + eps*dw` 90 | with torch.no_grad(): 91 | for p, d in zip(self.net.weights(), dw): 92 | p += eps * d 93 | loss = self.net.loss(trn_X, trn_y) 94 | dalpha_pos = torch.autograd.grad(loss, self.net.alphas()) # dalpha { L_trn(w+) } 95 | 96 | # w- = w - eps*dw` 97 | with torch.no_grad(): 98 | for p, d in zip(self.net.weights(), dw): 99 | p -= 2. * eps * d 100 | loss = self.net.loss(trn_X, trn_y) 101 | dalpha_neg = torch.autograd.grad(loss, self.net.alphas()) # dalpha { L_trn(w-) } 102 | 103 | # recover w 104 | with torch.no_grad(): 105 | for p, d in zip(self.net.weights(), dw): 106 | p += eps * d 107 | 108 | hessian = [(p-n) / (2.*eps) for p, n in zip(dalpha_pos, dalpha_neg)] 109 | return hessian 110 | -------------------------------------------------------------------------------- /DARTS/config.py: -------------------------------------------------------------------------------- 1 | """ Config class for search/augment """ 2 | import argparse 3 | import os 4 | import genotypes as gt 5 | from functools import partial 6 | import torch 7 | 8 | 9 | def get_parser(name): 10 | """ make default formatted parser """ 11 | parser = argparse.ArgumentParser(name, formatter_class=argparse.ArgumentDefaultsHelpFormatter) 12 | # print default value always 13 | parser.add_argument = partial(parser.add_argument, help=' ') 14 | return parser 15 | 16 | 17 | def parse_gpus(gpus): 18 | if gpus == 'all': 19 | return list(range(torch.cuda.device_count())) 20 | else: 21 | return [int(s) for s in gpus.split(',')] 22 | 23 | 24 | class BaseConfig(argparse.Namespace): 25 | def print_params(self, prtf=print): 26 | prtf("") 27 | prtf("Parameters:") 28 | for attr, value in sorted(vars(self).items()): 29 | prtf("{}={}".format(attr.upper(), value)) 30 | prtf("") 31 | 32 | def as_markdown(self): 33 | """ Return configs as markdown format """ 34 | text = "|name|value| \n|-|-| \n" 35 | for attr, value in sorted(vars(self).items()): 36 | text += "|{}|{}| \n".format(attr, value) 37 | 38 | return text 39 | 40 | 41 | class SearchConfig(BaseConfig): 42 | def build_parser(self): 43 | parser = get_parser("Search config") 44 | parser.add_argument('--name', required=True) 45 | parser.add_argument('--dataset', required=True, help='CIFAR10 / CIFAR100 / Sport8 / MIT67') 46 | parser.add_argument('--batch_size', type=int, default=64, help='batch size') 47 | parser.add_argument('--w_lr', type=float, default=0.025, help='lr for weights') 48 | parser.add_argument('--w_lr_min', type=float, default=0.001, help='minimum lr for weights') 49 | parser.add_argument('--w_momentum', type=float, default=0.9, help='momentum for weights') 50 | parser.add_argument('--w_weight_decay', type=float, default=3e-4, 51 | help='weight decay for weights') 52 | parser.add_argument('--w_grad_clip', type=float, default=5., 53 | help='gradient clipping for weights') 54 | parser.add_argument('--print_freq', type=int, default=50, help='print frequency') 55 | parser.add_argument('--gpus', default='0', help='gpu device ids separated by comma. ' 56 | '`all` indicates use all gpus.') 57 | parser.add_argument('--epochs', type=int, default=50, help='# of training epochs') 58 | parser.add_argument('--init_channels', type=int, default=16) 59 | parser.add_argument('--layers', type=int, default=8, help='# of layers') 60 | parser.add_argument('--seed', type=int, default=2, help='random seed') 61 | parser.add_argument('--workers', type=int, default=4, help='# of workers') 62 | parser.add_argument('--alpha_lr', type=float, default=3e-4, help='lr for alpha') 63 | parser.add_argument('--alpha_weight_decay', type=float, default=1e-3, 64 | help='weight decay for alpha') 65 | parser.add_argument('--data_path', default="./data", help="Where to look for the data") 66 | parser.add_argument('--layers_augment', type=int, default=20, help="nb of layers for augment") 67 | parser.add_argument('--path', type=str, default='/cache/darts/searchs') 68 | 69 | return parser 70 | 71 | def __init__(self): 72 | parser = self.build_parser() 73 | args = parser.parse_args() 74 | super().__init__(**vars(args)) 75 | 76 | self.path = os.path.join(self.path, self.name) 77 | self.plot_path = os.path.join(self.path, 'plots') 78 | self.gpus = parse_gpus(self.gpus) 79 | 80 | 81 | class AugmentConfig(BaseConfig): 82 | def build_parser(self): 83 | parser = get_parser("Augment config") 84 | parser.add_argument('--name', required=True) 85 | parser.add_argument('--dataset', required=True, help='CIFAR10 / CIFAR100 / Sport8 / MIT67') 86 | parser.add_argument('--batch_size', type=int, default=96, help='batch size') 87 | parser.add_argument('--lr', type=float, default=0.025, help='lr for weights') 88 | parser.add_argument('--momentum', type=float, default=0.9, help='momentum') 89 | parser.add_argument('--weight_decay', type=float, default=3e-4, help='weight decay') 90 | parser.add_argument('--grad_clip', type=float, default=5., 91 | help='gradient clipping for weights') 92 | parser.add_argument('--print_freq', type=int, default=200, help='print frequency') 93 | parser.add_argument('--gpus', default='0', help='gpu device ids separated by comma. ' 94 | '`all` indicates use all gpus.') 95 | parser.add_argument('--epochs', type=int, default=600, help='# of training epochs') 96 | parser.add_argument('--init_channels', type=int, default=36) 97 | parser.add_argument('--layers', type=int, default=20, help='# of layers') 98 | parser.add_argument('--seed', type=int, default=2, help='random seed') 99 | parser.add_argument('--workers', type=int, default=4, help='# of workers') 100 | parser.add_argument('--aux_weight', type=float, default=0.4, help='auxiliary loss weight') 101 | parser.add_argument('--cutout_length', type=int, default=16, help='cutout length') 102 | parser.add_argument('--drop_path_prob', type=float, default=0.2, help='drop path prob') 103 | 104 | parser.add_argument('--genotype', required=True, help='Cell genotype') 105 | parser.add_argument('--data_path', default="./data", help="Where to look for the data") 106 | parser.add_argument('--path', default="/cache/darts/augments") 107 | 108 | return parser 109 | 110 | def __init__(self): 111 | parser = self.build_parser() 112 | args = parser.parse_args() 113 | super().__init__(**vars(args)) 114 | 115 | self.path = os.path.join(self.path, self.name) 116 | self.genotype = gt.from_str(self.genotype) 117 | self.gpus = parse_gpus(self.gpus) 118 | -------------------------------------------------------------------------------- /DARTS/genotypes.py: -------------------------------------------------------------------------------- 1 | """ Genotypes 2 | - Genotype: normal/reduce gene + normal/reduce cell output connection (concat) 3 | - gene: discrete ops information (w/o output connection) 4 | - dag: real ops (can be mixed or discrete, but Genotype has only discrete information itself) 5 | """ 6 | from collections import namedtuple 7 | import torch 8 | import torch.nn as nn 9 | from models import ops 10 | 11 | 12 | Genotype = namedtuple('Genotype', 'normal normal_concat reduce reduce_concat') 13 | 14 | PRIMITIVES = [ 15 | 'max_pool_3x3', 16 | 'avg_pool_3x3', 17 | 'skip_connect', # identity 18 | 'sep_conv_3x3', 19 | 'sep_conv_5x5', 20 | 'dil_conv_3x3', 21 | 'dil_conv_5x5', 22 | 'none' 23 | ] 24 | 25 | 26 | def to_dag(C_in, gene, reduction): 27 | """ generate discrete ops from gene """ 28 | dag = nn.ModuleList() 29 | for edges in gene: 30 | row = nn.ModuleList() 31 | for op_name, s_idx in edges: 32 | # reduction cell & from input nodes => stride = 2 33 | stride = 2 if reduction and s_idx < 2 else 1 34 | op = ops.OPS[op_name](C_in, stride, True) 35 | if not isinstance(op, ops.Identity): # Identity does not use drop path 36 | op = nn.Sequential( 37 | op, 38 | ops.DropPath_() 39 | ) 40 | op.s_idx = s_idx 41 | row.append(op) 42 | dag.append(row) 43 | 44 | return dag 45 | 46 | 47 | def from_str(s): 48 | """ generate genotype from string 49 | e.g. "Genotype( 50 | normal=[[('sep_conv_3x3', 0), ('sep_conv_3x3', 1)], 51 | [('sep_conv_3x3', 1), ('dil_conv_3x3', 2)], 52 | [('sep_conv_3x3', 1), ('sep_conv_3x3', 2)], 53 | [('sep_conv_3x3', 1), ('dil_conv_3x3', 4)]], 54 | normal_concat=range(2, 6), 55 | reduce=[[('max_pool_3x3', 0), ('max_pool_3x3', 1)], 56 | [('max_pool_3x3', 0), ('skip_connect', 2)], 57 | [('max_pool_3x3', 0), ('skip_connect', 2)], 58 | [('max_pool_3x3', 0), ('skip_connect', 2)]], 59 | reduce_concat=range(2, 6))" 60 | """ 61 | 62 | genotype = eval(s) 63 | 64 | return genotype 65 | 66 | 67 | def parse(alpha, k): 68 | """ 69 | parse continuous alpha to discrete gene. 70 | alpha is ParameterList: 71 | ParameterList [ 72 | Parameter(n_edges1, n_ops), 73 | Parameter(n_edges2, n_ops), 74 | ... 75 | ] 76 | 77 | gene is list: 78 | [ 79 | [('node1_ops_1', node_idx), ..., ('node1_ops_k', node_idx)], 80 | [('node2_ops_1', node_idx), ..., ('node2_ops_k', node_idx)], 81 | ... 82 | ] 83 | each node has two edges (k=2) in CNN. 84 | """ 85 | 86 | gene = [] 87 | #assert PRIMITIVES[-1] == 'none' # assume last PRIMITIVE is 'none' 88 | 89 | # 1) Convert the mixed op to discrete edge (single op) by choosing top-1 weight edge 90 | # 2) Choose top-k edges per node by edge score (top-1 weight in edge) 91 | for edges in alpha: 92 | # edges: Tensor(n_edges, n_ops) 93 | edge_max, primitive_indices = torch.topk(edges[:, :-1], 1) # ignore 'none' 94 | topk_edge_values, topk_edge_indices = torch.topk(edge_max.view(-1), k) 95 | node_gene = [] 96 | for edge_idx in topk_edge_indices: 97 | prim_idx = primitive_indices[edge_idx] 98 | prim = PRIMITIVES[prim_idx] 99 | node_gene.append((prim, edge_idx.item())) 100 | 101 | gene.append(node_gene) 102 | 103 | return gene 104 | -------------------------------------------------------------------------------- /DARTS/models/augment_cells.py: -------------------------------------------------------------------------------- 1 | """ CNN cell for network augmentation """ 2 | import torch 3 | import torch.nn as nn 4 | from models import ops 5 | import genotypes as gt 6 | 7 | 8 | class AugmentCell(nn.Module): 9 | """ Cell for augmentation 10 | Each edge is discrete. 11 | """ 12 | def __init__(self, genotype, C_pp, C_p, C, reduction_p, reduction): 13 | super().__init__() 14 | self.reduction = reduction 15 | self.n_nodes = len(genotype.normal) 16 | 17 | if reduction_p: 18 | self.preproc0 = ops.FactorizedReduce(C_pp, C) 19 | else: 20 | self.preproc0 = ops.StdConv(C_pp, C, 1, 1, 0) 21 | self.preproc1 = ops.StdConv(C_p, C, 1, 1, 0) 22 | 23 | # generate dag 24 | if reduction: 25 | gene = genotype.reduce 26 | self.concat = genotype.reduce_concat 27 | else: 28 | gene = genotype.normal 29 | self.concat = genotype.normal_concat 30 | 31 | self.dag = gt.to_dag(C, gene, reduction) 32 | 33 | def forward(self, s0, s1): 34 | s0 = self.preproc0(s0) 35 | s1 = self.preproc1(s1) 36 | 37 | states = [s0, s1] 38 | for edges in self.dag: 39 | s_cur = sum(op(states[op.s_idx]) for op in edges) 40 | states.append(s_cur) 41 | 42 | s_out = torch.cat([states[i] for i in self.concat], dim=1) 43 | 44 | return s_out 45 | -------------------------------------------------------------------------------- /DARTS/models/search_cells.py: -------------------------------------------------------------------------------- 1 | """ CNN cell for architecture search """ 2 | import torch 3 | import torch.nn as nn 4 | from models import ops 5 | 6 | 7 | class SearchCell(nn.Module): 8 | """ Cell for search 9 | Each edge is mixed and continuous relaxed. 10 | """ 11 | def __init__(self, n_nodes, C_pp, C_p, C, reduction_p, reduction): 12 | """ 13 | Args: 14 | n_nodes: # of intermediate n_nodes 15 | C_pp: C_out[k-2] 16 | C_p : C_out[k-1] 17 | C : C_in[k] (current) 18 | reduction_p: flag for whether the previous cell is reduction cell or not 19 | reduction: flag for whether the current cell is reduction cell or not 20 | """ 21 | super().__init__() 22 | self.reduction = reduction 23 | self.n_nodes = n_nodes 24 | 25 | # If previous cell is reduction cell, current input size does not match with 26 | # output size of cell[k-2]. So the output[k-2] should be reduced by preprocessing. 27 | if reduction_p: 28 | self.preproc0 = ops.FactorizedReduce(C_pp, C, affine=False) 29 | else: 30 | self.preproc0 = ops.StdConv(C_pp, C, 1, 1, 0, affine=False) 31 | self.preproc1 = ops.StdConv(C_p, C, 1, 1, 0, affine=False) 32 | 33 | # generate dag 34 | self.dag = nn.ModuleList() 35 | for i in range(self.n_nodes): 36 | self.dag.append(nn.ModuleList()) 37 | for j in range(2+i): # include 2 input nodes 38 | # reduction should be used only for input node 39 | stride = 2 if reduction and j < 2 else 1 40 | op = ops.MixedOp(C, stride) 41 | self.dag[i].append(op) 42 | 43 | def forward(self, s0, s1, w_dag): 44 | s0 = self.preproc0(s0) 45 | s1 = self.preproc1(s1) 46 | 47 | states = [s0, s1] 48 | for edges, w_list in zip(self.dag, w_dag): 49 | s_cur = sum(edges[i](s, w) for i, (s, w) in enumerate(zip(states, w_list))) 50 | states.append(s_cur) 51 | 52 | s_out = torch.cat(states[2:], dim=1) 53 | return s_out 54 | -------------------------------------------------------------------------------- /DARTS/preproc.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import numpy as np 4 | import torchvision.transforms as transforms 5 | 6 | import utils 7 | 8 | class Cutout(object): 9 | def __init__(self, length): 10 | self.length = length 11 | 12 | def __call__(self, img): 13 | h, w = img.size(1), img.size(2) 14 | mask = np.ones((h, w), np.float32) 15 | y = np.random.randint(h) 16 | x = np.random.randint(w) 17 | 18 | y1 = np.clip(y - self.length // 2, 0, h) 19 | y2 = np.clip(y + self.length // 2, 0, h) 20 | x1 = np.clip(x - self.length // 2, 0, w) 21 | x2 = np.clip(x + self.length // 2, 0, w) 22 | 23 | mask[y1: y2, x1: x2] = 0. 24 | mask = torch.from_numpy(mask) 25 | mask = mask.expand_as(img) 26 | img *= mask 27 | 28 | return img 29 | 30 | 31 | def data_transforms(dataset, cutout_length): 32 | dataset = dataset.lower() 33 | if dataset == 'cifar10' or dataset == 'cifar100': 34 | MEAN = [0.49139968, 0.48215827, 0.44653124] 35 | STD = [0.24703233, 0.24348505, 0.26158768] 36 | transf_train = [ 37 | transforms.RandomCrop(32, padding=4), 38 | transforms.RandomHorizontalFlip() 39 | ] 40 | transf_val = [] 41 | elif dataset == 'mnist': 42 | MEAN = [0.13066051707548254] 43 | STD = [0.30810780244715075] 44 | transf_train = [ 45 | transforms.RandomAffine(degrees=15, translate=(0.1, 0.1), scale=(0.9, 1.1), shear=0.1) 46 | ] 47 | transf_val=[] 48 | elif dataset == 'fashionmnist': 49 | MEAN = [0.28604063146254594] 50 | STD = [0.35302426207299326] 51 | transf_train = [ 52 | transforms.RandomAffine(degrees=15, translate=(0.1, 0.1), scale=(0.9, 1.1), shear=0.1), 53 | transforms.RandomVerticalFlip() 54 | ] 55 | transf_val = [] 56 | #Same preprocessing for ImageNet, Sport8 and MIT67 57 | elif dataset in utils.LARGE_DATASETS: 58 | MEAN = [0.485, 0.456, 0.406] 59 | STD = [0.229, 0.224, 0.225] 60 | transf_train = [ 61 | transforms.RandomResizedCrop(224), 62 | transforms.RandomHorizontalFlip(), 63 | transforms.ColorJitter( 64 | brightness=0.4, 65 | contrast=0.4, 66 | saturation=0.4, 67 | hue=0.2) 68 | ] 69 | transf_val = [ 70 | transforms.Resize(256), 71 | transforms.CenterCrop(224), 72 | ] 73 | else: 74 | raise ValueError('not expected dataset = {}'.format(dataset)) 75 | 76 | normalize = [ 77 | transforms.ToTensor(), 78 | transforms.Normalize(MEAN, STD) 79 | ] 80 | 81 | train_transform = transforms.Compose(transf_train + normalize) 82 | valid_transform = transforms.Compose(transf_val + normalize) # FIXME validation is not set to square proportions, is this an issue? 83 | 84 | if cutout_length > 0: 85 | train_transform.transforms.append(Cutout(cutout_length)) 86 | 87 | return train_transform, valid_transform 88 | -------------------------------------------------------------------------------- /DARTS/utils.py: -------------------------------------------------------------------------------- 1 | """ Utilities """ 2 | import os 3 | import logging 4 | import shutil 5 | import torch 6 | import torchvision.datasets as dset 7 | import numpy as np 8 | import preproc 9 | 10 | LARGE_DATASETS = ["imagenet", "mit67", "sport8", "flowers102"] 11 | 12 | def get_data(dataset, data_root, cutout_length, validation): 13 | """ Get torchvision dataset """ 14 | dataset = dataset.lower() 15 | 16 | data_path = data_root 17 | if dataset == 'cifar10': 18 | dset_cls = dset.CIFAR10 19 | n_classes = 10 20 | elif dataset == 'cifar100': 21 | dset_cls = dset.CIFAR100 22 | n_classes = 100 23 | elif dataset == 'mnist': 24 | dset_cls = dset.MNIST 25 | n_classes = 10 26 | elif dataset == 'fashionmnist': 27 | dset_cls = dset.FashionMNIST 28 | n_classes = 10 29 | #New Datasets 30 | elif dataset == 'mit67': 31 | dset_cls = dset.ImageFolder 32 | n_classes = 67 33 | data_path = '%s/MIT67/train' % data_root # 'data/MIT67/train' 34 | val_path = '%s/MIT67/test' % data_root # 'data/MIT67/val' 35 | elif dataset == 'sport8': 36 | dset_cls = dset.ImageFolder 37 | n_classes = 8 38 | data_path = '%s/Sport8/train' % data_root # 'data/Sport8/train' 39 | val_path = '%s/Sport8/test' % data_root # 'data/Sport8/val' 40 | elif dataset == 'flowers102': 41 | dset_cls = dset.ImageFolder 42 | n_classes = 102 43 | data_path = '%s/flowers102/train' % data_root # 'data/flowers102/train' 44 | val_path = '%s/flowers102/test' % data_root # 'data/flowers102/val' 45 | 46 | else: 47 | raise ValueError(dataset) 48 | 49 | trn_transform, val_transform = preproc.data_transforms(dataset, cutout_length) 50 | if dataset in LARGE_DATASETS: 51 | trn_data = dset_cls(root=data_path, transform=trn_transform) 52 | shape = trn_data[0][0].unsqueeze(0).shape 53 | print(shape) 54 | assert shape[2] == shape[3], "not expected shape = {}".format(shape) 55 | input_size = shape[2] 56 | else: 57 | trn_data = dset_cls(root=data_path, train=True, download=True, transform=trn_transform) 58 | # assuming shape is NHW or NHWC 59 | try: 60 | shape = trn_data.data.shape 61 | except AttributeError: 62 | shape = trn_data.train_data.shape 63 | assert shape[1] == shape[2], "not expected shape = {}".format(shape) 64 | input_size = shape[1] 65 | 66 | input_channels = 3 if len(shape) == 4 else 1 67 | # print("Number of input channels: ", input_channels) 68 | 69 | ret = [input_size, input_channels, n_classes, trn_data] 70 | if validation: # append validation data 71 | if dataset in LARGE_DATASETS: 72 | ret.append(dset_cls(root=val_path, transform=val_transform)) 73 | else: 74 | ret.append(dset_cls(root=data_path, train=False, download=True, transform=val_transform)) 75 | 76 | return ret 77 | 78 | 79 | def get_logger(file_path): 80 | """ Make python logger """ 81 | # [!] Since tensorboardX use default logger (e.g. logging.info()), we should use custom logger 82 | logger = logging.getLogger('darts') 83 | log_format = '%(asctime)s | %(message)s' 84 | formatter = logging.Formatter(log_format, datefmt='%m/%d %I:%M:%S %p') 85 | file_handler = logging.FileHandler(file_path) 86 | file_handler.setFormatter(formatter) 87 | stream_handler = logging.StreamHandler() 88 | stream_handler.setFormatter(formatter) 89 | 90 | logger.addHandler(file_handler) 91 | logger.addHandler(stream_handler) 92 | logger.setLevel(logging.INFO) 93 | 94 | return logger 95 | 96 | 97 | def param_size(model): 98 | """ Compute parameter size in MB """ 99 | n_params = sum( 100 | np.prod(v.size()) for k, v in model.named_parameters() if not k.startswith('aux_head')) 101 | return n_params / 1024. / 1024. 102 | 103 | 104 | class AverageMeter(): 105 | """ Computes and stores the average and current value """ 106 | def __init__(self): 107 | self.reset() 108 | 109 | def reset(self): 110 | """ Reset all statistics """ 111 | self.val = 0 112 | self.avg = 0 113 | self.sum = 0 114 | self.count = 0 115 | 116 | def update(self, val, n=1): 117 | """ Update statistics """ 118 | self.val = val 119 | self.sum += val * n 120 | self.count += n 121 | self.avg = self.sum / self.count 122 | 123 | 124 | def accuracy(output, target, topk=(1,)): 125 | """ Computes the precision@k for the specified values of k """ 126 | maxk = max(topk) 127 | batch_size = target.size(0) 128 | 129 | _, pred = output.topk(maxk, 1, True, True) 130 | pred = pred.t() 131 | # one-hot case 132 | if target.ndimension() > 1: 133 | target = target.max(1)[1] 134 | 135 | correct = pred.eq(target.view(1, -1).expand_as(pred)) 136 | 137 | res = [] 138 | for k in topk: 139 | correct_k = correct[:k].view(-1).float().sum(0) 140 | res.append(correct_k.mul_(1.0 / batch_size)) 141 | 142 | return res 143 | 144 | 145 | def save_checkpoint(state, ckpt_dir, is_best=False): 146 | filename = os.path.join(ckpt_dir, 'checkpoint.pth.tar') 147 | torch.save(state, filename) 148 | if is_best: 149 | best_filename = os.path.join(ckpt_dir, 'best.pth.tar') 150 | shutil.copyfile(filename, best_filename) 151 | -------------------------------------------------------------------------------- /DARTS/visualize.py: -------------------------------------------------------------------------------- 1 | """ Network architecture visualizer using graphviz """ 2 | import sys 3 | from graphviz import Digraph 4 | import genotypes as gt 5 | 6 | 7 | def plot(genotype, file_path, caption=None): 8 | """ make DAG plot and save to file_path as .png """ 9 | edge_attr = { 10 | 'fontsize': '20', 11 | 'fontname': 'times' 12 | } 13 | node_attr = { 14 | 'style': 'filled', 15 | 'shape': 'rect', 16 | 'align': 'center', 17 | 'fontsize': '20', 18 | 'height': '0.5', 19 | 'width': '0.5', 20 | 'penwidth': '2', 21 | 'fontname': 'times' 22 | } 23 | g = Digraph( 24 | format='png', 25 | edge_attr=edge_attr, 26 | node_attr=node_attr, 27 | engine='dot') 28 | g.body.extend(['rankdir=LR']) 29 | 30 | # input nodes 31 | g.node("c_{k-2}", fillcolor='darkseagreen2') 32 | g.node("c_{k-1}", fillcolor='darkseagreen2') 33 | 34 | # intermediate nodes 35 | n_nodes = len(genotype) 36 | for i in range(n_nodes): 37 | g.node(str(i), fillcolor='lightblue') 38 | 39 | for i, edges in enumerate(genotype): 40 | for op, j in edges: 41 | if j == 0: 42 | u = "c_{k-2}" 43 | elif j == 1: 44 | u = "c_{k-1}" 45 | else: 46 | u = str(j-2) 47 | 48 | v = str(i) 49 | g.edge(u, v, label=op, fillcolor="gray") 50 | 51 | # output node 52 | g.node("c_{k}", fillcolor='palegoldenrod') 53 | for i in range(n_nodes): 54 | g.edge(str(i), "c_{k}", fillcolor="gray") 55 | 56 | # add image caption 57 | if caption: 58 | g.attr(label=caption, overlap='false', fontsize='20', fontname='times') 59 | 60 | g.render(file_path, view=False) 61 | 62 | 63 | if __name__ == '__main__': 64 | if len(sys.argv) != 2: 65 | raise ValueError("usage:\n python {} GENOTYPE".format(sys.argv[0])) 66 | 67 | genotype_str = sys.argv[1] 68 | try: 69 | genotype = gt.from_str(genotype_str) 70 | except AttributeError: 71 | raise ValueError("Cannot parse {}".format(genotype_str)) 72 | 73 | plot(genotype.normal, "normal") 74 | plot(genotype.reduce, "reduction") 75 | -------------------------------------------------------------------------------- /ENASPytorch/README.md: -------------------------------------------------------------------------------- 1 | # ENAS PyTorch 2 | 3 | Used for experiments on Sport8, MIT67 and flowers102 4 | 5 | ## Generate a random architecture 6 | 7 | ``` 8 | import numpy.random as rd 9 | B = 5 10 | ops = rd.randint(0, 5, 4*B) #5 ops, B nodes 11 | links = rd.choice([0, 1], size=4*B, replace=True) 12 | arch_normal = [links[0], ops[0], links[1], ops[1], links[0], ops[2], links[1], ops[3], links[0], ops[4], links[1], ops[5], 13 | links[0], ops[6], links[1], ops[7], links[0], ops[16], links[1], ops[17]] 14 | arch_reduce = [links[0], ops[8], links[1], ops[9], links[0], ops[10], links[1], ops[11], links[0], ops[12], links[1], ops[13], 15 | links[0], ops[14], links[1], ops[15], links[0], ops[18], links[1], ops[19]] 16 | op = {0:"sep_conv_3x3", 1:"sep_conv_5x5", 2:"avg_pool_3x3", 3:"max_pool_3x3", 4:"skip_connect"} 17 | 18 | # Convert from ENAS encoding to DARTS genotype 19 | 20 | genotype = {} 21 | genotype["normal"]=[] 22 | genotype["reduce"]=[] 23 | for i in range(B): 24 | cn=[(op[arch_normal[4*i+1]], arch_normal[4*i]),(op[arch_normal[4*i+3]], arch_normal[4*i+2])] 25 | cnr=[(op[arch_reduce[4*i+1]], arch_reduce[4*i]),(op[arch_reduce[4*i+3]], arch_reduce[4*i+2])] 26 | genotype["normal"].append(cn) 27 | genotype["reduce"].append(cnr) 28 | genotype["normal_concat"]=range(2,2+B) 29 | genotype["reduce_concat"]=range(2,2+B) 30 | ``` 31 | 32 | ## Search 33 | 34 | ``` 35 | python train_search.py 36 | --dataset Sport8 # choose between Sport8, MIT67 and flowers102 37 | --data /data # path to data 38 | --batch_size 16 39 | --save test 40 | ``` 41 | 42 | ## Augment 43 | 44 | Same as DARTS 45 | -------------------------------------------------------------------------------- /ENASPytorch/data/data.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch.utils.data import DataLoader, SubsetRandomSampler 3 | from torchvision.datasets import CIFAR10 4 | from torchvision import transforms 5 | import torchvision.datasets as dset 6 | import random 7 | 8 | def get_loaders(args): 9 | if args.dataset == "cifar10": 10 | MEAN = [0.4914, 0.4822, 0.4465] 11 | STD = [0.2023, 0.1994, 0.2010] 12 | train_transform = transforms.Compose([ 13 | transforms.RandomCrop(32, padding=4), 14 | transforms.RandomHorizontalFlip(), 15 | transforms.ToTensor(), 16 | transforms.Normalize( 17 | mean=MEAN, 18 | std=STD, 19 | ), 20 | ]) 21 | train_dataset = CIFAR10( 22 | root=args.data, 23 | train=True, 24 | download=True, 25 | transform=train_transform, 26 | ) 27 | 28 | indices = list(range(len(train_dataset))) 29 | 30 | train_loader = DataLoader( 31 | train_dataset, 32 | batch_size=args.batch_size, 33 | sampler=SubsetRandomSampler(indices[:-5000]), 34 | pin_memory=True, 35 | num_workers=2, 36 | ) 37 | 38 | reward_loader = DataLoader( 39 | train_dataset, 40 | batch_size=args.batch_size, 41 | sampler=SubsetRandomSampler(indices[-5000:]), 42 | pin_memory=True, 43 | num_workers=2, 44 | ) 45 | 46 | valid_transform = transforms.Compose([ 47 | transforms.ToTensor(), 48 | transforms.Normalize( 49 | mean=MEAN, 50 | std=STD, 51 | ), 52 | ]) 53 | valid_dataset = CIFAR10( 54 | root=args.data, 55 | train=False, 56 | download=False, 57 | transform=valid_transform, 58 | ) 59 | 60 | valid_loader = DataLoader( 61 | valid_dataset, 62 | batch_size=args.batch_size, 63 | shuffle=False, 64 | pin_memory=True, 65 | num_workers=2, 66 | ) 67 | # repeat_train_loader = RepeatedDataLoader(train_loader) 68 | repeat_reward_loader = RepeatedDataLoader(reward_loader) 69 | repeat_valid_loader = RepeatedDataLoader(valid_loader) 70 | 71 | elif args.dataset == "Sport8" or args.dataset == "MIT67" or args.dataset == "flowers102": 72 | MEAN = [0.485, 0.456, 0.406] 73 | STD = [0.229, 0.224, 0.225] 74 | transf_train = [ 75 | transforms.RandomResizedCrop(224), 76 | transforms.RandomHorizontalFlip(), 77 | transforms.ColorJitter( 78 | brightness=0.4, 79 | contrast=0.4, 80 | saturation=0.4, 81 | hue=0.2) 82 | ] 83 | transf_val = [ 84 | transforms.Resize(256), 85 | transforms.CenterCrop(224), 86 | ] 87 | normalize = [ 88 | transforms.ToTensor(), 89 | transforms.Normalize(MEAN, STD) 90 | ] 91 | train_transform = transforms.Compose(transf_train + normalize) 92 | valid_transform = transforms.Compose(transf_val + normalize) 93 | train_dataset = dset.ImageFolder(root=args.data + "/" + args.dataset + "/train", transform=train_transform) 94 | valid_dataset = dset.ImageFolder(root=args.data + "/" + args.dataset + "/test", transform=valid_transform) 95 | 96 | n_train = len(train_dataset) 97 | split = n_train // 2 98 | indices = list(range(n_train)) 99 | random.shuffle(indices) 100 | 101 | train_sampler = torch.utils.data.sampler.SubsetRandomSampler(indices[:split]) 102 | valid_sampler = torch.utils.data.sampler.SubsetRandomSampler(indices[split:]) 103 | 104 | train_loader = DataLoader( 105 | train_dataset, 106 | batch_size=args.batch_size, 107 | sampler=train_sampler, 108 | pin_memory=True, 109 | num_workers=2, 110 | ) 111 | 112 | reward_loader = DataLoader( 113 | train_dataset, 114 | batch_size=args.batch_size, 115 | sampler=train_sampler, 116 | pin_memory=True, 117 | num_workers=2, 118 | ) 119 | 120 | valid_loader = DataLoader( 121 | train_dataset, 122 | batch_size=args.batch_size, 123 | sampler = valid_sampler, 124 | pin_memory=True, 125 | num_workers=2, 126 | ) 127 | # repeat_train_loader = RepeatedDataLoader(train_loader) 128 | repeat_reward_loader = RepeatedDataLoader(reward_loader) 129 | repeat_valid_loader = RepeatedDataLoader(valid_loader) 130 | 131 | 132 | return train_loader, repeat_reward_loader, repeat_valid_loader 133 | 134 | 135 | class RepeatedDataLoader(): 136 | def __init__(self, data_loader): 137 | self.data_loader = data_loader 138 | self.data_iter = self.data_loader.__iter__() 139 | 140 | def __len__(self): 141 | return len(self.data_loader) 142 | 143 | def next_batch(self): 144 | try: 145 | batch = self.data_iter.__next__() 146 | except StopIteration: 147 | self.data_iter = self.data_loader.__iter__() 148 | batch = self.data_iter.__next__() 149 | return batch 150 | 151 | -------------------------------------------------------------------------------- /ENASPytorch/utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import math 3 | import numpy as np 4 | import torch 5 | import shutil 6 | import torchvision.transforms as transforms 7 | from torch.autograd import Variable 8 | 9 | 10 | class AvgrageMeter(object): 11 | 12 | def __init__(self): 13 | self.reset() 14 | 15 | def reset(self): 16 | self.avg = 0 17 | self.sum = 0 18 | self.cnt = 0 19 | 20 | def update(self, val, n=1): 21 | self.sum += val * n 22 | self.cnt += n 23 | self.avg = self.sum / self.cnt 24 | 25 | 26 | class LRScheduler: 27 | def __init__(self, optimizer, args): 28 | self.last_lr_reset = 0 29 | self.lr_T_0 = args.child_lr_T_0 30 | self.child_lr_T_mul = args.child_lr_T_mul 31 | self.child_lr_min = args.child_lr_min 32 | self.child_lr_max = args.child_lr_max 33 | self.optimizer = optimizer 34 | 35 | def update(self, epoch): 36 | T_curr = epoch - self.last_lr_reset 37 | if T_curr == self.lr_T_0: 38 | self.last_lr_reset = epoch 39 | self.lr_T_0 = self.lr_T_0 * self.child_lr_T_mul 40 | rate = T_curr / self.lr_T_0 * math.pi 41 | lr = self.child_lr_min + 0.5 * (self.child_lr_max - self.child_lr_min) * (1.0 + math.cos(rate)) 42 | for param_group in self.optimizer.param_groups: 43 | param_group['lr'] = lr 44 | return lr 45 | 46 | 47 | def accuracy(output, target, topk=(1,)): 48 | maxk = max(topk) 49 | batch_size = target.size(0) 50 | 51 | _, pred = output.topk(maxk, 1, True, True) 52 | pred = pred.t() 53 | correct = pred.eq(target.view(1, -1).expand_as(pred)) 54 | 55 | res = [] 56 | for k in topk: 57 | correct_k = correct[:k].view(-1).float().sum(0) 58 | res.append(correct_k.mul_(100.0/batch_size)) 59 | return res 60 | 61 | 62 | class Cutout(object): 63 | def __init__(self, length): 64 | self.length = length 65 | 66 | def __call__(self, img): 67 | h, w = img.size(1), img.size(2) 68 | mask = np.ones((h, w), np.float32) 69 | y = np.random.randint(h) 70 | x = np.random.randint(w) 71 | 72 | y1 = np.clip(y - self.length // 2, 0, h) 73 | y2 = np.clip(y + self.length // 2, 0, h) 74 | x1 = np.clip(x - self.length // 2, 0, w) 75 | x2 = np.clip(x + self.length // 2, 0, w) 76 | 77 | mask[y1: y2, x1: x2] = 0. 78 | mask = torch.from_numpy(mask) 79 | mask = mask.expand_as(img) 80 | img *= mask 81 | return img 82 | 83 | def save_checkpoint(state, is_best, save): 84 | filename = os.path.join(save, 'checkpoint.pth.tar') 85 | torch.save(state, filename) 86 | if is_best: 87 | best_filename = os.path.join(save, 'model_best.pth.tar') 88 | shutil.copyfile(filename, best_filename) 89 | 90 | 91 | def save(model, model_path): 92 | torch.save(model.state_dict(), model_path) 93 | 94 | 95 | def load(model, model_path): 96 | model.load_state_dict(torch.load(model_path)) 97 | 98 | 99 | def drop_path(x, drop_prob): 100 | if drop_prob > 0.: 101 | keep_prob = 1.-drop_prob 102 | mask = Variable(torch.cuda.FloatTensor(x.size(0), 1, 1, 1).bernoulli_(keep_prob)) 103 | x.div_(keep_prob) 104 | x.mul_(mask) 105 | return x 106 | 107 | 108 | def create_exp_dir(path, scripts_to_save=None): 109 | if not os.path.exists(path): 110 | os.mkdir(path) 111 | print('Experiment dir : {}'.format(path)) 112 | 113 | if scripts_to_save is not None: 114 | #os.mkdir(os.path.join(path, 'scripts')) 115 | for script in scripts_to_save: 116 | dst_file = os.path.join(path, 'scripts', os.path.basename(script)) 117 | shutil.copyfile(script, dst_file) 118 | 119 | -------------------------------------------------------------------------------- /ENASTF/README.md: -------------------------------------------------------------------------------- 1 | # Efficient Neural Architecture Search via Parameter Sharing 2 | 3 | Original code used for CIFAR-10 and CIFAR-100 experiments 4 | 5 | ## Generate a random architecture 6 | 7 | ``` 8 | import numpy.random as rd 9 | ops = rd.randint(0, 5, 20) 10 | links = rd.choice([0, 1], size=20, replace=True) 11 | arch_conv = "%i %i %i %i %i %i %i %i %i %i %i %i %i %i %i %i %i %i %i %i " % ( 12 | links[0], ops[0], links[1], ops[1], links[0], ops[2], links[1], ops[3], links[0], ops[4], links[1], ops[5], 13 | links[0], ops[6], links[1], ops[7], links[0], ops[16], links[1], ops[17]) 14 | arch_red = "%i %i %i %i %i %i %i %i %i %i %i %i %i %i %i %i %i %i %i %i" % ( 15 | links[0], ops[8], links[1], ops[9], links[0], ops[10], links[1], ops[11], links[0], ops[12], links[1], ops[13], 16 | links[0], ops[14], links[1], ops[15], links[0], ops[18], links[1], ops[19]) 17 | arch = arch_conv + arch_red 18 | ``` 19 | 20 | ## Search 21 | 22 | ``` 23 | python src/cifar10/main.py 24 | --data_format="NCHW" 25 | --search_for="micro" 26 | --reset_output_dir 27 | --data_path="/data/CIFAR10" #path to data 28 | --dataset="CIFAR10" # choose between CIFAR10 and CIFAR100 29 | --output_dir test 30 | --batch_size=160 31 | --num_epochs=150 32 | --log_every=50 33 | --eval_every_epochs=1 34 | --child_use_aux_heads 35 | --child_num_layers=6 36 | --child_out_filters=20 37 | --child_l2_reg=1e-4 38 | --child_num_branches=5 39 | --child_num_cells=5 40 | --child_keep_prob=0.90 41 | --child_drop_path_keep_prob=0.60 42 | --child_lr_cosine 43 | --child_lr_max=0.05 44 | --child_lr_min=0.0005 45 | --child_lr_T_0=10 46 | --child_lr_T_mul=2 47 | --controller_training 48 | --controller_search_whole_channels 49 | --controller_entropy_weight=0.0001 50 | --controller_train_every=1 51 | --controller_sync_replicas 52 | --controller_num_aggregate=10 53 | --controller_train_steps=30 54 | --controller_lr=0.0035 55 | --controller_tanh_constant=1.10 56 | --controller_op_tanh_reduce=2.5 57 | ``` 58 | 59 | ## Augment 60 | 61 | ``` 62 | python src/cifar10/main.py 63 | --data_format="NCHW" 64 | --search_for="micro" 65 | --reset_output_dir 66 | --data_path="/data/CIFAR10" #path to data 67 | --output_dir test 68 | --batch_size=144 69 | --num_epochs=630 70 | --log_every=50 71 | --eval_every_epochs=1 72 | --child_fixed_arc=arch 73 | --child_use_aux_heads 74 | --child_num_layers=15 75 | --child_out_filters=36 76 | --child_num_branches=5 77 | --child_num_cells=5 78 | --child_keep_prob=0.80 79 | --child_drop_path_keep_prob=0.60 80 | --child_l2_reg=2e-4 81 | --child_lr_cosine 82 | --child_lr_max=0.05 83 | --child_lr_min=0.0001 84 | --child_lr_T_0=10 85 | --child_lr_T_mul=2 86 | --nocontroller_training 87 | --controller_search_whole_channels 88 | --controller_entropy_weight=0.0001 89 | --controller_train_every=1 90 | --controller_sync_replicas 91 | --controller_num_aggregate=10 92 | --controller_train_steps=50 93 | --controller_lr=0.001 94 | --controller_tanh_constant=1.50 95 | --controller_op_tanh_reduce=2.5 96 | --dataset="CIFAR10" # choose between CIFAR10 and CIFAR100 97 | ``` 98 | -------------------------------------------------------------------------------- /ENASTF/src/cifar10/data_utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import cPickle as pickle 4 | import numpy as np 5 | import tensorflow as tf 6 | import matplotlib.pyplot as plt 7 | 8 | def _read_data(data_path, dataset, train_files): 9 | """Reads CIFAR-10 format data. Always returns NHWC format. 10 | 11 | Returns: 12 | images: np tensor of size [N, H, W, C] 13 | labels: np tensor of size [N] 14 | """ 15 | images, labels = [], [] 16 | for file_name in train_files: 17 | print file_name 18 | full_name = os.path.join(data_path, file_name) 19 | with open(full_name) as finp: 20 | data = pickle.load(finp) 21 | batch_images = data["data"].astype(np.float32) / 255.0 22 | if dataset == "CIFAR100": 23 | batch_labels = np.array(data["fine_labels"], dtype=np.int32) 24 | else: 25 | batch_labels = np.array(data["labels"], dtype=np.int32) 26 | images.append(batch_images) 27 | labels.append(batch_labels) 28 | images = np.concatenate(images, axis=0) 29 | labels = np.concatenate(labels, axis=0) 30 | images = np.reshape(images, [-1, 3, 32, 32]) 31 | images = np.transpose(images, [0, 2, 3, 1]) 32 | 33 | return images, labels 34 | 35 | def read_data(data_path, dataset="CIFAR10", num_valids=5000): 36 | print "-" * 80 37 | print "Reading data" 38 | 39 | images, labels = {}, {} 40 | 41 | if dataset == "CIFAR100": 42 | train_files = [ 43 | "train" 44 | ] 45 | test_file = [ 46 | "test" 47 | ] 48 | else: 49 | train_files = [ 50 | "data_batch_1", 51 | "data_batch_2", 52 | "data_batch_3", 53 | "data_batch_4", 54 | "data_batch_5", 55 | ] 56 | test_file = [ 57 | "test_batch", 58 | ] 59 | images["train"], labels["train"] = _read_data(data_path, dataset, train_files) 60 | 61 | if num_valids: 62 | images["valid"] = images["train"][-num_valids:] 63 | labels["valid"] = labels["train"][-num_valids:] 64 | 65 | images["train"] = images["train"][:-num_valids] 66 | labels["train"] = labels["train"][:-num_valids] 67 | else: 68 | images["valid"], labels["valid"] = None, None 69 | 70 | images["test"], labels["test"] = _read_data(data_path, dataset, test_file) 71 | 72 | print "Prepropcess: [subtract mean], [divide std]" 73 | mean = np.mean(images["train"], axis=(0, 1, 2), keepdims=True) 74 | std = np.std(images["train"], axis=(0, 1, 2), keepdims=True) 75 | 76 | print "mean: {}".format(np.reshape(mean * 255.0, [-1])) 77 | print "std: {}".format(np.reshape(std * 255.0, [-1])) 78 | 79 | images["train"] = (images["train"] - mean) / std 80 | if num_valids: 81 | images["valid"] = (images["valid"] - mean) / std 82 | images["test"] = (images["test"] - mean) / std 83 | 84 | return images, labels 85 | 86 | -------------------------------------------------------------------------------- /ENASTF/src/common_ops.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import tensorflow as tf 3 | 4 | 5 | def lstm(x, prev_c, prev_h, w): 6 | ifog = tf.matmul(tf.concat([x, prev_h], axis=1), w) 7 | i, f, o, g = tf.split(ifog, 4, axis=1) 8 | i = tf.sigmoid(i) 9 | f = tf.sigmoid(f) 10 | o = tf.sigmoid(o) 11 | g = tf.tanh(g) 12 | next_c = i * g + f * prev_c 13 | next_h = o * tf.tanh(next_c) 14 | return next_c, next_h 15 | 16 | 17 | def stack_lstm(x, prev_c, prev_h, w): 18 | next_c, next_h = [], [] 19 | for layer_id, (_c, _h, _w) in enumerate(zip(prev_c, prev_h, w)): 20 | inputs = x if layer_id == 0 else next_h[-1] 21 | curr_c, curr_h = lstm(inputs, _c, _h, _w) 22 | next_c.append(curr_c) 23 | next_h.append(curr_h) 24 | return next_c, next_h 25 | 26 | 27 | def create_weight(name, shape, initializer=None, trainable=True, seed=None): 28 | if initializer is None: 29 | initializer = tf.contrib.keras.initializers.he_normal(seed=seed) 30 | return tf.get_variable(name, shape, initializer=initializer, trainable=trainable) 31 | 32 | 33 | def create_bias(name, shape, initializer=None): 34 | if initializer is None: 35 | initializer = tf.constant_initializer(0.0, dtype=tf.float32) 36 | return tf.get_variable(name, shape, initializer=initializer) 37 | 38 | -------------------------------------------------------------------------------- /ENASTF/src/controller.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | class Controller(object): 4 | def __init__(self, *args, **kwargs): 5 | raise NotImplementedError("Abstract method.") 6 | 7 | def _build_sample(self): 8 | raise NotImplementedError("Abstract method.") 9 | 10 | def _build_greedy(self): 11 | raise NotImplementedError("Abstract method.") 12 | 13 | def _build_trainer(self): 14 | raise NotImplementedError("Abstract method.") 15 | -------------------------------------------------------------------------------- /NAO/README.md: -------------------------------------------------------------------------------- 1 | # Neural Architecture Optimization 2 | 3 | ## Generate random architectures 4 | 5 | ``` 6 | from utils import generate_arch 7 | arch = generate_arch(1, 5, num_ops=11)[0] 8 | ``` 9 | 10 | ## Search 11 | 12 | ``` 13 | python train_search.py 14 | --dataset CIFAR10 #choose between CIFAR10, CIFAR100, Sport8, MIT67 and flowers102 15 | --child_batch_size 64 16 | --child_eval_batch_size 500 # 500 for CIFAR10 and CIFAR100, 128 for Sport8, MIT67 and flowers102 17 | --child_layers 3 # 3 for CIFAR10 and CIFAR100, 2 for Sport8, MIT67 and flowers102 18 | --child_epochs 200 19 | --child_eval_epochs 50 20 | --controller_expand 8 21 | --child_keep_prob 1.0 22 | --child_drop_path_keep_prob 0.9 23 | --child_sample_policy "params" 24 | --data /data #path to data 25 | --output_dir test 26 | ``` 27 | 28 | ## Augment 29 | 30 | For CIFAR datasets: 31 | 32 | ``` 33 | python train_cifar.py 34 | --use_aux_head 35 | --keep_prob 0.6 36 | --drop_path_keep_prob 0.8 37 | --cutout_size 16 38 | --l2_reg 3e-4 39 | --arch arch 40 | --channels 36 41 | --batch_size 128 42 | --output_dir test 43 | --data /data # path to data 44 | --dataset CIFAR10 # choose between CIFAR10 and CIFAR100 45 | ``` 46 | 47 | For other datasets : 48 | 49 | ``` 50 | python train_imagenet.py 51 | --use_aux_head 52 | --keep_prob 0.6 53 | --drop_path_keep_prob 0.8 54 | --cutout_size 16 55 | --l2_reg 3e-4 56 | --arch arch 57 | --channels 36 58 | --batch_size 96 59 | --layers 2 60 | --epochs 600 61 | --output_dir test 62 | --data /data # path to data 63 | --dataset sport8 # choose between sport8, mit67 and flowers102 64 | ``` 65 | 66 | 67 | -------------------------------------------------------------------------------- /NAO/controller.py: -------------------------------------------------------------------------------- 1 | import os 2 | import logging 3 | import numpy as np 4 | import torch 5 | import torch.nn as nn 6 | import torch.nn.functional as F 7 | from encoder import Encoder 8 | from decoder import Decoder 9 | 10 | 11 | SOS_ID = 0 12 | EOS_ID = 0 13 | INITRANGE=0.04 14 | 15 | class NAO(nn.Module): 16 | def __init__(self, 17 | encoder_layers, 18 | encoder_vocab_size, 19 | encoder_hidden_size, 20 | encoder_dropout, 21 | encoder_length, 22 | source_length, 23 | encoder_emb_size, 24 | mlp_layers, 25 | mlp_hidden_size, 26 | mlp_dropout, 27 | decoder_layers, 28 | decoder_vocab_size, 29 | decoder_hidden_size, 30 | decoder_dropout, 31 | decoder_length, 32 | ): 33 | super(NAO, self).__init__() 34 | self.encoder = Encoder( 35 | encoder_layers, 36 | encoder_vocab_size, 37 | encoder_hidden_size, 38 | encoder_dropout, 39 | encoder_length, 40 | source_length, 41 | encoder_emb_size, 42 | mlp_layers, 43 | mlp_hidden_size, 44 | mlp_dropout, 45 | ) 46 | self.decoder = Decoder( 47 | decoder_layers, 48 | decoder_vocab_size, 49 | decoder_hidden_size, 50 | decoder_dropout, 51 | decoder_length, 52 | encoder_length 53 | ) 54 | 55 | self.init_parameters() 56 | self.flatten_parameters() 57 | 58 | def init_parameters(self): 59 | for w in self.parameters(): 60 | if w.data.dim() >= 2: 61 | nn.init.uniform_(w.data, -INITRANGE, INITRANGE) 62 | 63 | def flatten_parameters(self): 64 | self.encoder.rnn.flatten_parameters() 65 | self.decoder.rnn.flatten_parameters() 66 | 67 | def forward(self, input_variable, target_variable=None): 68 | encoder_outputs, encoder_hidden, arch_emb, predict_value = self.encoder(input_variable) 69 | decoder_hidden = (arch_emb.unsqueeze(0), arch_emb.unsqueeze(0)) 70 | decoder_outputs, decoder_hidden, ret = self.decoder(target_variable, decoder_hidden, encoder_outputs) 71 | decoder_outputs = torch.stack(decoder_outputs, 0).permute(1, 0, 2) 72 | arch = torch.stack(ret['sequence'], 0).permute(1, 0, 2) 73 | return predict_value, decoder_outputs, arch 74 | 75 | def generate_new_arch(self, input_variable, predict_lambda=1, direction='-'): 76 | encoder_outputs, encoder_hidden, arch_emb, predict_value, new_encoder_outputs, new_arch_emb = self.encoder.infer( 77 | input_variable, predict_lambda, direction=direction) 78 | new_encoder_hidden = (new_arch_emb.unsqueeze(0), new_arch_emb.unsqueeze(0)) 79 | decoder_outputs, decoder_hidden, ret = self.decoder(None, new_encoder_hidden, new_encoder_outputs) 80 | new_arch = torch.stack(ret['sequence'], 0).permute(1, 0, 2) 81 | return new_arch 82 | -------------------------------------------------------------------------------- /NAO/encoder.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | 5 | import logging 6 | import torch 7 | import torch.nn as nn 8 | import torch.nn.functional as F 9 | 10 | INITRANGE = 0.04 11 | 12 | 13 | class Encoder(nn.Module): 14 | def __init__(self, 15 | layers, 16 | vocab_size, 17 | hidden_size, 18 | dropout, 19 | length, 20 | source_length, 21 | emb_size, 22 | mlp_layers, 23 | mlp_hidden_size, 24 | mlp_dropout, 25 | ): 26 | super(Encoder, self).__init__() 27 | self.layers = layers 28 | self.vocab_size = vocab_size 29 | self.emb_size = emb_size 30 | self.hidden_size = hidden_size 31 | self.length = length 32 | self.source_length = source_length 33 | self.mlp_layers = mlp_layers 34 | self.mlp_hidden_size = mlp_hidden_size 35 | 36 | self.embedding = nn.Embedding(self.vocab_size, self.emb_size) 37 | self.rnn = nn.LSTM(self.hidden_size, self.hidden_size, self.layers, batch_first=True, dropout=dropout) 38 | self.mlp = nn.Sequential() 39 | for i in range(self.mlp_layers): 40 | if i == 0: 41 | self.mlp.add_module('layer_{}'.format(i), nn.Sequential( 42 | nn.Linear(self.hidden_size, self.mlp_hidden_size), 43 | nn.ReLU(inplace=False), 44 | nn.Dropout(p=mlp_dropout))) 45 | else: 46 | self.mlp.add_module('layer_{}'.format(i), nn.Sequential( 47 | nn.Linear(self.mlp_hidden_size, self.mlp_hidden_size), 48 | nn.ReLU(inplace=False), 49 | nn.Dropout(p=mlp_dropout))) 50 | self.regressor = nn.Linear(self.hidden_size if self.mlp_layers == 0 else self.mlp_hidden_size, 1) 51 | 52 | def forward(self, x): 53 | embedded = self.embedding(x) 54 | if self.source_length != self.length: 55 | assert self.source_length % self.length == 0 56 | ratio = self.source_length // self.length 57 | embedded = embedded.view(-1, self.source_length // ratio, ratio * self.emb_size) 58 | out, hidden = self.rnn(embedded) 59 | out = F.normalize(out, 2, dim=-1) 60 | encoder_outputs = out 61 | encoder_hidden = hidden 62 | 63 | out = torch.mean(out, dim=1) 64 | out = F.normalize(out, 2, dim=-1) 65 | arch_emb = out 66 | 67 | out = self.mlp(out) 68 | out = self.regressor(out) 69 | predict_value = torch.sigmoid(out) 70 | return encoder_outputs, encoder_hidden, arch_emb, predict_value 71 | 72 | def infer(self, x, predict_lambda, direction='-'): 73 | encoder_outputs, encoder_hidden, arch_emb, predict_value = self(x) 74 | grads_on_outputs = torch.autograd.grad(predict_value, encoder_outputs, torch.ones_like(predict_value))[0] 75 | if direction == '+': 76 | new_encoder_outputs = encoder_outputs + predict_lambda * grads_on_outputs 77 | elif direction == '-': 78 | new_encoder_outputs = encoder_outputs - predict_lambda * grads_on_outputs 79 | else: 80 | raise ValueError('Direction must be + or -, got {} instead'.format(direction)) 81 | new_encoder_outputs = F.normalize(new_encoder_outputs, 2, dim=-1) 82 | new_arch_emb = torch.mean(new_encoder_outputs, dim=1) 83 | new_arch_emb = F.normalize(new_arch_emb, 2, dim=-1) 84 | return encoder_outputs, encoder_hidden, arch_emb, predict_value, new_encoder_outputs, new_arch_emb -------------------------------------------------------------------------------- /NSGANET/README.md: -------------------------------------------------------------------------------- 1 | # NSGA-Net: Neural Architecture Search using Multi-Objective Genetic Algorithm 2 | 3 | ## Generate random architectures 4 | ``` 5 | from search.micro_encoding import decode 6 | import numpy.random as rd 7 | 8 | ops=rd.randint(0,8,16) 9 | links0=rd.choice([0,1], size=2, replace=False) 10 | links1=rd.choice([0,1,2], size=2, replace=False) 11 | links2=rd.choice([0,1,2,3], size=2, replace=False) 12 | links3=rd.choice([0,1,2,3,4], size=2, replace=False) 13 | links0r=rd.choice([0,1], size=2, replace=False) 14 | links1r=rd.choice([0,1,2], size=2, replace=False) 15 | links2r=rd.choice([0,1,2,3], size=2, replace=False) 16 | links3r=rd.choice([0,1,2,3,4], size=2, replace=False) 17 | genome = [[[[ops[0], links0[0]], [ops[1], links0[1]]], [[ops[2], links1[0]], [ops[3], links1[1]]], [[ops[4], links2[0]], [ops[5], links2[1]]], [[ops[6], links3[0]], [ops[7], links3[1]]]], 18 | [[[ops[8], links0r[0]], [ops[9], links0r[1]]], [[ops[10], links1r[0]], [ops[11], links1r[1]]], [[ops[12], links2r[0]], [ops[13], links2r[1]]], [[ops[14], links3r[0]], [ops[15], links3r[1]]]]] 19 | genotype = decode(genome) 20 | ``` 21 | 22 | ## Search 23 | 24 | ``` 25 | python search/evolution_search.py 26 | --init_channels 16 27 | --layers 8 28 | --epochs 20 29 | --n_offspring 20 30 | --n_gens 30 31 | --search_space micro 32 | --save test 33 | --data_path /data # path to data 34 | --dataset CIFAR10 # choose between CIFAR10, CIFAR100, Sport8, MIT67 and flowers102 35 | ``` 36 | 37 | ## Augment 38 | 39 | ``` 40 | python validation/train.py 41 | --dataset CIFAR10 # choose between CIFAR10, CIFAR100, Sport8, MIT67 and flowers102 42 | --net_type micro 43 | --layers 20 # 20 for CIFAR10 and CIFAR100, 8 for Sport8, MIT67 and flowers102 44 | --init_channels 34 45 | --filter_increment 4 46 | --cutout 47 | --auxiliary 48 | --batch_size 96 49 | --droprate 0.2 50 | --SE 51 | --epochs 600 52 | --genotype genotype 53 | --data datapath 54 | --save test 55 | --path test 56 | ``` 57 | -------------------------------------------------------------------------------- /NSGANET/misc/utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import torch 4 | import shutil 5 | import torchvision.transforms as transforms 6 | from torch.autograd import Variable 7 | 8 | 9 | class AvgrageMeter(object): 10 | def __init__(self): 11 | self.reset() 12 | 13 | def reset(self): 14 | self.avg = 0 15 | self.sum = 0 16 | self.cnt = 0 17 | 18 | def update(self, val, n=1): 19 | self.sum += val * n 20 | self.cnt += n 21 | self.avg = self.sum / self.cnt 22 | 23 | 24 | def accuracy(output, target, topk=(1,)): 25 | maxk = max(topk) 26 | batch_size = target.size(0) 27 | 28 | _, pred = output.topk(maxk, 1, True, True) 29 | pred = pred.t() 30 | correct = pred.eq(target.view(1, -1).expand_as(pred)) 31 | 32 | res = [] 33 | for k in topk: 34 | correct_k = correct[:k].view(-1).float().sum(0) 35 | res.append(correct_k.mul_(100.0 / batch_size)) 36 | return res 37 | 38 | 39 | class Cutout(object): 40 | def __init__(self, length): 41 | self.length = length 42 | 43 | def __call__(self, img): 44 | h, w = img.size(1), img.size(2) 45 | mask = np.ones((h, w), np.float32) 46 | y = np.random.randint(h) 47 | x = np.random.randint(w) 48 | 49 | y1 = np.clip(y - self.length // 2, 0, h) 50 | y2 = np.clip(y + self.length // 2, 0, h) 51 | x1 = np.clip(x - self.length // 2, 0, w) 52 | x2 = np.clip(x + self.length // 2, 0, w) 53 | 54 | mask[y1: y2, x1: x2] = 0. 55 | mask = torch.from_numpy(mask) 56 | mask = mask.expand_as(img) 57 | img *= mask 58 | return img 59 | 60 | 61 | def _data_transforms_cifar10(args): 62 | CIFAR_MEAN = [0.49139968, 0.48215827, 0.44653124] 63 | CIFAR_STD = [0.24703233, 0.24348505, 0.26158768] 64 | 65 | train_transform = transforms.Compose([ 66 | transforms.RandomCrop(32, padding=4), 67 | transforms.RandomHorizontalFlip(), 68 | transforms.ToTensor() 69 | ]) 70 | 71 | if args.cutout: 72 | train_transform.transforms.append(Cutout(args.cutout_length)) 73 | 74 | train_transform.transforms.append(transforms.Normalize(CIFAR_MEAN, CIFAR_STD)) 75 | 76 | valid_transform = transforms.Compose([ 77 | transforms.ToTensor(), 78 | transforms.Normalize(CIFAR_MEAN, CIFAR_STD), 79 | ]) 80 | return train_transform, valid_transform 81 | 82 | def _data_transforms_large(args): 83 | MEAN = [0.485, 0.456, 0.406] 84 | STD = [0.229, 0.224, 0.225] 85 | transf_train = [ 86 | transforms.RandomResizedCrop(224), 87 | transforms.RandomHorizontalFlip(), 88 | transforms.ColorJitter( 89 | brightness=0.4, 90 | contrast=0.4, 91 | saturation=0.4, 92 | hue=0.2) 93 | ] 94 | transf_val = [ 95 | transforms.Resize(256), 96 | transforms.CenterCrop(224), 97 | ] 98 | normalize = [ 99 | transforms.ToTensor(), 100 | transforms.Normalize(MEAN, STD) 101 | ] 102 | 103 | train_transform = transforms.Compose(transf_train + normalize) 104 | valid_transform = transforms.Compose(transf_val + normalize) 105 | if args.cutout: 106 | train_transform.transforms.append(Cutout(args.cutout_length)) 107 | return train_transform, valid_transform 108 | 109 | def count_parameters_in_MB(model): 110 | 111 | n_params_from_auxiliary_head = np.sum(np.prod(v.size()) for name, v in model.named_parameters()) - \ 112 | np.sum(np.prod(v.size()) for name, v in model.named_parameters() 113 | if "auxiliary" not in name) 114 | n_params_trainable = sum(p.numel() for p in model.parameters() if p.requires_grad) 115 | return (n_params_trainable - n_params_from_auxiliary_head) / 1e6 116 | 117 | 118 | def save_checkpoint(state, is_best, save): 119 | filename = os.path.join(save, 'checkpoint.pth.tar') 120 | torch.save(state, filename) 121 | if is_best: 122 | best_filename = os.path.join(save, 'model_best.pth.tar') 123 | shutil.copyfile(filename, best_filename) 124 | 125 | 126 | def save(model, model_path): 127 | torch.save(model.state_dict(), model_path) 128 | 129 | 130 | def load(model, model_path): 131 | model.load_state_dict(torch.load(model_path)) 132 | 133 | 134 | def drop_path(x, drop_prob): 135 | if drop_prob > 0.: 136 | keep_prob = 1. - drop_prob 137 | mask = Variable(torch.cuda.FloatTensor(x.size(0), 1, 1, 1).bernoulli_(keep_prob)) 138 | x.div_(keep_prob) 139 | x.mul_(mask) 140 | return x 141 | 142 | 143 | def create_exp_dir(path, scripts_to_save=None): 144 | if not os.path.exists(path): 145 | os.mkdir(path) 146 | print('Experiment dir : {}'.format(path)) 147 | 148 | if scripts_to_save is not None: 149 | os.mkdir(os.path.join(path, 'scripts')) 150 | for script in scripts_to_save: 151 | dst_file = os.path.join(path, 'scripts', os.path.basename(script)) 152 | shutil.copyfile(script, dst_file) 153 | -------------------------------------------------------------------------------- /NSGANET/models/macro_genotypes.py: -------------------------------------------------------------------------------- 1 | NSGANet = [[[1], [0, 0], [0, 1, 0], [0, 1, 1, 1], [1, 0, 0, 1, 1], [0]], 2 | [[0], [0, 0], [0, 1, 0], [0, 1, 0, 1], [1, 1, 1, 1, 1], [0]], 3 | [[0], [0, 1], [1, 0, 1], [1, 0, 1, 1], [1, 0, 0, 1, 1], [0]]] 4 | 5 | -------------------------------------------------------------------------------- /NSGANET/models/macro_models.py: -------------------------------------------------------------------------------- 1 | # evonetwork.py 2 | # no auxiliary head classifier should be used with this 3 | 4 | import torch 5 | import torch.nn as nn 6 | from models.macro_decoder import ResidualGenomeDecoder, VariableGenomeDecoder, DenseGenomeDecoder 7 | 8 | 9 | def get_decoder(decoder_str, genome, channels, repeats=None): 10 | """ 11 | Construct the appropriate decoder. 12 | :param decoder_str: string, refers to what genome scheme we're using. 13 | :param genome: list, list of genomes. 14 | :param channels: list, list of channel sizes. 15 | :param repeats: None | list, how many times to repeat each phase. 16 | :return: evolution.ChannelBasedDecoder 17 | """ 18 | if decoder_str == "residual": 19 | return ResidualGenomeDecoder(genome, channels, repeats=repeats) 20 | 21 | if decoder_str == "swapped-residual": 22 | return ResidualGenomeDecoder(genome, channels, preact=True, repeats=repeats) 23 | 24 | if decoder_str == "dense": 25 | return DenseGenomeDecoder(genome, channels, repeats=repeats) 26 | 27 | if decoder_str == "variable": 28 | return VariableGenomeDecoder(genome, channels, repeats=repeats) 29 | 30 | raise NotImplementedError("Decoder {} not implemented.".format(decoder_str)) 31 | 32 | 33 | class EvoNetwork(nn.Module): 34 | """ 35 | Entire network. 36 | Made up of Phases. 37 | """ 38 | def __init__(self, genome, channels, out_features, data_shape, decoder="residual", repeats=None): 39 | """ 40 | Network constructor. 41 | :param genome: depends on decoder scheme, for most this is a list. 42 | :param channels: list of desired channel tuples. 43 | :param out_features: number of output features. 44 | :param decoder: string, what kind of decoding scheme to use. 45 | """ 46 | super(EvoNetwork, self).__init__() 47 | 48 | assert len(channels) == len(genome), "Need to supply as many channel tuples as genes." 49 | if repeats is not None: 50 | assert len(repeats) == len(genome), "Need to supply repetition information for each phase." 51 | 52 | self.model = get_decoder(decoder, genome, channels, repeats).get_model() 53 | 54 | # 55 | # After the evolved part of the network, we would like to do global average pooling and a linear layer. 56 | # However, we don't know the output size so we do some forward passes and observe the output sizes. 57 | # 58 | 59 | out = self.model(torch.autograd.Variable(torch.zeros(1, channels[0][0], *data_shape))) 60 | shape = out.data.shape 61 | 62 | self.gap = nn.AvgPool2d(kernel_size=(shape[-2], shape[-1]), stride=1) 63 | 64 | shape = self.gap(out).data.shape 65 | 66 | self.linear = nn.Linear(shape[1] * shape[2] * shape[3], out_features) 67 | 68 | # We accumulated some unwanted gradient information data with those forward passes. 69 | self.model.zero_grad() 70 | 71 | def forward(self, x): 72 | """ 73 | Forward propagation. 74 | :param x: Variable, input to network. 75 | :return: Variable. 76 | """ 77 | x = self.gap(self.model(x)) 78 | 79 | x = x.view(x.size(0), -1) 80 | 81 | return self.linear(x), None 82 | 83 | 84 | def demo(): 85 | """ 86 | Demo creating a network. 87 | """ 88 | import validation.utils as utils 89 | genome = [[[1], [0, 0], [0, 1, 0], [0, 1, 1, 1], [1, 0, 0, 1, 1], [0]], 90 | [[0], [0, 0], [0, 1, 0], [0, 1, 0, 1], [1, 1, 1, 1, 1], [0]], 91 | [[0], [0, 1], [1, 0, 1], [1, 0, 1, 1], [1, 0, 0, 1, 1], [0]]] 92 | 93 | channels = [(3, 128), (128, 128), (128, 128)] 94 | 95 | out_features = 10 96 | data = torch.randn(16, 3, 32, 32) 97 | net = EvoNetwork(genome, channels, out_features, (32, 32), decoder='dense') 98 | print("param size = {}MB".format(utils.count_parameters_in_MB(net))) 99 | output = net(torch.autograd.Variable(data)) 100 | 101 | print(output) 102 | 103 | 104 | if __name__ == "__main__": 105 | demo() 106 | -------------------------------------------------------------------------------- /NSGANET/models/micro_genotypes.py: -------------------------------------------------------------------------------- 1 | from collections import namedtuple 2 | 3 | Genotype = namedtuple('Genotype', 'normal normal_concat reduce reduce_concat') 4 | 5 | PRIMITIVES = [ 6 | 'none', 7 | 'max_pool_3x3', 8 | 'avg_pool_3x3', 9 | 'skip_connect', 10 | 'sep_conv_3x3', 11 | 'sep_conv_5x5', 12 | 'dil_conv_3x3', 13 | 'dil_conv_5x5' 14 | ] 15 | 16 | NASNet = Genotype( 17 | normal=[ 18 | ('sep_conv_5x5', 1), 19 | ('sep_conv_3x3', 0), 20 | ('sep_conv_5x5', 0), 21 | ('sep_conv_3x3', 0), 22 | ('avg_pool_3x3', 1), 23 | ('skip_connect', 0), 24 | ('avg_pool_3x3', 0), 25 | ('avg_pool_3x3', 0), 26 | ('sep_conv_3x3', 1), 27 | ('skip_connect', 1), 28 | ], 29 | normal_concat=[2, 3, 4, 5, 6], 30 | reduce=[ 31 | ('sep_conv_5x5', 1), 32 | ('sep_conv_7x7', 0), 33 | ('max_pool_3x3', 1), 34 | ('sep_conv_7x7', 0), 35 | ('avg_pool_3x3', 1), 36 | ('sep_conv_5x5', 0), 37 | ('skip_connect', 3), 38 | ('avg_pool_3x3', 2), 39 | ('sep_conv_3x3', 2), 40 | ('max_pool_3x3', 1), 41 | ], 42 | reduce_concat=[4, 5, 6], 43 | ) 44 | 45 | AmoebaNet = Genotype( 46 | normal=[ 47 | ('avg_pool_3x3', 0), 48 | ('max_pool_3x3', 1), 49 | ('sep_conv_3x3', 0), 50 | ('sep_conv_5x5', 2), 51 | ('sep_conv_3x3', 0), 52 | ('avg_pool_3x3', 3), 53 | ('sep_conv_3x3', 1), 54 | ('skip_connect', 1), 55 | ('skip_connect', 0), 56 | ('avg_pool_3x3', 1), 57 | ], 58 | normal_concat=[4, 5, 6], 59 | reduce=[ 60 | ('avg_pool_3x3', 0), 61 | ('sep_conv_3x3', 1), 62 | ('max_pool_3x3', 0), 63 | ('sep_conv_7x7', 2), 64 | ('sep_conv_7x7', 0), 65 | ('avg_pool_3x3', 1), 66 | ('max_pool_3x3', 0), 67 | ('max_pool_3x3', 1), 68 | ('conv_7x1_1x7', 0), 69 | ('sep_conv_3x3', 5), 70 | ], 71 | reduce_concat=[3, 4, 6] 72 | ) 73 | 74 | DARTS = Genotype( 75 | normal=[ 76 | ('sep_conv_3x3', 0), 77 | ('sep_conv_3x3', 1), 78 | ('sep_conv_3x3', 0), 79 | ('sep_conv_3x3', 1), 80 | ('sep_conv_3x3', 1), 81 | ('skip_connect', 0), 82 | ('skip_connect', 0), 83 | ('dil_conv_3x3', 2) 84 | ], 85 | normal_concat=[2, 3, 4, 5], 86 | reduce=[ 87 | ('max_pool_3x3', 0), 88 | ('max_pool_3x3', 1), 89 | ('skip_connect', 2), 90 | ('max_pool_3x3', 1), 91 | ('max_pool_3x3', 0), 92 | ('skip_connect', 2), 93 | ('skip_connect', 2), 94 | ('max_pool_3x3', 1) 95 | ], 96 | reduce_concat=[2, 3, 4, 5] 97 | ) 98 | 99 | ENAS = Genotype( 100 | normal=[ 101 | ('sep_conv_3x3', 1), 102 | ('skip_connect', 1), 103 | ('sep_conv_5x5', 1), 104 | ('skip_connect', 0), 105 | ('avg_pool_3x3', 0), 106 | ('sep_conv_3x3', 1), 107 | ('sep_conv_3x3', 0), 108 | ('avg_pool_3x3', 0), 109 | ('sep_conv_5x5', 1), 110 | ('avg_pool_3x3', 0) 111 | ], 112 | normal_concat=[2, 3, 4, 5, 6], 113 | reduce=[ 114 | ('sep_conv_5x5', 0), 115 | ('avg_pool_3x3', 1), 116 | ('sep_conv_3x3', 1), 117 | ('avg_pool_3x3', 1), 118 | ('avg_pool_3x3', 1), 119 | ('sep_conv_3x3', 1), 120 | ('sep_conv_5x5', 4), 121 | ('avg_pool_3x3', 1), 122 | ('sep_conv_3x3', 5), 123 | ('sep_conv_5x5', 0) 124 | ], 125 | reduce_concat=[2, 3, 6] 126 | ) 127 | 128 | NSGANet = Genotype( 129 | normal=[ 130 | ('skip_connect', 0), 131 | ('max_pool_3x3', 0), 132 | ('dil_conv_5x5', 0), 133 | ('max_pool_3x3', 0), 134 | ('dil_conv_5x5', 1), 135 | ('sep_conv_3x3', 3), 136 | ('max_pool_3x3', 1), 137 | ('sep_conv_5x5', 3), 138 | ('sep_conv_3x3', 1), 139 | ('sep_conv_3x3', 0) 140 | ], 141 | normal_concat=[2, 4, 5, 6], 142 | reduce=[ 143 | ('avg_pool_3x3', 0), 144 | ('sep_conv_3x3', 1), 145 | ('dil_conv_3x3', 1), 146 | ('max_pool_3x3', 0), 147 | ('skip_connect', 2), 148 | ('dil_conv_5x5', 1), 149 | ('skip_connect', 2), 150 | ('avg_pool_3x3', 1), 151 | ('dil_conv_5x5', 1), 152 | ('dil_conv_3x3', 1) 153 | ], 154 | reduce_concat=[3, 4, 5, 6] 155 | ) 156 | 157 | 158 | 159 | -------------------------------------------------------------------------------- /NSGANET/models/micro_operations.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | OPS = { 5 | 'none': lambda C, stride, affine: Zero(stride), 6 | 'avg_pool_3x3': lambda C, stride, affine: nn.AvgPool2d(3, stride=stride, padding=1, count_include_pad=False), 7 | 'max_pool_3x3': lambda C, stride, affine: nn.MaxPool2d(3, stride=stride, padding=1), 8 | 'skip_connect': lambda C, stride, affine: Identity() if stride == 1 else FactorizedReduce(C, C, affine=affine), 9 | 'sep_conv_3x3': lambda C, stride, affine: SepConv(C, C, 3, stride, 1, affine=affine), 10 | 'sep_conv_5x5': lambda C, stride, affine: SepConv(C, C, 5, stride, 2, affine=affine), 11 | 'sep_conv_7x7': lambda C, stride, affine: SepConv(C, C, 7, stride, 3, affine=affine), 12 | 'dil_conv_3x3': lambda C, stride, affine: DilConv(C, C, 3, stride, 2, 2, affine=affine), 13 | 'dil_conv_5x5': lambda C, stride, affine: DilConv(C, C, 5, stride, 4, 2, affine=affine), 14 | 'conv_7x1_1x7': lambda C, stride, affine: nn.Sequential( 15 | nn.ReLU(inplace=False), 16 | nn.Conv2d(C, C, (1, 7), stride=(1, stride), padding=(0, 3), bias=False), 17 | nn.Conv2d(C, C, (7, 1), stride=(stride, 1), padding=(3, 0), bias=False), 18 | nn.BatchNorm2d(C, affine=affine) 19 | ), 20 | } 21 | 22 | 23 | class ReLUConvBN(nn.Module): 24 | 25 | def __init__(self, C_in, C_out, kernel_size, stride, padding, affine=True): 26 | super(ReLUConvBN, self).__init__() 27 | self.op = nn.Sequential( 28 | nn.ReLU(inplace=False), 29 | nn.Conv2d(C_in, C_out, kernel_size, stride=stride, padding=padding, bias=False), 30 | nn.BatchNorm2d(C_out, affine=affine) 31 | ) 32 | 33 | def forward(self, x): 34 | return self.op(x) 35 | 36 | 37 | class DilConv(nn.Module): 38 | 39 | def __init__(self, C_in, C_out, kernel_size, stride, padding, dilation, affine=True): 40 | super(DilConv, self).__init__() 41 | self.op = nn.Sequential( 42 | nn.ReLU(inplace=False), 43 | nn.Conv2d(C_in, C_in, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, 44 | groups=C_in, bias=False), 45 | nn.Conv2d(C_in, C_out, kernel_size=1, padding=0, bias=False), 46 | nn.BatchNorm2d(C_out, affine=affine), 47 | ) 48 | 49 | def forward(self, x): 50 | return self.op(x) 51 | 52 | 53 | class SepConv(nn.Module): 54 | 55 | def __init__(self, C_in, C_out, kernel_size, stride, padding, affine=True): 56 | super(SepConv, self).__init__() 57 | self.op = nn.Sequential( 58 | nn.ReLU(inplace=False), 59 | nn.Conv2d(C_in, C_in, kernel_size=kernel_size, stride=stride, padding=padding, groups=C_in, bias=False), 60 | nn.Conv2d(C_in, C_in, kernel_size=1, padding=0, bias=False), 61 | nn.BatchNorm2d(C_in, affine=affine), 62 | nn.ReLU(inplace=False), 63 | nn.Conv2d(C_in, C_in, kernel_size=kernel_size, stride=1, padding=padding, groups=C_in, bias=False), 64 | nn.Conv2d(C_in, C_out, kernel_size=1, padding=0, bias=False), 65 | nn.BatchNorm2d(C_out, affine=affine), 66 | ) 67 | 68 | def forward(self, x): 69 | return self.op(x) 70 | 71 | 72 | class Identity(nn.Module): 73 | 74 | def __init__(self): 75 | super(Identity, self).__init__() 76 | 77 | def forward(self, x): 78 | return x 79 | 80 | 81 | class Zero(nn.Module): 82 | 83 | def __init__(self, stride): 84 | super(Zero, self).__init__() 85 | self.stride = stride 86 | 87 | def forward(self, x): 88 | if self.stride == 1: 89 | return x.mul(0.) 90 | return x[:, :, ::self.stride, ::self.stride].mul(0.) 91 | 92 | 93 | class FactorizedReduce(nn.Module): 94 | 95 | def __init__(self, C_in, C_out, affine=True): 96 | super(FactorizedReduce, self).__init__() 97 | assert C_out % 2 == 0 98 | self.relu = nn.ReLU(inplace=False) 99 | self.conv_1 = nn.Conv2d(C_in, C_out // 2, 1, stride=2, padding=0, bias=False) 100 | self.conv_2 = nn.Conv2d(C_in, C_out // 2, 1, stride=2, padding=0, bias=False) 101 | self.bn = nn.BatchNorm2d(C_out, affine=affine) 102 | 103 | def forward(self, x): 104 | x = self.relu(x) 105 | out = torch.cat([self.conv_1(x), self.conv_2(x[:, :, 1:, 1:])], dim=1) 106 | out = self.bn(out) 107 | return out 108 | 109 | 110 | class SELayer(nn.Module): 111 | def __init__(self, channel, reduction=16): 112 | super(SELayer, self).__init__() 113 | self.avg_pool = nn.AdaptiveAvgPool2d(1) 114 | self.fc = nn.Sequential( 115 | nn.Linear(channel, channel // reduction, bias=False), 116 | nn.ReLU(inplace=True), 117 | nn.Linear(channel // reduction, channel, bias=False), 118 | nn.Sigmoid() 119 | ) 120 | 121 | def forward(self, x): 122 | b, c, _, _ = x.size() 123 | y = self.avg_pool(x).view(b, c) 124 | y = self.fc(y).view(b, c, 1, 1) 125 | return x * y.expand_as(x) 126 | 127 | -------------------------------------------------------------------------------- /NSGANET/search/macro_encoding.py: -------------------------------------------------------------------------------- 1 | # similar encoding to Genetic CNN paper 2 | # we add one more skip connection bit 3 | # L. Xie and A. Yuille, "Genetic CNN," 4 | # 2017 IEEE International Conference on Computer Vision (ICCV) 5 | import numpy as np 6 | 7 | 8 | def phase_dencode(phase_bit_string): 9 | n = int(np.sqrt(2 * len(phase_bit_string) - 7/4) - 1/2) 10 | genome = [] 11 | for i in range(n): 12 | operator = [] 13 | for j in range(i + 1): 14 | operator.append(phase_bit_string[int(i * (i + 1) / 2 + j)]) 15 | genome.append(operator) 16 | genome.append([phase_bit_string[-1]]) 17 | return genome 18 | 19 | 20 | def convert(bit_string, n_phases=3): 21 | # assumes bit_string is a np array 22 | assert bit_string.shape[0] % n_phases == 0 23 | phase_length = bit_string.shape[0] // n_phases 24 | genome = [] 25 | for i in range(0, bit_string.shape[0], phase_length): 26 | genome.append((bit_string[i:i+phase_length]).tolist()) 27 | 28 | return genome 29 | 30 | 31 | def decode(genome): 32 | genotype = [] 33 | for gene in genome: 34 | genotype.append(phase_dencode(gene)) 35 | 36 | return genotype 37 | 38 | 39 | if __name__ == "__main__": 40 | n_phases = 3 41 | # bit_string = np.random.randint(0, 2, size=21) 42 | # print(bit_string) 43 | # genome = decode(convert(bit_string, n_phases)) 44 | # print(genome) 45 | # 46 | # channels = [(3, 128), (128, 128), (128, 128)] 47 | # 48 | # out_features = 10 49 | # 50 | # import torch 51 | # from models.macro_models import EvoNetwork 52 | # from misc import utils 53 | # 54 | # data = torch.randn(1, 3, 32, 32) 55 | # net = EvoNetwork(genome, channels, out_features, (32, 32), decoder='dense') 56 | # print("param size = {}MB".format(utils.count_parameters_in_MB(net))) 57 | # output = net(torch.autograd.Variable(data)) 58 | # 59 | # print(output) -------------------------------------------------------------------------------- /NSGANET/search/micro_encoding.py: -------------------------------------------------------------------------------- 1 | # NASNet Search Space https://arxiv.org/pdf/1707.07012.pdf 2 | # code modified from DARTS https://github.com/quark0/darts 3 | import numpy as np 4 | from collections import namedtuple 5 | 6 | import torch 7 | from models.micro_models import NetworkCIFAR as Network 8 | 9 | Genotype = namedtuple('Genotype', 'normal normal_concat reduce reduce_concat') 10 | Genotype_norm = namedtuple('Genotype', 'normal normal_concat') 11 | Genotype_redu = namedtuple('Genotype', 'reduce reduce_concat') 12 | 13 | # what you want to search should be defined here and in micro_operations 14 | PRIMITIVES = [ 15 | 'max_pool_3x3', 16 | 'avg_pool_3x3', 17 | 'skip_connect', 18 | 'sep_conv_3x3', 19 | 'sep_conv_5x5', 20 | 'dil_conv_3x3', 21 | 'dil_conv_5x5', 22 | 'sep_conv_7x7', 23 | 'conv_7x1_1x7', 24 | ] 25 | 26 | 27 | def convert_cell(cell_bit_string): 28 | # convert cell bit-string to genome 29 | tmp = [cell_bit_string[i:i + 2] for i in range(0, len(cell_bit_string), 2)] 30 | return [tmp[i:i + 2] for i in range(0, len(tmp), 2)] 31 | 32 | 33 | def convert(bit_string): 34 | # convert network bit-string (norm_cell + redu_cell) to genome 35 | norm_gene = convert_cell(bit_string[:len(bit_string)//2]) 36 | redu_gene = convert_cell(bit_string[len(bit_string)//2:]) 37 | return [norm_gene, redu_gene] 38 | 39 | 40 | def decode_cell(genome, norm=True): 41 | 42 | cell, cell_concat = [], list(range(2, len(genome)+2)) 43 | for block in genome: 44 | for unit in block: 45 | cell.append((PRIMITIVES[unit[0]], unit[1])) 46 | if unit[1] in cell_concat: 47 | cell_concat.remove(unit[1]) 48 | 49 | if norm: 50 | return Genotype_norm(normal=cell, normal_concat=cell_concat) 51 | else: 52 | return Genotype_redu(reduce=cell, reduce_concat=cell_concat) 53 | 54 | 55 | def decode(genome): 56 | # decodes genome to architecture 57 | normal_cell = genome[0] 58 | reduce_cell = genome[1] 59 | 60 | normal, normal_concat = [], list(range(2, len(normal_cell)+2)) 61 | reduce, reduce_concat = [], list(range(2, len(reduce_cell)+2)) 62 | 63 | for block in normal_cell: 64 | for unit in block: 65 | normal.append((PRIMITIVES[unit[0]], unit[1])) 66 | if unit[1] in normal_concat: 67 | normal_concat.remove(unit[1]) 68 | 69 | for block in reduce_cell: 70 | for unit in block: 71 | reduce.append((PRIMITIVES[unit[0]], unit[1])) 72 | if unit[1] in reduce_concat: 73 | reduce_concat.remove(unit[1]) 74 | 75 | return Genotype( 76 | normal=normal, normal_concat=normal_concat, 77 | reduce=reduce, reduce_concat=reduce_concat 78 | ) 79 | 80 | 81 | def compare_cell(cell_string1, cell_string2): 82 | cell_genome1 = convert_cell(cell_string1) 83 | cell_genome2 = convert_cell(cell_string2) 84 | cell1, cell2 = cell_genome1[:], cell_genome2[:] 85 | 86 | for block1 in cell1: 87 | for block2 in cell2: 88 | if block1 == block2 or block1 == block2[::-1]: 89 | cell2.remove(block2) 90 | break 91 | if len(cell2) > 0: 92 | return False 93 | else: 94 | return True 95 | 96 | 97 | def compare(string1, string2): 98 | 99 | if compare_cell(string1[:len(string1)//2], 100 | string2[:len(string2)//2]): 101 | if compare_cell(string1[len(string1)//2:], 102 | string2[len(string2)//2:]): 103 | return True 104 | 105 | return False 106 | 107 | 108 | def debug(): 109 | # design to debug the encoding scheme 110 | seed = 0 111 | np.random.seed(seed) 112 | budget = 2000 113 | B, n_ops, n_cell = 5, 7, 2 114 | networks = [] 115 | design_id = 1 116 | while len(networks) < budget: 117 | bit_string = [] 118 | for c in range(n_cell): 119 | for b in range(B): 120 | bit_string += [np.random.randint(n_ops), 121 | np.random.randint(b + 2), 122 | np.random.randint(n_ops), 123 | np.random.randint(b + 2) 124 | ] 125 | 126 | genome = convert(bit_string) 127 | # check against evaluated networks in case of duplicates 128 | doTrain = True 129 | for network in networks: 130 | if compare(genome, network): 131 | doTrain = False 132 | break 133 | 134 | if doTrain: 135 | genotype = decode(genome) 136 | model = Network(16, 10, 8, False, genotype) 137 | model.drop_path_prob = 0.0 138 | data = torch.randn(1, 3, 32, 32) 139 | output, output_aux = model(torch.autograd.Variable(data)) 140 | networks.append(genome) 141 | design_id += 1 142 | print(design_id) 143 | 144 | 145 | if __name__ == "__main__": 146 | # debug() 147 | # genome1 = [[[[3, 0], [3, 1]], [[3, 0], [3, 1]], 148 | # [[3, 1], [2, 0]], [[2, 0], [5, 2]]], 149 | # [[[0, 0], [0, 1]], [[2, 2], [0, 1]], 150 | # [[0, 0], [2, 2]], [[2, 2], [0, 1]]]] 151 | # genome2 = [[[[3, 1], [3, 0]], [[3, 1], [3, 0]], 152 | # [[3, 1], [2, 0]], [[2, 0], [5, 2]]], 153 | # [[[0, 1], [0, 0]], [[2, 2], [0, 1]], 154 | # [[0, 0], [2, 2]], [[2, 2], [0, 0]]]] 155 | # 156 | # print(compare(genome1, genome2)) 157 | # print(genome1) 158 | # print(genome2) 159 | # bit_string1 = [3,1,3,0,3,1,3,0,3,1,2,0,2,0,5,2,0,0,0,1,2,2,0,1,0,0,2,2,2,2,0,1] 160 | # bit_string2 = [3, 0, 3, 1, 3, 0, 3, 1, 3, 1, 2, 0, 2, 0, 5, 2, 161 | # 0, 0, 0, 1, 2, 2, 0, 1, 0, 0, 2, 2, 2, 2, 0, 1] 162 | # # print(convert(bit_string1)) 163 | # print(compare(bit_string1, bit_string2)) 164 | # print(decode(convert(bit_string))) 165 | 166 | cell_bit_string = [3, 0, 3, 1, 3, 0, 3, 1, 3, 1, 2, 0, 2, 0, 5, 2] 167 | print(decode_cell(convert_cell(cell_bit_string), norm=False)) 168 | -------------------------------------------------------------------------------- /NSGANET/validation/test.py: -------------------------------------------------------------------------------- 1 | '''Inference CIFAR10 with PyTorch.''' 2 | from __future__ import print_function 3 | 4 | import sys 5 | 6 | import torch 7 | import torch.nn as nn 8 | import torch.backends.cudnn as cudnn 9 | import torchvision 10 | 11 | import os 12 | import sys 13 | import time 14 | import logging 15 | import argparse 16 | import numpy as np 17 | 18 | parser = argparse.ArgumentParser(description='PyTorch CIFAR10 Testing') 19 | parser.add_argument('--seed', type=int, default=0, help='random seed') 20 | parser.add_argument('--data', type=str, default='../data', help='location of the data corpus') 21 | parser.add_argument('--batch_size', type=int, default=128, help='batch size') 22 | parser.add_argument('--report_freq', type=float, default=50, help='report frequency') 23 | parser.add_argument('--save', type=str, default='EXP', help='experiment name') 24 | parser.add_argument('--cutout', action='store_true', default=False, help='use cutout') 25 | parser.add_argument('--cutout_length', type=int, default=16, help='cutout length') 26 | parser.add_argument('--auxiliary', action='store_true', default=False, help='use auxiliary tower') 27 | parser.add_argument('--auxiliary_weight', type=float, default=0.4, help='weight for auxiliary loss') 28 | parser.add_argument('--layers', default=20, type=int, help='total number of layers (default: 20)') 29 | parser.add_argument('--droprate', default=0, type=float, help='dropout probability (default: 0.0)') 30 | parser.add_argument('--init_channels', type=int, default=32, help='num of init channels') 31 | parser.add_argument('--arch', type=str, default='NSGANet', help='which architecture to use') 32 | parser.add_argument('--filter_increment', default=4, type=int, help='# of filter increment') 33 | parser.add_argument('--SE', action='store_true', default=False, help='use Squeeze-and-Excitation') 34 | parser.add_argument('--model_path', type=str, default='EXP/model.pt', help='path of pretrained model') 35 | parser.add_argument('--net_type', type=str, default='micro', help='(options)micro, macro') 36 | parser.add_argument('--path', type=str, default="/cache/NSGANET") 37 | 38 | args = parser.parse_args() 39 | 40 | # update your project root path before running 41 | sys.path.insert(0, args.path) 42 | 43 | from misc import utils 44 | 45 | # model imports 46 | from models import macro_genotypes 47 | from models.macro_models import EvoNetwork 48 | import models.micro_genotypes as genotypes 49 | from models.micro_models import PyramidNetworkCIFAR as PyrmNASNet 50 | 51 | args.save = 'infer-{}-{}'.format(args.save, time.strftime("%Y%m%d-%H%M%S")) 52 | utils.create_exp_dir(args.save) 53 | 54 | device = 'cuda' 55 | 56 | log_format = '%(asctime)s %(message)s' 57 | logging.basicConfig(stream=sys.stdout, level=logging.INFO, 58 | format=log_format, datefmt='%m/%d %I:%M:%S %p') 59 | fh = logging.FileHandler(os.path.join(args.save, 'log.txt')) 60 | fh.setFormatter(logging.Formatter(log_format)) 61 | logging.getLogger().addHandler(fh) 62 | 63 | 64 | def main(): 65 | if not torch.cuda.is_available(): 66 | logging.info('no gpu device available') 67 | sys.exit(1) 68 | 69 | if args.auxiliary and args.net_type == 'macro': 70 | logging.info('auxiliary head classifier not supported for macro search space models') 71 | sys.exit(1) 72 | 73 | logging.info("args = %s", args) 74 | 75 | cudnn.enabled = True 76 | cudnn.benchmark = True 77 | np.random.seed(args.seed) 78 | torch.manual_seed(args.seed) 79 | torch.cuda.manual_seed(args.seed) 80 | 81 | # Data 82 | _, valid_transform = utils._data_transforms_cifar10(args) 83 | 84 | valid_data = torchvision.datasets.CIFAR10(root=args.data, train=False, download=True, transform=valid_transform) 85 | valid_queue = torch.utils.data.DataLoader( 86 | valid_data, batch_size=args.batch_size, shuffle=False, pin_memory=True, num_workers=2) 87 | 88 | # classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck') 89 | 90 | # Model 91 | if args.net_type == 'micro': 92 | logging.info("==> Building micro search space encoded architectures") 93 | genotype = eval("genotypes.%s" % args.arch) 94 | net = PyrmNASNet(args.init_channels, num_classes=10, layers=args.layers, 95 | auxiliary=args.auxiliary, genotype=genotype, 96 | increment=args.filter_increment, SE=args.SE) 97 | elif args.net_type == 'macro': 98 | genome = eval("macro_genotypes.%s" % args.arch) 99 | channels = [(3, 128), (128, 128), (128, 128)] 100 | net = EvoNetwork(genome, channels, 10, (32, 32), decoder='dense') 101 | else: 102 | raise NameError('Unknown network type, please only use supported network type') 103 | 104 | # logging.info("{}".format(net)) 105 | logging.info("param size = %fMB", utils.count_parameters_in_MB(net)) 106 | 107 | net = net.to(device) 108 | # no drop path during inference 109 | net.droprate = 0.0 110 | utils.load(net, args.model_path) 111 | 112 | criterion = nn.CrossEntropyLoss() 113 | criterion.to(device) 114 | 115 | # inference on original CIFAR-10 test images 116 | infer(valid_queue, net, criterion) 117 | 118 | 119 | def infer(valid_queue, net, criterion): 120 | net.eval() 121 | test_loss = 0 122 | correct = 0 123 | total = 0 124 | 125 | with torch.no_grad(): 126 | for step, (inputs, targets) in enumerate(valid_queue): 127 | inputs, targets = inputs.to(device), targets.to(device) 128 | outputs, _ = net(inputs) 129 | loss = criterion(outputs, targets) 130 | 131 | test_loss += loss.item() 132 | _, predicted = outputs.max(1) 133 | total += targets.size(0) 134 | correct += predicted.eq(targets).sum().item() 135 | 136 | if step % args.report_freq == 0: 137 | logging.info('valid %03d %e %f', step, test_loss/total, 100.*correct/total) 138 | 139 | acc = 100.*correct/total 140 | logging.info('valid acc %f', acc) 141 | 142 | return test_loss/total, acc 143 | 144 | 145 | if __name__ == '__main__': 146 | main() 147 | 148 | -------------------------------------------------------------------------------- /NSGANET/visualization/micro_visualize.py: -------------------------------------------------------------------------------- 1 | import sys 2 | # sys.path.insert(0, '/path/to/nsga-net') 3 | sys.path.insert(0, '/Users/zhichao.lu/Dropbox/2019/github/nsga-net') 4 | 5 | import os 6 | from models import micro_genotypes as genotypes 7 | from graphviz import Digraph 8 | 9 | # op_labels = { 10 | # 'avg_pool_3x3': 'avg\n3x3', 11 | # 'max_pool_3x3': 'max\n3x3', 12 | # 'skip_connect': 'iden\ntity', 13 | # 'sep_conv_3x3': 'sep\n3x3', 14 | # 'sep_conv_5x5': 'sep\n5x5', 15 | # 'sep_conv_7x7': 'sep\n7x7', 16 | # 'dil_conv_3x3': 'dil\n3x3', 17 | # 'dil_conv_5x5': 'dil\n5x5', 18 | # 'conv_7x1_1x7': '7x1\n1x7', 19 | # } 20 | 21 | op_labels = { 22 | 'avg_pool_3x3': 'avg 3x3', 23 | 'max_pool_3x3': 'max 3x3', 24 | 'skip_connect': 'identity', 25 | 'sep_conv_3x3': 'sep 3x3', 26 | 'sep_conv_5x5': 'sep 5x5', 27 | 'sep_conv_7x7': 'sep 7x7', 28 | 'dil_conv_3x3': 'dil 3x3', 29 | 'dil_conv_5x5': 'dil 5x5', 30 | 'conv_7x1_1x7': '7x1 1x7', 31 | } 32 | 33 | 34 | def plot(genotype_tup, filename, file_type='pdf', view=True): 35 | genotype = genotype_tup[0] 36 | concat = genotype_tup[1] 37 | g = Digraph( 38 | format=file_type, 39 | # graph_attr=dict(margin='0.2', nodesep='0.1', ranksep='0.3'), 40 | edge_attr=dict(fontsize='20', fontname="times"), 41 | node_attr=dict(style='filled', shape='rect', align='center', fontsize='20', height='0.5', width='0.5', 42 | penwidth='2', fontname="times"), 43 | engine='dot', 44 | ) 45 | g.body.extend(['rankdir=LR']) 46 | 47 | g.node("h[i-1]", fillcolor='darkseagreen2') 48 | g.node("h[i]", fillcolor='darkseagreen2') 49 | 50 | assert len(genotype) % 2 == 0 51 | steps = len(genotype) // 2 52 | 53 | for i in range(steps): 54 | g.node(str(i), label="add", fillcolor='lightblue') 55 | 56 | for i in range(steps): 57 | for k in [2 * i, 2 * i + 1]: 58 | op, j = genotype[k] 59 | # g.node(str(steps+k+1), label=op_labels[op], fillcolor='yellow') 60 | if j == 0: 61 | u = "h[i-1]" 62 | elif j == 1: 63 | u = "h[i]" 64 | else: 65 | u = str(j - 2) 66 | v = str(i) 67 | # g.edge(u, str(steps+k+1), fillcolor="gray") 68 | # g.edge(str(steps+k+1), v, fillcolor="gray") 69 | g.edge(u, v, label=op_labels[op], fillcolor="gray") 70 | 71 | g.node("h[i+1]", label="concat", fillcolor='lightpink') 72 | 73 | for i in range(steps): 74 | if int(i + 2) in concat: 75 | g.edge(str(i), "h[i+1]", fillcolor="gray") 76 | 77 | g.node("output", label="h[i+1]", fillcolor='palegoldenrod') 78 | g.edge("h[i+1]", "output", fillcolor="gray") 79 | 80 | # g.attr(rank='same') 81 | 82 | g.render(filename, view=view) 83 | 84 | os.remove(filename) 85 | 86 | 87 | if __name__ == '__main__': 88 | # if len(sys.argv) != 2: 89 | # print("usage:\n python {} ARCH_NAME".format(sys.argv[0])) 90 | # sys.exit(1) 91 | 92 | genotype_name = sys.argv[1] 93 | try: 94 | genotype = eval('genotypes.{}'.format(genotype_name)) 95 | except AttributeError: 96 | print("{} is not specified in genotypes.py".format(genotype_name)) 97 | sys.exit(1) 98 | 99 | plot([genotype.normal, genotype.normal_concat], "normal") 100 | plot([genotype.reduce, genotype.reduce_concat], "reduction") 101 | 102 | 103 | -------------------------------------------------------------------------------- /PC-DARTS/README.md: -------------------------------------------------------------------------------- 1 | # PC-DARTS 2 | 3 | ## Generate a Random Architecture 4 | 5 | ``` 6 | from model_search import Network 7 | from genotypes import PRIMITIVES 8 | from genotypes import Genotype 9 | 10 | import copy 11 | import random 12 | import torch.nn.functional as F 13 | 14 | n_ops = 8 15 | n_nodes = 4 16 | S = 0 17 | for i in range(4): 18 | S = S + 2 + i 19 | 20 | switches = [] 21 | for i in range(S): 22 | switches.append([True for j in range(n_ops)]) 23 | switches_normal = copy.deepcopy(switches) 24 | switches_reduce = copy.deepcopy(switches) 25 | for i in range(S): 26 | # excluding zero operations 27 | switches_normal[i][0] = False 28 | switches_reduce[i][0] = False 29 | idxs = [1 + i for i in range(n_ops - 1)] 30 | # randomly 6 dropping operations out of the 7 possible 31 | drop_normal = random.sample(idxs, n_ops - 2) 32 | drop_reduce = random.sample(idxs, n_ops - 2) 33 | for idx in drop_normal: 34 | switches_normal[i][idx] = False 35 | for idx in drop_normal: 36 | switches_reduce[i][idx] = False 37 | model = Network(16, 10, 20, None) 38 | model = model.cuda() 39 | 40 | # Generate architecture 41 | sm_dim = -1 42 | arch_param = model.arch_parameters() 43 | normal_prob = F.softmax(arch_param[0], dim=sm_dim).data.cpu().numpy() 44 | reduce_prob = F.softmax(arch_param[1], dim=sm_dim).data.cpu().numpy() 45 | normal_final = [0 for idx in range(S)] 46 | reduce_final = [0 for idx in range(S)] 47 | keep_normal = [0, 1] 48 | keep_reduce = [0, 1] 49 | n = 3 50 | start = 2 51 | for i in range(3): 52 | end = start + n 53 | tbsn = normal_final[start:end] 54 | tbsr = reduce_final[start:end] 55 | edge_n = random.sample([k for k in range(n)], 2) 56 | keep_normal.append(edge_n[-1] + start) 57 | keep_normal.append(edge_n[-2] + start) 58 | edge_r = random.sample([k for k in range(n)], 2) 59 | keep_reduce.append(edge_r[-1] + start) 60 | keep_reduce.append(edge_r[-2] + start) 61 | start = end 62 | n = n + 1 63 | for i in range(S): 64 | if not i in keep_normal: 65 | for j in range(n_ops): 66 | switches_normal[i][j] = False 67 | if not i in keep_reduce: 68 | for j in range(n_ops): 69 | switches_reduce[i][j] = False 70 | 71 | 72 | def parse_network(switches_normal, switches_reduce): 73 | 74 | def _parse_switches(switches): 75 | n = 2 76 | start = 0 77 | gene = [] 78 | step = 4 79 | for i in range(step): 80 | end = start + n 81 | for j in range(start, end): 82 | for k in range(len(switches[j])): 83 | if switches[j][k]: 84 | gene.append((PRIMITIVES[k], j - start)) 85 | start = end 86 | n = n + 1 87 | return gene 88 | 89 | gene_normal = _parse_switches(switches_normal) 90 | gene_reduce = _parse_switches(switches_reduce) 91 | 92 | concat = range(2, 6) 93 | 94 | genotype = Genotype( 95 | normal=gene_normal, normal_concat=concat, 96 | reduce=gene_reduce, reduce_concat=concat 97 | ) 98 | 99 | return genotype 100 | 101 | genotype = parse_network(switches_normal, switches_reduce) 102 | ``` 103 | 104 | ## Search 105 | 106 | ``` 107 | python train_search.py 108 | --dataset CIFAR10 # choose between CIFAR10, CIFAR100, Sport8, MIT67 and flowers102 109 | --datapath /data # path to your data 110 | --save test 111 | ``` 112 | 113 | ## Augment 114 | 115 | ``` 116 | python train.py 117 | --dataset CIFAR10 # choose between CIFAR10, CIFAR100, Sport8, MIT67 and flowers102 118 | --datapath /data # path to your data 119 | --save test 120 | --auxiliary 121 | --cutout 122 | --arch genotype 123 | --layers 20 #20 for CIFAR datasets, 8 for Sport8, MIT67 and flowers102 124 | -------------------------------------------------------------------------------- /PC-DARTS/architect.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import numpy as np 3 | import torch.nn as nn 4 | from torch.autograd import Variable 5 | 6 | 7 | def _concat(xs): 8 | return torch.cat([x.view(-1) for x in xs]) 9 | 10 | 11 | class Architect(object): 12 | 13 | def __init__(self, model, args): 14 | self.network_momentum = args.momentum 15 | self.network_weight_decay = args.weight_decay 16 | self.model = model 17 | self.optimizer = torch.optim.Adam(self.model.arch_parameters(), 18 | lr=args.arch_learning_rate, betas=(0.5, 0.999), weight_decay=args.arch_weight_decay) 19 | 20 | def _compute_unrolled_model(self, input, target, eta, network_optimizer): 21 | loss = self.model._loss(input, target) 22 | theta = _concat(self.model.parameters()).data 23 | try: 24 | moment = _concat(network_optimizer.state[v]['momentum_buffer'] for v in self.model.parameters()).mul_(self.network_momentum) 25 | except: 26 | moment = torch.zeros_like(theta) 27 | dtheta = _concat(torch.autograd.grad(loss, self.model.parameters())).data + self.network_weight_decay*theta 28 | unrolled_model = self._construct_model_from_theta(theta.sub(eta, moment+dtheta)) 29 | return unrolled_model 30 | 31 | def step(self, input_train, target_train, input_valid, target_valid, eta, network_optimizer, unrolled): 32 | self.optimizer.zero_grad() 33 | if unrolled: 34 | self._backward_step_unrolled(input_train, target_train, input_valid, target_valid, eta, network_optimizer) 35 | else: 36 | self._backward_step(input_valid, target_valid) 37 | self.optimizer.step() 38 | 39 | def _backward_step(self, input_valid, target_valid): 40 | loss = self.model._loss(input_valid, target_valid) 41 | loss.backward() 42 | 43 | def _backward_step_unrolled(self, input_train, target_train, input_valid, target_valid, eta, network_optimizer): 44 | unrolled_model = self._compute_unrolled_model(input_train, target_train, eta, network_optimizer) 45 | unrolled_loss = unrolled_model._loss(input_valid, target_valid) 46 | 47 | unrolled_loss.backward() 48 | dalpha = [v.grad for v in unrolled_model.arch_parameters()] 49 | vector = [v.grad.data for v in unrolled_model.parameters()] 50 | implicit_grads = self._hessian_vector_product(vector, input_train, target_train) 51 | 52 | for g, ig in zip(dalpha, implicit_grads): 53 | g.data.sub_(eta, ig.data) 54 | 55 | for v, g in zip(self.model.arch_parameters(), dalpha): 56 | if v.grad is None: 57 | v.grad = Variable(g.data) 58 | else: 59 | v.grad.data.copy_(g.data) 60 | 61 | def _construct_model_from_theta(self, theta): 62 | model_new = self.model.new() 63 | model_dict = self.model.state_dict() 64 | 65 | params, offset = {}, 0 66 | for k, v in self.model.named_parameters(): 67 | v_length = np.prod(v.size()) 68 | params[k] = theta[offset: offset+v_length].view(v.size()) 69 | offset += v_length 70 | 71 | assert offset == len(theta) 72 | model_dict.update(params) 73 | model_new.load_state_dict(model_dict) 74 | return model_new.cuda() 75 | 76 | def _hessian_vector_product(self, vector, input, target, r=1e-2): 77 | R = r / _concat(vector).norm() 78 | for p, v in zip(self.model.parameters(), vector): 79 | p.data.add_(R, v) 80 | loss = self.model._loss(input, target) 81 | grads_p = torch.autograd.grad(loss, self.model.arch_parameters()) 82 | 83 | for p, v in zip(self.model.parameters(), vector): 84 | p.data.sub_(2*R, v) 85 | loss = self.model._loss(input, target) 86 | grads_n = torch.autograd.grad(loss, self.model.arch_parameters()) 87 | 88 | for p, v in zip(self.model.parameters(), vector): 89 | p.data.add_(R, v) 90 | 91 | return [(x-y).div_(2*R) for x, y in zip(grads_p, grads_n)] 92 | 93 | -------------------------------------------------------------------------------- /PC-DARTS/genotypes.py: -------------------------------------------------------------------------------- 1 | from collections import namedtuple 2 | 3 | Genotype = namedtuple('Genotype', 'normal normal_concat reduce reduce_concat') 4 | 5 | PRIMITIVES = [ 6 | 'none', 7 | 'max_pool_3x3', 8 | 'avg_pool_3x3', 9 | 'skip_connect', 10 | 'sep_conv_3x3', 11 | 'sep_conv_5x5', 12 | 'dil_conv_3x3', 13 | 'dil_conv_5x5' 14 | ] 15 | 16 | NASNet = Genotype( 17 | normal = [ 18 | ('sep_conv_5x5', 1), 19 | ('sep_conv_3x3', 0), 20 | ('sep_conv_5x5', 0), 21 | ('sep_conv_3x3', 0), 22 | ('avg_pool_3x3', 1), 23 | ('skip_connect', 0), 24 | ('avg_pool_3x3', 0), 25 | ('avg_pool_3x3', 0), 26 | ('sep_conv_3x3', 1), 27 | ('skip_connect', 1), 28 | ], 29 | normal_concat = [2, 3, 4, 5, 6], 30 | reduce = [ 31 | ('sep_conv_5x5', 1), 32 | ('sep_conv_7x7', 0), 33 | ('max_pool_3x3', 1), 34 | ('sep_conv_7x7', 0), 35 | ('avg_pool_3x3', 1), 36 | ('sep_conv_5x5', 0), 37 | ('skip_connect', 3), 38 | ('avg_pool_3x3', 2), 39 | ('sep_conv_3x3', 2), 40 | ('max_pool_3x3', 1), 41 | ], 42 | reduce_concat = [4, 5, 6], 43 | ) 44 | 45 | AmoebaNet = Genotype( 46 | normal = [ 47 | ('avg_pool_3x3', 0), 48 | ('max_pool_3x3', 1), 49 | ('sep_conv_3x3', 0), 50 | ('sep_conv_5x5', 2), 51 | ('sep_conv_3x3', 0), 52 | ('avg_pool_3x3', 3), 53 | ('sep_conv_3x3', 1), 54 | ('skip_connect', 1), 55 | ('skip_connect', 0), 56 | ('avg_pool_3x3', 1), 57 | ], 58 | normal_concat = [4, 5, 6], 59 | reduce = [ 60 | ('avg_pool_3x3', 0), 61 | ('sep_conv_3x3', 1), 62 | ('max_pool_3x3', 0), 63 | ('sep_conv_7x7', 2), 64 | ('sep_conv_7x7', 0), 65 | ('avg_pool_3x3', 1), 66 | ('max_pool_3x3', 0), 67 | ('max_pool_3x3', 1), 68 | ('conv_7x1_1x7', 0), 69 | ('sep_conv_3x3', 5), 70 | ], 71 | reduce_concat = [3, 4, 6] 72 | ) 73 | 74 | DARTS_V1 = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('skip_connect', 0), ('sep_conv_3x3', 1), ('skip_connect', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('skip_connect', 2)], normal_concat=[2, 3, 4, 5], reduce=[('max_pool_3x3', 0), ('max_pool_3x3', 1), ('skip_connect', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('skip_connect', 2), ('skip_connect', 2), ('avg_pool_3x3', 0)], reduce_concat=[2, 3, 4, 5]) 75 | DARTS_V2 = Genotype(normal=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('skip_connect', 0), ('dil_conv_3x3', 2)], normal_concat=[2, 3, 4, 5], reduce=[('max_pool_3x3', 0), ('max_pool_3x3', 1), ('skip_connect', 2), ('max_pool_3x3', 1), ('max_pool_3x3', 0), ('skip_connect', 2), ('skip_connect', 2), ('max_pool_3x3', 1)], reduce_concat=[2, 3, 4, 5]) 76 | 77 | 78 | PC_DARTS_cifar = Genotype(normal=[('sep_conv_3x3', 1), ('skip_connect', 0), ('sep_conv_3x3', 0), ('dil_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 1), ('avg_pool_3x3', 0), ('dil_conv_3x3', 1)], normal_concat=range(2, 6), reduce=[('sep_conv_5x5', 1), ('max_pool_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_5x5', 2), ('sep_conv_3x3', 0), ('sep_conv_3x3', 3), ('sep_conv_3x3', 1), ('sep_conv_3x3', 2)], reduce_concat=range(2, 6)) 79 | PC_DARTS_image = Genotype(normal=[('skip_connect', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 0), ('skip_connect', 1), ('sep_conv_3x3', 1), ('sep_conv_3x3', 3), ('sep_conv_3x3', 1), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('sep_conv_3x3', 0), ('skip_connect', 1), ('dil_conv_5x5', 2), ('max_pool_3x3', 1), ('sep_conv_3x3', 2), ('sep_conv_3x3', 1), ('sep_conv_5x5', 0), ('sep_conv_3x3', 3)], reduce_concat=range(2, 6)) 80 | 81 | 82 | PCDARTS = PC_DARTS_cifar 83 | 84 | -------------------------------------------------------------------------------- /PC-DARTS/operations.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | OPS = { 5 | 'none' : lambda C, stride, affine: Zero(stride), 6 | 'avg_pool_3x3' : lambda C, stride, affine: nn.AvgPool2d(3, stride=stride, padding=1, count_include_pad=False), 7 | 'max_pool_3x3' : lambda C, stride, affine: nn.MaxPool2d(3, stride=stride, padding=1), 8 | 'skip_connect' : lambda C, stride, affine: Identity() if stride == 1 else FactorizedReduce(C, C, affine=affine), 9 | 'sep_conv_3x3' : lambda C, stride, affine: SepConv(C, C, 3, stride, 1, affine=affine), 10 | 'sep_conv_5x5' : lambda C, stride, affine: SepConv(C, C, 5, stride, 2, affine=affine), 11 | 'sep_conv_7x7' : lambda C, stride, affine: SepConv(C, C, 7, stride, 3, affine=affine), 12 | 'dil_conv_3x3' : lambda C, stride, affine: DilConv(C, C, 3, stride, 2, 2, affine=affine), 13 | 'dil_conv_5x5' : lambda C, stride, affine: DilConv(C, C, 5, stride, 4, 2, affine=affine), 14 | 'conv_7x1_1x7' : lambda C, stride, affine: nn.Sequential( 15 | nn.ReLU(inplace=False), 16 | nn.Conv2d(C, C, (1,7), stride=(1, stride), padding=(0, 3), bias=False), 17 | nn.Conv2d(C, C, (7,1), stride=(stride, 1), padding=(3, 0), bias=False), 18 | nn.BatchNorm2d(C, affine=affine) 19 | ), 20 | } 21 | 22 | class ReLUConvBN(nn.Module): 23 | 24 | def __init__(self, C_in, C_out, kernel_size, stride, padding, affine=True): 25 | super(ReLUConvBN, self).__init__() 26 | self.op = nn.Sequential( 27 | nn.ReLU(inplace=False), 28 | nn.Conv2d(C_in, C_out, kernel_size, stride=stride, padding=padding, bias=False), 29 | nn.BatchNorm2d(C_out, affine=affine) 30 | ) 31 | 32 | def forward(self, x): 33 | return self.op(x) 34 | 35 | class DilConv(nn.Module): 36 | 37 | def __init__(self, C_in, C_out, kernel_size, stride, padding, dilation, affine=True): 38 | super(DilConv, self).__init__() 39 | self.op = nn.Sequential( 40 | nn.ReLU(inplace=False), 41 | nn.Conv2d(C_in, C_in, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=C_in, bias=False), 42 | nn.Conv2d(C_in, C_out, kernel_size=1, padding=0, bias=False), 43 | nn.BatchNorm2d(C_out, affine=affine), 44 | ) 45 | 46 | def forward(self, x): 47 | return self.op(x) 48 | 49 | 50 | class SepConv(nn.Module): 51 | 52 | def __init__(self, C_in, C_out, kernel_size, stride, padding, affine=True): 53 | super(SepConv, self).__init__() 54 | self.op = nn.Sequential( 55 | nn.ReLU(inplace=False), 56 | nn.Conv2d(C_in, C_in, kernel_size=kernel_size, stride=stride, padding=padding, groups=C_in, bias=False), 57 | nn.Conv2d(C_in, C_in, kernel_size=1, padding=0, bias=False), 58 | nn.BatchNorm2d(C_in, affine=affine), 59 | nn.ReLU(inplace=False), 60 | nn.Conv2d(C_in, C_in, kernel_size=kernel_size, stride=1, padding=padding, groups=C_in, bias=False), 61 | nn.Conv2d(C_in, C_out, kernel_size=1, padding=0, bias=False), 62 | nn.BatchNorm2d(C_out, affine=affine), 63 | ) 64 | 65 | def forward(self, x): 66 | return self.op(x) 67 | 68 | 69 | class Identity(nn.Module): 70 | 71 | def __init__(self): 72 | super(Identity, self).__init__() 73 | 74 | def forward(self, x): 75 | return x 76 | 77 | 78 | class Zero(nn.Module): 79 | 80 | def __init__(self, stride): 81 | super(Zero, self).__init__() 82 | self.stride = stride 83 | 84 | def forward(self, x): 85 | if self.stride == 1: 86 | return x.mul(0.) 87 | return x[:,:,::self.stride,::self.stride].mul(0.) 88 | 89 | 90 | class FactorizedReduce(nn.Module): 91 | 92 | def __init__(self, C_in, C_out, affine=True): 93 | super(FactorizedReduce, self).__init__() 94 | assert C_out % 2 == 0 95 | self.relu = nn.ReLU(inplace=False) 96 | self.conv_1 = nn.Conv2d(C_in, C_out // 2, 1, stride=2, padding=0, bias=False) 97 | self.conv_2 = nn.Conv2d(C_in, C_out // 2, 1, stride=2, padding=0, bias=False) 98 | self.bn = nn.BatchNorm2d(C_out, affine=affine) 99 | 100 | def forward(self, x): 101 | x = self.relu(x) 102 | out = torch.cat([self.conv_1(x), self.conv_2(x[:,:,1:,1:])], dim=1) 103 | out = self.bn(out) 104 | return out 105 | 106 | -------------------------------------------------------------------------------- /PC-DARTS/test.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import glob 4 | import numpy as np 5 | import torch 6 | import utils 7 | import logging 8 | import argparse 9 | import torch.nn as nn 10 | import genotypes 11 | import torch.utils 12 | import torchvision.datasets as dset 13 | import torch.backends.cudnn as cudnn 14 | 15 | from torch.autograd import Variable 16 | from model import NetworkCIFAR as Network 17 | 18 | 19 | parser = argparse.ArgumentParser("cifar") 20 | parser.add_argument('--data', type=str, default='../data', help='location of the data corpus') 21 | parser.add_argument('--batch_size', type=int, default=96, help='batch size') 22 | parser.add_argument('--report_freq', type=float, default=50, help='report frequency') 23 | parser.add_argument('--gpu', type=int, default=0, help='gpu device id') 24 | parser.add_argument('--init_channels', type=int, default=36, help='num of init channels') 25 | parser.add_argument('--layers', type=int, default=20, help='total number of layers') 26 | parser.add_argument('--model_path', type=str, default='EXP/model.pt', help='path of pretrained model') 27 | parser.add_argument('--auxiliary', action='store_true', default=False, help='use auxiliary tower') 28 | parser.add_argument('--cutout', action='store_true', default=False, help='use cutout') 29 | parser.add_argument('--cutout_length', type=int, default=16, help='cutout length') 30 | parser.add_argument('--drop_path_prob', type=float, default=0.2, help='drop path probability') 31 | parser.add_argument('--seed', type=int, default=0, help='random seed') 32 | parser.add_argument('--arch', type=str, default='DARTS', help='which architecture to use') 33 | args = parser.parse_args() 34 | 35 | log_format = '%(asctime)s %(message)s' 36 | logging.basicConfig(stream=sys.stdout, level=logging.INFO, 37 | format=log_format, datefmt='%m/%d %I:%M:%S %p') 38 | 39 | CIFAR_CLASSES = 10 40 | 41 | 42 | def main(): 43 | if not torch.cuda.is_available(): 44 | logging.info('no gpu device available') 45 | sys.exit(1) 46 | 47 | np.random.seed(args.seed) 48 | torch.cuda.set_device(args.gpu) 49 | cudnn.benchmark = True 50 | torch.manual_seed(args.seed) 51 | cudnn.enabled=True 52 | torch.cuda.manual_seed(args.seed) 53 | logging.info('gpu device = %d' % args.gpu) 54 | logging.info("args = %s", args) 55 | 56 | genotype = eval("genotypes.%s" % args.arch) 57 | model = Network(args.init_channels, CIFAR_CLASSES, args.layers, args.auxiliary, genotype) 58 | model = model.cuda() 59 | utils.load(model, args.model_path) 60 | 61 | logging.info("param size = %fMB", utils.count_parameters_in_MB(model)) 62 | 63 | criterion = nn.CrossEntropyLoss() 64 | criterion = criterion.cuda() 65 | 66 | _, test_transform = utils._data_transforms_cifar10(args) 67 | test_data = dset.CIFAR10(root=args.data, train=False, download=True, transform=test_transform) 68 | 69 | test_queue = torch.utils.data.DataLoader( 70 | test_data, batch_size=args.batch_size, shuffle=False, pin_memory=True, num_workers=2) 71 | 72 | model.drop_path_prob = args.drop_path_prob 73 | test_acc, test_obj = infer(test_queue, model, criterion) 74 | logging.info('test_acc %f', test_acc) 75 | 76 | 77 | def infer(test_queue, model, criterion): 78 | objs = utils.AvgrageMeter() 79 | top1 = utils.AvgrageMeter() 80 | top5 = utils.AvgrageMeter() 81 | model.eval() 82 | 83 | for step, (input, target) in enumerate(test_queue): 84 | input = Variable(input, volatile=True).cuda() 85 | target = Variable(target, volatile=True).cuda(async=True) 86 | 87 | logits, _ = model(input) 88 | loss = criterion(logits, target) 89 | 90 | prec1, prec5 = utils.accuracy(logits, target, topk=(1, 5)) 91 | n = input.size(0) 92 | objs.update(loss.data[0], n) 93 | top1.update(prec1.data[0], n) 94 | top5.update(prec5.data[0], n) 95 | 96 | if step % args.report_freq == 0: 97 | logging.info('test %03d %e %f %f', step, objs.avg, top1.avg, top5.avg) 98 | 99 | return top1.avg, objs.avg 100 | 101 | 102 | if __name__ == '__main__': 103 | main() 104 | 105 | -------------------------------------------------------------------------------- /PC-DARTS/visualize.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import genotypes 3 | from graphviz import Digraph 4 | 5 | 6 | def plot(genotype, filename): 7 | g = Digraph( 8 | format='pdf', 9 | edge_attr=dict(fontsize='20', fontname="times"), 10 | node_attr=dict(style='filled', shape='rect', align='center', fontsize='20', height='0.5', width='0.5', penwidth='2', fontname="times"), 11 | engine='dot') 12 | g.body.extend(['rankdir=LR']) 13 | 14 | g.node("c_{k-2}", fillcolor='darkseagreen2') 15 | g.node("c_{k-1}", fillcolor='darkseagreen2') 16 | assert len(genotype) % 2 == 0 17 | steps = len(genotype) // 2 18 | 19 | for i in range(steps): 20 | g.node(str(i), fillcolor='lightblue') 21 | 22 | for i in range(steps): 23 | for k in [2*i, 2*i + 1]: 24 | op, j = genotype[k] 25 | if j == 0: 26 | u = "c_{k-2}" 27 | elif j == 1: 28 | u = "c_{k-1}" 29 | else: 30 | u = str(j-2) 31 | v = str(i) 32 | g.edge(u, v, label=op, fillcolor="gray") 33 | 34 | g.node("c_{k}", fillcolor='palegoldenrod') 35 | for i in range(steps): 36 | g.edge(str(i), "c_{k}", fillcolor="gray") 37 | 38 | g.render(filename, view=True) 39 | 40 | 41 | if __name__ == '__main__': 42 | if len(sys.argv) != 2: 43 | print("usage:\n python {} ARCH_NAME".format(sys.argv[0])) 44 | sys.exit(1) 45 | 46 | genotype_name = sys.argv[1] 47 | try: 48 | genotype = eval('genotypes.{}'.format(genotype_name)) 49 | except AttributeError: 50 | print("{} is not specified in genotypes.py".format(genotype_name)) 51 | sys.exit(1) 52 | 53 | plot(genotype.normal, "normal") 54 | plot(genotype.reduce, "reduction") 55 | 56 | -------------------------------------------------------------------------------- /PDARTS/README.md: -------------------------------------------------------------------------------- 1 | # Progressive Differentiable Architecture Search 2 | 3 | ## Generate a Random Architecture 4 | 5 | Same as DARTS, with the restriction of 2 skip-connections maximum for a given architecture. 6 | 7 | ## Search 8 | 9 | ``` 10 | python train_search.py 11 | --save test 12 | --tmp_data_dir "/data" # path to data 13 | --dataset cifar10 # choose between cifar10, cifar100, sport8, mit67 and flowers102 14 | --layers 5 # 5 for cifar10 and cifar100, 8 for sport8, mit67 and flowers102 15 | --add_layers 6 # 6 for cifar10 and cifar100, 0 for sport8, mit67 and flowers102 16 | --add_layers 12 # 12 for cifar10 and cifar100, 0 for sport8, mit67 and flowers102 17 | --dropout_rate 0.0 # 0.0 for cifar10, sport8, mit67 and flowers102, 0.1 for cifar100 18 | --dropout_rate 0.4 # 0.4 for cifar10, sport8, mit67 and flowers102, 0.2 for cifar100 19 | --dropout_rate 0.7 # 0.7 for cifar10, sport8, mit67 and flowers102, 0.3 for cifar100 20 | ``` 21 | 22 | ## Augment 23 | 24 | ``` 25 | python train_cifar.py 26 | --save test 27 | --tmp_data_dir "/data" # path to data 28 | --dataset cifar10 # choose between cifar10, cifar100, sport8, mit67 and flowers102 29 | --layers 20 # 20 for cifar10 and cifar100, 8 for sport8, mit67 and flowers102 30 | --arch genotype 31 | --auxiliary 32 | --cutout 33 | ``` 34 | 35 | -------------------------------------------------------------------------------- /PDARTS/genotypes.py: -------------------------------------------------------------------------------- 1 | from collections import namedtuple 2 | 3 | Genotype = namedtuple('Genotype', 'normal normal_concat reduce reduce_concat') 4 | 5 | PRIMITIVES = [ 6 | 'none', 7 | 'max_pool_3x3', 8 | 'avg_pool_3x3', 9 | 'skip_connect', 10 | 'sep_conv_3x3', 11 | 'sep_conv_5x5', 12 | 'dil_conv_3x3', 13 | 'dil_conv_5x5' 14 | ] 15 | 16 | NASNet = Genotype( 17 | normal = [ 18 | ('sep_conv_5x5', 1), 19 | ('sep_conv_3x3', 0), 20 | ('sep_conv_5x5', 0), 21 | ('sep_conv_3x3', 0), 22 | ('avg_pool_3x3', 1), 23 | ('skip_connect', 0), 24 | ('avg_pool_3x3', 0), 25 | ('avg_pool_3x3', 0), 26 | ('sep_conv_3x3', 1), 27 | ('skip_connect', 1), 28 | ], 29 | normal_concat = [2, 3, 4, 5, 6], 30 | reduce = [ 31 | ('sep_conv_5x5', 1), 32 | ('sep_conv_7x7', 0), 33 | ('max_pool_3x3', 1), 34 | ('sep_conv_7x7', 0), 35 | ('avg_pool_3x3', 1), 36 | ('sep_conv_5x5', 0), 37 | ('skip_connect', 3), 38 | ('avg_pool_3x3', 2), 39 | ('sep_conv_3x3', 2), 40 | ('max_pool_3x3', 1), 41 | ], 42 | reduce_concat = [4, 5, 6], 43 | ) 44 | 45 | AmoebaNet = Genotype( 46 | normal = [ 47 | ('avg_pool_3x3', 0), 48 | ('max_pool_3x3', 1), 49 | ('sep_conv_3x3', 0), 50 | ('sep_conv_5x5', 2), 51 | ('sep_conv_3x3', 0), 52 | ('avg_pool_3x3', 3), 53 | ('sep_conv_3x3', 1), 54 | ('skip_connect', 1), 55 | ('skip_connect', 0), 56 | ('avg_pool_3x3', 1), 57 | ], 58 | normal_concat = [4, 5, 6], 59 | reduce = [ 60 | ('avg_pool_3x3', 0), 61 | ('sep_conv_3x3', 1), 62 | ('max_pool_3x3', 0), 63 | ('sep_conv_7x7', 2), 64 | ('sep_conv_7x7', 0), 65 | ('avg_pool_3x3', 1), 66 | ('max_pool_3x3', 0), 67 | ('max_pool_3x3', 1), 68 | ('conv_7x1_1x7', 0), 69 | ('sep_conv_3x3', 5), 70 | ], 71 | reduce_concat = [3, 4, 6] 72 | ) 73 | 74 | DARTS_V1 = Genotype(normal=[('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('skip_connect', 0), ('sep_conv_3x3', 1), ('skip_connect', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('skip_connect', 2)], normal_concat=[2, 3, 4, 5], reduce=[('max_pool_3x3', 0), ('max_pool_3x3', 1), ('skip_connect', 2), ('max_pool_3x3', 0), ('max_pool_3x3', 0), ('skip_connect', 2), ('skip_connect', 2), ('avg_pool_3x3', 0)], reduce_concat=[2, 3, 4, 5]) 75 | DARTS_V2 = Genotype(normal=[('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 0), ('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('skip_connect', 0), ('skip_connect', 0), ('dil_conv_3x3', 2)], normal_concat=[2, 3, 4, 5], reduce=[('max_pool_3x3', 0), ('max_pool_3x3', 1), ('skip_connect', 2), ('max_pool_3x3', 1), ('max_pool_3x3', 0), ('skip_connect', 2), ('skip_connect', 2), ('max_pool_3x3', 1)], reduce_concat=[2, 3, 4, 5]) 76 | 77 | PDARTS = Genotype(normal=[('skip_connect', 0), ('dil_conv_3x3', 1), ('skip_connect', 0),('sep_conv_3x3', 1), ('sep_conv_3x3', 1), ('sep_conv_3x3', 3), ('sep_conv_3x3',0), ('dil_conv_5x5', 4)], normal_concat=range(2, 6), reduce=[('avg_pool_3x3', 0), ('sep_conv_5x5', 1), ('sep_conv_3x3', 0), ('dil_conv_5x5', 2), ('max_pool_3x3', 0), ('dil_conv_3x3', 1), ('dil_conv_3x3', 1), ('dil_conv_5x5', 3)], reduce_concat=range(2, 6)) 78 | 79 | 80 | -------------------------------------------------------------------------------- /PDARTS/operations.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | OPS = { 5 | 'none' : lambda C, stride, affine: Zero(stride), 6 | 'avg_pool_3x3' : lambda C, stride, affine: nn.AvgPool2d(3, stride=stride, padding=1, count_include_pad=False), 7 | 'max_pool_3x3' : lambda C, stride, affine: nn.MaxPool2d(3, stride=stride, padding=1), 8 | 'skip_connect' : lambda C, stride, affine: Identity() if stride == 1 else FactorizedReduce(C, C, affine=affine), 9 | 'sep_conv_3x3' : lambda C, stride, affine: SepConv(C, C, 3, stride, 1, affine=affine), 10 | 'sep_conv_5x5' : lambda C, stride, affine: SepConv(C, C, 5, stride, 2, affine=affine), 11 | 'sep_conv_7x7' : lambda C, stride, affine: SepConv(C, C, 7, stride, 3, affine=affine), 12 | 'dil_conv_3x3' : lambda C, stride, affine: DilConv(C, C, 3, stride, 2, 2, affine=affine), 13 | 'dil_conv_5x5' : lambda C, stride, affine: DilConv(C, C, 5, stride, 4, 2, affine=affine), 14 | 'conv_7x1_1x7' : lambda C, stride, affine: nn.Sequential( 15 | nn.ReLU(inplace=False), 16 | nn.Conv2d(C, C, (1,7), stride=(1, stride), padding=(0, 3), bias=False), 17 | nn.Conv2d(C, C, (7,1), stride=(stride, 1), padding=(3, 0), bias=False), 18 | nn.BatchNorm2d(C, affine=affine) 19 | ), 20 | } 21 | 22 | class ReLUConvBN(nn.Module): 23 | 24 | def __init__(self, C_in, C_out, kernel_size, stride, padding, affine=True): 25 | super(ReLUConvBN, self).__init__() 26 | self.op = nn.Sequential( 27 | nn.ReLU(inplace=False), 28 | nn.Conv2d(C_in, C_out, kernel_size, stride=stride, padding=padding, bias=False), 29 | nn.BatchNorm2d(C_out, affine=affine) 30 | ) 31 | 32 | def forward(self, x): 33 | return self.op(x) 34 | 35 | class DilConv(nn.Module): 36 | 37 | def __init__(self, C_in, C_out, kernel_size, stride, padding, dilation, affine=True): 38 | super(DilConv, self).__init__() 39 | self.op = nn.Sequential( 40 | nn.ReLU(inplace=False), 41 | nn.Conv2d(C_in, C_in, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=C_in, bias=False), 42 | nn.Conv2d(C_in, C_out, kernel_size=1, padding=0, bias=False), 43 | nn.BatchNorm2d(C_out, affine=affine), 44 | ) 45 | 46 | def forward(self, x): 47 | return self.op(x) 48 | 49 | 50 | class SepConv(nn.Module): 51 | 52 | def __init__(self, C_in, C_out, kernel_size, stride, padding, affine=True): 53 | super(SepConv, self).__init__() 54 | self.op = nn.Sequential( 55 | nn.ReLU(inplace=False), 56 | nn.Conv2d(C_in, C_in, kernel_size=kernel_size, stride=stride, padding=padding, groups=C_in, bias=False), 57 | nn.Conv2d(C_in, C_in, kernel_size=1, padding=0, bias=False), 58 | nn.BatchNorm2d(C_in, affine=affine), 59 | nn.ReLU(inplace=False), 60 | nn.Conv2d(C_in, C_in, kernel_size=kernel_size, stride=1, padding=padding, groups=C_in, bias=False), 61 | nn.Conv2d(C_in, C_out, kernel_size=1, padding=0, bias=False), 62 | nn.BatchNorm2d(C_out, affine=affine), 63 | ) 64 | 65 | def forward(self, x): 66 | return self.op(x) 67 | 68 | 69 | class Identity(nn.Module): 70 | 71 | def __init__(self): 72 | super(Identity, self).__init__() 73 | 74 | def forward(self, x): 75 | return x 76 | 77 | ''' 78 | class Zero(nn.Module): 79 | 80 | def __init__(self, stride): 81 | super(Zero, self).__init__() 82 | self.stride = stride 83 | 84 | def forward(self, x): 85 | if self.stride == 1: 86 | return x.mul(0.) 87 | return x[:,:,::self.stride,::self.stride].mul(0.) 88 | ''' 89 | 90 | class Zero(nn.Module): 91 | 92 | def __init__(self, stride): 93 | super(Zero, self).__init__() 94 | self.stride = stride 95 | def forward(self, x): 96 | n, c, h, w = x.size() 97 | h //= self.stride 98 | w //= self.stride 99 | if x.is_cuda: 100 | with torch.cuda.device(x.get_device()): 101 | padding = torch.cuda.FloatTensor(n, c, h, w).fill_(0) 102 | else: 103 | padding = torch.FloatTensor(n, c, h, w).fill_(0) 104 | return padding 105 | 106 | class FactorizedReduce(nn.Module): 107 | 108 | def __init__(self, C_in, C_out, affine=True): 109 | super(FactorizedReduce, self).__init__() 110 | assert C_out % 2 == 0 111 | self.relu = nn.ReLU(inplace=False) 112 | self.conv_1 = nn.Conv2d(C_in, C_out // 2, 1, stride=2, padding=0, bias=False) 113 | self.conv_2 = nn.Conv2d(C_in, C_out // 2, 1, stride=2, padding=0, bias=False) 114 | self.bn = nn.BatchNorm2d(C_out, affine=affine) 115 | 116 | def forward(self, x): 117 | x = self.relu(x) 118 | out = torch.cat([self.conv_1(x), self.conv_2(x[:,:,1:,1:])], dim=1) 119 | out = self.bn(out) 120 | return out 121 | 122 | -------------------------------------------------------------------------------- /PDARTS/test.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import glob 4 | import numpy as np 5 | import torch 6 | import utils 7 | import logging 8 | import argparse 9 | import torch.nn as nn 10 | import genotypes 11 | import torch.utils 12 | import torchvision.datasets as dset 13 | import torch.backends.cudnn as cudnn 14 | 15 | from model import NetworkCIFAR as Network 16 | from model import NetworkImageNet as NetworkLarge 17 | 18 | parser = argparse.ArgumentParser("cifar") 19 | parser.add_argument('--dataset', default="CIFAR10", help='cifar10/mit67/sport8/cifar100') 20 | parser.add_argument('--data', type=str, default='../data', help='location of the data corpus') 21 | parser.add_argument('--batch_size', type=int, default=128, help='batch size') 22 | parser.add_argument('--report_freq', type=float, default=50, help='report frequency') 23 | parser.add_argument('--gpu', type=int, default=0, help='gpu device id') 24 | parser.add_argument('--init_channels', type=int, default=36, help='num of init channels') 25 | parser.add_argument('--layers', type=int, default=20, help='total number of layers') 26 | parser.add_argument('--model_path', type=str, default='CIFAR10.pt', help='path of pretrained model') 27 | parser.add_argument('--auxiliary', action='store_true', default=True, help='use auxiliary tower') 28 | parser.add_argument('--cutout', action='store_true', default=True, help='use cutout') 29 | parser.add_argument('--cutout_length', type=int, default=16, help='cutout length') 30 | parser.add_argument('--arch', type=str, default='PDARTS', help='which architecture to use') 31 | parser.add_argument('--tmp_data_dir', type=str, default='/tmp/cache/', help='temp data dir') 32 | args = parser.parse_args() 33 | 34 | log_format = '%(asctime)s %(message)s' 35 | logging.basicConfig(stream=sys.stdout, level=logging.INFO, 36 | format=log_format, datefmt='%m/%d %I:%M:%S %p') 37 | 38 | if args.dataset=="CIFAR100": 39 | CLASSES = 100 40 | elif args.dataset=="CIFAR10": 41 | CLASSES = 10 42 | elif args.dataset == 'mit67': 43 | CLASSES = 67 44 | elif args.dataset == 'sport8': 45 | CLASSES = 8 46 | elif args.dataset == 'flowers102': 47 | CLASSES = 102 48 | 49 | def main(): 50 | if not torch.cuda.is_available(): 51 | logging.info('no gpu device available') 52 | sys.exit(1) 53 | 54 | torch.cuda.set_device(args.gpu) 55 | cudnn.enabled=True 56 | logging.info("args = %s", args) 57 | 58 | genotype = eval("genotypes.%s" % args.arch) 59 | if args.dataset in LARGE_DATASETS: 60 | model = NetworkLarge(args.init_channels, CLASSES, args.layers, args.auxiliary, genotype) 61 | else: 62 | model = Network(args.init_channels, CLASSES, args.layers, args.auxiliary, genotype) 63 | model = model.cuda() 64 | utils.load(model, args.model_path) 65 | 66 | logging.info("param size = %fMB", utils.count_parameters_in_MB(model)) 67 | 68 | criterion = nn.CrossEntropyLoss() 69 | criterion = criterion.cuda() 70 | 71 | _, test_transform = utils.data_transforms(args.dataset,args.cutout,args.cutout_length) 72 | if args.dataset=="CIFAR100": 73 | test_data = dset.CIFAR100(root=args.data, train=False, download=True, transform=test_transform) 74 | elif args.dataset=="CIFAR10": 75 | test_data = dset.CIFAR10(root=args.data, train=False, download=True, transform=test_transform) 76 | elif args.dataset=="sport8": 77 | dset_cls = dset.ImageFolder 78 | val_path = '%s/Sport8/test' %args.data 79 | test_data = dset_cls(root=val_path, transform=test_transform) 80 | elif args.dataset=="mit67": 81 | dset_cls = dset.ImageFolder 82 | val_path = '%s/MIT67/test' %args.data 83 | test_data = dset_cls(root=val_path, transform=test_transform) 84 | elif args.dataset == "flowers102": 85 | dset_cls = dset.ImageFolder 86 | val_path = '%s/flowers102/test' % args.tmp_data_dir 87 | test_data = dset_cls(root=val_path, transform=test_transform) 88 | test_queue = torch.utils.data.DataLoader( 89 | test_data, batch_size=args.batch_size, shuffle=False, pin_memory=False, num_workers=2) 90 | 91 | model.drop_path_prob = 0.0 92 | test_acc, test_obj = infer(test_queue, model, criterion) 93 | logging.info('Test_acc %f', test_acc) 94 | 95 | 96 | def infer(test_queue, model, criterion): 97 | objs = utils.AvgrageMeter() 98 | top1 = utils.AvgrageMeter() 99 | top5 = utils.AvgrageMeter() 100 | model.eval() 101 | 102 | for step, (input, target) in enumerate(test_queue): 103 | input = input.cuda() 104 | target = target.cuda() 105 | with torch.no_grad(): 106 | logits, _ = model(input) 107 | loss = criterion(logits, target) 108 | 109 | prec1, prec5 = utils.accuracy(logits, target, topk=(1, 5)) 110 | n = input.size(0) 111 | objs.update(loss.data.item(), n) 112 | top1.update(prec1.data.item(), n) 113 | top5.update(prec5.data.item(), n) 114 | 115 | if step % args.report_freq == 0: 116 | logging.info('test %03d %e %f %f', step, objs.avg, top1.avg, top5.avg) 117 | 118 | return top1.avg, objs.avg 119 | 120 | 121 | if __name__ == '__main__': 122 | main() 123 | 124 | -------------------------------------------------------------------------------- /PDARTS/utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import torch 4 | import shutil 5 | import torchvision.transforms as transforms 6 | from torch.autograd import Variable 7 | 8 | LARGE_DATASETS = ["mit67", "sport8", "flowers102"] 9 | 10 | class AvgrageMeter(object): 11 | 12 | def __init__(self): 13 | self.reset() 14 | 15 | def reset(self): 16 | self.avg = 0 17 | self.sum = 0 18 | self.cnt = 0 19 | 20 | def update(self, val, n=1): 21 | self.sum += val * n 22 | self.cnt += n 23 | self.avg = self.sum / self.cnt 24 | 25 | 26 | def accuracy(output, target, topk=(1,)): 27 | maxk = max(topk) 28 | batch_size = target.size(0) 29 | 30 | _, pred = output.topk(maxk, 1, True, True) 31 | pred = pred.t() 32 | correct = pred.eq(target.view(1, -1).expand_as(pred)) 33 | 34 | res = [] 35 | for k in topk: 36 | correct_k = correct[:k].view(-1).float().sum(0) 37 | res.append(correct_k.mul_(100.0/batch_size)) 38 | return res 39 | 40 | 41 | class Cutout(object): 42 | def __init__(self, length): 43 | self.length = length 44 | 45 | def __call__(self, img): 46 | h, w = img.size(1), img.size(2) 47 | mask = np.ones((h, w), np.float32) 48 | y = np.random.randint(h) 49 | x = np.random.randint(w) 50 | 51 | y1 = np.clip(y - self.length // 2, 0, h) 52 | y2 = np.clip(y + self.length // 2, 0, h) 53 | x1 = np.clip(x - self.length // 2, 0, w) 54 | x2 = np.clip(x + self.length // 2, 0, w) 55 | 56 | mask[y1: y2, x1: x2] = 0. 57 | mask = torch.from_numpy(mask) 58 | mask = mask.expand_as(img) 59 | img *= mask 60 | return img 61 | 62 | 63 | def data_transforms(dataset,cutout, cutout_length): 64 | if dataset in LARGE_DATASETS: 65 | MEAN = [0.485, 0.456, 0.406] 66 | STD = [0.229, 0.224, 0.225] 67 | transf_train = [ 68 | transforms.RandomResizedCrop(224), 69 | transforms.RandomHorizontalFlip(), 70 | transforms.ColorJitter( 71 | brightness=0.4, 72 | contrast=0.4, 73 | saturation=0.4, 74 | hue=0.2) 75 | ] 76 | transf_val = [ 77 | transforms.Resize(256), 78 | transforms.CenterCrop(224), 79 | ] 80 | normalize = [ 81 | transforms.ToTensor(), 82 | transforms.Normalize(MEAN, STD) 83 | ] 84 | train_transform = transforms.Compose(transf_train + normalize) 85 | if cutout: 86 | train_transform.transforms.append(Cutout(cutout_length)) 87 | valid_transform = transforms.Compose(transf_val + normalize) 88 | return train_transform, valid_transform 89 | if dataset == "CIFAR10": 90 | CIFAR_MEAN = [0.49139968, 0.48215827, 0.44653124] 91 | CIFAR_STD = [0.24703233, 0.24348505, 0.26158768] 92 | 93 | train_transform = transforms.Compose([ 94 | transforms.RandomCrop(32, padding=4), 95 | transforms.RandomHorizontalFlip(), 96 | transforms.ToTensor(), 97 | transforms.Normalize(CIFAR_MEAN, CIFAR_STD), 98 | ]) 99 | if cutout: 100 | train_transform.transforms.append(Cutout(cutout_length)) 101 | 102 | valid_transform = transforms.Compose([ 103 | transforms.ToTensor(), 104 | transforms.Normalize(CIFAR_MEAN, CIFAR_STD), 105 | ]) 106 | return train_transform, valid_transform 107 | if dataset == "CIFAR100": 108 | CIFAR_MEAN = [0.5071, 0.4867, 0.4408] 109 | CIFAR_STD = [0.2675, 0.2565, 0.2761] 110 | 111 | train_transform = transforms.Compose([ 112 | transforms.RandomCrop(32, padding=4), 113 | transforms.RandomHorizontalFlip(), 114 | transforms.ToTensor(), 115 | transforms.Normalize(CIFAR_MEAN, CIFAR_STD), 116 | ]) 117 | if cutout: 118 | train_transform.transforms.append(Cutout(cutout_length)) 119 | 120 | valid_transform = transforms.Compose([ 121 | transforms.ToTensor(), 122 | transforms.Normalize(CIFAR_MEAN, CIFAR_STD), 123 | ]) 124 | return train_transform, valid_transform 125 | 126 | def count_parameters_in_MB(model): 127 | return np.sum(np.prod(v.size()) for name, v in model.named_parameters() if "auxiliary" not in name)/1e6 128 | 129 | 130 | def save_checkpoint(state, is_best, save): 131 | filename = os.path.join(save, 'checkpoint.pth.tar') 132 | torch.save(state, filename) 133 | if is_best: 134 | best_filename = os.path.join(save, 'model_best.pth.tar') 135 | shutil.copyfile(filename, best_filename) 136 | 137 | 138 | def save(model, model_path): 139 | torch.save(model.state_dict(), model_path) 140 | 141 | 142 | def load(model, model_path): 143 | model.load_state_dict(torch.load(model_path)) 144 | 145 | 146 | def drop_path(x, drop_prob): 147 | if drop_prob > 0.: 148 | keep_prob = 1.-drop_prob 149 | mask = Variable(torch.cuda.FloatTensor(x.size(0), 1, 1, 1).bernoulli_(keep_prob)) 150 | x.div_(keep_prob) 151 | x.mul_(mask) 152 | return x 153 | 154 | 155 | def create_exp_dir(path, scripts_to_save=None): 156 | if not os.path.exists(path): 157 | os.mkdir(path) 158 | print('Experiment dir : {}'.format(path)) 159 | 160 | if scripts_to_save is not None: 161 | os.mkdir(os.path.join(path, 'scripts')) 162 | for script in scripts_to_save: 163 | dst_file = os.path.join(path, 'scripts', os.path.basename(script)) 164 | shutil.copyfile(script, dst_file) 165 | 166 | -------------------------------------------------------------------------------- /PDARTS/visualize.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import genotypes_old 3 | from graphviz import Digraph 4 | 5 | 6 | def plot(genotype, filename): 7 | g = Digraph( 8 | format='pdf', 9 | edge_attr=dict(fontsize='20', fontname="times"), 10 | node_attr=dict(style='filled', shape='rect', align='center', fontsize='20', height='0.5', width='0.5', penwidth='2', fontname="times"), 11 | engine='dot') 12 | g.body.extend(['rankdir=LR']) 13 | 14 | g.node("c_{k-2}", fillcolor='darkseagreen2') 15 | g.node("c_{k-1}", fillcolor='darkseagreen2') 16 | assert len(genotype) % 2 == 0 17 | steps = len(genotype) // 2 18 | 19 | for i in range(steps): 20 | g.node(str(i), fillcolor='lightblue') 21 | 22 | for i in range(steps): 23 | for k in [2*i, 2*i + 1]: 24 | op, j = genotype[k] 25 | if j == 0: 26 | u = "c_{k-2}" 27 | elif j == 1: 28 | u = "c_{k-1}" 29 | else: 30 | u = str(j-2) 31 | v = str(i) 32 | g.edge(u, v, label=op, fillcolor="gray") 33 | 34 | g.node("c_{k}", fillcolor='palegoldenrod') 35 | for i in range(steps): 36 | g.edge(str(i), "c_{k}", fillcolor="gray") 37 | 38 | g.render(filename, view=True) 39 | 40 | 41 | if __name__ == '__main__': 42 | if len(sys.argv) != 2: 43 | print("usage:\n python {} ARCH_NAME".format(sys.argv[0])) 44 | sys.exit(1) 45 | 46 | genotype_name = sys.argv[1] 47 | try: 48 | genotype = eval('genotypes.{}'.format(genotype_name)) 49 | except AttributeError: 50 | print("{} is not specified in genotypes_old.py".format(genotype_name)) 51 | sys.exit(1) 52 | 53 | plot(genotype.normal, "normal") 54 | plot(genotype.reduce, "reduction") 55 | 56 | -------------------------------------------------------------------------------- /PRDARTS/README.md: -------------------------------------------------------------------------------- 1 | # Prune and Replace NAS 2 | 3 | ## Generate a Random Architecture 4 | 5 | ``` 6 | morph_restrictions='r_depth' 7 | if morph_restrictions=='darts_like': 8 | conv_min_k=3 9 | conv_max_k=7 10 | conv_max_dil=2 11 | conv_max_c_mul_len=1 12 | conv_max_c_mul_num=1 13 | conv_max_c_mul_width=1 14 | elif morph_restrictions=='unrestricted': 15 | conv_min_k=1 16 | conv_max_k=7 17 | conv_max_dil=2 18 | conv_max_c_mul_len=999 19 | conv_max_c_mul_num=999 20 | conv_max_c_mul_width=999 21 | elif args.morph_restrictions=='r_depth': 22 | conv_min_k = 1 23 | conv_max_k = 7 24 | conv_max_dil = 2 25 | conv_max_c_mul_len = 1 26 | conv_max_c_mul_num = 5 27 | conv_max_c_mul_width = 5 28 | c_muls = [[]] 29 | for l in range(1, conv_max_c_mul_width + 1): #To be corrected for unrestricted search space 30 | cmul = [] 31 | cmul.append(l) 32 | c_muls.append(cmul) 33 | CONV_PRIMITIVES = [('conv', {'dil': i, 'c_mul': cmul, 'k': k}) for i in range(1, conv_max_dil + 1) for k in 34 | range(conv_min_k, conv_max_k + 1) for cmul in c_muls] 35 | PRIMITIVES = [('skip', {}), ('pool', {'k': 3, 'type_': 'max'}), 36 | ('pool', {'k': 3, 'type_': 'avg'})] + CONV_PRIMITIVES 37 | from genotypes import Genotype 38 | import random 39 | 40 | normal = [] 41 | reduce = [] 42 | ops_normal = {} 43 | ops_reduce = {} 44 | links_normal = {} 45 | links_reduce = {} 46 | for i in range(4): 47 | #Giving fixed chances for skip and pooling operations as PRDARTS never searches for all convolutions of the space at the same time 48 | ops_normal[i] = random.choices([j for j in range(len(PRIMITIVES))], 49 | weights=[1 / 8, 1 / 8, 1 / 8] + len(CONV_PRIMITIVES) * [ 50 | 1 / len(CONV_PRIMITIVES)], k=2) 51 | ops_reduce[i] = random.choices([j for j in range(len(PRIMITIVES))], 52 | weights=[1 / 8, 1 / 8, 1 / 8] + len(CONV_PRIMITIVES) * [ 53 | 1 / len(CONV_PRIMITIVES)], k=2) 54 | links_normal[i] = random.sample([j for j in range(i + 2)], 2) 55 | links_reduce[i] = random.sample([j for j in range(i + 2)], 2) 56 | normal.append((PRIMITIVES[ops_normal[i][0]][0], PRIMITIVES[ops_normal[i][0]][1], links_normal[i][0])) 57 | reduce.append((PRIMITIVES[ops_reduce[i][0]][0], PRIMITIVES[ops_reduce[i][0]][1], links_reduce[i][0])) 58 | normal.append((PRIMITIVES[ops_normal[i][1]][0], PRIMITIVES[ops_normal[i][1]][1], links_normal[i][1])) 59 | reduce.append((PRIMITIVES[ops_reduce[i][1]][0], PRIMITIVES[ops_reduce[i][1]][1], links_reduce[i][1])) 60 | reduce_concat = [2, 3, 4, 5] 61 | normal_concat = [2, 3, 4, 5] 62 | genotype = Genotype(normal, normal_concat, reduce, reduce_concat) 63 | ``` 64 | 65 | ## Search 66 | 67 | ``` 68 | python train_search.py 69 | --dataset CIFAR10 # choose between CIFAR10, CIFAR100, Sport8, MIT67 and flowers102 70 | --tmp_data_dir /data # path to your data 71 | --save test 72 | --batch_size_min 24 73 | --num_morphs "3, 3, 3, 3, 3, 0, 0, 0, 0" 74 | --grace_epochs "5, 5, 3, 3, 3, 3, 3, 3, 3" 75 | --num_to_keep="6, 6, 6, 6, 6, 4, 3, 2, 1" 76 | --epochs "15, 15, 10, 10, 10, 10, 10, 10, 10" 77 | --try_load "true" 78 | --report_freq 50 79 | --test "false" 80 | --batch_multiples 8 81 | --batch_size_max 128 82 | --primitives 'prdarts4' 83 | --dropout_rate="0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0" 84 | --add_layers "3, 3, 3, 3, 3, 3, 3, 3, 3" 85 | --init_channels 16 86 | --add_width "0, 0, 0, 0, 0, 0, 0, 0, 0" 87 | --seed 0 88 | --morph_restrictions 'r_depth' 89 | --learning_rate_min 0.01 90 | ``` 91 | 92 | ## Augment 93 | 94 | ``` 95 | python train_cifar.py 96 | --dataset CIFAR10 # choose between CIFAR10, CIFAR100, Sport8, MIT67 and flowers102 97 | --tmp_data_dir /data # path to your data 98 | --save test 99 | --learning_rate 0.025 100 | --batch_size_min 32 101 | --epochs 600 102 | --auxiliary "true" 103 | --batch_size_max=128 104 | --workers 4 105 | --batch_multiples 8 106 | --seed 0 107 | --arch genotype 108 | --drop_path_prob 0.3 109 | --cutout "true" 110 | --init_channels 36 111 | --layers 20 #20 for CIFAR datasets, 8 for Sport8, MIT67 and flowers102 112 | ``` 113 | -------------------------------------------------------------------------------- /PRDARTS/gpu_thread.py: -------------------------------------------------------------------------------- 1 | import GPUtil 2 | import threading 3 | 4 | 5 | class AbortableSleep(): 6 | """ 7 | A class that enables sleeping with interrupts 8 | see https://stackoverflow.com/questions/28478291/abortable-sleep-in-python 9 | """ 10 | 11 | def __init__(self): 12 | self._condition = threading.Condition() 13 | self._aborted = False 14 | 15 | def __call__(self, secs): 16 | with self._condition: 17 | self._aborted = False 18 | self._condition.wait(timeout=secs) 19 | return not self._aborted 20 | 21 | def abort(self): 22 | with self._condition: 23 | self._condition.notify() 24 | self._aborted = True 25 | 26 | 27 | class GpuLogThread(threading.Thread): 28 | """ 29 | simple thread to log gpu util to tensorboard and keep track of util spikes 30 | not multithreading safe 31 | """ 32 | 33 | def __init__(self, gpu_ids: list, writer, seconds=10, rs=0.5): 34 | super(GpuLogThread, self).__init__() 35 | self.gpu_ids = gpu_ids 36 | self.seconds = seconds 37 | self.rs = rs 38 | self.writer = writer 39 | self.step_writer = 0 40 | self.step_recent = 0 41 | self.keep_running = True 42 | self.max_recent_util = 0.0 43 | self._abortable_sleep = AbortableSleep() 44 | self.daemon = True 45 | 46 | def __log_gpus(self): 47 | for i, gpu in enumerate(GPUtil.getGPUs()): 48 | if i in self.gpu_ids: 49 | # self.writer.add_scalar('gpus/%d/%s' % (gpu.id, 'memoryTotal'), gpu.memoryTotal, step) 50 | # self.writer.add_scalar('gpus/%d/%s' % (gpu.id, 'memoryUsed'), gpu.memoryUsed, step) 51 | # self.writer.add_scalar('gpus/%d/%s' % (gpu.id, 'memoryFree'), gpu.memoryFree, step) 52 | self.writer.add_scalar('gpus/%d/%s' % (gpu.id, 'memoryUtil'), gpu.memoryUtil, self.step_writer) 53 | self.writer.add_scalar('gpus/recentMaxUtil', self.max_recent_util, self.step_writer) 54 | self.step_writer += 1 55 | 56 | def __update_recent(self): 57 | for i, gpu in enumerate(GPUtil.getGPUs()): 58 | if i in self.gpu_ids: 59 | self.max_recent_util = max(self.max_recent_util, gpu.memoryUtil) 60 | self.step_recent += 1 61 | 62 | def wakeup(self): 63 | self.__update_recent() 64 | self._abortable_sleep.abort() 65 | 66 | def run(self): 67 | while self.keep_running: 68 | for i in range(int(self.seconds / self.rs)): 69 | self.__update_recent() 70 | self._abortable_sleep(self.rs) 71 | self.__log_gpus() 72 | 73 | def stop(self): 74 | self.keep_running = False 75 | 76 | def get_highest_recent(self): 77 | usage = self.max_recent_util 78 | if usage <= 0: 79 | return 1.0 80 | return usage 81 | 82 | def reset_recent(self): 83 | self.max_recent_util = 0.0 84 | -------------------------------------------------------------------------------- /PRDARTS/test.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import glob 4 | import numpy as np 5 | import torch 6 | import utils 7 | import logging 8 | import argparse 9 | import torch.nn as nn 10 | import genotypes 11 | import torch.utils 12 | import torchvision.datasets as dset 13 | import torch.backends.cudnn as cudnn 14 | 15 | from torch.autograd import Variable 16 | from model import NetworkCIFAR as Network 17 | 18 | 19 | parser = argparse.ArgumentParser("cifar") 20 | parser.add_argument('--data', type=str, default='../data', help='location of the data corpus') 21 | parser.add_argument('--batch_size', type=int, default=96, help='batch size') 22 | parser.add_argument('--report_freq', type=float, default=50, help='report frequency') 23 | parser.add_argument('--gpu', type=int, default=0, help='gpu device id') 24 | parser.add_argument('--init_channels', type=int, default=36, help='num of init channels') 25 | parser.add_argument('--layers', type=int, default=20, help='total number of layers') 26 | parser.add_argument('--model_path', type=str, default='EXP/model.pt', help='path of pretrained model') 27 | parser.add_argument('--auxiliary', action='store_true', default=False, help='use auxiliary tower') 28 | parser.add_argument('--cutout', action='store_true', default=False, help='use cutout') 29 | parser.add_argument('--cutout_length', type=int, default=16, help='cutout length') 30 | parser.add_argument('--drop_path_prob', type=float, default=0.2, help='drop path probability') 31 | parser.add_argument('--seed', type=int, default=0, help='random seed') 32 | parser.add_argument('--arch', type=str, default='DARTS', help='which architecture to use') 33 | args = parser.parse_args() 34 | 35 | log_format = '%(asctime)s %(message)s' 36 | logging.basicConfig(stream=sys.stdout, level=logging.INFO, 37 | format=log_format, datefmt='%m/%d %I:%M:%S %p') 38 | 39 | CIFAR_CLASSES = 10 40 | 41 | 42 | def main(): 43 | if not torch.cuda.is_available(): 44 | logging.info('no gpu device available') 45 | sys.exit(1) 46 | 47 | np.random.seed(args.seed) 48 | torch.cuda.set_device(args.gpu) 49 | cudnn.benchmark = True 50 | torch.manual_seed(args.seed) 51 | cudnn.enabled=True 52 | torch.cuda.manual_seed(args.seed) 53 | logging.info('gpu device = %d' % args.gpu) 54 | logging.info("args = %s", args) 55 | 56 | genotype = eval("genotypes.%s" % args.arch) 57 | model = Network(args.init_channels, CIFAR_CLASSES, args.layers, args.auxiliary, genotype) 58 | model = model.cuda() 59 | utils.load(model, args.model_path) 60 | 61 | logging.info("param size = %fMB", utils.count_parameters_in_MB(model)) 62 | 63 | criterion = nn.CrossEntropyLoss() 64 | criterion = criterion.cuda() 65 | 66 | _, test_transform = utils.data_transforms_cifar10(args) 67 | test_data = dset.CIFAR10(root=args.data, train=False, download=True, transform=test_transform) 68 | 69 | test_queue = torch.utils.data.DataLoader( 70 | test_data, batch_size=args.batch_size, shuffle=False, pin_memory=True, num_workers=2) 71 | 72 | model.drop_path_prob = args.drop_path_prob 73 | test_acc, test_obj = infer(test_queue, model, criterion) 74 | logging.info('test_acc %f', test_acc) 75 | 76 | 77 | def infer(test_queue, model, criterion): 78 | objs = utils.AvgrageMeter() 79 | top1 = utils.AvgrageMeter() 80 | top5 = utils.AvgrageMeter() 81 | model.eval() 82 | 83 | for step, (input, target) in enumerate(test_queue): 84 | input = Variable(input, volatile=True).cuda() 85 | target = Variable(target, volatile=True).cuda(async=True) 86 | 87 | logits, _ = model(input) 88 | loss = criterion(logits, target) 89 | 90 | prec1, prec5 = utils.accuracy(logits, target, topk=(1, 5)) 91 | n = input.size(0) 92 | objs.update(loss.data[0], n) 93 | top1.update(prec1.data[0], n) 94 | top5.update(prec5.data[0], n) 95 | 96 | if step % args.report_freq == 0: 97 | logging.info('test %03d %e %f %f', step, objs.avg, top1.avg, top5.avg) 98 | 99 | return top1.avg, objs.avg 100 | 101 | 102 | if __name__ == '__main__': 103 | main() 104 | 105 | -------------------------------------------------------------------------------- /PRDARTS/test_imagenet.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import numpy as np 4 | import torch 5 | import utils 6 | import glob 7 | import random 8 | import logging 9 | import argparse 10 | import torch.nn as nn 11 | import genotypes 12 | import torch.utils 13 | import torchvision.datasets as dset 14 | import torchvision.transforms as transforms 15 | import torch.backends.cudnn as cudnn 16 | 17 | from torch.autograd import Variable 18 | from model import NetworkImageNet as Network 19 | 20 | 21 | parser = argparse.ArgumentParser("imagenet") 22 | parser.add_argument('--data', type=str, default='../data/imagenet/', help='location of the data corpus') 23 | parser.add_argument('--batch_size', type=int, default=128, help='batch size') 24 | parser.add_argument('--report_freq', type=float, default=100, help='report frequency') 25 | parser.add_argument('--gpu', type=int, default=0, help='gpu device id') 26 | parser.add_argument('--init_channels', type=int, default=48, help='num of init channels') 27 | parser.add_argument('--layers', type=int, default=14, help='total number of layers') 28 | parser.add_argument('--model_path', type=str, default='EXP/model.pt', help='path of pretrained model') 29 | parser.add_argument('--auxiliary', action='store_true', default=False, help='use auxiliary tower') 30 | parser.add_argument('--drop_path_prob', type=float, default=0, help='drop path probability') 31 | parser.add_argument('--seed', type=int, default=0, help='random seed') 32 | parser.add_argument('--arch', type=str, default='DARTS', help='which architecture to use') 33 | args = parser.parse_args() 34 | 35 | log_format = '%(asctime)s %(message)s' 36 | logging.basicConfig(stream=sys.stdout, level=logging.INFO, 37 | format=log_format, datefmt='%m/%d %I:%M:%S %p') 38 | 39 | CLASSES = 1000 40 | 41 | 42 | def main(): 43 | if not torch.cuda.is_available(): 44 | logging.info('no gpu device available') 45 | sys.exit(1) 46 | 47 | np.random.seed(args.seed) 48 | torch.cuda.set_device(args.gpu) 49 | cudnn.benchmark = True 50 | torch.manual_seed(args.seed) 51 | cudnn.enabled=True 52 | torch.cuda.manual_seed(args.seed) 53 | logging.info('gpu device = %d' % args.gpu) 54 | logging.info("args = %s", args) 55 | 56 | genotype = eval("genotypes.%s" % args.arch) 57 | model = Network(args.init_channels, CLASSES, args.layers, args.auxiliary, genotype) 58 | model = model.cuda() 59 | model.load_state_dict(torch.load(args.model_path)['state_dict']) 60 | 61 | logging.info("param size = %fMB", utils.count_parameters_in_MB(model)) 62 | 63 | criterion = nn.CrossEntropyLoss() 64 | criterion = criterion.cuda() 65 | 66 | validdir = os.path.join(args.data, 'val') 67 | normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) 68 | valid_data = dset.ImageFolder( 69 | validdir, 70 | transforms.Compose([ 71 | transforms.Resize(256), 72 | transforms.CenterCrop(224), 73 | transforms.ToTensor(), 74 | normalize, 75 | ])) 76 | 77 | valid_queue = torch.utils.data.DataLoader( 78 | valid_data, batch_size=args.batch_size, shuffle=False, pin_memory=True, num_workers=4) 79 | 80 | model.drop_path_prob = args.drop_path_prob 81 | valid_acc_top1, valid_acc_top5, valid_obj = infer(valid_queue, model, criterion) 82 | logging.info('valid_acc_top1 %f', valid_acc_top1) 83 | logging.info('valid_acc_top5 %f', valid_acc_top5) 84 | 85 | 86 | def infer(valid_queue, model, criterion): 87 | objs = utils.AvgrageMeter() 88 | top1 = utils.AvgrageMeter() 89 | top5 = utils.AvgrageMeter() 90 | model.eval() 91 | 92 | for step, (input, target) in enumerate(valid_queue): 93 | input = Variable(input, volatile=True).cuda() 94 | target = Variable(target, volatile=True).cuda(async=True) 95 | 96 | logits, _ = model(input) 97 | loss = criterion(logits, target) 98 | 99 | prec1, prec5 = utils.accuracy(logits, target, topk=(1, 5)) 100 | n = input.size(0) 101 | objs.update(loss.data[0], n) 102 | top1.update(prec1.data[0], n) 103 | top5.update(prec5.data[0], n) 104 | 105 | if step % args.report_freq == 0: 106 | logging.info('valid %03d %e %f %f', step, objs.avg, top1.avg, top5.avg) 107 | 108 | return top1.avg, top5.avg, objs.avg 109 | 110 | 111 | if __name__ == '__main__': 112 | main() 113 | -------------------------------------------------------------------------------- /PRDARTS/testgenotype.json: -------------------------------------------------------------------------------- 1 | [ 2 | [ 3 | [ 4 | "pool", 5 | { 6 | "k": 3, 7 | "type_": "avg" 8 | }, 9 | 0 10 | ], 11 | [ 12 | "conv", 13 | { 14 | "c_mul": [], 15 | "k": 3, 16 | "dil": 2 17 | }, 18 | 1 19 | ], 20 | [ 21 | "pool", 22 | { 23 | "k": 3, 24 | "type_": "avg" 25 | }, 26 | 1 27 | ], 28 | [ 29 | "skip", 30 | {}, 31 | 2 32 | ], 33 | [ 34 | "conv", 35 | { 36 | "c_mul": [], 37 | "k": 1, 38 | "dil": 1 39 | }, 40 | 0 41 | ], 42 | [ 43 | "conv", 44 | { 45 | "c_mul": [], 46 | "k": 3, 47 | "dil": 1 48 | }, 49 | 3 50 | ], 51 | [ 52 | "conv", 53 | { 54 | "c_mul": [ 55 | 1 56 | ], 57 | "k": 3, 58 | "dil": 1 59 | }, 60 | 0 61 | ], 62 | [ 63 | "skip", 64 | {}, 65 | 2 66 | ] 67 | ], 68 | [ 69 | 2, 70 | 3, 71 | 4, 72 | 5 73 | ], 74 | [ 75 | [ 76 | "conv", 77 | { 78 | "c_mul": [], 79 | "k": 3, 80 | "dil": 2 81 | }, 82 | 0 83 | ], 84 | [ 85 | "conv", 86 | { 87 | "c_mul": [], 88 | "k": 3, 89 | "dil": 1 90 | }, 91 | 1 92 | ], 93 | [ 94 | "pool", 95 | { 96 | "k": 3, 97 | "type_": "max" 98 | }, 99 | 0 100 | ], 101 | [ 102 | "skip", 103 | {}, 104 | 2 105 | ], 106 | [ 107 | "conv", 108 | { 109 | "c_mul": [], 110 | "k": 3, 111 | "dil": 1 112 | }, 113 | 0 114 | ], 115 | [ 116 | "pool", 117 | { 118 | "k": 3, 119 | "type_": "avg" 120 | }, 121 | 1 122 | ], 123 | [ 124 | "conv", 125 | { 126 | "c_mul": [], 127 | "k": 3, 128 | "dil": 1 129 | }, 130 | 3 131 | ], 132 | [ 133 | "conv", 134 | { 135 | "c_mul": [], 136 | "k": 1, 137 | "dil": 1 138 | }, 139 | 4 140 | ] 141 | ], 142 | [ 143 | 2, 144 | 3, 145 | 4, 146 | 5 147 | ] 148 | ] -------------------------------------------------------------------------------- /PRDARTS/visualize.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import genotypes 3 | from graphviz import Digraph 4 | from operations import OPS 5 | 6 | 7 | def plot(genotype, filename): 8 | g = Digraph( 9 | format='pdf', 10 | edge_attr=dict(fontsize='20', fontname="times"), 11 | node_attr=dict(style='filled', shape='rect', align='center', fontsize='20', height='0.5', width='0.5', penwidth='2', fontname="times"), 12 | engine='dot') 13 | g.body.extend(['rankdir=LR']) 14 | 15 | g.node("c_{k-2}", fillcolor='darkseagreen2') 16 | g.node("c_{k-1}", fillcolor='darkseagreen2') 17 | assert len(genotype) % 2 == 0 18 | steps = len(genotype) // 2 19 | 20 | for i in range(steps): 21 | g.node(str(i), fillcolor='lightblue') 22 | 23 | for i in range(steps): 24 | for k in [2*i, 2*i + 1]: 25 | op, op_kwargs, j = genotype[k] 26 | if j == 0: 27 | u = "c_{k-2}" 28 | elif j == 1: 29 | u = "c_{k-1}" 30 | else: 31 | u = str(j-2) 32 | v = str(i) 33 | g.edge(u, v, label=OPS.get(op).label_str(**op_kwargs), fillcolor="gray") 34 | 35 | g.node("c_{k}", fillcolor='palegoldenrod') 36 | for i in range(steps): 37 | g.edge(str(i), "c_{k}", fillcolor="gray") 38 | 39 | g.render(filename, view=True) 40 | 41 | 42 | if __name__ == '__main__': 43 | import argparse 44 | p = argparse.ArgumentParser(description="Visualize") 45 | p.add_argument('--genome', type=str, default='PDARTS') # PDARTS 46 | args = p.parse_args() 47 | 48 | try: 49 | genotype = eval('genotypes.{}'.format(args.genome)) 50 | except AttributeError: 51 | print("{} is not specified in genotypes.py".format(args.genome)) 52 | sys.exit(1) 53 | 54 | plot(genotype.normal, "normal") 55 | plot(genotype.reduce, "reduction") 56 | 57 | -------------------------------------------------------------------------------- /Plots/data/cell_sensitivity/cell_n_12.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/antoyang/NAS-Benchmark/c2758da73192d408aad458768f26d3537dd78c40/Plots/data/cell_sensitivity/cell_n_12.npy -------------------------------------------------------------------------------- /Plots/data/cell_sensitivity/cell_n_16.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/antoyang/NAS-Benchmark/c2758da73192d408aad458768f26d3537dd78c40/Plots/data/cell_sensitivity/cell_n_16.npy -------------------------------------------------------------------------------- /Plots/data/cell_sensitivity/cell_n_20.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/antoyang/NAS-Benchmark/c2758da73192d408aad458768f26d3537dd78c40/Plots/data/cell_sensitivity/cell_n_20.npy -------------------------------------------------------------------------------- /Plots/data/cell_sensitivity/cell_n_24.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/antoyang/NAS-Benchmark/c2758da73192d408aad458768f26d3537dd78c40/Plots/data/cell_sensitivity/cell_n_24.npy -------------------------------------------------------------------------------- /Plots/data/cell_sensitivity/cell_n_4.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/antoyang/NAS-Benchmark/c2758da73192d408aad458768f26d3537dd78c40/Plots/data/cell_sensitivity/cell_n_4.npy -------------------------------------------------------------------------------- /Plots/data/cell_sensitivity/cell_n_6.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/antoyang/NAS-Benchmark/c2758da73192d408aad458768f26d3537dd78c40/Plots/data/cell_sensitivity/cell_n_6.npy -------------------------------------------------------------------------------- /Plots/data/cell_sensitivity/cell_n_8.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/antoyang/NAS-Benchmark/c2758da73192d408aad458768f26d3537dd78c40/Plots/data/cell_sensitivity/cell_n_8.npy -------------------------------------------------------------------------------- /Plots/data/correlation_cells/8_to_20_cells.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/antoyang/NAS-Benchmark/c2758da73192d408aad458768f26d3537dd78c40/Plots/data/correlation_cells/8_to_20_cells.npy -------------------------------------------------------------------------------- /Plots/data/correlation_seeds/different_seed_24.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/antoyang/NAS-Benchmark/c2758da73192d408aad458768f26d3537dd78c40/Plots/data/correlation_seeds/different_seed_24.npy -------------------------------------------------------------------------------- /Plots/data/modified_search_space/214_random_architectures.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/antoyang/NAS-Benchmark/c2758da73192d408aad458768f26d3537dd78c40/Plots/data/modified_search_space/214_random_architectures.npy -------------------------------------------------------------------------------- /Plots/data/modified_search_space/214genotypes.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/antoyang/NAS-Benchmark/c2758da73192d408aad458768f26d3537dd78c40/Plots/data/modified_search_space/214genotypes.pkl -------------------------------------------------------------------------------- /Plots/data/modified_search_space/56_random_mod_architectures.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/antoyang/NAS-Benchmark/c2758da73192d408aad458768f26d3537dd78c40/Plots/data/modified_search_space/56_random_mod_architectures.npy -------------------------------------------------------------------------------- /Plots/data/performance/Augmentation.csv: -------------------------------------------------------------------------------- 1 | Base,Auxiliary,DropPath ,Cutout,CD,CDA,CDA 50 Channels,CDA AutoAugment,CDAAA 1500 Epochs,CDAAA1500E50C 2 | 95.14,95.35,96.14,96.68,97.06,97.15,97.32,97.57,97.82,98.05 3 | 94.88,94.96,95.76,96.22,96.69,96.93,97.16,97.2,97.58,97.79 4 | 94.67,94.43,95.81,96.04,96.44,96.77,97.08,96.8,97.39,97.45 5 | 94.55,95.25,95.73,96.32,96.99,97.16,97.21,97.39,97.69,97.95 6 | 94.86,94.77,95.52,96.27,96.73,96.66,97.07,97.12,97.36,97.71 7 | 94.82,95,96.07,96.39,96.75,97.04,97.31,97.6,97.76,97.95 8 | 95.23,95.16,96.02,96.54,96.91,97.11,97.34,97.16,97.57,97.92 9 | 95.5,95.37,96.42,96.67,97.19,97.17,97.4,97.63,98.01,98.15 10 | -------------------------------------------------------------------------------- /Plots/data/performance/CIFAR10.csv: -------------------------------------------------------------------------------- 1 | Random DARTS,DARTS,Random PDARTS,PDARTS,Random NSGANET,NSGANET,Random ENAS,ENAS,Random CNAS,CNAS,Random MANAS,MANAS-LS,Random StacNAS,StacNAS,Random NAO,NAO 2 | 97.05,97.47,96.45,97.14,97.07,95.61,95.78,95.64,93.81,94.79,96.77,97.07,97.12,97.35,97.009995,96.889999 3 | 96.85,97.15,96.69,96.89,96.78,96.29,95.64,96.1,95.21,94.15,96.83,97.1,97.07,97.4,92.769997,96.939995 4 | 96.57,97.41,96.7,96.99,96.49,95.39,96.08,95.18,89.56,94.6,96.91,97.48,97.3,97.35,96.759995,96.970001 5 | 97.09,97.05,96.89,97.26,97.04,95.67,95.12,95.84,94.32,93.05,97.09,97.18,97.24,97.44,97.239998,96.790001 6 | 96.76,96.93,96.9,97.23,97.08,95.52,96.05,96.32,94.43,95.09,97.28,97.06,97,97.24,97.229996,96.82 7 | 97.1,97.13,96.65,97.22,95.8,96.67,95.99,95.95,94.87,94.93,97.05,97.14,97,97.59,96.970001,96.829994 8 | 96.94,97.4,95.96,97.07,96.28,97.18,96.09,95.86,94.25,94.66,96.99,97.12,96.21,97.29,96.939995,96.939995 9 | 97.02,97.34,96.73,97.2,96.6,97.11,96,95.94,94.11,94.82,97.17,97.31,96.87,97.47,96.769997,96.869995 10 | -------------------------------------------------------------------------------- /Plots/data/performance/CIFAR100.csv: -------------------------------------------------------------------------------- 1 | Random DARTS,DARTS,Random PDARTS,PDARTS,Random NSGANET,NSGANET,Random ENAS,ENAS,Random CNAS,CNAS,Random MANAS,MANAS,Random StacNAS,StacNAS,Random NAO,NAO 2 | 82.19,82.27,81.7,83,73.92,78.14,81.27,77.3,76.1,77.27,82.27,81.76,80.31,,80.860001,80.540001 3 | 82.95,81.15,80.44,83.13,78.19,74.65,81.15,78.05,74.37,75.91,81.65,82.12,81.78,,82.229996,81.369995 4 | 82.04,83.24,79.43,82.55,74.79,76.22,80.1,77.6,76.38,76.91,82.44,81.95,83.48,,82.419998,81.869995 5 | 81.41,82.97,81.15,82.63,75.45,78.36,81.06,79.03,76.22,71.05,81.92,82.38,80.06,,82.650002,82.009995 6 | 81.77,81.7,80.1,80.98,77.62,79.41,80.78,76.07,77.15,76.02,82.48,82.21,81.59,,81.129997,81.619995 7 | 82.1,82.66,81.35,81.93,73.53,76.89,81.14,79.75,73.7,74.54,82.55,82.42,79.96,,82.279999,81.639999 8 | 82.17,82.42,80.78,82.29,78.69,77.56,78.74,78.05,80.21,76.01,82.9,81.61,82.06,,80.829994,81.900002 9 | 82.82,82.55,81.78,83.14,77.96,77.26,82.13,78.29,76.61,77.57,81.68,82.12,80.84,,81.199997,82.599998 10 | -------------------------------------------------------------------------------- /Plots/data/performance/Flowers102.csv: -------------------------------------------------------------------------------- 1 | Random PDARTS,PDARTS ,Random DARTS,DARTS,Random NSGANET,NSGANET,Random ENAS,ENAS,Random CNAS,CNAS,Random MANAS,MANAS,Random StacNAS,StacNAS,Random NAO,NAO 2 | 95.184304,94.411415,94.887,96.2545,90.3,94.82,94.887,96.4923,93.222354,90.963139,96.4328,96.4923,96.6112,96.1356,91.200951,93.460167 3 | 95.778835,95.48157,96.2545,96.6706,95.95,95.42,96.2545,96.0761,92.865636,91.141498,95.8977,96.6706,97.0868,96.0166,93.995239,94.470863 4 | 94.054697,94.589774,96.3139,95.3032,91.49,94.64,96.3139,96.6112,92.211653,91.022592,94.9465,96.3734,96.7301,96.4328,93.868141,93.103447 5 | 94.887039,94.411415,96.9084,96.9084,94.47,95.69,96.9084,96.3139,93.162901,93.757432,96.3139,96.9679,95.7194,96.2545,93.103447,94.173599 6 | 94.589774,95.30321,96.0166,96.4328,95.54,94.94,96.0166,96.6112,93.816885,92.330559,96.1356,96.1356,96.6706,96.5517,94.292511,92.568367 7 | 93.935791,95.48157,95.3627,96.195,94.29,95.77,95.3627,96.1356,93.460166,89.001189,95.7194,96.9084,96.4328,96.3734,94.649223,95.005943 8 | 94.827586,95.659929,96.9679,96.5517,95.48,96.07,96.9679,97.0868,93.876338,87.931034,95.9572,96.0761,95.8977,96.4754,95.303207,92.687279 9 | 95.541023,95.362663,96.1356,96.0761,93.87,95.06,96.1356,97.1463,94.233056,92.1522,95.8383,96.9679,96.6112,96.2578,94.233055,94.173599 10 | -------------------------------------------------------------------------------- /Plots/data/performance/MIT67.csv: -------------------------------------------------------------------------------- 1 | Random DARTS,DARTS,Random PDARTS,PDARTS,Random NSGANET,NSGANET,Random ENAS,ENAS,Random CNAS,CNAS,Random MANAS,MANAS,Random StacNAS,StacNAS,Random NAO,NAO 2 | 71.3606,69.648,68.886774,70.980019,68.252458,69.394228,71.0752,71.424,69.933397,69.108785,70.314,71.04,71.3606,70.7263,64.763718,65.461464 3 | 70.3457,71.3289,69.965113,70.980019,70.187123,70.409134,69.87,71.7412,68.506185,69.362512,69.8065,71.52,70.9483,69.5845,66.508087,65.620049 4 | 69.87,71.0435,69.584523,70.884871,71.075167,71.233746,70.9483,70.7263,71.741199,69.330796,70.5043,71.77,71.7729,72.1218,66.159218,64.192833 5 | 71.0752,70.7897,70.535997,70.916587,69.077069,70.091976,69.87,70.8214,69.045354,70.472566,70.2506,71.39,71.6143,72.0901,67.142403,64.192833 6 | 71.424,70.6946,70.599429,71.392325,67.713289,68.728195,71.1069,70.4091,71.233746,69.425944,69.9017,71.74,70.3457,71.5509,66.159218,63.336506 7 | 69.2039,70.7897,69.774818,69.996828,67.332699,70.06026,71.4558,71.9315,69.774818,68.411037,70.758,71.23,70.536,71.0752,66.95211,64.129402 8 | 70.98,70.758,69.108785,70.789724,68.189026,70.82144,71.3606,70.536,69.235649,69.869965,70.1237,71.52,71.1703,70.6629,66.127495,65.683479 9 | 71.0117,70.758,70.472566,69.647954,68.728195,70.853156,71.9632,70.8214,69.67967,69.489375,70.8849,70.66,70.98,71.202,65.017441,68.125595 10 | -------------------------------------------------------------------------------- /Plots/data/performance/Sport8.csv: -------------------------------------------------------------------------------- 1 | Random DARTS,DARTS,Random PDARTS,PDARTS,Random NSGANET,NSGANET,Random ENAS,ENAS,Random CNAS,CNAS,Random MANAS,MANAS,Random StacNAS,StacNAS,Random NAO,NAO 2 | 93.3962,93.7107,91.8239,91.194968,92.138365,91.823899,93.7107,94.6541,89.622642,88.050315,94.34,94.3396,95.283,93.7107,88.364777,88.050308 3 | 95.283,93.0818,91.823899,90.566038,93.081761,92.45283,93.7107,94.0252,89.622642,91.509434,93.71,94.9686,94.3396,95.5975,88.679245,82.07547 4 | 94.6541,94.6541,92.138365,91.823899,92.138365,93.710692,93.0818,94.6541,90.566038,88.993711,94.34,94.0252,93.7107,94.3396,86.477982,84.905655 5 | 93.7107,94.3396,91.823899,93.396226,92.45283,92.45283,95.283,95.283,84.591195,88.364779,94.65,94.3396,93.0818,94.0252,87.421379,84.591194 6 | 93.7107,94.6541,91.194969,92.138365,92.767296,93.081761,93.3962,94.9686,91.194969,88.993711,95.65,94.0252,94.9686,94.0252,87.735847,88.050308 7 | 94.3396,94.0252,92.45283,92.45283,91.823899,92.45283,94.0252,93.0818,90.251573,85.534592,94.03,95.283,93.7107,94.0252,88.679245,87.106918 8 | 93.0818,92.7673,91.509434,94.654088,90.566038,91.509434,93.7107,94.9686,89.308177,90.880504,93.08,94.3396,93.7107,94.9686,89.937103,87.106918 9 | 93.7107,93.7107,93.081761,93.396226,92.138365,92.767296,94.3396,94.6541,88.679246,83.962264,93.4,94.3396,94.3396,95.283,87.106918,88.050308 10 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # NAS-Benchmark 2 | 3 | This repository includes the code used to evaluate NAS methods on 5 different datasets, as well as the code used to augment architectures with different protocols, as mentioned in our ICLR 2020 paper (https://arxiv.org/abs/1912.12522). Scripts examples are provided in each folder. 4 | 5 | ## ICLR 2020 video poster presentation 6 | The video from our ICLR 2020 poster presentation is available at https://iclr.cc/virtual_2020/poster_HygrdpVKvr.html. 7 | 8 | ## Plots 9 | All code used to generate the plots of the paper can be found in the "Plots" folder. 10 | 11 | ## Randomly Sampled Architectures 12 | You can find all sampled architectures and corresponding training logs in Plots\data\modified_search_space. 13 | 14 | ## Data 15 | 16 | In the data folder, you will find the data splits for Sport-8, MIT-67 and Flowers-102 in .csv files. 17 | 18 | You can download these datasets on the following web sites : 19 | 20 | Sport-8: http://vision.stanford.edu/lijiali/event_dataset/ 21 | 22 | MIT-67: http://web.mit.edu/torralba/www/indoor.html 23 | 24 | Flowers-102: http://www.robots.ox.ac.uk/~vgg/data/flowers/102/ 25 | 26 | The data path has to be set the following way: dataset/train/classes/images for the training set, dataset/test/classes/images for the test set. 27 | 28 | We used the following repositories: 29 | 30 | ## DARTS 31 | Paper: Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "Darts: Differentiable architecture search." arXiv preprint arXiv:1806.09055 (2018). 32 | 33 | Unofficial updated implementation: https://github.com/khanrc/pt.darts 34 | 35 | ## P-DARTS 36 | Paper: Xin Chen, Lingxi Xie, Jun Wu, Qi Tian. "Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation." ICCV, 2019. 37 | 38 | Official implementation: https://github.com/chenxin061/pdarts 39 | 40 | ## CNAS 41 | Paper: Weng, Yu, et al. "Automatic Convolutional Neural Architecture Search for Image Classification Under Different Scenes." IEEE Access 7 (2019): 38495-38506. 42 | 43 | Official implementation: https://github.com/tianbaochou/CNAS 44 | 45 | ## StacNAS 46 | Paper: Guilin Li et al. "StacNAS: Towards Stable and Consistent Differentiable Neural Architecture Search." arXiv preprint arXiv:1909.11926 (2019). 47 | 48 | Implementation: provided by the authors 49 | 50 | ## ENAS 51 | Paper: Pham, Hieu, et al. "Efficient neural architecture search via parameter sharing." arXiv preprint arXiv:1802.03268 (2018). 52 | 53 | Official Tensorflow implementation: https://github.com/melodyguan/enas 54 | 55 | Unofficial Pytorch implementation: https://github.com/MengTianjian/enas-pytorch 56 | 57 | ## MANAS 58 | Paper: Maria Carlucci, Fabio, et al. "MANAS: Multi-Agent Neural Architecture Search." arXiv preprint arXiv:1909.01051 (2019). 59 | 60 | Implementation: provided by the authors. 61 | 62 | ## NSGA-NET 63 | Paper: Lu, Zhichao, et al. "NSGA-NET: a multi-objective genetic algorithm for neural architecture search." arXiv preprint arXiv:1810.03522 (2018). 64 | 65 | Official implementation: https://github.com/ianwhale/nsga-net 66 | 67 | ## NAO 68 | Paper: Luo, Renqian, et al. "Neural architecture optimization." Advances in neural information processing systems. 2018. 69 | 70 | Official Pytorch implementation: https://github.com/renqianluo/NAO_pytorch 71 | 72 | 73 | For the two following methods, we have not yet performed consistent experiments (therefore the methods are not included in the paper). Nonetheless, we provide runnable code that could provide relevant insights (similar to those provided in the paper on the other methods) on these methods. 74 | 75 | ## PC-DARTS 76 | Paper: Xu, Yuhui, et al. "PC-DARTS: Partial Channel Connections for Memory-Efficient Differentiable Architecture Search." arXiv preprint arXiv:1907.05737 (2019). 77 | 78 | Official implementation: https://github.com/yuhuixu1993/PC-DARTS 79 | 80 | ## PRDARTS 81 | Paper: Laube, Kevin Alexander, and Andreas Zell. "Prune and Replace NAS." arXiv preprint arXiv:1906.07528 (2019). 82 | 83 | Official implementation: https://github.com/cogsys-tuebingen/prdarts 84 | 85 | ## AutoAugment 86 | Paper: Cubuk, Ekin D., et al. "Autoaugment: Learning augmentation policies from data." arXiv preprint arXiv:1805.09501 (2018). 87 | 88 | Unofficial Pytorch implementation: https://github.com/DeepVoltaire/AutoAugment 89 | 90 | ## Citation 91 | 92 | If you found this work useful, consider citing us: 93 | 94 | ``` 95 | @inproceedings{yang2020nasefh, 96 | title={NAS evaluation is frustratingly hard}, 97 | author={Antoine Yang and Pedro M. Esperança and Fabio M. Carlucci}, 98 | booktitle={ICLR}, 99 | year={2020}} 100 | ``` 101 | --------------------------------------------------------------------------------