├── LICENSE ├── README.md ├── checkpoints ├── tf │ └── .gitkeep └── torch │ └── .gitkeep ├── eval_mobilenet_tf.py ├── eval_mobilenet_torch.py ├── logs ├── 0.5flops_run1.txt ├── 0.5flops_run2.txt ├── 0.5flops_run3.txt ├── 0.5time_run1.txt ├── 0.5time_run2.txt ├── 0.5time_run3.txt ├── 0.75mobilenet_run1.txt ├── 0.75mobilenet_run2.txt ├── 0.75mobilenet_run3.txt ├── mobilenet_run1.txt ├── mobilenet_run2.txt └── mobilenet_run3.txt ├── models ├── mobilenet_v1.py ├── mobilenet_v1_tf.py └── mobilenet_v2.py └── utils.py /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2021 MIT HAN Lab 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # AMC Compressed Models 2 | 3 | This repo contains some of the compressed models from paper **AMC: AutoML for Model Compression and Acceleration on Mobile Devices (ECCV18)**. 4 | 5 | ## Reference 6 | 7 | If you find the models useful, please kindly cite our paper: 8 | 9 | ``` 10 | @inproceedings{he2018amc, 11 | title={AMC: AutoML for Model Compression and Acceleration on Mobile Devices}, 12 | author={He, Yihui and Lin, Ji and Liu, Zhijian and Wang, Hanrui and Li, Li-Jia and Han, Song}, 13 | booktitle={Proceedings of the European Conference on Computer Vision (ECCV)}, 14 | pages={784--800}, 15 | year={2018} 16 | } 17 | ``` 18 | 19 | ## Download the Pretrained Models 20 | 21 | Firstly, download the pretrained models from [here](https://drive.google.com/drive/folders/1w_1OyKuoj8JciwKPFNvztSmAqEEwPFlA?usp=sharing) and put it in `./checkpoints`. 22 | 23 | ## Models 24 | 25 | ### Compressed MobileNets 26 | 27 | We provide compressed MobileNetV1 by **50% FLOPs** and **50% Inference time**, and also compressed MobileNetV2 by **70% FLOPs**, with PyTorch. The comparison with vanila models as follows: 28 | 29 | | Models | Top1 Acc (%) | Top5 Acc (%) | Latency (ms) | MACs (M) | 30 | | ------------------------ | ------------ | ------------ | ------------ | -------- | 31 | | MobileNetV1 | 70.9 | 89.5 | 123 | 569 | 32 | | MobileNetV1-width*0.75 | 68.4 | 88.2 | 72.5 | 325 | 33 | | **MobileNetV1-50%FLOPs** | **70.5** | **89.3** | 68.9 | 285 | 34 | | **MobileNetV1-50%Time** | **70.2** | **89.4** | 63.2 | 272 | 35 | | MobileNetV2-width*0.75 | 69.8 | 89.6 | - | 300 | 36 | | **MobileNetV2-70%FLOPs** | **70.9** | **89.9** | - | 210 | 37 | 38 | To test the model, run: 39 | 40 | ``` 41 | python eval_mobilenet_torch.py --profile={mobilenet_0.5flops, mobilenet_0.5time, mobilenetv2_0.7flops} 42 | ``` 43 | 44 | 45 | 46 | ### Converted TensorFLow Models 47 | 48 | We converted the **50% FLOPs** and **50% time** compressed MobileNetV1 model to TensorFlow. We offer the normal checkpoint format and also the TF-Lite format. We used the TF-Lite format to test the speed on MobileNet. 49 | 50 | To replicate the results of PyTorch, we write a new preprocessing function, and also adapt some hyper-parameters from the original TF [MobileNetV1](https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md). To verify the performance, run following scripts: 51 | 52 | ``` 53 | python eval_mobilenet_tf.py --profile={0.5flops, 0.5time} 54 | ``` 55 | 56 | The produced result is: 57 | 58 | | Models | Top1 Acc (%) | Top5 Acc (%) | 59 | | --------- | ------------ | ------------ | 60 | | 50% FLOPs | 70.424 | 89.28 | 61 | | 50% Time | 70.214 | 89.244 | 62 | 63 | ## Timing Logs 64 | 65 | Here we provide timing logs on Google Pixel 1 using **TensorFlow Lite** in `./logs` directory. We benchmarked the original MobileNetV1 (mobilenet), MobileNetV1 with 0.75 width multiplier (0.75mobilenet), 50% FLOPs pruned MobileNetV1 (0.5flops) and 50% time pruned MobileNetV1 (0.5time). Each model is benchmarked for 200 iterations with extra 100 iterations for warming up, and repeated for 3 runs. 66 | 67 | ## AMC 68 | 69 | You can also find our PyTorch implementation of AMC [Here](https://github.com/mit-han-lab/amc-release.git). 70 | 71 | ## Contact 72 | 73 | To contact the authors: 74 | 75 | Ji Lin, jilin@mit.edu 76 | 77 | Song Han, songhan@mit.edu 78 | -------------------------------------------------------------------------------- /checkpoints/tf/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/amc-models/49f83edfca6533a7e5464bbd2d5025835811c690/checkpoints/tf/.gitkeep -------------------------------------------------------------------------------- /checkpoints/torch/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mit-han-lab/amc-models/49f83edfca6533a7e5464bbd2d5025835811c690/checkpoints/torch/.gitkeep -------------------------------------------------------------------------------- /eval_mobilenet_tf.py: -------------------------------------------------------------------------------- 1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================== 15 | """Generic evaluation script that evaluates a model using a given dataset.""" 16 | 17 | from __future__ import absolute_import 18 | from __future__ import division 19 | from __future__ import print_function 20 | 21 | import math 22 | import tensorflow as tf 23 | 24 | import os 25 | import sys 26 | # add slim to PATH 27 | home = os.getenv("HOME") 28 | sys.path.insert(0, os.path.join(home, 'models/research/slim/')) 29 | sys.path.insert(0, '..') 30 | 31 | from datasets import dataset_factory 32 | 33 | slim = tf.contrib.slim 34 | 35 | tf.app.flags.DEFINE_string( 36 | 'profile', '0.5time', 'The profile to use for MobileNetV1 (1.0/0.75/0.5flops/0.5time).') 37 | 38 | tf.app.flags.DEFINE_integer( 39 | 'batch_size', 100, 'The number of samples in each batch.') 40 | 41 | tf.app.flags.DEFINE_integer( 42 | 'max_num_batches', None, 43 | 'Max number of batches to evaluate by default use all.') 44 | 45 | tf.app.flags.DEFINE_string( 46 | 'master', '', 'The address of the TensorFlow master to use.') 47 | 48 | # tf.app.flags.DEFINE_string( 49 | # 'eval_dir', '/tmp/tfmodel/', 'Directory where the results are saved to.') 50 | 51 | tf.app.flags.DEFINE_integer( 52 | 'num_preprocessing_threads', 16, 53 | 'The number of threads used to create the batches.') 54 | 55 | tf.app.flags.DEFINE_string( 56 | 'dataset_name', 'imagenet', 'The name of the dataset to load.') 57 | 58 | tf.app.flags.DEFINE_string( 59 | 'dataset_split_name', 'validation', 'The name of the train/test split.') 60 | 61 | tf.app.flags.DEFINE_string( 62 | 'dataset_dir', '/ssd/dataset/tf-imagenet', 'The directory where the dataset files are stored.') 63 | 64 | tf.app.flags.DEFINE_integer( 65 | 'labels_offset', 1, 66 | 'An offset for the labels in the dataset. This flag is primarily used to ' 67 | 'evaluate the VGG and ResNet architectures which do not use a background ' 68 | 'class for the ImageNet dataset.') 69 | 70 | 71 | tf.app.flags.DEFINE_float( 72 | 'moving_average_decay', None, 73 | 'The decay to use for the moving average.' 74 | 'If left as None, then moving averages are not used.') 75 | 76 | tf.app.flags.DEFINE_integer( 77 | 'eval_image_size', 224, 'Eval image size') 78 | 79 | FLAGS = tf.app.flags.FLAGS 80 | 81 | 82 | from models import mobilenet_v1_tf as mobilenet_v1 83 | from models.mobilenet_v1_tf import Conv, DepthSepConv 84 | 85 | profiles = { 86 | # '1.0': [32, 64, (128, 2), 128, (256, 2), 256, (512, 2), 512, 512, 512, 512, 512, (1024, 2), 1024], 87 | # '0.75': [24, 48, (96, 2), 96, (192, 2), 192, (384, 2), 384, 384, 384, 384, 384, (768, 2), 768], 88 | '0.5flops': [24, 48, (96, 2), 80, (192, 2), 200, (328, 2), 352, 368, 360, 328, 400, (736, 2), 752], 89 | '0.5time': [16, 48, (88, 2), 80, (192, 2), 168, (336, 2), 360, 360, 352, 352, 368, (768, 2), 680], 90 | } 91 | 92 | checkpoints = { 93 | '0.5flops': './checkpoints/tf/0.5flops/', 94 | '0.5time': './checkpoints/tf/0.5time/', 95 | } 96 | 97 | 98 | def build_conv_defs(): 99 | profile = profiles[FLAGS.profile] 100 | conv_defs = [] 101 | conv_defs.append(Conv(kernel=[3, 3], stride=2, depth=profile[0])) 102 | for p in profile[1:]: 103 | if type(p) == tuple: 104 | conv_defs.append(DepthSepConv(kernel=[3, 3], stride=p[1], depth=p[0])) 105 | else: 106 | conv_defs.append(DepthSepConv(kernel=[3, 3], stride=1, depth=p)) 107 | dm = 1 108 | return conv_defs, dm 109 | 110 | 111 | def main(_): 112 | if not FLAGS.dataset_dir: 113 | raise ValueError('You must supply the dataset directory with --dataset_dir') 114 | 115 | tf.logging.set_verbosity(tf.logging.INFO) 116 | with tf.Graph().as_default(): 117 | tf_global_step = slim.get_or_create_global_step() 118 | 119 | ###################### 120 | # Select the dataset # 121 | ###################### 122 | dataset = dataset_factory.get_dataset( 123 | FLAGS.dataset_name, FLAGS.dataset_split_name, FLAGS.dataset_dir) 124 | 125 | 126 | ############################################################## 127 | # Create a dataset provider that loads data from the dataset # 128 | ############################################################## 129 | provider = slim.dataset_data_provider.DatasetDataProvider( 130 | dataset, 131 | shuffle=False, 132 | common_queue_capacity=2 * FLAGS.batch_size, 133 | common_queue_min=FLAGS.batch_size) 134 | [image, label] = provider.get(['image', 'label']) 135 | label -= FLAGS.labels_offset 136 | 137 | ##################################### 138 | # Select the preprocessing function # 139 | ##################################### 140 | def preprocess_for_eval(image, height=224, width=224, 141 | central_fraction=0.875, scope=None): 142 | assert height == width # square input for imagenet 143 | 144 | def proc_func(x): 145 | x = (x * 255).astype('uint8') 146 | 147 | import PIL 148 | import numpy as np 149 | x = PIL.Image.fromarray(x) 150 | 151 | size = int(height / central_fraction) 152 | interpolation = PIL.Image.BILINEAR 153 | w, h = x.size 154 | if (w <= h and w == size) or (h <= w and h == size): 155 | pass 156 | if w < h: 157 | ow = size 158 | oh = int(size * h / w) 159 | x = x.resize((ow, oh), interpolation) 160 | else: 161 | oh = size 162 | ow = int(size * w / h) 163 | x = x.resize((ow, oh), interpolation) 164 | 165 | w, h = x.size 166 | th, tw = height, width 167 | i = int(round((h - th) / 2.)) 168 | j = int(round((w - tw) / 2.)) 169 | 170 | x = x.crop((j, i, j+tw, i+th)) 171 | 172 | return np.array(x).astype('float32') / 255 173 | 174 | with tf.name_scope(scope, 'eval_image', [image, height, width]): 175 | if image.dtype != tf.float32: 176 | image = tf.image.convert_image_dtype(image, dtype=tf.float32) 177 | 178 | image = tf.py_func(proc_func, [image], tf.float32) 179 | image.set_shape([224, 224, 3]) 180 | 181 | means = [0.485, 0.456, 0.406] 182 | stds = [0.229, 0.224, 0.225] 183 | channels = tf.split(axis=2, num_or_size_splits=3, value=image) 184 | for i in range(3): 185 | channels[i] -= means[i] 186 | channels[i] /= stds[i] 187 | return tf.concat(axis=2, values=channels) 188 | 189 | eval_image_size = FLAGS.eval_image_size 190 | 191 | image = preprocess_for_eval(image, eval_image_size, eval_image_size) 192 | 193 | images, labels = tf.train.batch( 194 | [image, label], 195 | batch_size=FLAGS.batch_size, 196 | num_threads=FLAGS.num_preprocessing_threads, 197 | capacity=5 * FLAGS.batch_size) 198 | 199 | #################### 200 | # Define the model # 201 | #################### 202 | conv_defs, dm = build_conv_defs() 203 | with slim.arg_scope(mobilenet_v1.mobilenet_v1_arg_scope()): 204 | logits, _ = mobilenet_v1.mobilenet_v1(images, 1000, dropout_keep_prob=0.8, is_training=False, 205 | conv_defs=conv_defs, depth_multiplier=dm) 206 | 207 | if FLAGS.moving_average_decay: 208 | variable_averages = tf.train.ExponentialMovingAverage( 209 | FLAGS.moving_average_decay, tf_global_step) 210 | variables_to_restore = variable_averages.variables_to_restore( 211 | slim.get_model_variables()) 212 | variables_to_restore[tf_global_step.op.name] = tf_global_step 213 | else: 214 | variables_to_restore = slim.get_variables_to_restore() 215 | 216 | predictions = tf.argmax(logits, 1) 217 | labels = tf.squeeze(labels) 218 | 219 | # Define the metrics: 220 | names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({ 221 | 'Accuracy': slim.metrics.streaming_accuracy(predictions, labels), 222 | 'Recall_5': slim.metrics.streaming_recall_at_k( 223 | logits, labels, 5), 224 | }) 225 | 226 | # Print the summaries to screen. 227 | for name, value in names_to_values.items(): 228 | summary_name = 'eval/%s' % name 229 | op = tf.summary.scalar(summary_name, value, collections=[]) 230 | op = tf.Print(op, [value], summary_name) 231 | tf.add_to_collection(tf.GraphKeys.SUMMARIES, op) 232 | 233 | if FLAGS.max_num_batches: 234 | num_batches = FLAGS.max_num_batches 235 | else: 236 | # This ensures that we make a single pass over all of the data. 237 | num_batches = math.ceil(dataset.num_samples / float(FLAGS.batch_size)) 238 | 239 | tf.logging.info('Evaluating %s' % checkpoints[FLAGS.profile]) 240 | 241 | slim.evaluation.evaluation_loop( 242 | master=FLAGS.master, 243 | checkpoint_dir=checkpoints[FLAGS.profile], 244 | logdir=os.path.join('./eval', checkpoints[FLAGS.profile]), 245 | num_evals=num_batches, 246 | eval_op=list(names_to_updates.values()), 247 | variables_to_restore=variables_to_restore, 248 | max_number_of_evaluations=1, 249 | ) 250 | 251 | 252 | if __name__ == '__main__': 253 | tf.app.run() 254 | -------------------------------------------------------------------------------- /eval_mobilenet_torch.py: -------------------------------------------------------------------------------- 1 | '''Train CIFAR10 with PyTorch.''' 2 | from __future__ import print_function 3 | 4 | import os 5 | import time 6 | import torch 7 | import torch.nn as nn 8 | import torch.backends.cudnn as cudnn 9 | 10 | import argparse 11 | 12 | from torch.autograd import Variable 13 | 14 | from models import mobilenet_v1, mobilenet_v2 15 | from utils import measure_model, AverageMeter, progress_bar, accuracy, process_state_dict 16 | 17 | parser = argparse.ArgumentParser(description='PyTorch CIFAR10 Training') 18 | parser.add_argument('--model', default='mobilenet_0.5flops', type=str, help='name of the model to test') 19 | parser.add_argument('--imagenet_path', default=None, type=str, help='Directory of ImageNet') 20 | parser.add_argument('--n_gpu', default=1, type=int, help='name of the job') 21 | parser.add_argument('--batch_size', default=100, type=int, help='batch size') 22 | parser.add_argument('--n_worker', default=32, type=int, help='number of data loader worker') 23 | 24 | args = parser.parse_args() 25 | 26 | use_cuda = torch.cuda.is_available() 27 | 28 | 29 | def get_dataset(): 30 | # lazy import 31 | import torchvision.datasets as datasets 32 | import torchvision.transforms as transforms 33 | if not args.imagenet_path: 34 | raise Exception('Please provide valid ImageNet path!') 35 | print('=> Preparing data..') 36 | valdir = os.path.join(args.imagenet_path, 'val') 37 | normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], 38 | std=[0.229, 0.224, 0.225]) 39 | 40 | input_size = 224 41 | val_loader = torch.utils.data.DataLoader( 42 | datasets.ImageFolder(valdir, transforms.Compose([ 43 | transforms.Resize(int(input_size / 0.875)), 44 | transforms.CenterCrop(input_size), 45 | transforms.ToTensor(), 46 | normalize, 47 | ])), 48 | batch_size=args.batch_size, shuffle=False, 49 | num_workers=args.n_worker, pin_memory=True) 50 | n_class = 1000 51 | return val_loader, n_class 52 | 53 | 54 | def get_model(n_class): 55 | print('=> Building model {}...'.format(args.model)) 56 | if args.model == 'mobilenet_0.5flops': 57 | net = mobilenet_v1.MobileNet(n_class, profile='0.5flops') 58 | checkpoint_path = './checkpoints/torch/mobilenet_imagenet_0.5flops_70.5.pth.tar' 59 | elif args.model == 'mobilenet_0.5time': 60 | net = mobilenet_v1.MobileNet(n_class, profile='0.5time') 61 | checkpoint_path = './checkpoints/torch/mobilenet_imagenet_0.5time_70.2.pth.tar' 62 | elif args.model == 'mobilenetv2': 63 | net = mobilenet_v2.MobileNetV2(n_class, profile='normal') 64 | checkpoint_path = './checkpoints/torch/mobilenetv2_imagenet_71.814.pth.tar' 65 | elif args.model == 'mobilenetv2_0.7flops': 66 | net = mobilenet_v2.MobileNetV2(n_class, profile='0.7flops') 67 | checkpoint_path = './checkpoints/torch/mobilenetv2_imagenet_0.7amc_70.854.pth.tar' 68 | else: 69 | raise NotImplementedError 70 | 71 | print('=> Loading checkpoints..') 72 | checkpoint = torch.load(checkpoint_path) 73 | if 'state_dict' in checkpoint: 74 | checkpoint = checkpoint['state_dict'] # get state_dict 75 | net.load_state_dict(process_state_dict(checkpoint)) # remove .module 76 | 77 | return net 78 | 79 | 80 | def evaluate(): 81 | # build dataset 82 | val_loader, n_class = get_dataset() 83 | # build model 84 | net = get_model(n_class) # for measure 85 | n_flops, n_params = measure_model(net, 224, 224) 86 | print('=> Model Parameter: {:.3f} M, FLOPs: {:.3f}M'.format(n_params / 1e6, n_flops / 1e6)) 87 | del net 88 | net = get_model(n_class) 89 | 90 | criterion = nn.CrossEntropyLoss() 91 | 92 | if use_cuda: 93 | net = net.cuda() 94 | net = torch.nn.DataParallel(net, list(range(args.n_gpu))) 95 | cudnn.benchmark = True 96 | 97 | # begin eval 98 | net.eval() 99 | 100 | batch_time = AverageMeter() 101 | losses = AverageMeter() 102 | top1 = AverageMeter() 103 | top5 = AverageMeter() 104 | end = time.time() 105 | 106 | with torch.no_grad(): 107 | for batch_idx, (inputs, targets) in enumerate(val_loader): 108 | if use_cuda: 109 | inputs, targets = inputs.cuda(), targets.cuda() 110 | inputs, targets = Variable(inputs), Variable(targets) 111 | outputs = net(inputs) 112 | loss = criterion(outputs, targets) 113 | 114 | # measure accuracy and record loss 115 | prec1, prec5 = accuracy(outputs.data, targets.data, topk=(1, 5)) 116 | losses.update(loss.item(), inputs.size(0)) 117 | top1.update(prec1.item(), inputs.size(0)) 118 | top5.update(prec5.item(), inputs.size(0)) 119 | # timing 120 | batch_time.update(time.time() - end) 121 | end = time.time() 122 | 123 | progress_bar(batch_idx, len(val_loader), 'Loss: {:.3f} | Acc1: {:.3f}% | Acc5: {:.3f}%' 124 | .format(losses.avg, top1.avg, top5.avg)) 125 | 126 | 127 | if __name__ == '__main__': 128 | evaluate() 129 | -------------------------------------------------------------------------------- /logs/0.5flops_run1.txt: -------------------------------------------------------------------------------- 1 | native : commandlineflags.cc:1511 Ignoring RegisterValidateFunction() for flag pointer 0x5b04b76330: no flag found at that address 2 | native : benchmark_tflite_model.cc:480 STARTING! 3 | native : benchmark_tflite_model.cc:521 Graph: [/data/local/tmp/0.5amc_round8.tflite] 4 | native : benchmark_tflite_model.cc:522 Input layers: [pruned/input] 5 | native : benchmark_tflite_model.cc:523 Input shapes: [1,224,224,3] 6 | native : benchmark_tflite_model.cc:524 Input types: [float] 7 | native : benchmark_tflite_model.cc:525 Output layers: [pruned/MobilenetV1/Logits/SpatialSqueeze] 8 | native : benchmark_tflite_model.cc:526 Num runs: [200] 9 | native : benchmark_tflite_model.cc:527 Inter-run delay (seconds): [-1.0] 10 | native : benchmark_tflite_model.cc:528 Num threads: [1] 11 | native : benchmark_tflite_model.cc:529 Benchmark name: [] 12 | native : benchmark_tflite_model.cc:530 Output prefix: [] 13 | native : benchmark_tflite_model.cc:531 Warmup runs: [100] 14 | native : benchmark_tflite_model.cc:532 Use nnapi : [0] 15 | native : benchmark_tflite_model.cc:237 Loaded model /data/local/tmp/0.5amc_round8.tflite 16 | native : benchmark_tflite_model.cc:239 resolved reporter 17 | native : benchmark_tflite_model.cc:560 Initialized session in 0.026466s 18 | native : benchmark_tflite_model.cc:412 Running benchmark for 100 iterations with detailed stat logging: 19 | native : benchmark_tflite_model.cc:440 count=100 first=74794 curr=67812 min=66944 max=74794 avg=68459.8 std=1061 20 | 21 | native : benchmark_tflite_model.cc:412 Running benchmark for 200 iterations with detailed stat logging: 22 | native : benchmark_tflite_model.cc:440 count=200 first=67444 curr=68656 min=66997 max=71141 avg=68307.7 std=758 23 | 24 | native : benchmark_tflite_model.cc:590 Average inference timings in us: Warmup: 68459.8, no stats: 68307 25 | native : stat_summarizer.cc:358 Number of nodes executed: 30 26 | native : benchmark_tflite_model.cc:108 ============================== Run Order ============================== 27 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 28 | CONV_2D 0.000 5.228 3.367 4.928% 4.928% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_0/Relu6] 29 | DEPTHWISE_CONV_2D 3.369 2.739 2.571 3.763% 8.691% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6] 30 | CONV_2D 5.941 3.960 3.784 5.538% 14.229% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 31 | DEPTHWISE_CONV_2D 9.726 1.607 1.588 2.325% 16.554% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6] 32 | CONV_2D 11.316 2.948 2.971 4.349% 20.903% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6] 33 | DEPTHWISE_CONV_2D 14.289 1.585 1.602 2.346% 23.248% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6] 34 | CONV_2D 15.892 4.290 4.286 6.273% 29.522% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 35 | DEPTHWISE_CONV_2D 20.180 0.583 0.589 0.862% 30.383% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6] 36 | CONV_2D 20.769 2.215 2.199 3.218% 33.601% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6] 37 | DEPTHWISE_CONV_2D 22.969 0.837 0.784 1.148% 34.749% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6] 38 | CONV_2D 23.754 5.460 5.158 7.549% 42.298% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 39 | DEPTHWISE_CONV_2D 28.913 0.273 0.285 0.417% 42.716% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6] 40 | CONV_2D 29.199 2.696 2.565 3.755% 46.470% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6] 41 | DEPTHWISE_CONV_2D 31.765 0.320 0.335 0.490% 46.961% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6] 42 | CONV_2D 32.101 4.599 4.309 6.307% 53.268% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 43 | DEPTHWISE_CONV_2D 36.411 0.336 0.348 0.510% 53.778% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6] 44 | CONV_2D 36.760 4.849 4.637 6.788% 60.566% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 45 | DEPTHWISE_CONV_2D 41.399 0.368 0.375 0.549% 61.114% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6] 46 | CONV_2D 41.774 5.309 4.980 7.290% 68.404% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 47 | DEPTHWISE_CONV_2D 46.756 0.357 0.361 0.528% 68.932% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6] 48 | CONV_2D 47.117 4.841 4.522 6.619% 75.551% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 49 | DEPTHWISE_CONV_2D 51.641 0.333 0.333 0.487% 76.038% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6] 50 | CONV_2D 51.974 5.218 4.944 7.237% 83.275% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 51 | DEPTHWISE_CONV_2D 56.919 0.138 0.122 0.178% 83.453% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6] 52 | CONV_2D 57.041 4.418 3.952 5.784% 89.237% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 53 | DEPTHWISE_CONV_2D 60.995 0.209 0.201 0.294% 89.531% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6] 54 | CONV_2D 61.197 7.583 6.722 9.838% 99.369% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 55 | AVERAGE_POOL_2D 67.920 0.047 0.035 0.051% 99.420% 0.000 1 [pruned/MobilenetV1/Logits/AvgPool_1a/AvgPool] 56 | CONV_2D 67.956 1.412 0.394 0.577% 99.997% 0.000 1 [pruned/MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd] 57 | RESHAPE 68.351 0.003 0.002 0.003% 100.000% 0.000 1 [pruned/MobilenetV1/Logits/SpatialSqueeze] 58 | 59 | ============================== Top by Computation Time ============================== 60 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 61 | CONV_2D 61.197 7.583 6.722 9.838% 9.838% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 62 | CONV_2D 23.754 5.460 5.158 7.549% 17.388% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 63 | CONV_2D 41.774 5.309 4.980 7.290% 24.677% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 64 | CONV_2D 51.974 5.218 4.944 7.237% 31.914% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 65 | CONV_2D 36.760 4.849 4.637 6.788% 38.702% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 66 | CONV_2D 47.117 4.841 4.522 6.619% 45.321% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 67 | CONV_2D 32.101 4.599 4.309 6.307% 51.628% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 68 | CONV_2D 15.892 4.290 4.286 6.273% 57.901% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 69 | CONV_2D 57.041 4.418 3.952 5.784% 63.686% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 70 | CONV_2D 5.941 3.960 3.784 5.538% 69.224% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 71 | 72 | ============================== Summary by node type ============================== 73 | [Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] [times called] 74 | CONV_2D 15 58.783 86.060% 86.060% 0.000 15 75 | DEPTHWISE_CONV_2D 13 9.486 13.888% 99.947% 0.000 13 76 | AVERAGE_POOL_2D 1 0.034 0.050% 99.997% 0.000 1 77 | RESHAPE 1 0.002 0.003% 100.000% 0.000 1 78 | 79 | Timings (microseconds): count=300 first=74761 curr=68623 min=66908 max=74761 avg=68320.7 std=871 80 | Memory (bytes): count=300 curr=0(all same) 81 | 30 nodes observed 82 | 83 | 84 | -------------------------------------------------------------------------------- /logs/0.5flops_run2.txt: -------------------------------------------------------------------------------- 1 | native : commandlineflags.cc:1511 Ignoring RegisterValidateFunction() for flag pointer 0x57ff42b330: no flag found at that address 2 | native : benchmark_tflite_model.cc:480 STARTING! 3 | native : benchmark_tflite_model.cc:521 Graph: [/data/local/tmp/0.5amc_round8.tflite] 4 | native : benchmark_tflite_model.cc:522 Input layers: [pruned/input] 5 | native : benchmark_tflite_model.cc:523 Input shapes: [1,224,224,3] 6 | native : benchmark_tflite_model.cc:524 Input types: [float] 7 | native : benchmark_tflite_model.cc:525 Output layers: [pruned/MobilenetV1/Logits/SpatialSqueeze] 8 | native : benchmark_tflite_model.cc:526 Num runs: [200] 9 | native : benchmark_tflite_model.cc:527 Inter-run delay (seconds): [-1.0] 10 | native : benchmark_tflite_model.cc:528 Num threads: [1] 11 | native : benchmark_tflite_model.cc:529 Benchmark name: [] 12 | native : benchmark_tflite_model.cc:530 Output prefix: [] 13 | native : benchmark_tflite_model.cc:531 Warmup runs: [100] 14 | native : benchmark_tflite_model.cc:532 Use nnapi : [0] 15 | native : benchmark_tflite_model.cc:237 Loaded model /data/local/tmp/0.5amc_round8.tflite 16 | native : benchmark_tflite_model.cc:239 resolved reporter 17 | native : benchmark_tflite_model.cc:560 Initialized session in 0.025957s 18 | native : benchmark_tflite_model.cc:412 Running benchmark for 100 iterations with detailed stat logging: 19 | native : benchmark_tflite_model.cc:440 count=100 first=75287 curr=67611 min=67509 max=75287 avg=69003.6 std=1164 20 | 21 | native : benchmark_tflite_model.cc:412 Running benchmark for 200 iterations with detailed stat logging: 22 | native : benchmark_tflite_model.cc:440 count=200 first=68817 curr=68198 min=67197 max=71494 avg=68445.8 std=682 23 | 24 | native : benchmark_tflite_model.cc:590 Average inference timings in us: Warmup: 69003.6, no stats: 68445 25 | native : stat_summarizer.cc:358 Number of nodes executed: 30 26 | native : benchmark_tflite_model.cc:108 ============================== Run Order ============================== 27 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 28 | CONV_2D 0.000 5.782 3.418 4.982% 4.982% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_0/Relu6] 29 | DEPTHWISE_CONV_2D 3.419 2.657 2.713 3.955% 8.937% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6] 30 | CONV_2D 6.133 4.147 3.801 5.540% 14.477% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 31 | DEPTHWISE_CONV_2D 9.935 1.746 1.718 2.505% 16.982% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6] 32 | CONV_2D 11.655 2.952 2.958 4.313% 21.295% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6] 33 | DEPTHWISE_CONV_2D 14.615 1.667 1.611 2.348% 23.643% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6] 34 | CONV_2D 16.227 4.314 4.293 6.258% 29.901% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 35 | DEPTHWISE_CONV_2D 20.521 0.542 0.554 0.807% 30.708% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6] 36 | CONV_2D 21.075 2.254 2.195 3.200% 33.908% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6] 37 | DEPTHWISE_CONV_2D 23.272 0.786 0.787 1.148% 35.056% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6] 38 | CONV_2D 24.060 5.085 5.131 7.480% 42.536% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 39 | DEPTHWISE_CONV_2D 29.192 0.378 0.281 0.409% 42.945% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6] 40 | CONV_2D 29.473 2.582 2.571 3.748% 46.693% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6] 41 | DEPTHWISE_CONV_2D 32.045 0.343 0.341 0.497% 47.190% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6] 42 | CONV_2D 32.386 4.473 4.304 6.275% 53.464% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 43 | DEPTHWISE_CONV_2D 36.691 0.355 0.348 0.507% 53.971% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6] 44 | CONV_2D 37.040 4.774 4.622 6.738% 60.709% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 45 | DEPTHWISE_CONV_2D 41.663 0.381 0.376 0.548% 61.258% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6] 46 | CONV_2D 42.040 5.137 4.969 7.244% 68.501% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 47 | DEPTHWISE_CONV_2D 47.010 0.372 0.366 0.534% 69.035% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6] 48 | CONV_2D 47.377 4.725 4.523 6.593% 75.629% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 49 | DEPTHWISE_CONV_2D 51.901 0.400 0.344 0.501% 76.130% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6] 50 | CONV_2D 52.245 5.151 4.944 7.207% 83.337% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 51 | DEPTHWISE_CONV_2D 57.190 0.136 0.116 0.169% 83.507% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6] 52 | CONV_2D 57.307 4.649 3.951 5.760% 89.266% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 53 | DEPTHWISE_CONV_2D 61.259 0.218 0.200 0.291% 89.557% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6] 54 | CONV_2D 61.459 7.719 6.740 9.826% 99.383% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 55 | AVERAGE_POOL_2D 68.201 0.048 0.034 0.050% 99.433% 0.000 1 [pruned/MobilenetV1/Logits/AvgPool_1a/AvgPool] 56 | CONV_2D 68.235 1.474 0.387 0.564% 99.996% 0.000 1 [pruned/MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd] 57 | RESHAPE 68.623 0.003 0.003 0.004% 100.000% 0.000 1 [pruned/MobilenetV1/Logits/SpatialSqueeze] 58 | 59 | ============================== Top by Computation Time ============================== 60 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 61 | CONV_2D 61.459 7.719 6.740 9.826% 9.826% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 62 | CONV_2D 24.060 5.085 5.131 7.480% 17.306% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 63 | CONV_2D 42.040 5.137 4.969 7.244% 24.549% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 64 | CONV_2D 52.245 5.151 4.944 7.207% 31.757% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 65 | CONV_2D 37.040 4.774 4.622 6.738% 38.495% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 66 | CONV_2D 47.377 4.725 4.523 6.593% 45.088% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 67 | CONV_2D 32.386 4.473 4.304 6.275% 51.363% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 68 | CONV_2D 16.227 4.314 4.293 6.258% 57.621% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 69 | CONV_2D 57.307 4.649 3.951 5.760% 63.380% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 70 | CONV_2D 6.133 4.147 3.801 5.540% 68.921% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 71 | 72 | ============================== Summary by node type ============================== 73 | [Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] [times called] 74 | CONV_2D 15 58.799 85.735% 85.735% 0.000 15 75 | DEPTHWISE_CONV_2D 13 9.747 14.212% 99.948% 0.000 13 76 | AVERAGE_POOL_2D 1 0.034 0.050% 99.997% 0.000 1 77 | RESHAPE 1 0.002 0.003% 100.000% 0.000 1 78 | 79 | Timings (microseconds): count=300 first=75250 curr=68161 min=67161 max=75250 avg=68596.9 std=911 80 | Memory (bytes): count=300 curr=0(all same) 81 | 30 nodes observed 82 | 83 | 84 | -------------------------------------------------------------------------------- /logs/0.5flops_run3.txt: -------------------------------------------------------------------------------- 1 | native : commandlineflags.cc:1511 Ignoring RegisterValidateFunction() for flag pointer 0x5e9728f330: no flag found at that address 2 | native : benchmark_tflite_model.cc:480 STARTING! 3 | native : benchmark_tflite_model.cc:521 Graph: [/data/local/tmp/0.5amc_round8.tflite] 4 | native : benchmark_tflite_model.cc:522 Input layers: [pruned/input] 5 | native : benchmark_tflite_model.cc:523 Input shapes: [1,224,224,3] 6 | native : benchmark_tflite_model.cc:524 Input types: [float] 7 | native : benchmark_tflite_model.cc:525 Output layers: [pruned/MobilenetV1/Logits/SpatialSqueeze] 8 | native : benchmark_tflite_model.cc:526 Num runs: [200] 9 | native : benchmark_tflite_model.cc:527 Inter-run delay (seconds): [-1.0] 10 | native : benchmark_tflite_model.cc:528 Num threads: [1] 11 | native : benchmark_tflite_model.cc:529 Benchmark name: [] 12 | native : benchmark_tflite_model.cc:530 Output prefix: [] 13 | native : benchmark_tflite_model.cc:531 Warmup runs: [100] 14 | native : benchmark_tflite_model.cc:532 Use nnapi : [0] 15 | native : benchmark_tflite_model.cc:237 Loaded model /data/local/tmp/0.5amc_round8.tflite 16 | native : benchmark_tflite_model.cc:239 resolved reporter 17 | native : benchmark_tflite_model.cc:560 Initialized session in 0.027871s 18 | native : benchmark_tflite_model.cc:412 Running benchmark for 100 iterations with detailed stat logging: 19 | native : benchmark_tflite_model.cc:440 count=100 first=76717 curr=69204 min=67622 max=82359 avg=70324.9 std=2204 20 | 21 | native : benchmark_tflite_model.cc:412 Running benchmark for 200 iterations with detailed stat logging: 22 | native : benchmark_tflite_model.cc:440 count=200 first=68054 curr=68749 min=67502 max=74259 avg=69797.8 std=1592 23 | 24 | native : benchmark_tflite_model.cc:590 Average inference timings in us: Warmup: 70324.9, no stats: 69797 25 | native : stat_summarizer.cc:358 Number of nodes executed: 30 26 | native : benchmark_tflite_model.cc:108 ============================== Run Order ============================== 27 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 28 | CONV_2D 0.000 6.939 3.468 4.959% 4.959% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_0/Relu6] 29 | DEPTHWISE_CONV_2D 3.470 2.938 2.620 3.746% 8.704% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6] 30 | CONV_2D 6.091 4.250 3.882 5.551% 14.256% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 31 | DEPTHWISE_CONV_2D 9.975 1.632 1.662 2.377% 16.633% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6] 32 | CONV_2D 11.638 2.971 3.033 4.336% 20.969% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6] 33 | DEPTHWISE_CONV_2D 14.672 1.667 1.661 2.375% 23.344% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6] 34 | CONV_2D 16.334 4.267 4.410 6.306% 29.650% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 35 | DEPTHWISE_CONV_2D 20.746 0.558 0.592 0.847% 30.497% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6] 36 | CONV_2D 21.339 2.279 2.218 3.171% 33.668% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6] 37 | DEPTHWISE_CONV_2D 23.558 0.788 0.801 1.145% 34.812% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6] 38 | CONV_2D 24.359 5.318 5.265 7.528% 42.341% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 39 | DEPTHWISE_CONV_2D 29.627 0.292 0.305 0.436% 42.777% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6] 40 | CONV_2D 29.932 2.678 2.637 3.771% 46.548% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6] 41 | DEPTHWISE_CONV_2D 32.571 0.351 0.352 0.503% 47.051% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6] 42 | CONV_2D 32.923 4.632 4.380 6.263% 53.313% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 43 | DEPTHWISE_CONV_2D 37.304 0.365 0.359 0.514% 53.827% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6] 44 | CONV_2D 37.664 4.784 4.736 6.772% 60.599% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 45 | DEPTHWISE_CONV_2D 42.402 0.480 0.402 0.575% 61.174% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6] 46 | CONV_2D 42.805 5.104 5.063 7.240% 68.414% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 47 | DEPTHWISE_CONV_2D 47.869 0.381 0.374 0.534% 68.948% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6] 48 | CONV_2D 48.243 4.688 4.613 6.597% 75.545% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 49 | DEPTHWISE_CONV_2D 52.858 0.348 0.341 0.488% 76.033% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6] 50 | CONV_2D 53.199 5.065 5.069 7.249% 83.281% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 51 | DEPTHWISE_CONV_2D 58.270 0.190 0.123 0.176% 83.457% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6] 52 | CONV_2D 58.393 4.404 4.026 5.756% 89.213% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 53 | DEPTHWISE_CONV_2D 62.420 0.218 0.211 0.302% 89.515% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6] 54 | CONV_2D 62.632 7.558 6.894 9.858% 99.373% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 55 | AVERAGE_POOL_2D 69.527 0.049 0.034 0.049% 99.422% 0.000 1 [pruned/MobilenetV1/Logits/AvgPool_1a/AvgPool] 56 | CONV_2D 69.561 1.485 0.402 0.575% 99.997% 0.000 1 [pruned/MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd] 57 | RESHAPE 69.965 0.002 0.002 0.003% 100.000% 0.000 1 [pruned/MobilenetV1/Logits/SpatialSqueeze] 58 | 59 | ============================== Top by Computation Time ============================== 60 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 61 | CONV_2D 62.632 7.558 6.894 9.858% 9.858% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 62 | CONV_2D 24.359 5.318 5.265 7.528% 17.386% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 63 | CONV_2D 53.199 5.065 5.069 7.249% 24.635% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 64 | CONV_2D 42.805 5.104 5.063 7.240% 31.874% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 65 | CONV_2D 37.664 4.784 4.736 6.772% 38.646% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 66 | CONV_2D 48.243 4.688 4.613 6.597% 45.243% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 67 | CONV_2D 16.334 4.267 4.410 6.306% 51.549% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 68 | CONV_2D 32.923 4.632 4.380 6.263% 57.812% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 69 | CONV_2D 58.393 4.404 4.026 5.756% 63.568% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 70 | CONV_2D 6.091 4.250 3.882 5.551% 69.120% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 71 | 72 | ============================== Summary by node type ============================== 73 | [Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] [times called] 74 | CONV_2D 15 60.091 85.938% 85.938% 0.000 15 75 | DEPTHWISE_CONV_2D 13 9.797 14.011% 99.949% 0.000 13 76 | AVERAGE_POOL_2D 1 0.034 0.049% 99.997% 0.000 1 77 | RESHAPE 1 0.002 0.003% 100.000% 0.000 1 78 | 79 | Timings (microseconds): count=300 first=76681 curr=68714 min=67466 max=82327 avg=69936.6 std=1835 80 | Memory (bytes): count=300 curr=0(all same) 81 | 30 nodes observed 82 | 83 | 84 | -------------------------------------------------------------------------------- /logs/0.5time_run1.txt: -------------------------------------------------------------------------------- 1 | native : commandlineflags.cc:1511 Ignoring RegisterValidateFunction() for flag pointer 0x619b047330: no flag found at that address 2 | native : benchmark_tflite_model.cc:480 STARTING! 3 | native : benchmark_tflite_model.cc:521 Graph: [/data/local/tmp/0.5time_round8.tflite] 4 | native : benchmark_tflite_model.cc:522 Input layers: [pruned/input] 5 | native : benchmark_tflite_model.cc:523 Input shapes: [1,224,224,3] 6 | native : benchmark_tflite_model.cc:524 Input types: [float] 7 | native : benchmark_tflite_model.cc:525 Output layers: [pruned/MobilenetV1/Logits/SpatialSqueeze] 8 | native : benchmark_tflite_model.cc:526 Num runs: [200] 9 | native : benchmark_tflite_model.cc:527 Inter-run delay (seconds): [-1.0] 10 | native : benchmark_tflite_model.cc:528 Num threads: [1] 11 | native : benchmark_tflite_model.cc:529 Benchmark name: [] 12 | native : benchmark_tflite_model.cc:530 Output prefix: [] 13 | native : benchmark_tflite_model.cc:531 Warmup runs: [100] 14 | native : benchmark_tflite_model.cc:532 Use nnapi : [0] 15 | native : benchmark_tflite_model.cc:237 Loaded model /data/local/tmp/0.5time_round8.tflite 16 | native : benchmark_tflite_model.cc:239 resolved reporter 17 | native : benchmark_tflite_model.cc:560 Initialized session in 0.02682s 18 | native : benchmark_tflite_model.cc:412 Running benchmark for 100 iterations with detailed stat logging: 19 | native : benchmark_tflite_model.cc:440 count=100 first=68597 curr=64739 min=62066 max=68597 avg=63348.6 std=890 20 | 21 | native : benchmark_tflite_model.cc:412 Running benchmark for 200 iterations with detailed stat logging: 22 | native : benchmark_tflite_model.cc:440 count=200 first=63390 curr=63099 min=62136 max=66562 avg=63424.6 std=731 23 | 24 | native : benchmark_tflite_model.cc:590 Average inference timings in us: Warmup: 63348.6, no stats: 63424 25 | native : stat_summarizer.cc:358 Number of nodes executed: 30 26 | native : benchmark_tflite_model.cc:108 ============================== Run Order ============================== 27 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 28 | CONV_2D 0.000 4.294 2.677 4.225% 4.225% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_0/Relu6] 29 | DEPTHWISE_CONV_2D 2.679 1.297 1.223 1.931% 6.156% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6] 30 | CONV_2D 3.903 3.561 3.055 4.821% 10.977% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 31 | DEPTHWISE_CONV_2D 6.960 1.692 1.605 2.534% 13.511% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6] 32 | CONV_2D 8.567 2.701 2.675 4.222% 17.733% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6] 33 | DEPTHWISE_CONV_2D 11.243 1.809 1.778 2.806% 20.539% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6] 34 | CONV_2D 13.022 4.065 4.097 6.466% 27.005% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 35 | DEPTHWISE_CONV_2D 17.121 0.587 0.569 0.898% 27.902% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6] 36 | CONV_2D 17.691 2.269 2.190 3.456% 31.359% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6] 37 | DEPTHWISE_CONV_2D 19.882 0.797 0.792 1.249% 32.608% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6] 38 | CONV_2D 20.674 4.369 4.343 6.854% 39.462% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 39 | DEPTHWISE_CONV_2D 25.018 0.229 0.238 0.375% 39.837% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6] 40 | CONV_2D 25.257 2.287 2.191 3.458% 43.295% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6] 41 | DEPTHWISE_CONV_2D 27.448 0.342 0.338 0.533% 43.828% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6] 42 | CONV_2D 27.787 4.772 4.565 7.205% 51.033% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 43 | DEPTHWISE_CONV_2D 32.353 0.383 0.375 0.591% 51.624% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6] 44 | CONV_2D 32.728 5.054 4.979 7.857% 59.481% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 45 | DEPTHWISE_CONV_2D 37.708 0.367 0.365 0.576% 60.057% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6] 46 | CONV_2D 38.074 4.861 4.682 7.389% 67.446% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 47 | DEPTHWISE_CONV_2D 42.757 0.351 0.344 0.544% 67.990% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6] 48 | CONV_2D 43.102 4.471 4.357 6.876% 74.866% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 49 | DEPTHWISE_CONV_2D 47.460 0.348 0.343 0.541% 75.407% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6] 50 | CONV_2D 47.804 4.704 4.621 7.292% 82.700% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 51 | DEPTHWISE_CONV_2D 52.426 0.120 0.112 0.177% 82.877% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6] 52 | CONV_2D 52.538 4.234 3.814 6.019% 88.895% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 53 | DEPTHWISE_CONV_2D 56.353 0.221 0.191 0.302% 89.197% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6] 54 | CONV_2D 56.545 6.890 6.433 10.153% 99.350% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 55 | AVERAGE_POOL_2D 62.980 0.045 0.032 0.050% 99.400% 0.000 1 [pruned/MobilenetV1/Logits/AvgPool_1a/AvgPool] 56 | CONV_2D 63.012 1.443 0.378 0.597% 99.997% 0.000 1 [pruned/MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd] 57 | RESHAPE 63.391 0.002 0.002 0.003% 100.000% 0.000 1 [pruned/MobilenetV1/Logits/SpatialSqueeze] 58 | 59 | ============================== Top by Computation Time ============================== 60 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 61 | CONV_2D 56.545 6.890 6.433 10.153% 10.153% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 62 | CONV_2D 32.728 5.054 4.979 7.857% 18.010% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 63 | CONV_2D 38.074 4.861 4.682 7.389% 25.399% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 64 | CONV_2D 47.804 4.704 4.621 7.292% 32.692% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 65 | CONV_2D 27.787 4.772 4.565 7.205% 39.896% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 66 | CONV_2D 43.102 4.471 4.357 6.876% 46.772% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 67 | CONV_2D 20.674 4.369 4.343 6.854% 53.626% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 68 | CONV_2D 13.022 4.065 4.097 6.466% 60.092% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 69 | CONV_2D 52.538 4.234 3.814 6.019% 66.111% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 70 | CONV_2D 3.903 3.561 3.055 4.821% 70.933% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 71 | 72 | ============================== Summary by node type ============================== 73 | [Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] [times called] 74 | CONV_2D 15 55.047 86.898% 86.898% 0.000 15 75 | DEPTHWISE_CONV_2D 13 8.267 13.050% 99.948% 0.000 13 76 | AVERAGE_POOL_2D 1 0.031 0.049% 99.997% 0.000 1 77 | RESHAPE 1 0.002 0.003% 100.000% 0.000 1 78 | 79 | Timings (microseconds): count=300 first=68565 curr=63066 min=62034 max=68565 avg=63363.2 std=789 80 | Memory (bytes): count=300 curr=0(all same) 81 | 30 nodes observed 82 | 83 | 84 | -------------------------------------------------------------------------------- /logs/0.5time_run2.txt: -------------------------------------------------------------------------------- 1 | native : commandlineflags.cc:1511 Ignoring RegisterValidateFunction() for flag pointer 0x6170569330: no flag found at that address 2 | native : benchmark_tflite_model.cc:480 STARTING! 3 | native : benchmark_tflite_model.cc:521 Graph: [/data/local/tmp/0.5time_round8.tflite] 4 | native : benchmark_tflite_model.cc:522 Input layers: [pruned/input] 5 | native : benchmark_tflite_model.cc:523 Input shapes: [1,224,224,3] 6 | native : benchmark_tflite_model.cc:524 Input types: [float] 7 | native : benchmark_tflite_model.cc:525 Output layers: [pruned/MobilenetV1/Logits/SpatialSqueeze] 8 | native : benchmark_tflite_model.cc:526 Num runs: [200] 9 | native : benchmark_tflite_model.cc:527 Inter-run delay (seconds): [-1.0] 10 | native : benchmark_tflite_model.cc:528 Num threads: [1] 11 | native : benchmark_tflite_model.cc:529 Benchmark name: [] 12 | native : benchmark_tflite_model.cc:530 Output prefix: [] 13 | native : benchmark_tflite_model.cc:531 Warmup runs: [100] 14 | native : benchmark_tflite_model.cc:532 Use nnapi : [0] 15 | native : benchmark_tflite_model.cc:237 Loaded model /data/local/tmp/0.5time_round8.tflite 16 | native : benchmark_tflite_model.cc:239 resolved reporter 17 | native : benchmark_tflite_model.cc:560 Initialized session in 0.025855s 18 | native : benchmark_tflite_model.cc:412 Running benchmark for 100 iterations with detailed stat logging: 19 | native : benchmark_tflite_model.cc:440 count=100 first=72266 curr=61899 min=61682 max=72266 avg=63298 std=1253 20 | 21 | native : benchmark_tflite_model.cc:412 Running benchmark for 200 iterations with detailed stat logging: 22 | native : benchmark_tflite_model.cc:440 count=200 first=63790 curr=62587 min=61528 max=65246 avg=63016.7 std=705 23 | 24 | native : benchmark_tflite_model.cc:590 Average inference timings in us: Warmup: 63298, no stats: 63016 25 | native : stat_summarizer.cc:358 Number of nodes executed: 30 26 | native : benchmark_tflite_model.cc:108 ============================== Run Order ============================== 27 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 28 | CONV_2D 0.000 6.282 2.687 4.259% 4.259% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_0/Relu6] 29 | DEPTHWISE_CONV_2D 2.688 1.222 1.199 1.900% 6.160% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6] 30 | CONV_2D 3.888 3.647 3.077 4.878% 11.038% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 31 | DEPTHWISE_CONV_2D 6.966 1.648 1.552 2.460% 13.498% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6] 32 | CONV_2D 8.519 2.688 2.671 4.234% 17.732% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6] 33 | DEPTHWISE_CONV_2D 11.192 1.663 1.714 2.718% 20.450% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6] 34 | CONV_2D 12.907 4.171 4.110 6.516% 26.965% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 35 | DEPTHWISE_CONV_2D 17.018 0.618 0.568 0.901% 27.866% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6] 36 | CONV_2D 17.587 2.210 2.202 3.491% 31.357% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6] 37 | DEPTHWISE_CONV_2D 19.790 0.760 0.750 1.190% 32.547% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6] 38 | CONV_2D 20.541 4.524 4.340 6.880% 39.426% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 39 | DEPTHWISE_CONV_2D 24.882 0.217 0.222 0.353% 39.779% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6] 40 | CONV_2D 25.105 2.285 2.181 3.458% 43.237% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6] 41 | DEPTHWISE_CONV_2D 27.287 0.321 0.319 0.506% 43.743% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6] 42 | CONV_2D 27.607 4.895 4.560 7.230% 50.973% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 43 | DEPTHWISE_CONV_2D 32.169 0.367 0.348 0.552% 51.525% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6] 44 | CONV_2D 32.517 5.403 4.973 7.884% 59.410% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 45 | DEPTHWISE_CONV_2D 37.492 0.369 0.351 0.557% 59.966% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6] 46 | CONV_2D 37.843 4.883 4.683 7.424% 67.391% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 47 | DEPTHWISE_CONV_2D 42.528 0.409 0.330 0.523% 67.913% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6] 48 | CONV_2D 42.858 4.568 4.349 6.895% 74.808% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 49 | DEPTHWISE_CONV_2D 47.208 0.333 0.326 0.517% 75.325% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6] 50 | CONV_2D 47.534 4.829 4.608 7.306% 82.630% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 51 | DEPTHWISE_CONV_2D 52.144 0.187 0.106 0.169% 82.799% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6] 52 | CONV_2D 52.251 4.455 3.837 6.082% 88.881% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 53 | DEPTHWISE_CONV_2D 56.088 0.213 0.186 0.295% 89.177% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6] 54 | CONV_2D 56.275 7.583 6.428 10.190% 99.367% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 55 | AVERAGE_POOL_2D 62.704 0.045 0.032 0.051% 99.418% 0.000 1 [pruned/MobilenetV1/Logits/AvgPool_1a/AvgPool] 56 | CONV_2D 62.736 1.432 0.365 0.579% 99.997% 0.000 1 [pruned/MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd] 57 | RESHAPE 63.103 0.003 0.002 0.003% 100.000% 0.000 1 [pruned/MobilenetV1/Logits/SpatialSqueeze] 58 | 59 | ============================== Top by Computation Time ============================== 60 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 61 | CONV_2D 56.275 7.583 6.428 10.190% 10.190% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 62 | CONV_2D 32.517 5.403 4.973 7.884% 18.074% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 63 | CONV_2D 37.843 4.883 4.683 7.424% 25.499% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 64 | CONV_2D 47.534 4.829 4.608 7.306% 32.804% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 65 | CONV_2D 27.607 4.895 4.560 7.230% 40.034% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 66 | CONV_2D 42.858 4.568 4.349 6.895% 46.929% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 67 | CONV_2D 20.541 4.524 4.340 6.880% 53.809% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 68 | CONV_2D 12.907 4.171 4.110 6.516% 60.324% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 69 | CONV_2D 52.251 4.455 3.837 6.082% 66.406% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 70 | CONV_2D 3.888 3.647 3.077 4.878% 71.285% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 71 | 72 | ============================== Summary by node type ============================== 73 | [Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] [times called] 74 | CONV_2D 15 55.062 87.311% 87.311% 0.000 15 75 | DEPTHWISE_CONV_2D 13 7.968 12.635% 99.946% 0.000 13 76 | AVERAGE_POOL_2D 1 0.032 0.051% 99.997% 0.000 1 77 | RESHAPE 1 0.002 0.003% 100.000% 0.000 1 78 | 79 | Timings (microseconds): count=300 first=72230 curr=62557 min=61495 max=72230 avg=63077.3 std=934 80 | Memory (bytes): count=300 curr=0(all same) 81 | 30 nodes observed 82 | 83 | 84 | -------------------------------------------------------------------------------- /logs/0.5time_run3.txt: -------------------------------------------------------------------------------- 1 | native : commandlineflags.cc:1511 Ignoring RegisterValidateFunction() for flag pointer 0x565e772330: no flag found at that address 2 | native : benchmark_tflite_model.cc:480 STARTING! 3 | native : benchmark_tflite_model.cc:521 Graph: [/data/local/tmp/0.5time_round8.tflite] 4 | native : benchmark_tflite_model.cc:522 Input layers: [pruned/input] 5 | native : benchmark_tflite_model.cc:523 Input shapes: [1,224,224,3] 6 | native : benchmark_tflite_model.cc:524 Input types: [float] 7 | native : benchmark_tflite_model.cc:525 Output layers: [pruned/MobilenetV1/Logits/SpatialSqueeze] 8 | native : benchmark_tflite_model.cc:526 Num runs: [200] 9 | native : benchmark_tflite_model.cc:527 Inter-run delay (seconds): [-1.0] 10 | native : benchmark_tflite_model.cc:528 Num threads: [1] 11 | native : benchmark_tflite_model.cc:529 Benchmark name: [] 12 | native : benchmark_tflite_model.cc:530 Output prefix: [] 13 | native : benchmark_tflite_model.cc:531 Warmup runs: [100] 14 | native : benchmark_tflite_model.cc:532 Use nnapi : [0] 15 | native : benchmark_tflite_model.cc:237 Loaded model /data/local/tmp/0.5time_round8.tflite 16 | native : benchmark_tflite_model.cc:239 resolved reporter 17 | native : benchmark_tflite_model.cc:560 Initialized session in 0.030002s 18 | native : benchmark_tflite_model.cc:412 Running benchmark for 100 iterations with detailed stat logging: 19 | native : benchmark_tflite_model.cc:440 count=100 first=71676 curr=62141 min=62073 max=71676 avg=63261.5 std=1194 20 | 21 | native : benchmark_tflite_model.cc:412 Running benchmark for 200 iterations with detailed stat logging: 22 | native : benchmark_tflite_model.cc:440 count=200 first=63370 curr=64291 min=61765 max=66495 avg=63181.6 std=681 23 | 24 | native : benchmark_tflite_model.cc:590 Average inference timings in us: Warmup: 63261.5, no stats: 63181 25 | native : stat_summarizer.cc:358 Number of nodes executed: 30 26 | native : benchmark_tflite_model.cc:108 ============================== Run Order ============================== 27 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 28 | CONV_2D 0.000 5.606 2.705 4.282% 4.282% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_0/Relu6] 29 | DEPTHWISE_CONV_2D 2.707 1.278 1.206 1.909% 6.191% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6] 30 | CONV_2D 3.914 3.480 3.081 4.877% 11.068% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 31 | DEPTHWISE_CONV_2D 6.996 1.541 1.580 2.501% 13.569% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6] 32 | CONV_2D 8.578 2.786 2.660 4.211% 17.780% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6] 33 | DEPTHWISE_CONV_2D 11.239 1.696 1.702 2.694% 20.474% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6] 34 | CONV_2D 12.942 4.343 4.118 6.519% 26.992% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 35 | DEPTHWISE_CONV_2D 17.061 0.582 0.562 0.890% 27.882% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6] 36 | CONV_2D 17.624 2.360 2.200 3.482% 31.364% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6] 37 | DEPTHWISE_CONV_2D 19.825 0.749 0.764 1.210% 32.573% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6] 38 | CONV_2D 20.590 4.625 4.338 6.867% 39.441% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 39 | DEPTHWISE_CONV_2D 24.929 0.242 0.236 0.374% 39.815% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6] 40 | CONV_2D 25.166 2.343 2.185 3.458% 43.273% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6] 41 | DEPTHWISE_CONV_2D 27.352 0.335 0.326 0.517% 43.789% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6] 42 | CONV_2D 27.679 5.002 4.555 7.211% 51.000% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 43 | DEPTHWISE_CONV_2D 32.235 0.361 0.361 0.572% 51.572% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6] 44 | CONV_2D 32.597 5.453 4.975 7.874% 59.446% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 45 | DEPTHWISE_CONV_2D 37.573 0.363 0.359 0.568% 60.014% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6] 46 | CONV_2D 37.932 5.033 4.676 7.401% 67.416% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 47 | DEPTHWISE_CONV_2D 42.609 0.332 0.336 0.532% 67.947% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6] 48 | CONV_2D 42.946 4.556 4.368 6.914% 74.862% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 49 | DEPTHWISE_CONV_2D 47.315 0.387 0.341 0.539% 75.401% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6] 50 | CONV_2D 47.657 5.025 4.609 7.296% 82.697% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 51 | DEPTHWISE_CONV_2D 52.267 0.114 0.109 0.172% 82.869% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6] 52 | CONV_2D 52.376 4.174 3.818 6.043% 88.912% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 53 | DEPTHWISE_CONV_2D 56.195 0.195 0.187 0.296% 89.208% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6] 54 | CONV_2D 56.382 7.287 6.428 10.175% 99.383% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 55 | AVERAGE_POOL_2D 62.812 0.045 0.032 0.051% 99.434% 0.000 1 [pruned/MobilenetV1/Logits/AvgPool_1a/AvgPool] 56 | CONV_2D 62.844 1.346 0.355 0.562% 99.996% 0.000 1 [pruned/MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd] 57 | RESHAPE 63.201 0.003 0.002 0.004% 100.000% 0.000 1 [pruned/MobilenetV1/Logits/SpatialSqueeze] 58 | 59 | ============================== Top by Computation Time ============================== 60 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 61 | CONV_2D 56.382 7.287 6.428 10.175% 10.175% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 62 | CONV_2D 32.597 5.453 4.975 7.874% 18.050% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 63 | CONV_2D 37.932 5.033 4.676 7.401% 25.451% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 64 | CONV_2D 47.657 5.025 4.609 7.296% 32.747% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 65 | CONV_2D 27.679 5.002 4.555 7.211% 39.958% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 66 | CONV_2D 42.946 4.556 4.368 6.914% 46.872% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 67 | CONV_2D 20.590 4.625 4.338 6.867% 53.740% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 68 | CONV_2D 12.942 4.343 4.118 6.519% 60.258% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 69 | CONV_2D 52.376 4.174 3.818 6.043% 66.301% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 70 | CONV_2D 3.914 3.480 3.081 4.877% 71.179% 0.000 1 [pruned/MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 71 | 72 | ============================== Summary by node type ============================== 73 | [Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] [times called] 74 | CONV_2D 15 55.066 87.182% 87.182% 0.000 15 75 | DEPTHWISE_CONV_2D 13 8.062 12.764% 99.946% 0.000 13 76 | AVERAGE_POOL_2D 1 0.032 0.051% 99.997% 0.000 1 77 | RESHAPE 1 0.002 0.003% 100.000% 0.000 1 78 | 79 | Timings (microseconds): count=300 first=71642 curr=64256 min=61734 max=71642 avg=63174.4 std=886 80 | Memory (bytes): count=300 curr=0(all same) 81 | 30 nodes observed 82 | 83 | 84 | -------------------------------------------------------------------------------- /logs/0.75mobilenet_run1.txt: -------------------------------------------------------------------------------- 1 | native : commandlineflags.cc:1511 Ignoring RegisterValidateFunction() for flag pointer 0x58d3a6e330: no flag found at that address 2 | native : benchmark_tflite_model.cc:480 STARTING! 3 | native : benchmark_tflite_model.cc:521 Graph: [/data/local/tmp/mobilenet_v1_0.75_224.tflite] 4 | native : benchmark_tflite_model.cc:522 Input layers: [input] 5 | native : benchmark_tflite_model.cc:523 Input shapes: [1,224,224,3] 6 | native : benchmark_tflite_model.cc:524 Input types: [float] 7 | native : benchmark_tflite_model.cc:525 Output layers: [MobilenetV1/Logits/SpatialSqueeze] 8 | native : benchmark_tflite_model.cc:526 Num runs: [200] 9 | native : benchmark_tflite_model.cc:527 Inter-run delay (seconds): [-1.0] 10 | native : benchmark_tflite_model.cc:528 Num threads: [1] 11 | native : benchmark_tflite_model.cc:529 Benchmark name: [] 12 | native : benchmark_tflite_model.cc:530 Output prefix: [] 13 | native : benchmark_tflite_model.cc:531 Warmup runs: [100] 14 | native : benchmark_tflite_model.cc:532 Use nnapi : [0] 15 | native : benchmark_tflite_model.cc:237 Loaded model /data/local/tmp/mobilenet_v1_0.75_224.tflite 16 | native : benchmark_tflite_model.cc:239 resolved reporter 17 | native : benchmark_tflite_model.cc:560 Initialized session in 0.031199s 18 | native : benchmark_tflite_model.cc:412 Running benchmark for 100 iterations with detailed stat logging: 19 | native : benchmark_tflite_model.cc:440 count=100 first=111747 curr=73555 min=70826 max=111747 avg=73515.1 std=4157 20 | 21 | native : benchmark_tflite_model.cc:412 Running benchmark for 200 iterations with detailed stat logging: 22 | native : benchmark_tflite_model.cc:440 count=200 first=74083 curr=72225 min=70325 max=148718 avg=73083.3 std=5558 23 | 24 | native : benchmark_tflite_model.cc:590 Average inference timings in us: Warmup: 73515.1, no stats: 73083 25 | native : stat_summarizer.cc:358 Number of nodes executed: 31 26 | native : benchmark_tflite_model.cc:108 ============================== Run Order ============================== 27 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 28 | CONV_2D 0.000 5.933 3.486 4.763% 4.763% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_0/Relu6] 29 | DEPTHWISE_CONV_2D 3.488 2.552 2.551 3.485% 8.248% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6] 30 | CONV_2D 6.040 3.915 3.770 5.150% 13.398% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 31 | DEPTHWISE_CONV_2D 9.811 1.590 1.599 2.184% 15.583% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6] 32 | CONV_2D 11.411 2.964 2.995 4.093% 19.675% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6] 33 | DEPTHWISE_CONV_2D 14.408 1.607 1.585 2.166% 21.841% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6] 34 | CONV_2D 15.994 5.038 5.094 6.960% 28.801% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 35 | DEPTHWISE_CONV_2D 21.090 0.868 0.891 1.218% 30.019% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6] 36 | CONV_2D 21.982 2.638 2.613 3.571% 33.590% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6] 37 | DEPTHWISE_CONV_2D 24.597 0.807 0.764 1.044% 34.634% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6] 38 | CONV_2D 25.362 4.765 4.777 6.527% 41.161% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 39 | DEPTHWISE_CONV_2D 30.140 0.247 0.238 0.325% 41.486% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6] 40 | CONV_2D 30.379 2.846 2.748 3.754% 45.240% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6] 41 | DEPTHWISE_CONV_2D 33.128 0.377 0.379 0.518% 45.758% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6] 42 | CONV_2D 33.507 10.203 5.225 7.139% 52.897% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 43 | DEPTHWISE_CONV_2D 38.734 0.374 0.373 0.509% 53.406% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6] 44 | CONV_2D 39.107 6.084 5.199 7.103% 60.509% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 45 | DEPTHWISE_CONV_2D 44.306 0.384 0.380 0.519% 61.028% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6] 46 | CONV_2D 44.687 6.780 5.230 7.145% 68.173% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 47 | DEPTHWISE_CONV_2D 49.918 0.381 0.376 0.514% 68.687% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6] 48 | CONV_2D 50.294 5.945 5.287 7.224% 75.910% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 49 | DEPTHWISE_CONV_2D 55.583 0.390 0.385 0.525% 76.436% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6] 50 | CONV_2D 55.968 10.180 5.273 7.204% 83.640% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 51 | DEPTHWISE_CONV_2D 61.242 0.126 0.109 0.149% 83.789% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6] 52 | CONV_2D 61.351 7.039 3.789 5.177% 88.965% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 53 | DEPTHWISE_CONV_2D 65.141 0.232 0.213 0.291% 89.256% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6] 54 | CONV_2D 65.354 16.040 7.343 10.033% 99.289% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 55 | AVERAGE_POOL_2D 72.699 0.059 0.036 0.050% 99.338% 0.000 1 [MobilenetV1/Logits/AvgPool_1a/AvgPool] 56 | CONV_2D 72.735 11.288 0.456 0.622% 99.961% 0.000 1 [MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd] 57 | SQUEEZE 73.192 0.004 0.002 0.003% 99.964% 0.000 1 [MobilenetV1/Logits/SpatialSqueeze] 58 | SOFTMAX 73.195 0.051 0.026 0.036% 100.000% 0.000 1 [MobilenetV1/Predictions/Reshape_1] 59 | 60 | ============================== Top by Computation Time ============================== 61 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 62 | CONV_2D 65.354 16.040 7.343 10.033% 10.033% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 63 | CONV_2D 50.294 5.945 5.287 7.224% 17.257% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 64 | CONV_2D 55.968 10.180 5.273 7.204% 24.461% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 65 | CONV_2D 44.687 6.780 5.230 7.145% 31.606% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 66 | CONV_2D 33.507 10.203 5.225 7.139% 38.745% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 67 | CONV_2D 39.107 6.084 5.199 7.103% 45.848% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 68 | CONV_2D 15.994 5.038 5.094 6.960% 52.808% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 69 | CONV_2D 25.362 4.765 4.777 6.527% 59.335% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 70 | CONV_2D 61.351 7.039 3.789 5.177% 64.511% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 71 | CONV_2D 6.040 3.915 3.770 5.150% 69.662% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 72 | 73 | ============================== Summary by node type ============================== 74 | [Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] [times called] 75 | CONV_2D 15 63.276 86.472% 86.472% 0.000 15 76 | DEPTHWISE_CONV_2D 13 9.835 13.440% 99.913% 0.000 13 77 | AVERAGE_POOL_2D 1 0.036 0.049% 99.962% 0.000 1 78 | SOFTMAX 1 0.026 0.036% 99.997% 0.000 1 79 | SQUEEZE 1 0.002 0.003% 100.000% 0.000 1 80 | 81 | Timings (microseconds): count=300 first=111707 curr=72194 min=70288 max=148667 avg=73192.1 std=5137 82 | Memory (bytes): count=300 curr=0(all same) 83 | 31 nodes observed 84 | 85 | 86 | -------------------------------------------------------------------------------- /logs/0.75mobilenet_run2.txt: -------------------------------------------------------------------------------- 1 | native : commandlineflags.cc:1511 Ignoring RegisterValidateFunction() for flag pointer 0x5979573330: no flag found at that address 2 | native : benchmark_tflite_model.cc:480 STARTING! 3 | native : benchmark_tflite_model.cc:521 Graph: [/data/local/tmp/mobilenet_v1_0.75_224.tflite] 4 | native : benchmark_tflite_model.cc:522 Input layers: [input] 5 | native : benchmark_tflite_model.cc:523 Input shapes: [1,224,224,3] 6 | native : benchmark_tflite_model.cc:524 Input types: [float] 7 | native : benchmark_tflite_model.cc:525 Output layers: [MobilenetV1/Logits/SpatialSqueeze] 8 | native : benchmark_tflite_model.cc:526 Num runs: [200] 9 | native : benchmark_tflite_model.cc:527 Inter-run delay (seconds): [-1.0] 10 | native : benchmark_tflite_model.cc:528 Num threads: [1] 11 | native : benchmark_tflite_model.cc:529 Benchmark name: [] 12 | native : benchmark_tflite_model.cc:530 Output prefix: [] 13 | native : benchmark_tflite_model.cc:531 Warmup runs: [100] 14 | native : benchmark_tflite_model.cc:532 Use nnapi : [0] 15 | native : benchmark_tflite_model.cc:237 Loaded model /data/local/tmp/mobilenet_v1_0.75_224.tflite 16 | native : benchmark_tflite_model.cc:239 resolved reporter 17 | native : benchmark_tflite_model.cc:560 Initialized session in 0.029303s 18 | native : benchmark_tflite_model.cc:412 Running benchmark for 100 iterations with detailed stat logging: 19 | native : benchmark_tflite_model.cc:440 count=100 first=81042 curr=71927 min=70288 max=81042 avg=72228.3 std=1490 20 | 21 | native : benchmark_tflite_model.cc:412 Running benchmark for 200 iterations with detailed stat logging: 22 | native : benchmark_tflite_model.cc:440 count=200 first=70717 curr=73215 min=70355 max=83401 avg=72131.2 std=1473 23 | 24 | native : benchmark_tflite_model.cc:590 Average inference timings in us: Warmup: 72228.3, no stats: 72131 25 | native : stat_summarizer.cc:358 Number of nodes executed: 31 26 | native : benchmark_tflite_model.cc:108 ============================== Run Order ============================== 27 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 28 | CONV_2D 0.000 7.507 3.458 4.794% 4.794% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_0/Relu6] 29 | DEPTHWISE_CONV_2D 3.460 2.422 2.517 3.489% 8.283% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6] 30 | CONV_2D 5.977 4.050 3.765 5.219% 13.502% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 31 | DEPTHWISE_CONV_2D 9.744 1.729 1.601 2.219% 15.722% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6] 32 | CONV_2D 11.345 3.005 2.959 4.102% 19.824% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6] 33 | DEPTHWISE_CONV_2D 14.306 1.515 1.542 2.138% 21.962% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6] 34 | CONV_2D 15.849 5.075 5.116 7.093% 29.055% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 35 | DEPTHWISE_CONV_2D 20.967 0.920 0.934 1.295% 30.350% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6] 36 | CONV_2D 21.902 2.679 2.586 3.585% 33.935% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6] 37 | DEPTHWISE_CONV_2D 24.489 1.156 0.756 1.049% 34.983% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6] 38 | CONV_2D 25.246 5.263 4.798 6.652% 41.635% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 39 | DEPTHWISE_CONV_2D 30.045 0.226 0.234 0.324% 41.959% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6] 40 | CONV_2D 30.279 2.896 2.689 3.729% 45.688% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6] 41 | DEPTHWISE_CONV_2D 32.970 0.369 0.375 0.519% 46.207% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6] 42 | CONV_2D 33.345 5.446 5.118 7.096% 53.303% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 43 | DEPTHWISE_CONV_2D 38.464 0.467 0.366 0.508% 53.811% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6] 44 | CONV_2D 38.831 5.230 5.108 7.082% 60.893% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 45 | DEPTHWISE_CONV_2D 43.941 0.378 0.376 0.521% 61.414% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6] 46 | CONV_2D 44.317 5.304 5.125 7.105% 68.519% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 47 | DEPTHWISE_CONV_2D 49.443 0.370 0.364 0.505% 69.024% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6] 48 | CONV_2D 49.807 5.312 5.129 7.112% 76.136% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 49 | DEPTHWISE_CONV_2D 54.938 0.378 0.373 0.518% 76.653% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6] 50 | CONV_2D 55.312 5.299 5.142 7.130% 83.783% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 51 | DEPTHWISE_CONV_2D 60.455 0.121 0.109 0.151% 83.933% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6] 52 | CONV_2D 60.565 4.149 3.668 5.086% 89.019% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 53 | DEPTHWISE_CONV_2D 64.234 0.225 0.211 0.292% 89.311% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6] 54 | CONV_2D 64.445 7.839 7.209 9.996% 99.307% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 55 | AVERAGE_POOL_2D 71.656 0.050 0.037 0.051% 99.358% 0.000 1 [MobilenetV1/Logits/AvgPool_1a/AvgPool] 56 | CONV_2D 71.693 1.588 0.436 0.604% 99.962% 0.000 1 [MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd] 57 | SQUEEZE 72.130 0.002 0.002 0.003% 99.965% 0.000 1 [MobilenetV1/Logits/SpatialSqueeze] 58 | SOFTMAX 72.133 0.035 0.025 0.035% 100.000% 0.000 1 [MobilenetV1/Predictions/Reshape_1] 59 | 60 | ============================== Top by Computation Time ============================== 61 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 62 | CONV_2D 64.445 7.839 7.209 9.996% 9.996% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 63 | CONV_2D 55.312 5.299 5.142 7.130% 17.125% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 64 | CONV_2D 49.807 5.312 5.129 7.112% 24.237% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 65 | CONV_2D 44.317 5.304 5.125 7.105% 31.342% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 66 | CONV_2D 33.345 5.446 5.118 7.096% 38.438% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 67 | CONV_2D 15.849 5.075 5.116 7.093% 45.531% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 68 | CONV_2D 38.831 5.230 5.108 7.082% 52.613% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 69 | CONV_2D 25.246 5.263 4.798 6.652% 59.265% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 70 | CONV_2D 5.977 4.050 3.765 5.219% 64.485% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 71 | CONV_2D 60.565 4.149 3.668 5.086% 69.571% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 72 | 73 | ============================== Summary by node type ============================== 74 | [Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] [times called] 75 | CONV_2D 15 62.298 86.392% 86.392% 0.000 15 76 | DEPTHWISE_CONV_2D 13 9.750 13.521% 99.913% 0.000 13 77 | AVERAGE_POOL_2D 1 0.036 0.050% 99.963% 0.000 1 78 | SOFTMAX 1 0.025 0.035% 99.997% 0.000 1 79 | SQUEEZE 1 0.002 0.003% 100.000% 0.000 1 80 | 81 | Timings (microseconds): count=300 first=81005 curr=73179 min=70251 max=83358 avg=72127 std=1480 82 | Memory (bytes): count=300 curr=0(all same) 83 | 31 nodes observed 84 | 85 | 86 | -------------------------------------------------------------------------------- /logs/0.75mobilenet_run3.txt: -------------------------------------------------------------------------------- 1 | native : commandlineflags.cc:1511 Ignoring RegisterValidateFunction() for flag pointer 0x62939e3330: no flag found at that address 2 | native : benchmark_tflite_model.cc:480 STARTING! 3 | native : benchmark_tflite_model.cc:521 Graph: [/data/local/tmp/mobilenet_v1_0.75_224.tflite] 4 | native : benchmark_tflite_model.cc:522 Input layers: [input] 5 | native : benchmark_tflite_model.cc:523 Input shapes: [1,224,224,3] 6 | native : benchmark_tflite_model.cc:524 Input types: [float] 7 | native : benchmark_tflite_model.cc:525 Output layers: [MobilenetV1/Logits/SpatialSqueeze] 8 | native : benchmark_tflite_model.cc:526 Num runs: [200] 9 | native : benchmark_tflite_model.cc:527 Inter-run delay (seconds): [-1.0] 10 | native : benchmark_tflite_model.cc:528 Num threads: [1] 11 | native : benchmark_tflite_model.cc:529 Benchmark name: [] 12 | native : benchmark_tflite_model.cc:530 Output prefix: [] 13 | native : benchmark_tflite_model.cc:531 Warmup runs: [100] 14 | native : benchmark_tflite_model.cc:532 Use nnapi : [0] 15 | native : benchmark_tflite_model.cc:237 Loaded model /data/local/tmp/mobilenet_v1_0.75_224.tflite 16 | native : benchmark_tflite_model.cc:239 resolved reporter 17 | native : benchmark_tflite_model.cc:560 Initialized session in 0.030714s 18 | native : benchmark_tflite_model.cc:412 Running benchmark for 100 iterations with detailed stat logging: 19 | native : benchmark_tflite_model.cc:440 count=100 first=82902 curr=72893 min=70306 max=82902 avg=72580.2 std=1843 20 | 21 | native : benchmark_tflite_model.cc:412 Running benchmark for 200 iterations with detailed stat logging: 22 | native : benchmark_tflite_model.cc:440 count=200 first=72953 curr=72577 min=70583 max=75696 avg=72164.2 std=1221 23 | 24 | native : benchmark_tflite_model.cc:590 Average inference timings in us: Warmup: 72580.2, no stats: 72164 25 | native : stat_summarizer.cc:358 Number of nodes executed: 31 26 | native : benchmark_tflite_model.cc:108 ============================== Run Order ============================== 27 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 28 | CONV_2D 0.000 6.411 3.478 4.813% 4.813% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_0/Relu6] 29 | DEPTHWISE_CONV_2D 3.480 2.462 2.517 3.483% 8.296% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6] 30 | CONV_2D 5.998 3.977 3.797 5.254% 13.550% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 31 | DEPTHWISE_CONV_2D 9.797 1.649 1.608 2.225% 15.775% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6] 32 | CONV_2D 11.406 5.613 2.995 4.144% 19.919% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6] 33 | DEPTHWISE_CONV_2D 14.402 1.503 1.564 2.164% 22.083% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6] 34 | CONV_2D 15.968 5.291 5.112 7.074% 29.157% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 35 | DEPTHWISE_CONV_2D 21.082 0.879 0.904 1.251% 30.408% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6] 36 | CONV_2D 21.987 2.707 2.610 3.612% 34.020% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6] 37 | DEPTHWISE_CONV_2D 24.598 0.780 0.736 1.019% 35.039% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6] 38 | CONV_2D 25.335 4.775 4.776 6.609% 41.648% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 39 | DEPTHWISE_CONV_2D 30.112 0.233 0.223 0.309% 41.957% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6] 40 | CONV_2D 30.336 3.069 2.709 3.749% 45.706% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6] 41 | DEPTHWISE_CONV_2D 33.046 0.373 0.369 0.511% 46.217% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6] 42 | CONV_2D 33.416 5.452 5.141 7.114% 53.331% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 43 | DEPTHWISE_CONV_2D 38.558 0.356 0.361 0.499% 53.830% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6] 44 | CONV_2D 38.920 5.499 5.105 7.065% 60.895% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 45 | DEPTHWISE_CONV_2D 44.026 0.357 0.360 0.498% 61.393% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6] 46 | CONV_2D 44.386 5.470 5.146 7.120% 68.513% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 47 | DEPTHWISE_CONV_2D 49.533 0.352 0.356 0.492% 69.005% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6] 48 | CONV_2D 49.889 5.522 5.133 7.103% 76.108% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 49 | DEPTHWISE_CONV_2D 55.024 0.368 0.366 0.507% 76.615% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6] 50 | CONV_2D 55.390 5.452 5.135 7.106% 83.721% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 51 | DEPTHWISE_CONV_2D 60.527 0.114 0.106 0.147% 83.868% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6] 52 | CONV_2D 60.634 4.260 3.730 5.161% 89.029% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 53 | DEPTHWISE_CONV_2D 64.365 0.220 0.209 0.289% 89.318% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6] 54 | CONV_2D 64.574 7.991 7.214 9.983% 99.301% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 55 | AVERAGE_POOL_2D 71.790 0.053 0.037 0.051% 99.352% 0.000 1 [MobilenetV1/Logits/AvgPool_1a/AvgPool] 56 | CONV_2D 71.827 1.643 0.440 0.608% 99.961% 0.000 1 [MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd] 57 | SQUEEZE 72.268 0.003 0.003 0.004% 99.964% 0.000 1 [MobilenetV1/Logits/SpatialSqueeze] 58 | SOFTMAX 72.271 0.036 0.026 0.036% 100.000% 0.000 1 [MobilenetV1/Predictions/Reshape_1] 59 | 60 | ============================== Top by Computation Time ============================== 61 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 62 | CONV_2D 64.574 7.991 7.214 9.983% 9.983% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 63 | CONV_2D 44.386 5.470 5.146 7.120% 17.103% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 64 | CONV_2D 33.416 5.452 5.141 7.114% 24.218% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 65 | CONV_2D 55.390 5.452 5.135 7.106% 31.323% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 66 | CONV_2D 49.889 5.522 5.133 7.103% 38.426% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 67 | CONV_2D 15.968 5.291 5.112 7.074% 45.500% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 68 | CONV_2D 38.920 5.499 5.105 7.065% 52.565% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 69 | CONV_2D 25.335 4.775 4.776 6.609% 59.174% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 70 | CONV_2D 5.998 3.977 3.797 5.254% 64.428% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 71 | CONV_2D 60.634 4.260 3.730 5.161% 69.589% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 72 | 73 | ============================== Summary by node type ============================== 74 | [Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] [times called] 75 | CONV_2D 15 62.513 86.523% 86.523% 0.000 15 76 | DEPTHWISE_CONV_2D 13 9.674 13.390% 99.913% 0.000 13 77 | AVERAGE_POOL_2D 1 0.036 0.050% 99.963% 0.000 1 78 | SOFTMAX 1 0.025 0.035% 99.997% 0.000 1 79 | SQUEEZE 1 0.002 0.003% 100.000% 0.000 1 80 | 81 | Timings (microseconds): count=300 first=82870 curr=72544 min=70271 max=82870 avg=72265.6 std=1470 82 | Memory (bytes): count=300 curr=0(all same) 83 | 31 nodes observed 84 | 85 | 86 | -------------------------------------------------------------------------------- /logs/mobilenet_run1.txt: -------------------------------------------------------------------------------- 1 | native : commandlineflags.cc:1511 Ignoring RegisterValidateFunction() for flag pointer 0x59d8297330: no flag found at that address 2 | native : benchmark_tflite_model.cc:480 STARTING! 3 | native : benchmark_tflite_model.cc:521 Graph: [/data/local/tmp/mobilenet_v1_1.0_224.tflite] 4 | native : benchmark_tflite_model.cc:522 Input layers: [input] 5 | native : benchmark_tflite_model.cc:523 Input shapes: [1,224,224,3] 6 | native : benchmark_tflite_model.cc:524 Input types: [float] 7 | native : benchmark_tflite_model.cc:525 Output layers: [MobilenetV1/Logits/SpatialSqueeze] 8 | native : benchmark_tflite_model.cc:526 Num runs: [200] 9 | native : benchmark_tflite_model.cc:527 Inter-run delay (seconds): [-1.0] 10 | native : benchmark_tflite_model.cc:528 Num threads: [1] 11 | native : benchmark_tflite_model.cc:529 Benchmark name: [] 12 | native : benchmark_tflite_model.cc:530 Output prefix: [] 13 | native : benchmark_tflite_model.cc:531 Warmup runs: [100] 14 | native : benchmark_tflite_model.cc:532 Use nnapi : [0] 15 | native : benchmark_tflite_model.cc:237 Loaded model /data/local/tmp/mobilenet_v1_1.0_224.tflite 16 | native : benchmark_tflite_model.cc:239 resolved reporter 17 | native : benchmark_tflite_model.cc:560 Initialized session in 0.026762s 18 | native : benchmark_tflite_model.cc:412 Running benchmark for 100 iterations with detailed stat logging: 19 | native : benchmark_tflite_model.cc:440 count=100 first=135716 curr=122388 min=117861 max=163370 avg=123679 std=5429 20 | 21 | native : benchmark_tflite_model.cc:412 Running benchmark for 200 iterations with detailed stat logging: 22 | native : benchmark_tflite_model.cc:440 count=200 first=129190 curr=123555 min=117812 max=152764 avg=123162 std=3455 23 | 24 | native : benchmark_tflite_model.cc:590 Average inference timings in us: Warmup: 123679, no stats: 123161 25 | native : stat_summarizer.cc:358 Number of nodes executed: 31 26 | native : benchmark_tflite_model.cc:108 ============================== Run Order ============================== 27 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 28 | CONV_2D 0.000 6.743 4.245 3.443% 3.443% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_0/Relu6] 29 | DEPTHWISE_CONV_2D 4.247 2.288 2.336 1.894% 5.337% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6] 30 | CONV_2D 6.584 6.314 5.974 4.845% 10.182% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 31 | DEPTHWISE_CONV_2D 12.560 2.800 2.852 2.313% 12.496% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6] 32 | CONV_2D 15.413 4.843 4.891 3.967% 16.463% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6] 33 | DEPTHWISE_CONV_2D 20.306 2.083 2.142 1.738% 18.200% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6] 34 | CONV_2D 22.450 8.514 8.747 7.094% 25.294% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 35 | DEPTHWISE_CONV_2D 31.198 0.989 0.969 0.786% 26.081% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6] 36 | CONV_2D 32.169 4.388 4.503 3.652% 29.733% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6] 37 | DEPTHWISE_CONV_2D 36.672 1.174 1.062 0.861% 30.594% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6] 38 | CONV_2D 37.735 8.732 8.766 7.110% 37.704% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 39 | DEPTHWISE_CONV_2D 46.503 0.396 0.384 0.311% 38.015% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6] 40 | CONV_2D 46.888 5.019 4.967 4.028% 42.043% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6] 41 | DEPTHWISE_CONV_2D 51.856 0.525 0.530 0.429% 42.473% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6] 42 | CONV_2D 52.386 12.734 9.284 7.530% 50.003% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 43 | DEPTHWISE_CONV_2D 61.671 0.529 0.507 0.411% 50.414% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6] 44 | CONV_2D 62.179 9.692 9.249 7.501% 57.915% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 45 | DEPTHWISE_CONV_2D 71.429 0.506 0.496 0.402% 58.318% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6] 46 | CONV_2D 71.925 9.360 9.270 7.518% 65.836% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 47 | DEPTHWISE_CONV_2D 81.196 0.519 0.526 0.427% 66.263% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6] 48 | CONV_2D 81.723 9.741 9.378 7.606% 73.869% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 49 | DEPTHWISE_CONV_2D 91.103 0.500 0.536 0.435% 74.304% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6] 50 | CONV_2D 91.639 9.674 9.429 7.648% 81.951% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 51 | DEPTHWISE_CONV_2D 101.070 0.166 0.158 0.128% 82.080% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6] 52 | CONV_2D 101.228 7.433 7.111 5.767% 87.847% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 53 | DEPTHWISE_CONV_2D 108.340 0.296 0.290 0.236% 88.083% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6] 54 | CONV_2D 108.631 15.173 13.989 11.346% 99.428% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 55 | AVERAGE_POOL_2D 122.621 0.056 0.045 0.036% 99.465% 0.000 1 [MobilenetV1/Logits/AvgPool_1a/AvgPool] 56 | CONV_2D 122.666 2.151 0.623 0.505% 99.970% 0.000 1 [MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd] 57 | SQUEEZE 123.291 0.002 0.003 0.002% 99.972% 0.000 1 [MobilenetV1/Logits/SpatialSqueeze] 58 | SOFTMAX 123.294 2.337 0.034 0.028% 100.000% 0.000 1 [MobilenetV1/Predictions/Reshape_1] 59 | 60 | ============================== Top by Computation Time ============================== 61 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 62 | CONV_2D 108.631 15.173 13.989 11.346% 11.346% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 63 | CONV_2D 91.639 9.674 9.429 7.648% 18.993% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 64 | CONV_2D 81.723 9.741 9.378 7.606% 26.600% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 65 | CONV_2D 52.386 12.734 9.284 7.530% 34.129% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 66 | CONV_2D 71.925 9.360 9.270 7.518% 41.648% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 67 | CONV_2D 62.179 9.692 9.249 7.501% 49.149% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 68 | CONV_2D 37.735 8.732 8.766 7.110% 56.259% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 69 | CONV_2D 22.450 8.514 8.747 7.094% 63.353% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 70 | CONV_2D 101.228 7.433 7.111 5.767% 69.121% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 71 | CONV_2D 6.584 6.314 5.974 4.845% 73.966% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 72 | 73 | ============================== Summary by node type ============================== 74 | [Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] [times called] 75 | CONV_2D 15 110.416 89.566% 89.566% 0.000 15 76 | DEPTHWISE_CONV_2D 13 12.783 10.369% 99.935% 0.000 13 77 | AVERAGE_POOL_2D 1 0.044 0.036% 99.971% 0.000 1 78 | SOFTMAX 1 0.034 0.028% 99.998% 0.000 1 79 | SQUEEZE 1 0.002 0.002% 100.000% 0.000 1 80 | 81 | Timings (microseconds): count=300 first=135677 curr=123515 min=117771 max=163324 avg=123295 std=4222 82 | Memory (bytes): count=300 curr=0(all same) 83 | 31 nodes observed 84 | 85 | 86 | -------------------------------------------------------------------------------- /logs/mobilenet_run2.txt: -------------------------------------------------------------------------------- 1 | native : commandlineflags.cc:1511 Ignoring RegisterValidateFunction() for flag pointer 0x55d9227330: no flag found at that address 2 | native : benchmark_tflite_model.cc:480 STARTING! 3 | native : benchmark_tflite_model.cc:521 Graph: [/data/local/tmp/mobilenet_v1_1.0_224.tflite] 4 | native : benchmark_tflite_model.cc:522 Input layers: [input] 5 | native : benchmark_tflite_model.cc:523 Input shapes: [1,224,224,3] 6 | native : benchmark_tflite_model.cc:524 Input types: [float] 7 | native : benchmark_tflite_model.cc:525 Output layers: [MobilenetV1/Logits/SpatialSqueeze] 8 | native : benchmark_tflite_model.cc:526 Num runs: [200] 9 | native : benchmark_tflite_model.cc:527 Inter-run delay (seconds): [-1.0] 10 | native : benchmark_tflite_model.cc:528 Num threads: [1] 11 | native : benchmark_tflite_model.cc:529 Benchmark name: [] 12 | native : benchmark_tflite_model.cc:530 Output prefix: [] 13 | native : benchmark_tflite_model.cc:531 Warmup runs: [100] 14 | native : benchmark_tflite_model.cc:532 Use nnapi : [0] 15 | native : benchmark_tflite_model.cc:237 Loaded model /data/local/tmp/mobilenet_v1_1.0_224.tflite 16 | native : benchmark_tflite_model.cc:239 resolved reporter 17 | native : benchmark_tflite_model.cc:560 Initialized session in 0.026589s 18 | native : benchmark_tflite_model.cc:412 Running benchmark for 100 iterations with detailed stat logging: 19 | native : benchmark_tflite_model.cc:440 count=100 first=135303 curr=122071 min=118505 max=135303 avg=122219 std=2238 20 | 21 | native : benchmark_tflite_model.cc:412 Running benchmark for 200 iterations with detailed stat logging: 22 | native : benchmark_tflite_model.cc:440 count=200 first=121404 curr=120166 min=118621 max=199349 avg=123034 std=5962 23 | 24 | native : benchmark_tflite_model.cc:590 Average inference timings in us: Warmup: 122219, no stats: 123033 25 | native : stat_summarizer.cc:358 Number of nodes executed: 31 26 | native : benchmark_tflite_model.cc:108 ============================== Run Order ============================== 27 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 28 | CONV_2D 0.000 7.306 4.203 3.425% 3.425% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_0/Relu6] 29 | DEPTHWISE_CONV_2D 4.205 2.262 2.288 1.865% 5.290% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6] 30 | CONV_2D 6.495 6.583 5.974 4.868% 10.158% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 31 | DEPTHWISE_CONV_2D 12.471 2.821 2.784 2.268% 12.426% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6] 32 | CONV_2D 15.257 5.019 4.875 3.972% 16.398% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6] 33 | DEPTHWISE_CONV_2D 20.133 2.163 2.186 1.781% 18.180% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6] 34 | CONV_2D 22.321 8.636 8.659 7.056% 25.235% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 35 | DEPTHWISE_CONV_2D 30.981 1.013 1.014 0.826% 26.061% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6] 36 | CONV_2D 31.996 4.440 4.461 3.635% 29.696% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6] 37 | DEPTHWISE_CONV_2D 36.459 1.036 1.073 0.874% 30.571% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6] 38 | CONV_2D 37.533 8.727 8.708 7.096% 37.666% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 39 | DEPTHWISE_CONV_2D 46.242 0.389 0.398 0.325% 37.991% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6] 40 | CONV_2D 46.641 5.184 4.910 4.001% 41.992% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6] 41 | DEPTHWISE_CONV_2D 51.552 0.525 0.524 0.427% 42.419% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6] 42 | CONV_2D 52.077 10.955 9.180 7.480% 49.900% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 43 | DEPTHWISE_CONV_2D 61.259 0.569 0.514 0.419% 50.319% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6] 44 | CONV_2D 61.773 9.633 9.246 7.534% 57.853% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 45 | DEPTHWISE_CONV_2D 71.020 0.495 0.504 0.410% 58.263% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6] 46 | CONV_2D 71.525 9.506 9.271 7.554% 65.817% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 47 | DEPTHWISE_CONV_2D 80.797 0.513 0.525 0.427% 66.245% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6] 48 | CONV_2D 81.322 9.944 9.328 7.601% 73.845% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 49 | DEPTHWISE_CONV_2D 90.651 0.571 0.524 0.427% 74.272% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6] 50 | CONV_2D 91.175 9.569 9.370 7.635% 81.908% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 51 | DEPTHWISE_CONV_2D 100.547 0.259 0.159 0.129% 82.037% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6] 52 | CONV_2D 100.706 8.043 7.109 5.793% 87.830% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 53 | DEPTHWISE_CONV_2D 107.816 0.308 0.276 0.225% 88.055% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6] 54 | CONV_2D 108.093 16.489 13.992 11.401% 99.456% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 55 | AVERAGE_POOL_2D 122.086 0.059 0.044 0.036% 99.492% 0.000 1 [MobilenetV1/Logits/AvgPool_1a/AvgPool] 56 | CONV_2D 122.130 2.211 0.595 0.485% 99.977% 0.000 1 [MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd] 57 | SQUEEZE 122.727 0.002 0.003 0.002% 99.979% 0.000 1 [MobilenetV1/Logits/SpatialSqueeze] 58 | SOFTMAX 122.730 0.036 0.026 0.021% 100.000% 0.000 1 [MobilenetV1/Predictions/Reshape_1] 59 | 60 | ============================== Top by Computation Time ============================== 61 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 62 | CONV_2D 108.093 16.489 13.992 11.401% 11.401% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 63 | CONV_2D 91.175 9.569 9.370 7.635% 19.037% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 64 | CONV_2D 81.322 9.944 9.328 7.601% 26.637% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 65 | CONV_2D 71.525 9.506 9.271 7.554% 34.192% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 66 | CONV_2D 61.773 9.633 9.246 7.534% 41.726% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 67 | CONV_2D 52.077 10.955 9.180 7.480% 49.206% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 68 | CONV_2D 37.533 8.727 8.708 7.096% 56.302% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 69 | CONV_2D 22.321 8.636 8.659 7.056% 63.358% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 70 | CONV_2D 100.706 8.043 7.109 5.793% 69.150% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 71 | CONV_2D 6.495 6.583 5.974 4.868% 74.018% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 72 | 73 | ============================== Summary by node type ============================== 74 | [Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] [times called] 75 | CONV_2D 15 109.872 89.542% 89.542% 0.000 15 76 | DEPTHWISE_CONV_2D 13 12.762 10.401% 99.943% 0.000 13 77 | AVERAGE_POOL_2D 1 0.043 0.035% 99.978% 0.000 1 78 | SOFTMAX 1 0.025 0.020% 99.998% 0.000 1 79 | SQUEEZE 1 0.002 0.002% 100.000% 0.000 1 80 | 81 | Timings (microseconds): count=300 first=135266 curr=120119 min=118473 max=199289 avg=122721 std=5050 82 | Memory (bytes): count=300 curr=0(all same) 83 | 31 nodes observed 84 | 85 | 86 | -------------------------------------------------------------------------------- /logs/mobilenet_run3.txt: -------------------------------------------------------------------------------- 1 | native : commandlineflags.cc:1511 Ignoring RegisterValidateFunction() for flag pointer 0x577bec6330: no flag found at that address 2 | native : benchmark_tflite_model.cc:480 STARTING! 3 | native : benchmark_tflite_model.cc:521 Graph: [/data/local/tmp/mobilenet_v1_1.0_224.tflite] 4 | native : benchmark_tflite_model.cc:522 Input layers: [input] 5 | native : benchmark_tflite_model.cc:523 Input shapes: [1,224,224,3] 6 | native : benchmark_tflite_model.cc:524 Input types: [float] 7 | native : benchmark_tflite_model.cc:525 Output layers: [MobilenetV1/Logits/SpatialSqueeze] 8 | native : benchmark_tflite_model.cc:526 Num runs: [200] 9 | native : benchmark_tflite_model.cc:527 Inter-run delay (seconds): [-1.0] 10 | native : benchmark_tflite_model.cc:528 Num threads: [1] 11 | native : benchmark_tflite_model.cc:529 Benchmark name: [] 12 | native : benchmark_tflite_model.cc:530 Output prefix: [] 13 | native : benchmark_tflite_model.cc:531 Warmup runs: [100] 14 | native : benchmark_tflite_model.cc:532 Use nnapi : [0] 15 | native : benchmark_tflite_model.cc:237 Loaded model /data/local/tmp/mobilenet_v1_1.0_224.tflite 16 | native : benchmark_tflite_model.cc:239 resolved reporter 17 | native : benchmark_tflite_model.cc:560 Initialized session in 0.028279s 18 | native : benchmark_tflite_model.cc:412 Running benchmark for 100 iterations with detailed stat logging: 19 | native : benchmark_tflite_model.cc:440 count=100 first=133623 curr=126794 min=118587 max=133623 avg=123052 std=2545 20 | 21 | native : benchmark_tflite_model.cc:412 Running benchmark for 200 iterations with detailed stat logging: 22 | native : benchmark_tflite_model.cc:440 count=200 first=155451 curr=125038 min=118880 max=187975 avg=127212 std=7549 23 | 24 | native : benchmark_tflite_model.cc:590 Average inference timings in us: Warmup: 123052, no stats: 127212 25 | native : stat_summarizer.cc:358 Number of nodes executed: 31 26 | native : benchmark_tflite_model.cc:108 ============================== Run Order ============================== 27 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 28 | CONV_2D 0.000 8.798 4.461 3.547% 3.547% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_0/Relu6] 29 | DEPTHWISE_CONV_2D 4.463 2.829 2.315 1.840% 5.387% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_depthwise/Relu6] 30 | CONV_2D 6.779 8.886 5.960 4.738% 10.125% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 31 | DEPTHWISE_CONV_2D 12.741 2.738 2.847 2.263% 12.388% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_2_depthwise/Relu6] 32 | CONV_2D 15.588 4.810 4.978 3.957% 16.345% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_2_pointwise/Relu6] 33 | DEPTHWISE_CONV_2D 20.568 2.086 2.139 1.700% 18.046% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6] 34 | CONV_2D 22.708 8.560 8.714 6.928% 24.974% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 35 | DEPTHWISE_CONV_2D 31.424 0.948 0.984 0.782% 25.756% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_4_depthwise/Relu6] 36 | CONV_2D 32.410 4.427 4.502 3.579% 29.335% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_4_pointwise/Relu6] 37 | DEPTHWISE_CONV_2D 36.913 1.078 1.085 0.863% 30.198% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_depthwise/Relu6] 38 | CONV_2D 37.999 8.485 8.829 7.019% 37.217% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 39 | DEPTHWISE_CONV_2D 46.829 0.356 0.382 0.303% 37.520% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_6_depthwise/Relu6] 40 | CONV_2D 47.211 4.972 4.950 3.935% 41.455% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_6_pointwise/Relu6] 41 | DEPTHWISE_CONV_2D 52.162 0.517 0.521 0.414% 41.869% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_depthwise/Relu6] 42 | CONV_2D 52.683 9.370 9.326 7.414% 49.284% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 43 | DEPTHWISE_CONV_2D 62.011 0.499 0.516 0.410% 49.694% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_depthwise/Relu6] 44 | CONV_2D 62.528 9.320 9.234 7.341% 57.035% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 45 | DEPTHWISE_CONV_2D 71.763 0.660 0.516 0.410% 57.445% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_depthwise/Relu6] 46 | CONV_2D 72.279 9.270 9.201 7.315% 64.760% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 47 | DEPTHWISE_CONV_2D 81.481 0.490 0.504 0.400% 65.161% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_depthwise/Relu6] 48 | CONV_2D 81.986 9.384 9.324 7.413% 72.573% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 49 | DEPTHWISE_CONV_2D 91.311 0.510 0.517 0.411% 72.984% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_depthwise/Relu6] 50 | CONV_2D 91.829 9.312 9.514 7.564% 80.548% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 51 | DEPTHWISE_CONV_2D 101.344 0.167 0.157 0.125% 80.673% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_depthwise/Relu6] 52 | CONV_2D 101.502 7.569 7.914 6.292% 86.965% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 53 | DEPTHWISE_CONV_2D 109.418 0.291 0.287 0.228% 87.193% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_depthwise/Relu6] 54 | CONV_2D 109.705 15.029 15.311 12.172% 99.366% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 55 | AVERAGE_POOL_2D 125.018 0.054 0.048 0.038% 99.404% 0.000 1 [MobilenetV1/Logits/AvgPool_1a/AvgPool] 56 | CONV_2D 125.066 2.131 0.719 0.572% 99.975% 0.000 1 [MobilenetV1/Logits/Conv2d_1c_1x1/BiasAdd] 57 | SQUEEZE 125.787 0.002 0.003 0.002% 99.977% 0.000 1 [MobilenetV1/Logits/SpatialSqueeze] 58 | SOFTMAX 125.790 0.035 0.028 0.023% 100.000% 0.000 1 [MobilenetV1/Predictions/Reshape_1] 59 | 60 | ============================== Top by Computation Time ============================== 61 | [node type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [times called] [Name] 62 | CONV_2D 109.705 15.029 15.311 12.172% 12.172% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_13_pointwise/Relu6] 63 | CONV_2D 91.829 9.312 9.514 7.564% 19.736% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Relu6] 64 | CONV_2D 52.683 9.370 9.326 7.414% 27.151% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_7_pointwise/Relu6] 65 | CONV_2D 81.986 9.384 9.324 7.413% 34.564% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_10_pointwise/Relu6] 66 | CONV_2D 62.528 9.320 9.234 7.341% 41.905% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_8_pointwise/Relu6] 67 | CONV_2D 72.279 9.270 9.201 7.315% 49.220% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_9_pointwise/Relu6] 68 | CONV_2D 37.999 8.485 8.829 7.019% 56.238% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_5_pointwise/Relu6] 69 | CONV_2D 22.708 8.560 8.714 6.928% 63.166% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Relu6] 70 | CONV_2D 101.502 7.569 7.914 6.292% 69.458% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_12_pointwise/Relu6] 71 | CONV_2D 6.779 8.886 5.960 4.738% 74.196% 0.000 1 [MobilenetV1/MobilenetV1/Conv2d_1_pointwise/Relu6] 72 | 73 | ============================== Summary by node type ============================== 74 | [Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] [times called] 75 | CONV_2D 15 112.931 89.792% 89.792% 0.000 15 76 | DEPTHWISE_CONV_2D 13 12.762 10.147% 99.939% 0.000 13 77 | AVERAGE_POOL_2D 1 0.047 0.037% 99.976% 0.000 1 78 | SOFTMAX 1 0.028 0.022% 99.998% 0.000 1 79 | SQUEEZE 1 0.002 0.002% 100.000% 0.000 1 80 | 81 | Timings (microseconds): count=300 first=133583 curr=125003 min=118546 max=187932 avg=125785 std=6632 82 | Memory (bytes): count=300 curr=0(all same) 83 | 31 nodes observed 84 | 85 | 86 | -------------------------------------------------------------------------------- /models/mobilenet_v1.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import math 4 | 5 | 6 | def conv_bn(inp, oup, stride): 7 | return nn.Sequential( 8 | nn.Conv2d(inp, oup, 3, stride, 1, bias=False), 9 | nn.BatchNorm2d(oup), 10 | nn.ReLU(inplace=True) 11 | ) 12 | 13 | 14 | def conv_dw(inp, oup, stride): 15 | return nn.Sequential( 16 | nn.Conv2d(inp, inp, 3, stride, 1, groups=inp, bias=False), 17 | nn.BatchNorm2d(inp), 18 | nn.ReLU(inplace=True), 19 | 20 | nn.Conv2d(inp, oup, 1, 1, 0, bias=False), 21 | nn.BatchNorm2d(oup), 22 | nn.ReLU(inplace=True), 23 | ) 24 | 25 | 26 | class MobileNet(nn.Module): 27 | def __init__(self, n_class, profile='1.0'): 28 | super(MobileNet, self).__init__() 29 | # original 30 | if profile == '1.0': 31 | in_planes = 32 32 | cfg = [64, (128, 2), 128, (256, 2), 256, (512, 2), 512, 512, 512, 512, 512, (1024, 2), 1024] 33 | # 0.75 34 | elif profile == '0.75': 35 | in_planes = 24 36 | cfg = [48, (96, 2), 96, (192, 2), 192, (384, 2), 384, 384, 384, 384, 384, (768, 2), 768] 37 | # 0.5 AMC 38 | elif profile == '0.5flops': 39 | in_planes = 24 40 | cfg = [48, (96, 2), 80, (192, 2), 200, (328, 2), 352, 368, 360, 328, 400, (736, 2), 752] 41 | # 0.5 time 42 | elif profile == '0.5time': 43 | in_planes = 16 44 | cfg = [48, (88, 2), 80, (192, 2), 168, (336, 2), 360, 360, 352, 352, 368, (768, 2), 680] 45 | else: 46 | raise NotImplementedError 47 | 48 | self.conv1 = conv_bn(3, in_planes, stride=2) 49 | 50 | self.features = self._make_layers(in_planes, cfg, conv_dw) 51 | 52 | self.classifier = nn.Sequential( 53 | nn.Dropout(0.2), 54 | nn.Linear(cfg[-1], n_class), 55 | ) 56 | 57 | self._initialize_weights() 58 | 59 | def forward(self, x): 60 | x = self.conv1(x) 61 | x = self.features(x) 62 | x = x.mean(3).mean(2) # global average pooling 63 | 64 | x = self.classifier(x) 65 | return x 66 | 67 | def _make_layers(self, in_planes, cfg, layer): 68 | layers = [] 69 | for x in cfg: 70 | out_planes = x if isinstance(x, int) else x[0] 71 | stride = 1 if isinstance(x, int) else x[1] 72 | layers.append(layer(in_planes, out_planes, stride)) 73 | in_planes = out_planes 74 | return nn.Sequential(*layers) 75 | 76 | def _initialize_weights(self): 77 | for m in self.modules(): 78 | if isinstance(m, nn.Conv2d): 79 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 80 | m.weight.data.normal_(0, math.sqrt(2. / n)) 81 | if m.bias is not None: 82 | m.bias.data.zero_() 83 | elif isinstance(m, nn.BatchNorm2d): 84 | m.weight.data.fill_(1) 85 | m.bias.data.zero_() 86 | elif isinstance(m, nn.Linear): 87 | n = m.weight.size(1) 88 | m.weight.data.normal_(0, 0.01) 89 | m.bias.data.zero_() 90 | -------------------------------------------------------------------------------- /models/mobilenet_v1_tf.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017 The TensorFlow Authors. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================= 15 | """MobileNet v1. 16 | 17 | MobileNet is a general architecture and can be used for multiple use cases. 18 | Depending on the use case, it can use different input layer size and different 19 | head (for example: embeddings, localization and classification). 20 | 21 | As described in https://arxiv.org/abs/1704.04861. 22 | 23 | MobileNets: Efficient Convolutional Neural Networks for 24 | Mobile Vision Applications 25 | Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, 26 | Tobias Weyand, Marco Andreetto, Hartwig Adam 27 | 28 | 100% Mobilenet V1 (base) with input size 224x224: 29 | 30 | See mobilenet_v1() 31 | 32 | Layer params macs 33 | -------------------------------------------------------------------------------- 34 | MobilenetV1/Conv2d_0/Conv2D: 864 10,838,016 35 | MobilenetV1/Conv2d_1_depthwise/depthwise: 288 3,612,672 36 | MobilenetV1/Conv2d_1_pointwise/Conv2D: 2,048 25,690,112 37 | MobilenetV1/Conv2d_2_depthwise/depthwise: 576 1,806,336 38 | MobilenetV1/Conv2d_2_pointwise/Conv2D: 8,192 25,690,112 39 | MobilenetV1/Conv2d_3_depthwise/depthwise: 1,152 3,612,672 40 | MobilenetV1/Conv2d_3_pointwise/Conv2D: 16,384 51,380,224 41 | MobilenetV1/Conv2d_4_depthwise/depthwise: 1,152 903,168 42 | MobilenetV1/Conv2d_4_pointwise/Conv2D: 32,768 25,690,112 43 | MobilenetV1/Conv2d_5_depthwise/depthwise: 2,304 1,806,336 44 | MobilenetV1/Conv2d_5_pointwise/Conv2D: 65,536 51,380,224 45 | MobilenetV1/Conv2d_6_depthwise/depthwise: 2,304 451,584 46 | MobilenetV1/Conv2d_6_pointwise/Conv2D: 131,072 25,690,112 47 | MobilenetV1/Conv2d_7_depthwise/depthwise: 4,608 903,168 48 | MobilenetV1/Conv2d_7_pointwise/Conv2D: 262,144 51,380,224 49 | MobilenetV1/Conv2d_8_depthwise/depthwise: 4,608 903,168 50 | MobilenetV1/Conv2d_8_pointwise/Conv2D: 262,144 51,380,224 51 | MobilenetV1/Conv2d_9_depthwise/depthwise: 4,608 903,168 52 | MobilenetV1/Conv2d_9_pointwise/Conv2D: 262,144 51,380,224 53 | MobilenetV1/Conv2d_10_depthwise/depthwise: 4,608 903,168 54 | MobilenetV1/Conv2d_10_pointwise/Conv2D: 262,144 51,380,224 55 | MobilenetV1/Conv2d_11_depthwise/depthwise: 4,608 903,168 56 | MobilenetV1/Conv2d_11_pointwise/Conv2D: 262,144 51,380,224 57 | MobilenetV1/Conv2d_12_depthwise/depthwise: 4,608 225,792 58 | MobilenetV1/Conv2d_12_pointwise/Conv2D: 524,288 25,690,112 59 | MobilenetV1/Conv2d_13_depthwise/depthwise: 9,216 451,584 60 | MobilenetV1/Conv2d_13_pointwise/Conv2D: 1,048,576 51,380,224 61 | -------------------------------------------------------------------------------- 62 | Total: 3,185,088 567,716,352 63 | 64 | 65 | 75% Mobilenet V1 (base) with input size 128x128: 66 | 67 | See mobilenet_v1_075() 68 | 69 | Layer params macs 70 | -------------------------------------------------------------------------------- 71 | MobilenetV1/Conv2d_0/Conv2D: 648 2,654,208 72 | MobilenetV1/Conv2d_1_depthwise/depthwise: 216 884,736 73 | MobilenetV1/Conv2d_1_pointwise/Conv2D: 1,152 4,718,592 74 | MobilenetV1/Conv2d_2_depthwise/depthwise: 432 442,368 75 | MobilenetV1/Conv2d_2_pointwise/Conv2D: 4,608 4,718,592 76 | MobilenetV1/Conv2d_3_depthwise/depthwise: 864 884,736 77 | MobilenetV1/Conv2d_3_pointwise/Conv2D: 9,216 9,437,184 78 | MobilenetV1/Conv2d_4_depthwise/depthwise: 864 221,184 79 | MobilenetV1/Conv2d_4_pointwise/Conv2D: 18,432 4,718,592 80 | MobilenetV1/Conv2d_5_depthwise/depthwise: 1,728 442,368 81 | MobilenetV1/Conv2d_5_pointwise/Conv2D: 36,864 9,437,184 82 | MobilenetV1/Conv2d_6_depthwise/depthwise: 1,728 110,592 83 | MobilenetV1/Conv2d_6_pointwise/Conv2D: 73,728 4,718,592 84 | MobilenetV1/Conv2d_7_depthwise/depthwise: 3,456 221,184 85 | MobilenetV1/Conv2d_7_pointwise/Conv2D: 147,456 9,437,184 86 | MobilenetV1/Conv2d_8_depthwise/depthwise: 3,456 221,184 87 | MobilenetV1/Conv2d_8_pointwise/Conv2D: 147,456 9,437,184 88 | MobilenetV1/Conv2d_9_depthwise/depthwise: 3,456 221,184 89 | MobilenetV1/Conv2d_9_pointwise/Conv2D: 147,456 9,437,184 90 | MobilenetV1/Conv2d_10_depthwise/depthwise: 3,456 221,184 91 | MobilenetV1/Conv2d_10_pointwise/Conv2D: 147,456 9,437,184 92 | MobilenetV1/Conv2d_11_depthwise/depthwise: 3,456 221,184 93 | MobilenetV1/Conv2d_11_pointwise/Conv2D: 147,456 9,437,184 94 | MobilenetV1/Conv2d_12_depthwise/depthwise: 3,456 55,296 95 | MobilenetV1/Conv2d_12_pointwise/Conv2D: 294,912 4,718,592 96 | MobilenetV1/Conv2d_13_depthwise/depthwise: 6,912 110,592 97 | MobilenetV1/Conv2d_13_pointwise/Conv2D: 589,824 9,437,184 98 | -------------------------------------------------------------------------------- 99 | Total: 1,800,144 106,002,432 100 | 101 | """ 102 | 103 | # Tensorflow mandates these. 104 | from __future__ import absolute_import 105 | from __future__ import division 106 | from __future__ import print_function 107 | 108 | from collections import namedtuple 109 | import functools 110 | 111 | import tensorflow as tf 112 | 113 | 114 | def wrapped_partial(func, *args, **kwargs): 115 | partial_func = functools.partial(func, *args, **kwargs) 116 | functools.update_wrapper(partial_func, func) 117 | return partial_func 118 | 119 | 120 | slim = tf.contrib.slim 121 | batch_norm = wrapped_partial(slim.batch_norm, scale=True, epsilon=1e-5, decay=0.9) 122 | 123 | # Conv and DepthSepConv namedtuple define layers of the MobileNet architecture 124 | # Conv defines 3x3 convolution layers 125 | # DepthSepConv defines 3x3 depthwise convolution followed by 1x1 convolution. 126 | # stride is the stride of the convolution 127 | # depth is the number of channels or filters in a layer 128 | Conv = namedtuple('Conv', ['kernel', 'stride', 'depth']) 129 | DepthSepConv = namedtuple('DepthSepConv', ['kernel', 'stride', 'depth']) 130 | 131 | 132 | def _fixed_padding(inputs, kernel_size, rate=1): 133 | """Pads the input along the spatial dimensions independently of input size. 134 | 135 | Pads the input such that if it was used in a convolution with 'VALID' padding, 136 | the output would have the same dimensions as if the unpadded input was used 137 | in a convolution with 'SAME' padding. 138 | 139 | Args: 140 | inputs: A tensor of size [batch, height_in, width_in, channels]. 141 | kernel_size: The kernel to be used in the conv2d or max_pool2d operation. 142 | rate: An integer, rate for atrous convolution. 143 | 144 | Returns: 145 | output: A tensor of size [batch, height_out, width_out, channels] with the 146 | input, either intact (if kernel_size == 1) or padded (if kernel_size > 1). 147 | """ 148 | kernel_size_effective = [kernel_size[0] + (kernel_size[0] - 1) * (rate - 1), 149 | kernel_size[0] + (kernel_size[0] - 1) * (rate - 1)] 150 | pad_total = [kernel_size_effective[0] - 1, kernel_size_effective[1] - 1] 151 | pad_beg = [pad_total[0] // 2, pad_total[1] // 2] 152 | pad_end = [pad_total[0] - pad_beg[0], pad_total[1] - pad_beg[1]] 153 | padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg[0], pad_end[0]], 154 | [pad_beg[1], pad_end[1]], [0, 0]]) 155 | return padded_inputs 156 | 157 | 158 | def mobilenet_v1_base(inputs, 159 | final_endpoint='Conv2d_13_pointwise', 160 | min_depth=8, 161 | depth_multiplier=1.0, 162 | conv_defs=None, 163 | output_stride=None, 164 | use_explicit_padding=True, 165 | scope=None): 166 | """Mobilenet v1. 167 | 168 | Constructs a Mobilenet v1 network from inputs to the given final endpoint. 169 | 170 | Args: 171 | inputs: a tensor of shape [batch_size, height, width, channels]. 172 | final_endpoint: specifies the endpoint to construct the network up to. It 173 | can be one of ['Conv2d_0', 'Conv2d_1_pointwise', 'Conv2d_2_pointwise', 174 | 'Conv2d_3_pointwise', 'Conv2d_4_pointwise', 'Conv2d_5'_pointwise, 175 | 'Conv2d_6_pointwise', 'Conv2d_7_pointwise', 'Conv2d_8_pointwise', 176 | 'Conv2d_9_pointwise', 'Conv2d_10_pointwise', 'Conv2d_11_pointwise', 177 | 'Conv2d_12_pointwise', 'Conv2d_13_pointwise']. 178 | min_depth: Minimum depth value (number of channels) for all convolution ops. 179 | Enforced when depth_multiplier < 1, and not an active constraint when 180 | depth_multiplier >= 1. 181 | depth_multiplier: Float multiplier for the depth (number of channels) 182 | for all convolution ops. The value must be greater than zero. Typical 183 | usage will be to set this value in (0, 1) to reduce the number of 184 | parameters or computation cost of the model. 185 | conv_defs: A list of ConvDef namedtuples specifying the net architecture. 186 | output_stride: An integer that specifies the requested ratio of input to 187 | output spatial resolution. If not None, then we invoke atrous convolution 188 | if necessary to prevent the network from reducing the spatial resolution 189 | of the activation maps. Allowed values are 8 (accurate fully convolutional 190 | mode), 16 (fast fully convolutional mode), 32 (classification mode). 191 | use_explicit_padding: Use 'VALID' padding for convolutions, but prepad 192 | inputs so that the output dimensions are the same as if 'SAME' padding 193 | were used. 194 | scope: Optional variable_scope. 195 | 196 | Returns: 197 | tensor_out: output tensor corresponding to the final_endpoint. 198 | end_points: a set of activations for external use, for example summaries or 199 | losses. 200 | 201 | Raises: 202 | ValueError: if final_endpoint is not set to one of the predefined values, 203 | or depth_multiplier <= 0, or the target output_stride is not 204 | allowed. 205 | """ 206 | depth = lambda d: max(int(d * depth_multiplier), min_depth) 207 | end_points = {} 208 | 209 | # Used to find thinned depths for each layer. 210 | if depth_multiplier <= 0: 211 | raise ValueError('depth_multiplier is not greater than zero.') 212 | 213 | if output_stride is not None and output_stride not in [8, 16, 32]: 214 | raise ValueError('Only allowed output_stride values are 8, 16, 32.') 215 | 216 | padding = 'SAME' 217 | if use_explicit_padding: 218 | padding = 'VALID' 219 | with tf.variable_scope(scope, 'MobilenetV1', [inputs]): 220 | with slim.arg_scope([slim.conv2d, slim.separable_conv2d], padding=padding): 221 | # The current_stride variable keeps track of the output stride of the 222 | # activations, i.e., the running product of convolution strides up to the 223 | # current network layer. This allows us to invoke atrous convolution 224 | # whenever applying the next convolution would result in the activations 225 | # having output stride larger than the target output_stride. 226 | current_stride = 1 227 | 228 | # The atrous convolution rate parameter. 229 | rate = 1 230 | 231 | net = inputs 232 | for i, conv_def in enumerate(conv_defs): 233 | end_point_base = 'Conv2d_%d' % i 234 | 235 | if output_stride is not None and current_stride == output_stride: 236 | # If we have reached the target output_stride, then we need to employ 237 | # atrous convolution with stride=1 and multiply the atrous rate by the 238 | # current unit's stride for use in subsequent layers. 239 | layer_stride = 1 240 | layer_rate = rate 241 | rate *= conv_def.stride 242 | else: 243 | layer_stride = conv_def.stride 244 | layer_rate = 1 245 | current_stride *= conv_def.stride 246 | 247 | if isinstance(conv_def, Conv): 248 | end_point = end_point_base 249 | if use_explicit_padding: 250 | net = _fixed_padding(net, conv_def.kernel) 251 | net = slim.conv2d(net, depth(conv_def.depth), conv_def.kernel, 252 | stride=conv_def.stride, 253 | normalizer_fn=batch_norm, 254 | scope=end_point) 255 | end_points[end_point] = net 256 | if end_point == final_endpoint: 257 | return net, end_points 258 | 259 | elif isinstance(conv_def, DepthSepConv): 260 | end_point = end_point_base + '_depthwise' 261 | 262 | # By passing filters=None 263 | # separable_conv2d produces only a depthwise convolution layer 264 | if use_explicit_padding: 265 | net = _fixed_padding(net, conv_def.kernel, layer_rate) 266 | net = slim.separable_conv2d(net, None, conv_def.kernel, 267 | depth_multiplier=1, 268 | stride=layer_stride, 269 | rate=layer_rate, 270 | normalizer_fn=batch_norm, 271 | scope=end_point) 272 | 273 | end_points[end_point] = net 274 | if end_point == final_endpoint: 275 | return net, end_points 276 | 277 | end_point = end_point_base + '_pointwise' 278 | 279 | net = slim.conv2d(net, depth(conv_def.depth), [1, 1], 280 | stride=1, 281 | normalizer_fn=batch_norm, 282 | scope=end_point) 283 | 284 | end_points[end_point] = net 285 | if end_point == final_endpoint: 286 | return net, end_points 287 | else: 288 | raise ValueError('Unknown convolution type %s for layer %d' 289 | % (conv_def.ltype, i)) 290 | raise ValueError('Unknown final endpoint %s' % final_endpoint) 291 | 292 | 293 | def mobilenet_v1(inputs, 294 | num_classes=1000, 295 | dropout_keep_prob=0.999, 296 | is_training=True, 297 | min_depth=8, 298 | depth_multiplier=1.0, 299 | conv_defs=None, 300 | prediction_fn=tf.contrib.layers.softmax, 301 | spatial_squeeze=True, 302 | reuse=None, 303 | scope='MobilenetV1', 304 | global_pool=True): 305 | """Mobilenet v1 model for classification. 306 | 307 | Args: 308 | inputs: a tensor of shape [batch_size, height, width, channels]. 309 | num_classes: number of predicted classes. If 0 or None, the logits layer 310 | is omitted and the input features to the logits layer (before dropout) 311 | are returned instead. 312 | dropout_keep_prob: the percentage of activation values that are retained. 313 | is_training: whether is training or not. 314 | min_depth: Minimum depth value (number of channels) for all convolution ops. 315 | Enforced when depth_multiplier < 1, and not an active constraint when 316 | depth_multiplier >= 1. 317 | depth_multiplier: Float multiplier for the depth (number of channels) 318 | for all convolution ops. The value must be greater than zero. Typical 319 | usage will be to set this value in (0, 1) to reduce the number of 320 | parameters or computation cost of the model. 321 | conv_defs: A list of ConvDef namedtuples specifying the net architecture. 322 | prediction_fn: a function to get predictions out of logits. 323 | spatial_squeeze: if True, logits is of shape is [B, C], if false logits is 324 | of shape [B, 1, 1, C], where B is batch_size and C is number of classes. 325 | reuse: whether or not the network and its variables should be reused. To be 326 | able to reuse 'scope' must be given. 327 | scope: Optional variable_scope. 328 | global_pool: Optional boolean flag to control the avgpooling before the 329 | logits layer. If false or unset, pooling is done with a fixed window 330 | that reduces default-sized inputs to 1x1, while larger inputs lead to 331 | larger outputs. If true, any input size is pooled down to 1x1. 332 | 333 | Returns: 334 | net: a 2D Tensor with the logits (pre-softmax activations) if num_classes 335 | is a non-zero integer, or the non-dropped-out input to the logits layer 336 | if num_classes is 0 or None. 337 | end_points: a dictionary from components of the network to the corresponding 338 | activation. 339 | 340 | Raises: 341 | ValueError: Input rank is invalid. 342 | """ 343 | input_shape = inputs.get_shape().as_list() 344 | if len(input_shape) != 4: 345 | raise ValueError('Invalid input tensor rank, expected 4, was: %d' % 346 | len(input_shape)) 347 | 348 | with tf.variable_scope(scope, 'MobilenetV1', [inputs], reuse=reuse) as scope: 349 | with slim.arg_scope([batch_norm, slim.dropout], 350 | is_training=is_training): 351 | net, end_points = mobilenet_v1_base(inputs, scope=scope, 352 | min_depth=min_depth, 353 | depth_multiplier=depth_multiplier, 354 | conv_defs=conv_defs) 355 | with tf.variable_scope('Logits'): 356 | if global_pool: 357 | # Global average pooling. 358 | net = tf.reduce_mean(net, [1, 2], keep_dims=True, name='global_pool') 359 | end_points['global_pool'] = net 360 | else: 361 | # Pooling with a fixed kernel size. 362 | kernel_size = _reduced_kernel_size_for_small_input(net, [7, 7]) 363 | net = slim.avg_pool2d(net, kernel_size, padding='VALID', 364 | scope='AvgPool_1a') 365 | end_points['AvgPool_1a'] = net 366 | if not num_classes: 367 | return net, end_points 368 | # 1 x 1 x 1024 369 | net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b') 370 | logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, 371 | normalizer_fn=None, scope='Conv2d_1c_1x1') 372 | if spatial_squeeze: 373 | logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze') 374 | end_points['Logits'] = logits 375 | # if prediction_fn: 376 | # end_points['Predictions'] = prediction_fn(logits, scope='Predictions') 377 | return logits, end_points 378 | 379 | 380 | mobilenet_v1.default_image_size = 224 381 | 382 | 383 | def wrapped_partial(func, *args, **kwargs): 384 | partial_func = functools.partial(func, *args, **kwargs) 385 | functools.update_wrapper(partial_func, func) 386 | return partial_func 387 | 388 | 389 | mobilenet_v1_075 = wrapped_partial(mobilenet_v1, depth_multiplier=0.75) 390 | mobilenet_v1_050 = wrapped_partial(mobilenet_v1, depth_multiplier=0.50) 391 | mobilenet_v1_025 = wrapped_partial(mobilenet_v1, depth_multiplier=0.25) 392 | 393 | 394 | def _reduced_kernel_size_for_small_input(input_tensor, kernel_size): 395 | """Define kernel size which is automatically reduced for small input. 396 | 397 | If the shape of the input images is unknown at graph construction time this 398 | function assumes that the input images are large enough. 399 | 400 | Args: 401 | input_tensor: input tensor of size [batch_size, height, width, channels]. 402 | kernel_size: desired kernel size of length 2: [kernel_height, kernel_width] 403 | 404 | Returns: 405 | a tensor with the kernel size. 406 | """ 407 | shape = input_tensor.get_shape().as_list() 408 | if shape[1] is None or shape[2] is None: 409 | kernel_size_out = kernel_size 410 | else: 411 | kernel_size_out = [min(shape[1], kernel_size[0]), 412 | min(shape[2], kernel_size[1])] 413 | return kernel_size_out 414 | 415 | 416 | def mobilenet_v1_arg_scope(is_training=True, 417 | weight_decay=0.00004, 418 | stddev=0.09, 419 | regularize_depthwise=False, 420 | batch_norm_decay=0.9, 421 | batch_norm_epsilon=1e-5): 422 | """Defines the default MobilenetV1 arg scope. 423 | 424 | Args: 425 | is_training: Whether or not we're training the model. If this is set to 426 | None, the parameter is not added to the batch_norm arg_scope. 427 | weight_decay: The weight decay to use for regularizing the model. 428 | stddev: The standard deviation of the trunctated normal weight initializer. 429 | regularize_depthwise: Whether or not apply regularization on depthwise. 430 | batch_norm_decay: Decay for batch norm moving average. 431 | batch_norm_epsilon: Small float added to variance to avoid dividing by zero 432 | in batch norm. 433 | 434 | Returns: 435 | An `arg_scope` to use for the mobilenet v1 model. 436 | """ 437 | batch_norm_params = { 438 | 'center': True, 439 | 'scale': True, 440 | 'decay': batch_norm_decay, 441 | 'epsilon': batch_norm_epsilon, 442 | } 443 | if is_training is not None: 444 | batch_norm_params['is_training'] = is_training 445 | 446 | # Set weight_decay for weights in Conv and DepthSepConv layers. 447 | weights_init = tf.truncated_normal_initializer(stddev=stddev) 448 | regularizer = tf.contrib.layers.l2_regularizer(weight_decay) 449 | if regularize_depthwise: 450 | depthwise_regularizer = regularizer 451 | else: 452 | depthwise_regularizer = None 453 | with slim.arg_scope([slim.conv2d, slim.separable_conv2d], 454 | weights_initializer=weights_init, 455 | activation_fn=tf.nn.relu, normalizer_fn=batch_norm): 456 | with slim.arg_scope([batch_norm], **batch_norm_params): 457 | with slim.arg_scope([slim.conv2d], weights_regularizer=regularizer): 458 | with slim.arg_scope([slim.separable_conv2d], 459 | weights_regularizer=depthwise_regularizer) as sc: 460 | return sc 461 | -------------------------------------------------------------------------------- /models/mobilenet_v2.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import math 4 | 5 | 6 | def conv_bn(inp, oup, stride): 7 | return nn.Sequential( 8 | nn.Conv2d(inp, oup, 3, stride, 1, bias=False), 9 | nn.BatchNorm2d(oup), 10 | nn.ReLU6(inplace=True) 11 | ) 12 | 13 | 14 | def conv_1x1_bn(inp, oup): 15 | return nn.Sequential( 16 | nn.Conv2d(inp, oup, 1, 1, 0, bias=False), 17 | nn.BatchNorm2d(oup), 18 | nn.ReLU6(inplace=True) 19 | ) 20 | 21 | 22 | class InvertedResidual(nn.Module): 23 | def __init__(self, inp, oup, stride, expand_ratio): 24 | super(InvertedResidual, self).__init__() 25 | self.stride = stride 26 | assert stride in [1, 2] 27 | 28 | hidden_dim = round(inp * expand_ratio) 29 | self.use_res_connect = self.stride == 1 and inp == oup 30 | 31 | if expand_ratio == 1: 32 | self.conv = nn.Sequential( 33 | # dw 34 | nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False), 35 | nn.BatchNorm2d(hidden_dim), 36 | nn.ReLU6(inplace=True), 37 | # pw-linear 38 | nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False), 39 | nn.BatchNorm2d(oup), 40 | ) 41 | else: 42 | self.conv = nn.Sequential( 43 | # pw 44 | nn.Conv2d(inp, hidden_dim, 1, 1, 0, bias=False), 45 | nn.BatchNorm2d(hidden_dim), 46 | nn.ReLU6(inplace=True), 47 | # dw 48 | nn.Conv2d(hidden_dim, hidden_dim, 3, stride, 1, groups=hidden_dim, bias=False), 49 | nn.BatchNorm2d(hidden_dim), 50 | nn.ReLU6(inplace=True), 51 | # pw-linear 52 | nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False), 53 | nn.BatchNorm2d(oup), 54 | ) 55 | 56 | def forward(self, x): 57 | if self.use_res_connect: 58 | return x + self.conv(x) 59 | else: 60 | return self.conv(x) 61 | 62 | 63 | def parse_pruned_channels(pruned_channels): 64 | # [24, 24, 16, 72, 24, 104, 24] 65 | strides = [1, 2, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1] 66 | input_channel = pruned_channels[1] 67 | last_channel = pruned_channels[-1] 68 | pruned_channels = pruned_channels[1:-1] 69 | interverted_residual_setting = [[1, pruned_channels[1], 1, strides[0]]] 70 | pruned_channels = pruned_channels[1:] 71 | s_idx = 1 72 | start = 0 73 | while start < len(pruned_channels) - 1: 74 | t = pruned_channels[start + 1] * 1. / pruned_channels[start] 75 | c = pruned_channels[start + 2] 76 | n = 1 77 | s = strides[s_idx] 78 | interverted_residual_setting.append([t, c, n, s]) 79 | s_idx += 1 80 | start += 2 81 | assert len(strides) == len(interverted_residual_setting) 82 | return input_channel, last_channel, interverted_residual_setting 83 | 84 | 85 | class MobileNetV2(nn.Module): 86 | def __init__(self, n_class=1000, input_size=224, width_mult=1., profile='normal'): 87 | super(MobileNetV2, self).__init__() 88 | if profile == 'normal': 89 | block = InvertedResidual 90 | input_channel = 32 91 | last_channel = 1280 92 | interverted_residual_setting = [ 93 | # t, c, n, s 94 | [1, 16, 1, 1], 95 | [6, 24, 2, 2], 96 | [6, 32, 3, 2], 97 | [6, 64, 4, 2], 98 | [6, 96, 3, 1], 99 | [6, 160, 3, 2], 100 | [6, 320, 1, 1], 101 | ] 102 | elif profile == '0.7flops': 103 | block = InvertedResidual 104 | pruned_channels = [3, 24, 16, 72, 24, 104, 24, 104, 32, 136, 32, 136, 32, 136, 64, 272, 64, 272, 64, 272, 105 | 64, 272, 96, 408, 96, 400, 96, 400, 160, 712, 160, 720, 160, 712, 248, 848] 106 | input_channel, last_channel, interverted_residual_setting = parse_pruned_channels(pruned_channels) 107 | else: 108 | raise NotImplementedError 109 | 110 | # building first layer 111 | assert input_size % 32 == 0 112 | input_channel = int(input_channel * width_mult) 113 | self.last_channel = int(last_channel * width_mult) if width_mult > 1.0 else last_channel 114 | self.features = [conv_bn(3, input_channel, 2)] 115 | # building inverted residual blocks 116 | for t, c, n, s in interverted_residual_setting: 117 | output_channel = int(c * width_mult) 118 | for i in range(n): 119 | if i == 0: 120 | self.features.append(block(input_channel, output_channel, s, expand_ratio=t)) 121 | else: 122 | self.features.append(block(input_channel, output_channel, 1, expand_ratio=t)) 123 | input_channel = output_channel 124 | # building last several layers 125 | self.features.append(conv_1x1_bn(input_channel, self.last_channel)) 126 | # make it nn.Sequential 127 | self.features = nn.Sequential(*self.features) 128 | 129 | # building classifier 130 | self.classifier = nn.Sequential( 131 | nn.Dropout(0.2), 132 | nn.Linear(self.last_channel, n_class), 133 | ) 134 | 135 | self._initialize_weights() 136 | 137 | def forward(self, x): 138 | x = self.features(x) 139 | x = x.mean(3).mean(2) 140 | x = self.classifier(x) 141 | return x 142 | 143 | def _initialize_weights(self): 144 | for m in self.modules(): 145 | if isinstance(m, nn.Conv2d): 146 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 147 | m.weight.data.normal_(0, math.sqrt(2. / n)) 148 | if m.bias is not None: 149 | m.bias.data.zero_() 150 | elif isinstance(m, nn.BatchNorm2d): 151 | m.weight.data.fill_(1) 152 | m.bias.data.zero_() 153 | elif isinstance(m, nn.Linear): 154 | n = m.weight.size(1) 155 | m.weight.data.normal_(0, 0.01) 156 | m.bias.data.zero_() 157 | 158 | -------------------------------------------------------------------------------- /utils.py: -------------------------------------------------------------------------------- 1 | ''' Code partly from https://github.com/ShichenLiu/CondenseNet/blob/master/utils.py ''' 2 | 3 | import torch 4 | from torch.autograd import Variable 5 | import os 6 | import sys 7 | import time 8 | 9 | def get_num_gen(gen): 10 | return sum(1 for x in gen) 11 | 12 | 13 | def is_leaf(model): 14 | return get_num_gen(model.children()) == 0 15 | 16 | 17 | def get_layer_info(layer): 18 | layer_str = str(layer) 19 | type_name = layer_str[:layer_str.find('(')].strip() 20 | return type_name 21 | 22 | 23 | def get_layer_param(model): 24 | import operator 25 | import functools 26 | 27 | return sum([functools.reduce(operator.mul, i.size(), 1) for i in model.parameters()]) 28 | 29 | 30 | def measure_layer(layer, x): 31 | global count_ops, count_params 32 | delta_ops = 0 33 | delta_params = 0 34 | multi_add = 1 35 | type_name = get_layer_info(layer) 36 | 37 | # ops_conv 38 | if type_name in ['Conv2d']: 39 | out_h = int((x.size()[2] + 2 * layer.padding[0] - layer.kernel_size[0]) / 40 | layer.stride[0] + 1) 41 | out_w = int((x.size()[3] + 2 * layer.padding[1] - layer.kernel_size[1]) / 42 | layer.stride[1] + 1) 43 | delta_ops = layer.in_channels * layer.out_channels * layer.kernel_size[0] * \ 44 | layer.kernel_size[1] * out_h * out_w / layer.groups * multi_add 45 | delta_params = get_layer_param(layer) 46 | 47 | # ops_nonlinearity 48 | elif type_name in ['ReLU']: 49 | delta_ops = x.numel() / x.size(0) 50 | delta_params = get_layer_param(layer) 51 | 52 | # ops_pooling 53 | elif type_name in ['AvgPool2d']: 54 | in_w = x.size()[2] 55 | kernel_ops = layer.kernel_size * layer.kernel_size 56 | out_w = int((in_w + 2 * layer.padding - layer.kernel_size) / layer.stride + 1) 57 | out_h = int((in_w + 2 * layer.padding - layer.kernel_size) / layer.stride + 1) 58 | delta_ops = x.size()[1] * out_w * out_h * kernel_ops 59 | delta_params = get_layer_param(layer) 60 | 61 | elif type_name in ['AdaptiveAvgPool2d']: 62 | delta_ops = x.size()[1] * x.size()[2] * x.size()[3] 63 | delta_params = get_layer_param(layer) 64 | 65 | # ops_linear 66 | elif type_name in ['Linear']: 67 | weight_ops = layer.weight.numel() * multi_add 68 | bias_ops = layer.bias.numel() 69 | delta_ops = weight_ops + bias_ops 70 | delta_params = get_layer_param(layer) 71 | 72 | # ops_nothing 73 | elif type_name in ['BatchNorm2d', 'Dropout2d', 'DropChannel', 'Dropout']: 74 | delta_params = get_layer_param(layer) 75 | 76 | # unknown layer type 77 | else: 78 | delta_params = get_layer_param(layer) 79 | 80 | count_ops += delta_ops 81 | count_params += delta_params 82 | 83 | return 84 | 85 | 86 | def measure_model(model, H, W): 87 | global count_ops, count_params 88 | count_ops = 0 89 | count_params = 0 90 | data = Variable(torch.zeros(1, 3, H, W)) 91 | 92 | def should_measure(x): 93 | return is_leaf(x) 94 | 95 | def modify_forward(model): 96 | for child in model.children(): 97 | if should_measure(child): 98 | def new_forward(m): 99 | def lambda_forward(x): 100 | measure_layer(m, x) 101 | return m.old_forward(x) 102 | return lambda_forward 103 | child.old_forward = child.forward 104 | child.forward = new_forward(child) 105 | else: 106 | modify_forward(child) 107 | 108 | def restore_forward(model): 109 | for child in model.children(): 110 | # leaf node 111 | if is_leaf(child) and hasattr(child, 'old_forward'): 112 | child.forward = child.old_forward 113 | child.old_forward = None 114 | else: 115 | restore_forward(child) 116 | 117 | modify_forward(model) 118 | model.forward(data) 119 | restore_forward(model) 120 | 121 | return count_ops, count_params 122 | 123 | 124 | class AverageMeter(object): 125 | """Computes and stores the average and current value""" 126 | def __init__(self): 127 | self.reset() 128 | 129 | def reset(self): 130 | self.val = 0 131 | self.avg = 0 132 | self.sum = 0 133 | self.count = 0 134 | 135 | def update(self, val, n=1): 136 | self.val = val 137 | self.sum += val * n 138 | self.count += n 139 | if self.count > 0: 140 | self.avg = self.sum / self.count 141 | 142 | def accumulate(self, val, n=1): 143 | self.sum += val 144 | self.count += n 145 | if self.count > 0: 146 | self.avg = self.sum / self.count 147 | 148 | 149 | def accuracy(output, target, topk=(1,)): 150 | """Computes the precision@k for the specified values of k""" 151 | batch_size = target.size(0) 152 | num = output.size(1) 153 | target_topk = [] 154 | appendices = [] 155 | for k in topk: 156 | if k <= num: 157 | target_topk.append(k) 158 | else: 159 | appendices.append([0.0]) 160 | topk = target_topk 161 | maxk = max(topk) 162 | _, pred = output.topk(maxk, 1, True, True) 163 | pred = pred.t() 164 | correct = pred.eq(target.view(1, -1).expand_as(pred)) 165 | 166 | res = [] 167 | for k in topk: 168 | correct_k = correct[:k].view(-1).float().sum(0) 169 | res.append(correct_k.mul_(100.0 / batch_size)) 170 | return res + appendices 171 | 172 | 173 | def process_state_dict(state_dict): 174 | # process state dict so that it can be loaded by normal models 175 | for k in list(state_dict.keys()): 176 | state_dict[k.replace('module.', '')] = state_dict.pop(k) 177 | return state_dict 178 | 179 | 180 | # Custom progress bar 181 | _, term_width = os.popen('stty size', 'r').read().split() 182 | term_width = int(term_width) 183 | TOTAL_BAR_LENGTH = 40. 184 | last_time = time.time() 185 | begin_time = last_time 186 | 187 | 188 | def format_time(seconds): 189 | days = int(seconds / 3600/24) 190 | seconds = seconds - days*3600*24 191 | hours = int(seconds / 3600) 192 | seconds = seconds - hours*3600 193 | minutes = int(seconds / 60) 194 | seconds = seconds - minutes*60 195 | secondsf = int(seconds) 196 | seconds = seconds - secondsf 197 | millis = int(seconds*1000) 198 | 199 | f = '' 200 | i = 1 201 | if days > 0: 202 | f += str(days) + 'D' 203 | i += 1 204 | if hours > 0 and i <= 2: 205 | f += str(hours) + 'h' 206 | i += 1 207 | if minutes > 0 and i <= 2: 208 | f += str(minutes) + 'm' 209 | i += 1 210 | if secondsf > 0 and i <= 2: 211 | f += str(secondsf) + 's' 212 | i += 1 213 | if millis > 0 and i <= 2: 214 | f += str(millis) + 'ms' 215 | i += 1 216 | if f == '': 217 | f = '0ms' 218 | return f 219 | 220 | 221 | def progress_bar(current, total, msg=None): 222 | global last_time, begin_time 223 | if current == 0: 224 | begin_time = time.time() # Reset for new bar. 225 | 226 | cur_len = int(TOTAL_BAR_LENGTH*current/total) 227 | rest_len = int(TOTAL_BAR_LENGTH - cur_len) - 1 228 | 229 | sys.stdout.write(' [') 230 | for i in range(cur_len): 231 | sys.stdout.write('=') 232 | sys.stdout.write('>') 233 | for i in range(rest_len): 234 | sys.stdout.write('.') 235 | sys.stdout.write(']') 236 | 237 | cur_time = time.time() 238 | step_time = cur_time - last_time 239 | last_time = cur_time 240 | tot_time = cur_time - begin_time 241 | 242 | L = [] 243 | L.append(' Step: %s' % format_time(step_time)) 244 | L.append(' | Tot: %s' % format_time(tot_time)) 245 | if msg: 246 | L.append(' | ' + msg) 247 | 248 | msg = ''.join(L) 249 | sys.stdout.write(msg) 250 | for i in range(term_width-int(TOTAL_BAR_LENGTH)-len(msg)-3): 251 | sys.stdout.write(' ') 252 | 253 | # Go back to the center of the bar. 254 | for i in range(term_width-int(TOTAL_BAR_LENGTH/2)+2): 255 | sys.stdout.write('\b') 256 | sys.stdout.write(' %d/%d ' % (current+1, total)) 257 | 258 | if current < total-1: 259 | sys.stdout.write('\r') 260 | else: 261 | sys.stdout.write('\n') 262 | sys.stdout.flush() 263 | --------------------------------------------------------------------------------