├── .gitignore ├── dataset ├── README.md └── dataset_download.sh ├── README.md ├── arm_edge ├── docker_build_tf_serving_nano.sh ├── models │ ├── README.md │ ├── models.config │ └── model_download.sh ├── docker_build_tf_serving_xavier_tx.sh ├── docker_run_tf_serving.sh ├── Dockerfile_tf_serving_nano ├── Dockerfile_tf_serving_xavier_tx └── README.md ├── bench ├── README.md ├── run_rest_bench.py ├── run_grpc_bench.py ├── bert_imdb │ ├── preprocessing.py │ └── grpc_bench.py ├── distilbert_sst2 │ ├── preprocessing.py │ └── grpc_bench.py ├── module │ ├── module_rest.py │ ├── module_gspread.py │ ├── module_grpc.py │ └── put_data_into_sheet.py ├── variables.py ├── mobilenet_v1 │ ├── preprocessing.py │ ├── rest_bench.py │ └── grpc_bench.py ├── mobilenet_v2 │ ├── preprocessing.py │ ├── rest_bench.py │ └── grpc_bench.py ├── yolo_v5 │ ├── preprocessing.py │ ├── rest_bench.py │ └── grpc_bench.py └── inception_v3 │ ├── preprocessing.py │ ├── rest_bench.py │ └── grpc_bench.py ├── model ├── README.md └── model_download.sh ├── x86 ├── inception_v3 │ ├── model_download.sh │ ├── Dockerfile-REST │ └── Dockerfile-gRPC ├── mobilenet_v1 │ ├── model_download.sh │ ├── Dockerfile-REST │ └── Dockerfile-gRPC ├── mobilenet_v2 │ ├── model_download.sh │ ├── Dockerfile-REST │ └── Dockerfile-gRPC ├── yolo_v5 │ ├── model_download.sh │ ├── Dockerfile-REST │ └── Dockerfile-gRPC └── README.md └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | /bench/module/sheet_credential.json 2 | /bench/variables.py -------------------------------------------------------------------------------- /dataset/README.md: -------------------------------------------------------------------------------- 1 | 2 | ### Dataset 다운로드 3 | 4 | ``` 5 | chmod +x dataset_download.sh 6 | ./dataset_download.sh 7 | ``` 8 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ``` 2 | pip3 install -r requirements.txt 3 | ``` 4 | 5 | - bench를 실행하기 전, /bench/variables.py에서 기본 설정을 합니다. (서버주소 등) -------------------------------------------------------------------------------- /arm_edge/docker_build_tf_serving_nano.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | docker build -t edge-tf-serving -f Dockerfile_tf_serving_nano . -------------------------------------------------------------------------------- /arm_edge/models/README.md: -------------------------------------------------------------------------------- 1 | ### models download 2 | 3 | ```shell 4 | chmod +x model_download.sh && ./model_download.sh 5 | ``` -------------------------------------------------------------------------------- /bench/README.md: -------------------------------------------------------------------------------- 1 | - module_gspread를 이용하기 위해서는 module 디렉터리에 Google Spread Sheet 인증 정보를 소유하고 있는 "sheet_credential.json"파일이 있어야 합니다. -------------------------------------------------------------------------------- /arm_edge/docker_build_tf_serving_xavier_tx.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | docker build -t edge-tf-serving -f Dockerfile_tf_serving_xavier_tx . -------------------------------------------------------------------------------- /model/README.md: -------------------------------------------------------------------------------- 1 | ### Dataset 다운로드 2 | 3 | ``` 4 | chmod +x dataset_download.sh 5 | ./dataset_download.sh 6 | ``` 7 | ### Model 다운로드 8 | 9 | ``` 10 | chmod +x model_download.sh 11 | ./model_download.sh 12 | ``` 13 | -------------------------------------------------------------------------------- /x86/inception_v3/model_download.sh: -------------------------------------------------------------------------------- 1 | ### inception_v3 model downloads 2 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/inception_v3/inception_v3.zip 3 | unzip -q inception_v3.zip && rm inception_v3.zip 4 | -------------------------------------------------------------------------------- /x86/mobilenet_v1/model_download.sh: -------------------------------------------------------------------------------- 1 | ### mobilenet_v1 model downloads 2 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/mobilenet_v1/mobilenet_v1.zip 3 | unzip -q mobilenet_v1.zip && rm mobilenet_v1.zip 4 | -------------------------------------------------------------------------------- /x86/mobilenet_v2/model_download.sh: -------------------------------------------------------------------------------- 1 | ### mobilenet_v2 model downloads 2 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/mobilenet_v2/mobilenet_v2.zip 3 | unzip -q mobilenet_v2.zip && rm mobilenet_v2.zip 4 | -------------------------------------------------------------------------------- /x86/yolo_v5/model_download.sh: -------------------------------------------------------------------------------- 1 | ### yolo_v5 model downloads 2 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/yolo_v5/yolo_v5.zip 3 | unzip -q yolo_v5.zip && rm yolo_v5.zip 4 | mv yolov5/yolov5s_saved_model yolo_v5 -------------------------------------------------------------------------------- /x86/yolo_v5/Dockerfile-REST: -------------------------------------------------------------------------------- 1 | FROM tensorflow/serving 2 | 3 | COPY yolo_v5 /models/yolo_v5/1/ 4 | 5 | EXPOSE 8501 6 | 7 | CMD ["--rest-port=8501","--model_name=yolo_v5","--model_base_path=/models/yolo_v5","--rest_api_num_threads=1000"] 8 | 9 | -------------------------------------------------------------------------------- /arm_edge/docker_run_tf_serving.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | this_pwd=$(pwd) 4 | 5 | docker run --rm \ 6 | --gpus all \ 7 | -p 8500:8500 \ 8 | -p 8501:8501 \ 9 | -v $this_pwd/models/:/models/ \ 10 | edge-tf-serving:latest -------------------------------------------------------------------------------- /x86/inception_v3/Dockerfile-REST: -------------------------------------------------------------------------------- 1 | FROM tensorflow/serving 2 | 3 | COPY inception_v3 /models/inception_v3/1/ 4 | 5 | EXPOSE 8501 6 | 7 | CMD ["--rest-port=8501","--model_name=inception_v3","--model_base_path=/models/inception_v3","--rest_api_num_threads=1000"] 8 | 9 | -------------------------------------------------------------------------------- /x86/mobilenet_v1/Dockerfile-REST: -------------------------------------------------------------------------------- 1 | FROM tensorflow/serving 2 | 3 | COPY mobilenet_v1 /models/mobilenet_v1/1/ 4 | 5 | EXPOSE 8501 6 | 7 | CMD ["--rest-port=8501","--model_name=mobilenet_v1","--model_base_path=/models/mobilenet_v1","--rest_api_num_threads=1000"] 8 | 9 | -------------------------------------------------------------------------------- /x86/mobilenet_v2/Dockerfile-REST: -------------------------------------------------------------------------------- 1 | FROM tensorflow/serving 2 | 3 | COPY mobilenet_v2 /models/mobilenet_v2/1/ 4 | 5 | EXPOSE 8501 6 | 7 | CMD ["--rest-port=8501","--model_name=mobilenet_v2","--model_base_path=/models/mobilenet_v2","--rest_api_num_threads=1000"] 8 | 9 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | #for grpc request 2 | tensorflow-serving-api>=2.11.0 3 | 4 | #for image preprocessing 5 | #1) mobilenet / inception 6 | numpy>=1.24.2 7 | Pillow>=9.4.0 8 | #2) yolo 9 | opencv-python>=4.7.0.72 10 | 11 | #for use google spread sheet 12 | gspread>=5.7.2 13 | oauth2client>=4.1.3 -------------------------------------------------------------------------------- /bench/run_rest_bench.py: -------------------------------------------------------------------------------- 1 | import variables 2 | from module import put_data_into_sheet 3 | import importlib 4 | rest_bench = importlib.import_module(f"{variables.model_name}.rest_bench") 5 | 6 | result = rest_bench.run_bench(variables.num_tasks, variables.rest_server_address) 7 | 8 | put_data_into_sheet.put_data(variables.rest_spreadsheet_id, result, variables.num_tasks) -------------------------------------------------------------------------------- /x86/yolo_v5/Dockerfile-gRPC: -------------------------------------------------------------------------------- 1 | FROM tensorflow/serving 2 | 3 | COPY yolo_v5 /models/yolo_v5/1/ 4 | 5 | EXPOSE 8500 6 | 7 | CMD ["--port=8500","--model_name=yolo_v5","--model_base_path=/models/yolo_v5", "--grpc_channel_arguments=grpc.max_send_message_length=50*1024*1024", "--grpc_channel_arguments=grpc.max_receive_length=50*1024*1024", "--grpc_max_threads=1000"] 8 | 9 | -------------------------------------------------------------------------------- /bench/run_grpc_bench.py: -------------------------------------------------------------------------------- 1 | import variables 2 | from module import put_data_into_sheet 3 | import importlib 4 | grpc_bench = importlib.import_module(f"{variables.model_name}.grpc_bench") 5 | 6 | result = grpc_bench.run_bench(variables.num_tasks, variables.grpc_server_address, variables.use_https) 7 | 8 | put_data_into_sheet.put_data(variables.grpc_spreadsheet_id, result, variables.num_tasks) -------------------------------------------------------------------------------- /x86/inception_v3/Dockerfile-gRPC: -------------------------------------------------------------------------------- 1 | FROM tensorflow/serving 2 | 3 | COPY inception_v3 /models/inception_v3/1/ 4 | 5 | EXPOSE 8500 6 | 7 | CMD ["--port=8500","--model_name=inception_v3","--model_base_path=/models/inception_v3", "--grpc_channel_arguments=grpc.max_send_message_length=50*1024*1024", "--grpc_channel_arguments=grpc.max_receive_length=50*1024*1024", "--grpc_max_threads=1000"] 8 | 9 | -------------------------------------------------------------------------------- /x86/mobilenet_v1/Dockerfile-gRPC: -------------------------------------------------------------------------------- 1 | FROM tensorflow/serving 2 | 3 | COPY mobilenet_v1 /models/mobilenet_v1/1/ 4 | 5 | EXPOSE 8500 6 | 7 | CMD ["--port=8500","--model_name=mobilenet_v1","--model_base_path=/models/mobilenet_v1", "--grpc_channel_arguments=grpc.max_send_message_length=50*1024*1024", "--grpc_channel_arguments=grpc.max_receive_length=50*1024*1024", "--grpc_max_threads=1000"] 8 | 9 | -------------------------------------------------------------------------------- /x86/mobilenet_v2/Dockerfile-gRPC: -------------------------------------------------------------------------------- 1 | FROM tensorflow/serving 2 | 3 | COPY mobilenet_v2 /models/mobilenet_v2/1/ 4 | 5 | EXPOSE 8500 6 | 7 | CMD ["--port=8500","--model_name=mobilenet_v2","--model_base_path=/models/mobilenet_v2", "--grpc_channel_arguments=grpc.max_send_message_length=50*1024*1024", "--grpc_channel_arguments=grpc.max_receive_length=50*1024*1024", "--grpc_max_threads=1000"] 8 | 9 | -------------------------------------------------------------------------------- /bench/bert_imdb/preprocessing.py: -------------------------------------------------------------------------------- 1 | #image 전처리 library 2 | import tensorflow as tf 3 | 4 | def run_preprocessing(text): 5 | tokenizer = tf.keras.preprocessing.text.Tokenizer() 6 | tokenizer.fit_on_texts([text]) 7 | input_ids = tokenizer.texts_to_sequences([text]) 8 | input_mask = [[1] * len(input_ids[0])] 9 | segment_ids = [[0] * len(input_ids[0])] 10 | 11 | return input_ids, input_mask, segment_ids -------------------------------------------------------------------------------- /bench/distilbert_sst2/preprocessing.py: -------------------------------------------------------------------------------- 1 | #image 전처리 library 2 | import tensorflow as tf 3 | 4 | def run_preprocessing(text): 5 | tokenizer = tf.keras.preprocessing.text.Tokenizer() 6 | tokenizer.fit_on_texts([text]) 7 | input_ids = tokenizer.texts_to_sequences([text]) 8 | input_mask = [[1] * len(input_ids[0])] 9 | segment_ids = [[0] * len(input_ids[0])] 10 | 11 | return input_ids, input_mask, segment_ids -------------------------------------------------------------------------------- /arm_edge/Dockerfile_tf_serving_nano: -------------------------------------------------------------------------------- 1 | FROM helmuthva/jetson-nano-tensorflow-serving-base 2 | 3 | EXPOSE 8500 4 | EXPOSE 8501 5 | 6 | CMD ["tensorflow_model_server", "--port=8500", "--rest_api_port=8501","--model_config_file=/models/models.config", "--model_config_file_poll_wait_seconds=60", "--rest_api_num_threads=1000", "--grpc_channel_arguments=grpc.max_send_message_length=100*1024*1024", "--grpc_channel_arguments=grpc.max_receive_length=100*1024*1024", "--grpc_max_threads=1000"] -------------------------------------------------------------------------------- /arm_edge/Dockerfile_tf_serving_xavier_tx: -------------------------------------------------------------------------------- 1 | FROM helmuthva/jetson-xavier-tensorflow-serving-base 2 | 3 | EXPOSE 8500 4 | EXPOSE 8501 5 | 6 | CMD ["tensorflow_model_server", "--port=8500", "--rest_api_port=8501","--model_config_file=/models/models.config", "--model_config_file_poll_wait_seconds=60", "--rest_api_num_threads=1000", "--grpc_channel_arguments=grpc.max_send_message_length=100*1024*1024", "--grpc_channel_arguments=grpc.max_receive_length=100*1024*1024", "--grpc_max_threads=1000"] -------------------------------------------------------------------------------- /bench/module/module_rest.py: -------------------------------------------------------------------------------- 1 | import requests 2 | import time 3 | import json 4 | 5 | def predict(server_address, model_name, data): 6 | headers = {"content-type": "application/json"} 7 | url = server_address + "v1/models/" + model_name + ":predict" 8 | request_time = time.time() 9 | response = requests.post(url, data=data, headers=headers) 10 | response_time = time.time() 11 | elapsed_time = response_time - request_time 12 | result = json.loads(response.text) 13 | return result, elapsed_time 14 | -------------------------------------------------------------------------------- /bench/variables.py: -------------------------------------------------------------------------------- 1 | # 모델 서버의 주소 및 포트 번호, 모델 이름 설정 2 | 3 | # DNS, IP 주소만 작성합니다. (기본 포트번호 443) 4 | # EX) serving-host 5 | grpc_server_address = '' 6 | # gRPC에서 https 사용시 1로 설정, http 사용시 0으로 설정 7 | use_https = 1 8 | # protocol을 포함한 URI 형태로 작성 9 | # EX) https://serving-host/ 10 | rest_server_address = '' 11 | # gRPC bench 결과를 저장할 google spread sheet id 12 | grpc_spreadsheet_id = '' 13 | # REST bench 결과를 저장할 google spread sheet id 14 | rest_spreadsheet_id = '' 15 | # 추론을 테스트할 모델 이름 16 | model_name = '' 17 | # 추론 횟수 지정 (병렬로 한번에 처리됩니다.) 18 | num_tasks = 10 -------------------------------------------------------------------------------- /bench/mobilenet_v1/preprocessing.py: -------------------------------------------------------------------------------- 1 | #image 전처리 library 2 | import tensorflow as tf 3 | import numpy as np 4 | from PIL import Image 5 | import os 6 | 7 | def get_file_path(filename): 8 | return os.path.join(os.path.dirname(__file__), filename) 9 | 10 | # 이미지 로드 및 전처리 (for mobilenet) 11 | def run_preprocessing(image_file_path): 12 | img = Image.open(get_file_path(image_file_path)) 13 | img = img.resize((224, 224)) 14 | img_array = np.array(img) 15 | img_array = img_array.astype('float32') / 255.0 16 | img_array = np.expand_dims(img_array, axis=0) 17 | return img_array -------------------------------------------------------------------------------- /bench/mobilenet_v2/preprocessing.py: -------------------------------------------------------------------------------- 1 | #image 전처리 library 2 | import tensorflow as tf 3 | import numpy as np 4 | from PIL import Image 5 | import os 6 | 7 | def get_file_path(filename): 8 | return os.path.join(os.path.dirname(__file__), filename) 9 | 10 | # 이미지 로드 및 전처리 (for mobilenet) 11 | def run_preprocessing(image_file_path): 12 | img = Image.open(get_file_path(image_file_path)) 13 | img = img.resize((224, 224)) 14 | img_array = np.array(img) 15 | img_array = img_array.astype('float32') / 255.0 16 | img_array = np.expand_dims(img_array, axis=0) 17 | return img_array -------------------------------------------------------------------------------- /bench/yolo_v5/preprocessing.py: -------------------------------------------------------------------------------- 1 | #image 전처리 library 2 | import tensorflow as tf 3 | import numpy as np 4 | from PIL import Image 5 | import os 6 | import cv2 7 | 8 | def get_file_path(filename): 9 | return os.path.join(os.path.dirname(__file__), filename) 10 | 11 | # 이미지 로드 및 전처리 (for yolo) 12 | def run_preprocessing(image_file_path): 13 | img = cv2.imread(get_file_path(image_file_path)) 14 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 15 | img = cv2.resize(img, (640, 640)) 16 | img = img.astype('float32') / 255.0 17 | img = np.expand_dims(img, axis=0) 18 | 19 | return img 20 | -------------------------------------------------------------------------------- /bench/module/module_gspread.py: -------------------------------------------------------------------------------- 1 | import os 2 | import gspread 3 | from oauth2client.service_account import ServiceAccountCredentials 4 | 5 | def get_file_path(filename): 6 | return os.path.join(os.path.dirname(__file__), filename) 7 | 8 | def open_sheet(spreadsheet_key): 9 | scope = [ 10 | 'https://spreadsheets.google.com/feeds' 11 | ] 12 | cred_json_file_name = get_file_path('sheet_credential.json') 13 | credentials = ServiceAccountCredentials.from_json_keyfile_name(cred_json_file_name,scope) 14 | gc = gspread.authorize(credentials) 15 | doc = gc.open_by_key(spreadsheet_key) 16 | return doc -------------------------------------------------------------------------------- /bench/inception_v3/preprocessing.py: -------------------------------------------------------------------------------- 1 | #image 전처리 library 2 | import tensorflow as tf 3 | import numpy as np 4 | from PIL import Image 5 | import os 6 | 7 | def get_file_path(filename): 8 | return os.path.join(os.path.dirname(__file__), filename) 9 | 10 | # 이미지 로드 및 전처리 (for inception) 11 | def run_preprocessing(image_file_path): 12 | img = Image.open(get_file_path(image_file_path)) 13 | img = img.resize((299, 299)) 14 | img_array = np.array(img) 15 | img_array = (img_array - np.mean(img_array)) / np.std(img_array) 16 | img_array = img_array.astype(np.float32) 17 | img_array = np.expand_dims(img_array, axis=0) 18 | return img_array -------------------------------------------------------------------------------- /x86/README.md: -------------------------------------------------------------------------------- 1 | ### Fully-managed Cloud Service (AWS App Runner, GCP Cloud Run)에서 tfserving을 이용한 추론 성능을 시험합니다. 2 | 3 | - 각 모델 Docker container 빌드 전, build하는 호스트에서 model_download.sh를 참고하여 model을 다운로드 후 진행해야 합니다. 4 | 5 | - Docker Image build 전, 같은 폴더에 위치한 model_download.sh를 실행 후 진행해야 합니다. 6 | ```shell 7 | chmod +x ./model_download.sh && ./model_download.sh 8 | ``` 9 | 10 | - Docker bulld & run 방법 (예시) 11 | ```shell 12 | tag="tfserving-cloud" 13 | version="latest" 14 | API="gRPC" #gPRC or REST 15 | HOST_PORT=8080 16 | CONTAINER_PORT=8500 #gRPC = 8500, REST = 8501 17 | docker build -t $tag:$version -f Dockerfile-$API . 18 | docker run -it -p $HOST_PORT:$CONTAINER_PORT $tag:$version 19 | ``` -------------------------------------------------------------------------------- /arm_edge/README.md: -------------------------------------------------------------------------------- 1 | Edge device (nvidia jetson, raspberry pi with coral TPU)에서 tfserving을 이용한 추론 성능을 시험합니다. 2 | 3 | ## 모델 다운로드 4 | - model은 models/README.md를 참고하여 다운로드 가능합니다. 5 | 6 | ## 각 장비에서 사용가능한 도커파일 7 | - xavier, tx 장비에서 사용가능한 Dockerfile과 nano 장비에서 사용가능한 Dockerfile이 각각 따로 있습니다. (Dockerfile_tf_serving_xavier_tx, Dockerfile_tf_serving_nano) 8 | 9 | 10 | ## 도커파일 빌드 및 실행파일 11 | - docker_build_tf_serving_xavier_tx.sh 또는 docker_build_tf_serving_nano.sh로 도커파일을 빌드할 수 있습니다. 12 | - docker_run_tf_serving.sh로 빌드한 이미지를 실행 할 수 있습니다. 13 | - 실행 하기 전 각 파일의 실행 권한을 확인 후 권한이 없으면 권한을 주어야합니다. 14 | - xavier, tx 장비 예시 15 | ```shell 16 | chmod +x docker_build_tf_serving_xavier_tx.sh docker_run_tf_serving.sh 17 | ``` 18 | - 권한이 있다면 다음과 같이 실행 할 수 있습니다. 19 | - xavier, tx 장비 예시 20 | ```shell 21 | ./docker_build_tf_serving_xavier_tx.sh && ./docker_run_tf_serving.sh 22 | ``` -------------------------------------------------------------------------------- /bench/module/module_grpc.py: -------------------------------------------------------------------------------- 1 | import os 2 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 3 | #gRPC Import 4 | import grpc 5 | from tensorflow_serving.apis import prediction_service_pb2_grpc 6 | 7 | #시간 측정 library 8 | import time 9 | 10 | def create_grpc_stub(server_address, use_https): 11 | # gRPC 채널 생성 12 | if (use_https): 13 | channel = grpc.secure_channel(server_address,grpc.ssl_channel_credentials(),options=[('grpc.max_send_message_length', 50 * 1024 * 1024), ('grpc.max_receive_message_length', 50 * 1024 * 1024)]) 14 | else: 15 | channel = grpc.insecure_channel(server_address,options=[('grpc.max_send_message_length', 50 * 1024 * 1024), ('grpc.max_receive_message_length', 50 * 1024 * 1024)]) 16 | 17 | # gRPC 스텁 생성 18 | stub = prediction_service_pb2_grpc.PredictionServiceStub(channel) 19 | return stub 20 | 21 | def predict(stub, request): 22 | request_time = time.time() 23 | result = stub.Predict(request, timeout=100.0) 24 | response_time = time.time() 25 | elapsed_time = response_time - request_time 26 | return result, elapsed_time -------------------------------------------------------------------------------- /bench/yolo_v5/rest_bench.py: -------------------------------------------------------------------------------- 1 | #preprocessing library 2 | from mobilenet_v1 import preprocessing 3 | import numpy as np 4 | 5 | #REST 요청 관련 library 6 | from module import module_rest 7 | import json 8 | 9 | #병렬처리 library 10 | import concurrent.futures 11 | 12 | def run_bench(num_tasks, server_address): 13 | model_name = "yolo_v5" 14 | image_file_path = "../../../dataset/imagenet/imagenet_1000_raw/n01843383_1.JPEG" 15 | 16 | data = json.dumps({"instances": preprocessing.run_preprocessing(image_file_path).tolist()}) 17 | 18 | # REST 요청 병렬 처리 19 | with concurrent.futures.ThreadPoolExecutor(max_workers=num_tasks) as executor: 20 | futures = [executor.submit(lambda: module_rest.predict(server_address, model_name, data)) for _ in range(num_tasks)] 21 | 22 | inference_times_include_network_latency = [] 23 | for future in concurrent.futures.as_completed(futures): 24 | result, thread_elapsed_time = future.result() 25 | inference_times_include_network_latency.append(thread_elapsed_time) 26 | 27 | return inference_times_include_network_latency 28 | -------------------------------------------------------------------------------- /bench/inception_v3/rest_bench.py: -------------------------------------------------------------------------------- 1 | #preprocessing library 2 | from inception_v3 import preprocessing 3 | import numpy as np 4 | 5 | #REST 요청 관련 library 6 | from module import module_rest 7 | import json 8 | 9 | #병렬처리 library 10 | import concurrent.futures 11 | 12 | def run_bench(num_tasks, server_address): 13 | model_name = "inception_v3" 14 | image_file_path = "../../../dataset/imagenet/imagenet_1000_raw/n01843383_1.JPEG" 15 | 16 | data = json.dumps({"instances": preprocessing.run_preprocessing(image_file_path).tolist()}) 17 | 18 | # REST 요청 병렬 처리 19 | with concurrent.futures.ThreadPoolExecutor(max_workers=num_tasks) as executor: 20 | futures = [executor.submit(lambda: module_rest.predict(server_address, model_name, data)) for _ in range(num_tasks)] 21 | 22 | inference_times_include_network_latency = [] 23 | for future in concurrent.futures.as_completed(futures): 24 | result, thread_elapsed_time = future.result() 25 | inference_times_include_network_latency.append(thread_elapsed_time) 26 | 27 | return inference_times_include_network_latency 28 | -------------------------------------------------------------------------------- /bench/mobilenet_v1/rest_bench.py: -------------------------------------------------------------------------------- 1 | #preprocessing library 2 | from mobilenet_v1 import preprocessing 3 | import numpy as np 4 | 5 | #REST 요청 관련 library 6 | from module import module_rest 7 | import json 8 | 9 | #병렬처리 library 10 | import concurrent.futures 11 | 12 | def run_bench(num_tasks, server_address): 13 | model_name = "mobilenet_v1" 14 | image_file_path = "../../../dataset/imagenet/imagenet_1000_raw/n01843383_1.JPEG" 15 | 16 | data = json.dumps({"instances": preprocessing.run_preprocessing(image_file_path).tolist()}) 17 | 18 | # REST 요청 병렬 처리 19 | with concurrent.futures.ThreadPoolExecutor(max_workers=num_tasks) as executor: 20 | futures = [executor.submit(lambda: module_rest.predict(server_address, model_name, data)) for _ in range(num_tasks)] 21 | 22 | inference_times_include_network_latency = [] 23 | for future in concurrent.futures.as_completed(futures): 24 | result, thread_elapsed_time = future.result() 25 | inference_times_include_network_latency.append(thread_elapsed_time) 26 | 27 | return inference_times_include_network_latency 28 | -------------------------------------------------------------------------------- /bench/mobilenet_v2/rest_bench.py: -------------------------------------------------------------------------------- 1 | #preprocessing library 2 | from mobilenet_v2 import preprocessing 3 | import numpy as np 4 | 5 | #REST 요청 관련 library 6 | from module import module_rest 7 | import json 8 | 9 | #병렬처리 library 10 | import concurrent.futures 11 | 12 | def run_bench(num_tasks, server_address): 13 | model_name = "mobilenet_v2" 14 | image_file_path = "../../../dataset/imagenet/imagenet_1000_raw/n01843383_1.JPEG" 15 | 16 | data = json.dumps({"instances": preprocessing.run_preprocessing(image_file_path).tolist()}) 17 | 18 | # REST 요청 병렬 처리 19 | with concurrent.futures.ThreadPoolExecutor(max_workers=num_tasks) as executor: 20 | futures = [executor.submit(lambda: module_rest.predict(server_address, model_name, data)) for _ in range(num_tasks)] 21 | 22 | inference_times_include_network_latency = [] 23 | for future in concurrent.futures.as_completed(futures): 24 | result, thread_elapsed_time = future.result() 25 | inference_times_include_network_latency.append(thread_elapsed_time) 26 | 27 | return inference_times_include_network_latency 28 | -------------------------------------------------------------------------------- /dataset/dataset_download.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | 4 | #image classification image dataset (tfrecord imagenet) 5 | mkdir imagenet 6 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/dataset/imagenet/imagenet_1000 7 | mv imagenet_1000 ./imagenet 8 | 9 | #image classification image dataset (raw imagenet) 10 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/dataset/imagenet/imagenet_metadata.txt 11 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/dataset/imagenet/imagenet_1000_raw.zip 12 | mv imagenet_metadata.txt ./imagenet 13 | unzip -q imagenet_1000_raw.zip -d ./imagenet && rm imagenet_1000_raw.zip 14 | 15 | 16 | #object detection image dataset(coco_2017) 17 | mkdir coco_2017 18 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/dataset/coco_2017/val_dataset.py 19 | python3 val_dataset.py 20 | mv val_dataset.py ./coco_2017 21 | unzip -q coco2017val.zip -d ./ && rm coco2017val.zip 22 | 23 | 24 | #object detection video dataset 25 | mkdir video 26 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/dataset/video/road.mp4 27 | mv road.mp4 ./video 28 | -------------------------------------------------------------------------------- /arm_edge/models/models.config: -------------------------------------------------------------------------------- 1 | # models.config 2 | 3 | model_config_list { 4 | config { 5 | name : "mobilenet_v1" 6 | base_path: "/models/mobilenet_v1/" 7 | model_platform: "tensorflow" 8 | } 9 | 10 | # config { 11 | # name: "mobilenet_v2" 12 | # base_path: "/models/mobilenet_v2/" 13 | # model_platform: "tensorflow" 14 | # } 15 | # 16 | # config { 17 | # name: "inception_v3" 18 | # base_path: "/models/inception_v3/" 19 | # model_platform: "tensorflow" 20 | # } 21 | # 22 | # config { 23 | # name: "yolo_v5" 24 | # base_path: "/models/yolo_v5/" 25 | # model_platform: "tensorflow" 26 | # } 27 | # 28 | # config { 29 | # name: "bert_imdb" 30 | # base_path: "/models/bert_imdb/" 31 | # model_platform: "tensorflow" 32 | # } 33 | # 34 | # config { 35 | # name: "distilbert_sst2" 36 | # base_path: "/models/distilbert_sst2/" 37 | # model_platform: "tensorflow" 38 | # } 39 | } -------------------------------------------------------------------------------- /bench/module/put_data_into_sheet.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | import numpy as np 3 | # google sheet api 관련 library 4 | from module import module_gspread 5 | 6 | def put_data(spreadsheet_id, array, num_tasks): 7 | now = datetime.now() 8 | formatted_date = now.strftime("%y-%m-%d-%H:%M:%S") 9 | spread_doc = module_gspread.open_sheet(spreadsheet_id) 10 | open_worksheet = spread_doc.add_worksheet( 11 | title=formatted_date, rows=10+num_tasks, cols=10) 12 | open_worksheet.update( 13 | 'A1', "elapsed_inference_time (network latency included)") 14 | open_worksheet.merge_cells("A1:C1") 15 | open_worksheet.update('A2', "Minimum time") 16 | open_worksheet.update( 17 | 'A3', f"=MIN(A6:A{6+num_tasks-1})", value_input_option='USER_ENTERED') 18 | open_worksheet.update('B2', "Maximum time") 19 | open_worksheet.update( 20 | 'B3', f"=MAX(A6:A{6+num_tasks-1})", value_input_option='USER_ENTERED') 21 | open_worksheet.update('C2', "Average time") 22 | open_worksheet.update( 23 | 'C3', f"=AVERAGE(A6:A{6+num_tasks-1})", value_input_option='USER_ENTERED') 24 | open_worksheet.update('A5', "times") 25 | 26 | cell_list = open_worksheet.range(f'A6:A{6+num_tasks-1}') 27 | for i, cell in enumerate(cell_list): 28 | cell.value = array[i] 29 | open_worksheet.update_cells(cell_list) 30 | -------------------------------------------------------------------------------- /arm_edge/models/model_download.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # cnn model (mobilenet_v1, mobilenet_v2, inception_v3) 4 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/mobilenet_v1/mobilenet_v1.zip 5 | unzip -q mobilenet_v1.zip && rm mobilenet_v1.zip 6 | mkdir mobilenet_v1/1/ && mv mobilenet_v1/* mobilenet_v1/1/ 7 | 8 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/mobilenet_v2/mobilenet_v2.zip 9 | unzip -q mobilenet_v2.zip && rm mobilenet_v2.zip 10 | mkdir mobilenet_v2/1/ && mv mobilenet_v2/* mobilenet_v2/1/ 11 | 12 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/inception_v3/inception_v3.zip 13 | unzip -q inception_v3.zip && rm inception_v3.zip 14 | mkdir inception_v3/1/ && mv inception_v3/* inception_v3/1/ 15 | 16 | #object detection model (yolo_v5) 17 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/yolo_v5/yolo_v5.zip 18 | unzip -q yolo_v5.zip && rm yolo_v5.zip 19 | mv yolov5/yolov5s_saved_model yolo_v5 && rm -r yolov5 20 | mkdir yolo_v5/1/ && mv yolo_v5/* yolo_v5/1/ 21 | 22 | # nlp model (bert_imdb, distilbert_sst2) 23 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/NLP/bert_imdb.zip 24 | unzip -q bert_imdb.zip && rm bert_imdb.zip 25 | mkdir bert_imdb/1/ && mv bert_imdb/* bert_imdb/1/ 26 | 27 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/NLP/distilbert_sst2.zip 28 | unzip -q distilbert_sst2.zip && rm distilbert_sst2.zip 29 | mkdir distilbert_sst2/1/ && mv distilbert_sst2/* distilbert_sst2/1/ 30 | -------------------------------------------------------------------------------- /bench/yolo_v5/grpc_bench.py: -------------------------------------------------------------------------------- 1 | #import grpc module 2 | from module import module_grpc 3 | from tensorflow_serving.apis import predict_pb2 4 | 5 | #tf log setting 6 | import os 7 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 8 | import tensorflow as tf 9 | import numpy as np 10 | 11 | #preprocessing library 12 | from yolo_v5 import preprocessing 13 | 14 | #병렬처리 library 15 | import concurrent.futures 16 | 17 | def run_bench(num_tasks, server_address, use_https): 18 | model_name = "yolo_v5" 19 | 20 | image_file_path = "../../dataset/imagenet/imagenet_1000_raw/n01843383_1.JPEG" 21 | data = tf.make_tensor_proto(preprocessing.run_preprocessing(image_file_path)) 22 | 23 | stub = module_grpc.create_grpc_stub(server_address, use_https) 24 | 25 | # gRPC 요청 생성 26 | request = predict_pb2.PredictRequest() 27 | request.model_spec.name = model_name 28 | request.model_spec.signature_name = 'serving_default' 29 | request.inputs['x'].CopyFrom(data) 30 | 31 | # gRPC 요청 병렬 처리 32 | with concurrent.futures.ThreadPoolExecutor(max_workers=num_tasks) as executor: 33 | futures = [executor.submit(lambda: module_grpc.predict(stub, request)) for _ in range(num_tasks)] 34 | 35 | inference_times_include_network_latency = [] 36 | # 결과 출력 37 | for future in concurrent.futures.as_completed(futures): 38 | result, thread_elapsed_time = future.result() 39 | inference_times_include_network_latency.append(thread_elapsed_time) 40 | 41 | return inference_times_include_network_latency 42 | -------------------------------------------------------------------------------- /bench/inception_v3/grpc_bench.py: -------------------------------------------------------------------------------- 1 | #import grpc module 2 | from module import module_grpc 3 | from tensorflow_serving.apis import predict_pb2 4 | 5 | #tf log setting 6 | import os 7 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 8 | import tensorflow as tf 9 | import numpy as np 10 | 11 | #preprocessing library 12 | from inception_v3 import preprocessing 13 | 14 | #병렬처리 library 15 | import concurrent.futures 16 | 17 | def run_bench(num_tasks, server_address, use_https): 18 | model_name = "inception_v3" 19 | 20 | image_file_path = "../../dataset/imagenet/imagenet_1000_raw/n01843383_1.JPEG" 21 | data = tf.make_tensor_proto(preprocessing.run_preprocessing(image_file_path)) 22 | 23 | stub = module_grpc.create_grpc_stub(server_address, use_https) 24 | 25 | # gRPC 요청 생성 26 | request = predict_pb2.PredictRequest() 27 | request.model_spec.name = model_name 28 | request.model_spec.signature_name = 'serving_default' 29 | request.inputs['input_3'].CopyFrom(data) 30 | 31 | # gRPC 요청 병렬 처리 32 | with concurrent.futures.ThreadPoolExecutor(max_workers=num_tasks) as executor: 33 | futures = [executor.submit(lambda: module_grpc.predict(stub, request)) for _ in range(num_tasks)] 34 | 35 | inference_times_include_network_latency = [] 36 | # 결과 출력 37 | for future in concurrent.futures.as_completed(futures): 38 | result, thread_elapsed_time = future.result() 39 | inference_times_include_network_latency.append(thread_elapsed_time) 40 | 41 | return inference_times_include_network_latency -------------------------------------------------------------------------------- /bench/mobilenet_v1/grpc_bench.py: -------------------------------------------------------------------------------- 1 | #import grpc module 2 | from module import module_grpc 3 | from tensorflow_serving.apis import predict_pb2 4 | 5 | #tf log setting 6 | import os 7 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 8 | import tensorflow as tf 9 | import numpy as np 10 | 11 | #preprocessing library 12 | from mobilenet_v1 import preprocessing 13 | 14 | #병렬처리 library 15 | import concurrent.futures 16 | 17 | def run_bench(num_tasks, server_address, use_https): 18 | model_name = "mobilenet_v1" 19 | 20 | image_file_path = "../../dataset/imagenet/imagenet_1000_raw/n01843383_1.JPEG" 21 | data = tf.make_tensor_proto(preprocessing.run_preprocessing(image_file_path)) 22 | 23 | stub = module_grpc.create_grpc_stub(server_address, use_https) 24 | 25 | # gRPC 요청 생성 26 | request = predict_pb2.PredictRequest() 27 | request.model_spec.name = model_name 28 | request.model_spec.signature_name = 'serving_default' 29 | request.inputs['input_1'].CopyFrom(data) 30 | 31 | # gRPC 요청 병렬 처리 32 | with concurrent.futures.ThreadPoolExecutor(max_workers=num_tasks) as executor: 33 | futures = [executor.submit(lambda: module_grpc.predict(stub, request)) for _ in range(num_tasks)] 34 | 35 | inference_times_include_network_latency = [] 36 | # 결과 출력 37 | for future in concurrent.futures.as_completed(futures): 38 | result, thread_elapsed_time = future.result() 39 | inference_times_include_network_latency.append(thread_elapsed_time) 40 | 41 | return inference_times_include_network_latency -------------------------------------------------------------------------------- /bench/mobilenet_v2/grpc_bench.py: -------------------------------------------------------------------------------- 1 | #import grpc module 2 | from module import module_grpc 3 | from tensorflow_serving.apis import predict_pb2 4 | 5 | #tf log setting 6 | import os 7 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 8 | import tensorflow as tf 9 | import numpy as np 10 | 11 | #preprocessing library 12 | from mobilenet_v2 import preprocessing 13 | 14 | #병렬처리 library 15 | import concurrent.futures 16 | 17 | def run_bench(num_tasks, server_address, use_https): 18 | model_name = "mobilenet_v2" 19 | 20 | image_file_path = "../../dataset/imagenet/imagenet_1000_raw/n01843383_1.JPEG" 21 | data = tf.make_tensor_proto(preprocessing.run_preprocessing(image_file_path)) 22 | 23 | stub = module_grpc.create_grpc_stub(server_address, use_https) 24 | 25 | # gRPC 요청 생성 26 | request = predict_pb2.PredictRequest() 27 | request.model_spec.name = model_name 28 | request.model_spec.signature_name = 'serving_default' 29 | request.inputs['input_2'].CopyFrom(data) 30 | 31 | # gRPC 요청 병렬 처리 32 | with concurrent.futures.ThreadPoolExecutor(max_workers=num_tasks) as executor: 33 | futures = [executor.submit(lambda: module_grpc.predict(stub, request)) for _ in range(num_tasks)] 34 | 35 | inference_times_include_network_latency = [] 36 | # 결과 출력 37 | for future in concurrent.futures.as_completed(futures): 38 | result, thread_elapsed_time = future.result() 39 | inference_times_include_network_latency.append(thread_elapsed_time) 40 | 41 | return inference_times_include_network_latency -------------------------------------------------------------------------------- /bench/distilbert_sst2/grpc_bench.py: -------------------------------------------------------------------------------- 1 | #import grpc module 2 | from module import module_grpc 3 | from tensorflow_serving.apis import predict_pb2 4 | 5 | #tf log setting 6 | import os 7 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 8 | import tensorflow as tf 9 | 10 | #preprocessing library 11 | from distilbert_sst2 import preprocessing 12 | 13 | #병렬처리 library 14 | import concurrent.futures 15 | 16 | 17 | def run_bench(num_tasks, server_address, use_https): 18 | model_name = "distilbert_sst2" 19 | 20 | text = "This is a sample sentence to test the BERT model." 21 | input_ids, input_mask, segment_ids = preprocessing.run_preprocessing(text) 22 | 23 | stub = module_grpc.create_grpc_stub(server_address, use_https) 24 | 25 | # gRPC 요청 생성 26 | request = predict_pb2.PredictRequest() 27 | request.model_spec.name = model_name 28 | request.model_spec.signature_name = 'serving_default' 29 | 30 | request.inputs['bert_input_ids'].CopyFrom(tf.make_tensor_proto(input_ids, shape=[1, 128])) 31 | request.inputs['bert_input_masks'].CopyFrom(tf.make_tensor_proto(input_mask, shape=[1, 128])) 32 | 33 | 34 | # gRPC 요청 병렬 처리 35 | with concurrent.futures.ThreadPoolExecutor(max_workers=num_tasks) as executor: 36 | futures = [executor.submit(lambda: module_grpc.predict(stub, request)) for _ in range(num_tasks)] 37 | 38 | inference_times_include_network_latency = [] 39 | # 결과 출력 40 | for future in concurrent.futures.as_completed(futures): 41 | result, thread_elapsed_time = future.result() 42 | inference_times_include_network_latency.append(thread_elapsed_time) 43 | 44 | return inference_times_include_network_latency -------------------------------------------------------------------------------- /bench/bert_imdb/grpc_bench.py: -------------------------------------------------------------------------------- 1 | #import grpc module 2 | from module import module_grpc 3 | from tensorflow_serving.apis import predict_pb2 4 | 5 | #tf log setting 6 | import os 7 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 8 | import tensorflow as tf 9 | 10 | #preprocessing library 11 | from bert_imdb import preprocessing 12 | 13 | #병렬처리 library 14 | import concurrent.futures 15 | 16 | def run_bench(num_tasks, server_address, use_https): 17 | model_name = "bert_imdb" 18 | 19 | text = "This is a sample sentence to test the BERT model." 20 | input_ids, input_mask, segment_ids = preprocessing.run_preprocessing(text) 21 | 22 | stub = module_grpc.create_grpc_stub(server_address, use_https) 23 | 24 | # gRPC 요청 생성 25 | request = predict_pb2.PredictRequest() 26 | request.model_spec.name = model_name 27 | request.model_spec.signature_name = 'serving_default' 28 | 29 | request.inputs['input_ids'].CopyFrom(tf.make_tensor_proto(input_ids, shape=[1, len(input_ids[0])])) 30 | request.inputs['input_masks'].CopyFrom(tf.make_tensor_proto(input_mask, shape=[1, len(input_mask[0])])) 31 | request.inputs['segment_ids'].CopyFrom(tf.make_tensor_proto(segment_ids, shape=[1, len(segment_ids[0])])) 32 | 33 | 34 | # gRPC 요청 병렬 처리 35 | with concurrent.futures.ThreadPoolExecutor(max_workers=num_tasks) as executor: 36 | futures = [executor.submit(lambda: module_grpc.predict(stub, request)) for _ in range(num_tasks)] 37 | 38 | inference_times_include_network_latency = [] 39 | # 결과 출력 40 | for future in concurrent.futures.as_completed(futures): 41 | result, thread_elapsed_time = future.result() 42 | inference_times_include_network_latency.append(thread_elapsed_time) 43 | 44 | return inference_times_include_network_latency -------------------------------------------------------------------------------- /model/model_download.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | 4 | #image classification TF/FP32 model (mobilenet v1, mobilenet v2, inception v3) 5 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/mobilenet_v1/mobilenet_v1.zip 6 | unzip -q mobilenet_v1.zip && rm mobilenet_v1.zip 7 | 8 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/mobilenet_v2/mobilenet_v2.zip 9 | unzip -q mobilenet_v2.zip && rm mobilenet_v2.zip 10 | 11 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/inception_v3/inception_v3.zip 12 | unzip -q inception_v3.zip && rm inception_v3.zip 13 | 14 | #image classification EdgeTPU tflite INT8 model (mobilenet v1, mobilenet v2, inception v3) 15 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/classify.py 16 | mkdir mobilenet_v1_edgetpu_tflite 17 | ## coral model 18 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/mobilenet_v1_edgetpu_tflite/tf2_mobilenet_v1_1.0_224_ptq_edgetpu.tflite 19 | mv tf2_mobilenet_v1_1.0_224_ptq_edgetpu.tflite ./mobilenet_v1_edgetpu_tflite 20 | 21 | mkdir mobilenet_v2_edgetpu_tflite 22 | ## coral model 23 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/mobilenet_v2_edgetpu_tflite/tf2_mobilenet_v2_1.0_224_ptq_edgetpu.tflite 24 | mv tf2_mobilenet_v2_1.0_224_ptq_edgetpu.tflite ./mobilenet_v2_edgetpu_tflite 25 | ## custom model 26 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/mobilenet_v2_edgetpu_tflite/custom-mobilenet_v2_edgetpu.tflite 27 | mv custom-mobilenet_v2_edgetpu.tflite ./mobilenet_v2_edgetpu_tflite 28 | 29 | mkdir inception_v3_edgetpu_tflite 30 | ## coral model 31 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/inception_v3_edgetpu_tflite/inceptionv3_edgetpu.tflite 32 | mv inceptionv3_edgetpu.tflite ./inception_v3_edgetpu_tflite 33 | ## custom model 34 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/inception_v3_edgetpu_tflite/custom-inceptionv3_edgetpu.tflite 35 | mv custom-inceptionv3_edgetpu.tflite ./inception_v3_edgetpu_tflite 36 | 37 | #image classification tflite INT8 model (mobilenet v1, mobilenet v2, inception v3) 38 | mkdir mobilenet_v1_quantization_tflite 39 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/mobilenet_v1_quantization_tflite/mobilenetv1.tflite 40 | mv mobilenetv1.tflite ./mobilenet_v1_quantization_tflite 41 | 42 | mkdir mobilenet_v2_quantization_tflite 43 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/mobilenet_v2_quantization_tflite/mobilenetv2.tflite 44 | mv mobilenetv2.tflite ./mobilenet_v2_quantization_tflite 45 | 46 | mkdir inception_v3_quantization_tflite 47 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/inception_v3_quantization_tflite/inceptionv3.tflite 48 | mv inceptionv3.tflite ./inception_v3_quantization_tflite 49 | 50 | #object detection TF/FP32 model (yolo v5) 51 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/yolo_v5/yolo_v5.zip 52 | unzip -q yolo_v5.zip && rm yolo_v5.zip 53 | 54 | #object detection EdgeTPU tflite INT8 model (yolo v5) 55 | mkdir yolo_v5_edgetpu_tflite 56 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/yolo_v5_edgetpu_tflite/yolov5s-int8_edgetpu.tflite 57 | mv yolov5s-int8_edgetpu.tflite ./yolo_v5_edgetpu_tflite 58 | 59 | #object detection tflite FP16/INT8 model (yolo v5) 60 | mkdir yolo_v5_quantization_tflite 61 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/yolo_v5_quantization_tflite/yolov5s-fp16.tflite 62 | curl -O https://edge-inference.s3.us-west-2.amazonaws.com/CNN/model/yolo_v5_quantization_tflite/yolov5s-int8.tflite 63 | mv yolov5s-fp16.tflite yolov5s-int8.tflite ./yolo_v5_quantization_tflite 64 | --------------------------------------------------------------------------------