├── assets └── segmentation_workflow.png ├── env └── intel_env.yml ├── SECURITY.md ├── src ├── intel_neural_compressor │ ├── deploy.yaml │ ├── neural_compressor_conversion.py │ └── utils.py ├── create_frozen_graph.py ├── training.py ├── evaluation.py ├── run_inference.py └── utils.py ├── LICENSE ├── CONTRIBUTING.md ├── .gitignore ├── CODE_OF_CONDUCT.md └── README.md /assets/segmentation_workflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/oneapi-src/drone-navigation-inspection/HEAD/assets/segmentation_workflow.png -------------------------------------------------------------------------------- /env/intel_env.yml: -------------------------------------------------------------------------------- 1 | name: drone_navigation_intel 2 | channels: 3 | - intel 4 | - conda-forge 5 | dependencies: 6 | - intelpython3_core=2024.1.0 7 | - python=3.9 8 | - intel-aikit-tensorflow=2024.1 9 | - tqdm=4.66.2 10 | - pip=24.0 11 | - pip: 12 | - opencv-python==4.9.0.80 13 | -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | # Security Policy 2 | Intel is committed to rapidly addressing security vulnerabilities affecting our customers and providing clear guidance on the solution, impact, severity and mitigation. 3 | 4 | ## Reporting a Vulnerability 5 | Please report any security vulnerabilities in this project [utilizing the guidelines here](https://www.intel.com/content/www/us/en/security-center/vulnerability-handling-guidelines.html). 6 | 7 | -------------------------------------------------------------------------------- /src/intel_neural_compressor/deploy.yaml: -------------------------------------------------------------------------------- 1 | version: 1.0 2 | 3 | model: 4 | name: vgg_unet 5 | framework: tensorflow 6 | inputs: ["input_1"] 7 | outputs: ["Identity"] 8 | evaluation: # optional. required if user doesn't provide eval_func in neural_compressor.Quantization. 9 | accuracy: # optional. required if user doesn't provide eval_func in neural_compressor.Quantization. 10 | metric: 11 | topk: 1 12 | performance: # optional. used to benchmark performance of passing model. 13 | configs: 14 | cores_per_instance: 2 15 | num_of_instance: 1 16 | # quantization: # optional. tuning constraints on model-wise for advance user to reduce tuning space. 17 | # calibration: 18 | # sampling_size: 200 19 | 20 | tuning: 21 | accuracy_criterion: 22 | relative: 0.01 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%. 23 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2024, Intel Corporation 2 | 3 | Redistribution and use in source and binary forms, with or without 4 | modification, are permitted provided that the following conditions are met: 5 | 6 | * Redistributions of source code must retain the above copyright notice, 7 | this list of conditions and the following disclaimer. 8 | * Redistributions in binary form must reproduce the above copyright 9 | notice, this list of conditions and the following disclaimer in the 10 | documentation and/or other materials provided with the distribution. 11 | * Neither the name of Intel Corporation nor the names of its contributors 12 | may be used to endorse or promote products derived from this software 13 | without specific prior written permission. 14 | 15 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 16 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 17 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 18 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE 19 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 21 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 22 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 23 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 24 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 25 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | ### License 4 | 5 | Drone Navigation Inspection is licensed under the terms in [LICENSE]. By contributing to the project, you agree to the license and copyright terms therein and release your contribution under these terms. 6 | 7 | ### Sign your work 8 | 9 | Please use the sign-off line at the end of the patch. Your signature certifies that you wrote the patch or otherwise have the right to pass it on as an open-source patch. The rules are pretty simple: if you can certify 10 | the below (from [developercertificate.org](http://developercertificate.org/)): 11 | 12 | ``` 13 | Developer Certificate of Origin 14 | Version 1.1 15 | 16 | Copyright (C) 2004, 2006 The Linux Foundation and its contributors. 17 | 660 York Street, Suite 102, 18 | San Francisco, CA 94110 USA 19 | 20 | Everyone is permitted to copy and distribute verbatim copies of this 21 | license document, but changing it is not allowed. 22 | 23 | Developer's Certificate of Origin 1.1 24 | 25 | By making a contribution to this project, I certify that: 26 | 27 | (a) The contribution was created in whole or in part by me and I 28 | have the right to submit it under the open source license 29 | indicated in the file; or 30 | 31 | (b) The contribution is based upon previous work that, to the best 32 | of my knowledge, is covered under an appropriate open source 33 | license and I have the right under that license to submit that 34 | work with modifications, whether created in whole or in part 35 | by me, under the same open source license (unless I am 36 | permitted to submit under a different license), as indicated 37 | in the file; or 38 | 39 | (c) The contribution was provided directly to me by some other 40 | person who certified (a), (b) or (c) and I have not modified 41 | it. 42 | 43 | (d) I understand and agree that this project and the contribution 44 | are public and that a record of the contribution (including all 45 | personal information I submit with it, including my sign-off) is 46 | maintained indefinitely and may be redistributed consistent with 47 | this project or the open source license(s) involved. 48 | ``` 49 | 50 | Then you just add a line to every git commit message: 51 | 52 | Signed-off-by: Joe Smith 53 | 54 | Use your real name (sorry, no pseudonyms or anonymous contributions.) 55 | 56 | If you set your `user.name` and `user.email` git configs, you can sign your 57 | commit automatically with `git commit -s`. -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | share/python-wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | MANIFEST 28 | 29 | # PyInstaller 30 | # Usually these files are written by a python script from a template 31 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 32 | *.manifest 33 | *.spec 34 | 35 | # Installer logs 36 | pip-log.txt 37 | pip-delete-this-directory.txt 38 | 39 | # Unit test / coverage reports 40 | htmlcov/ 41 | .tox/ 42 | .nox/ 43 | .coverage 44 | .coverage.* 45 | .cache 46 | nosetests.xml 47 | coverage.xml 48 | *.cover 49 | *.py,cover 50 | .hypothesis/ 51 | .pytest_cache/ 52 | cover/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | .pybuilder/ 76 | target/ 77 | 78 | # Jupyter Notebook 79 | .ipynb_checkpoints 80 | 81 | # IPython 82 | profile_default/ 83 | ipython_config.py 84 | 85 | # pyenv 86 | # For a library or package, you might want to ignore these files since the code is 87 | # intended to run in multiple environments; otherwise, check them in: 88 | # .python-version 89 | 90 | # pipenv 91 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 92 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 93 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 94 | # install all needed dependencies. 95 | #Pipfile.lock 96 | 97 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 98 | __pypackages__/ 99 | 100 | # Celery stuff 101 | celerybeat-schedule 102 | celerybeat.pid 103 | 104 | # SageMath parsed files 105 | *.sage.py 106 | 107 | # Environments 108 | .env 109 | .venv 110 | venv/ 111 | ENV/ 112 | env.bak/ 113 | venv.bak/ 114 | 115 | # Spyder project settings 116 | .spyderproject 117 | .spyproject 118 | 119 | # Rope project settings 120 | .ropeproject 121 | 122 | # mkdocs documentation 123 | /site 124 | 125 | # mypy 126 | .mypy_cache/ 127 | .dmypy.json 128 | dmypy.json 129 | 130 | # Pyre type checker 131 | .pyre/ 132 | 133 | # pytype static type analyzer 134 | .pytype/ 135 | 136 | # Cython debug symbols 137 | cython_debug/ 138 | 139 | data/ 140 | output/ 141 | .vscode -------------------------------------------------------------------------------- /src/create_frozen_graph.py: -------------------------------------------------------------------------------- 1 | # Copyright (C) 2024 Intel Corporation 2 | # SPDX-License-Identifier: BSD-3-Clause 3 | 4 | """ 5 | Convert checkpoint to frozen graph 6 | """ 7 | # pylint: disable=E0401 C0301 E0611 E1136 8 | # flake8: noqa = E501 9 | import argparse 10 | import sys 11 | import os 12 | import pathlib 13 | import tensorflow as tf 14 | from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2 15 | from utils import vgg_unet 16 | 17 | 18 | if __name__ == "__main__": 19 | # Parameters 20 | parser = argparse.ArgumentParser() 21 | parser.add_argument('-m', 22 | '--model_path', 23 | type=str, 24 | required=False, 25 | default=None, 26 | help="Please provide the Latest Checkpoint path. Default is None.") 27 | 28 | 29 | parser.add_argument('-o', 30 | '--output_saved_dir', 31 | default=None, 32 | type=str, 33 | required=True, 34 | help="Directory to save frozen graph to." 35 | ) 36 | FLAGS = parser.parse_args() 37 | model_path = FLAGS.model_path 38 | output_saved_dir = FLAGS.output_saved_dir 39 | N_CLASSES = 21 40 | model = vgg_unet(n_classes=N_CLASSES, input_height=416, input_width=608) 41 | 42 | # Loading weights of Trained Model 43 | if model_path is not None: 44 | latest_checkpoint = model_path # find_latest_checkpoint(checkpoints_path) 45 | if latest_checkpoint is not None: 46 | print("Loading the weights from latest checkpoint ", 47 | latest_checkpoint) 48 | model.load_weights(latest_checkpoint).expect_partial() 49 | else: 50 | print("Please Check the checkpoint model path Provided") 51 | sys.exit() 52 | # Convert Keras model to ConcreteFunction 53 | full_model = tf.function(model) 54 | full_model = full_model.get_concrete_function( 55 | tf.TensorSpec(model.inputs[0].shape, model.inputs[0].dtype, name="input_1")) 56 | # Get frozen ConcreteFunction 57 | frozen_func = convert_variables_to_constants_v2(full_model) 58 | frozen_func.graph.as_graph_def() 59 | layers = [op.name for op in frozen_func.graph.get_operations()] 60 | print("-" * 50) 61 | print("Frozen model layers: ") 62 | for layer in layers: 63 | print(layer) 64 | print("-" * 50) 65 | print("Frozen model inputs: ") 66 | print(frozen_func.inputs) 67 | print("Frozen model outputs: ") 68 | print(frozen_func.outputs) 69 | path = pathlib.Path(FLAGS.output_saved_dir) 70 | path.mkdir(parents=True, exist_ok=True) 71 | 72 | # Save frozen graph from frozen ConcreteFunction to hard drive 73 | tf.io.write_graph(graph_or_graph_def=frozen_func.graph,logdir='.', 74 | name=os.path.join(FLAGS.output_saved_dir, "frozen_graph.pb"), 75 | as_text=False 76 | ) 77 | 78 | -------------------------------------------------------------------------------- /src/training.py: -------------------------------------------------------------------------------- 1 | # Copyright (C) 2024 Intel Corporation 2 | # SPDX-License-Identifier: BSD-3-Clause 3 | 4 | """ 5 | Training of segmentation model 6 | """ 7 | # pylint: disable = E1129 W0611 8 | # flake8: noqa = E501 9 | import os 10 | import time 11 | import argparse 12 | import pathlib 13 | import sys 14 | from utils import vgg_unet, train, train_hyperparameters_tuning 15 | 16 | 17 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 18 | 19 | 20 | if __name__ == "__main__": 21 | # Parameters 22 | parser = argparse.ArgumentParser() 23 | parser.add_argument('-m', 24 | '--model_path', 25 | type=str, 26 | required=False, 27 | default=None, 28 | help='Please provide the Latest Checkpoint path. Default is None.') 29 | parser.add_argument('-d', 30 | '--data_path', 31 | type=str, 32 | required=True, 33 | help='Absolute path to the dataset folder containing ' 34 | '"original_images" and "label_images_semantic" folders.') 35 | parser.add_argument('-e', 36 | '--epochs', 37 | type=str, 38 | required=False, 39 | default=1, 40 | help='Provide the number of epochs want to train.') 41 | parser.add_argument('-hy', 42 | '--hyperparams', 43 | type=str, 44 | required=False, 45 | default=0, 46 | help='Enable hyperparameter tuning. Default is "0" to ' 47 | 'indicate unabled hyperparameter tuning.') 48 | 49 | 50 | FLAGS = parser.parse_args() 51 | train_images_path = os.path.join(FLAGS.data_path, "original_images") 52 | train_labels_path = os.path.join(FLAGS.data_path, "label_images_semantic") 53 | epochs = int(FLAGS.epochs) 54 | hyp_flag = int(FLAGS.hyperparams) 55 | path = pathlib.Path(FLAGS.model_path) 56 | os.makedirs(path,exist_ok=True) 57 | 58 | 59 | 60 | if FLAGS.model_path is None: 61 | print("Please provide path to save the model...\n") 62 | sys.exit(1) 63 | else: 64 | model_path = FLAGS.model_path + "/vgg_unet" 65 | 66 | 67 | # Model Initialization 68 | 69 | N_CLASSES = 21 # Aerial Semantic Segmentation Drone Dataset tree, 70 | 71 | model = vgg_unet(n_classes=N_CLASSES, input_height=416, input_width=608) 72 | model_from_name = {"vgg_unet": vgg_unet} 73 | 74 | # Train 75 | if not hyp_flag: 76 | print("Started data validation and Training for ", epochs, " epochs") 77 | start_time = time.time() 78 | train(model=model, train_images=train_images_path, 79 | train_annotations=train_labels_path, 80 | checkpoints_path=model_path, epochs=epochs 81 | ) 82 | #print("Time Taken for Training in seconds --> ", time.time()-start_time) 83 | else: 84 | print("Started Hyperprameter tuning ") 85 | start_time = time.time() 86 | total_time = train_hyperparameters_tuning(model=model, train_images=train_images_path, 87 | train_annotations=train_labels_path, epochs=epochs, 88 | load_weights=model_path) 89 | print("Time Taken for Total Hyper parameter Tuning and Model loading " 90 | "in seconds --> ", time.time() - start_time) 91 | print("total_time --> ", total_time) 92 | -------------------------------------------------------------------------------- /src/evaluation.py: -------------------------------------------------------------------------------- 1 | # Copyright (C) 2024 Intel Corporation 2 | # SPDX-License-Identifier: BSD-3-Clause 3 | 4 | """ 5 | Evaluating of segmentation model 6 | """ 7 | # pylint: disable = C0301 W0611 C0413 8 | # flake8: noqa = E501 9 | import os 10 | import argparse 11 | import time 12 | import itertools 13 | import random 14 | import tensorflow as tf 15 | tf.compat.v1.disable_eager_execution() 16 | import cv2 17 | import numpy as np 18 | import six 19 | from utils import predict, evaluate, vgg_unet, frozen 20 | 21 | 22 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 23 | 24 | 25 | if __name__ == "__main__": 26 | # Parameters 27 | parser = argparse.ArgumentParser() 28 | parser.add_argument('-m', 29 | '--model_path', 30 | type=str, 31 | required=False, 32 | default=None, 33 | help='Please provide the Latest Checkpoint path. Default is None.') 34 | parser.add_argument('-d', 35 | '--data_path', 36 | type=str, 37 | required=True, 38 | help='Absolute path to the dataset folder containing ' 39 | '"original_images" and "label_images_semantic" folders.') 40 | parser.add_argument('-t', 41 | '--model_type', 42 | type=int, 43 | required=False, 44 | default='0', 45 | help='0 for checkpoint, ' 46 | '1 for frozen_graph.') 47 | FLAGS = parser.parse_args() 48 | model_path = FLAGS.model_path 49 | model_type=FLAGS.model_type 50 | data_path=FLAGS.data_path 51 | N_CLASSES = 21 52 | INPUT_HEIGHT = 416 53 | INPUT_WIDTH = 608 54 | OUTPUT_HEIGHT = 208 55 | OUTPUT_WIDTH = 304 56 | 57 | 58 | train_images_path = os.path.join(FLAGS.data_path, "original_images") 59 | train_labels_path = os.path.join(FLAGS.data_path, "label_images_semantic") 60 | 61 | # Model Initialization 62 | N_CLASSES = 21 # Aerial Semantic Segmentation Drone Dataset tree, 63 | 64 | model = vgg_unet(n_classes=N_CLASSES, input_height=416, input_width=608) 65 | model_from_name = {"vgg_unet": vgg_unet} 66 | if model_type == 0 : 67 | 68 | out = evaluate(model=model, inp_images_dir=train_images_path, annotations_dir=train_labels_path, 69 | checkpoints_path=model_path) 70 | start = time.time() 71 | n_classes_names = ["unlabeled", "[TARGET CLASS] paved-area", "dirt", "grass", "gravel", "water", "rocks", 72 | "pool", "vegetation", "roof", "wall", "window", "door", "fence", "fence-pole", 73 | "person", "dog", "car", "bicycle", "tree", "bald-tree"] 74 | for i, elem in enumerate(out['class_wise_IU']): 75 | print(n_classes_names[i], f"=> {(elem * 100):.2f}") 76 | 77 | input_image = train_images_path+"/002.jpg" 78 | out = predict(model=model, inp=input_image, out_fname="out.png") 79 | print("Time Taken for Prediction in seconds --> ", time.time()-start) 80 | else: 81 | print("Inferening using pb grah",model_path) 82 | out = frozen(model=model, inp_images_dir=train_images_path, annotations_dir=train_labels_path, 83 | checkpoints_path=model_path) 84 | start = time.time() 85 | n_classes_names = ["unlabeled", "[TARGET CLASS] paved-area", "dirt", "grass", "gravel", "water", "rocks", 86 | "pool", "vegetation", "roof", "wall", "window", "door", "fence", "fence-pole", 87 | "person", "dog", "car", "bicycle", "tree", "bald-tree"] 88 | for i, elem in enumerate(out['class_wise_IU']): 89 | print(n_classes_names[i], f"=> {(elem * 100):.2f}") 90 | 91 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | We as members, contributors, and leaders pledge to make participation in our 6 | community a harassment-free experience for everyone, regardless of age, body 7 | size, visible or invisible disability, ethnicity, sex characteristics, gender 8 | identity and expression, level of experience, education, socio-economic status, 9 | nationality, personal appearance, race, caste, color, religion, or sexual 10 | identity and orientation. 11 | 12 | We pledge to act and interact in ways that contribute to an open, welcoming, 13 | diverse, inclusive, and healthy community. 14 | 15 | ## Our Standards 16 | 17 | Examples of behavior that contributes to a positive environment for our 18 | community include: 19 | 20 | * Demonstrating empathy and kindness toward other people 21 | * Being respectful of differing opinions, viewpoints, and experiences 22 | * Giving and gracefully accepting constructive feedback 23 | * Accepting responsibility and apologizing to those affected by our mistakes, 24 | and learning from the experience 25 | * Focusing on what is best not just for us as individuals, but for the overall 26 | community 27 | 28 | Examples of unacceptable behavior include: 29 | 30 | * The use of sexualized language or imagery, and sexual attention or advances of 31 | any kind 32 | * Trolling, insulting or derogatory comments, and personal or political attacks 33 | * Public or private harassment 34 | * Publishing others' private information, such as a physical or email address, 35 | without their explicit permission 36 | * Other conduct which could reasonably be considered inappropriate in a 37 | professional setting 38 | 39 | ## Enforcement Responsibilities 40 | 41 | Community leaders are responsible for clarifying and enforcing our standards of 42 | acceptable behavior and will take appropriate and fair corrective action in 43 | response to any behavior that they deem inappropriate, threatening, offensive, 44 | or harmful. 45 | 46 | Community leaders have the right and responsibility to remove, edit, or reject 47 | comments, commits, code, wiki edits, issues, and other contributions that are 48 | not aligned to this Code of Conduct, and will communicate reasons for moderation 49 | decisions when appropriate. 50 | 51 | ## Scope 52 | 53 | This Code of Conduct applies within all community spaces, and also applies when 54 | an individual is officially representing the community in public spaces. 55 | Examples of representing our community include using an official e-mail address, 56 | posting via an official social media account, or acting as an appointed 57 | representative at an online or offline event. 58 | 59 | ## Enforcement 60 | 61 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 62 | reported to the community leaders responsible for enforcement at 63 | CommunityCodeOfConduct AT intel DOT com. 64 | All complaints will be reviewed and investigated promptly and fairly. 65 | 66 | All community leaders are obligated to respect the privacy and security of the 67 | reporter of any incident. 68 | 69 | ## Enforcement Guidelines 70 | 71 | Community leaders will follow these Community Impact Guidelines in determining 72 | the consequences for any action they deem in violation of this Code of Conduct: 73 | 74 | ### 1. Correction 75 | 76 | **Community Impact**: Use of inappropriate language or other behavior deemed 77 | unprofessional or unwelcome in the community. 78 | 79 | **Consequence**: A private, written warning from community leaders, providing 80 | clarity around the nature of the violation and an explanation of why the 81 | behavior was inappropriate. A public apology may be requested. 82 | 83 | ### 2. Warning 84 | 85 | **Community Impact**: A violation through a single incident or series of 86 | actions. 87 | 88 | **Consequence**: A warning with consequences for continued behavior. No 89 | interaction with the people involved, including unsolicited interaction with 90 | those enforcing the Code of Conduct, for a specified period of time. This 91 | includes avoiding interactions in community spaces as well as external channels 92 | like social media. Violating these terms may lead to a temporary or permanent 93 | ban. 94 | 95 | ### 3. Temporary Ban 96 | 97 | **Community Impact**: A serious violation of community standards, including 98 | sustained inappropriate behavior. 99 | 100 | **Consequence**: A temporary ban from any sort of interaction or public 101 | communication with the community for a specified period of time. No public or 102 | private interaction with the people involved, including unsolicited interaction 103 | with those enforcing the Code of Conduct, is allowed during this period. 104 | Violating these terms may lead to a permanent ban. 105 | 106 | ### 4. Permanent Ban 107 | 108 | **Community Impact**: Demonstrating a pattern of violation of community 109 | standards, including sustained inappropriate behavior, harassment of an 110 | individual, or aggression toward or disparagement of classes of individuals. 111 | 112 | **Consequence**: A permanent ban from any sort of public interaction within the 113 | community. 114 | 115 | ## Attribution 116 | 117 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], 118 | version 2.1, available at 119 | [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1]. 120 | 121 | Community Impact Guidelines were inspired by 122 | [Mozilla's code of conduct enforcement ladder][Mozilla CoC]. 123 | 124 | For answers to common questions about this code of conduct, see the FAQ at 125 | [https://www.contributor-covenant.org/faq][FAQ]. Translations are available at 126 | [https://www.contributor-covenant.org/translations][translations]. 127 | 128 | [homepage]: https://www.contributor-covenant.org 129 | [v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html 130 | [Mozilla CoC]: https://github.com/mozilla/diversity 131 | [FAQ]: https://www.contributor-covenant.org/faq -------------------------------------------------------------------------------- /src/intel_neural_compressor/neural_compressor_conversion.py: -------------------------------------------------------------------------------- 1 | # Copyright (C) 2024 Intel Corporation 2 | # SPDX-License-Identifier: BSD-3-Clause 3 | """ 4 | INC QUANTIZATION model saving 5 | """ 6 | # pylint: disable=C0301 E0401 C0103 I1101 R0913 R1708 E0611 W0612 W0611 C0413 7 | # flake8: noqa = E501 8 | import os 9 | import argparse 10 | import itertools 11 | import random 12 | import six 13 | import cv2 14 | import numpy as np 15 | import tensorflow as tf 16 | from tqdm import tqdm 17 | from neural_compressor.experimental import Quantization, common 18 | from neural_compressor.experimental import Benchmark 19 | from tensorflow.python.tools.optimize_for_inference_lib import optimize_for_inference 20 | from tensorflow.python.framework import dtypes 21 | tf.compat.v1.disable_eager_execution() 22 | from utils import vgg_unet, image_segmentation_generator, \ 23 | get_image_array, get_segmentation_array, get_pairs_from_paths 24 | 25 | 26 | class Dataset: 27 | """Creating Dataset class for getting Image and labels""" 28 | def __init__(self, data_root_path): 29 | self.data_root_path = data_root_path 30 | self.train_images = os.path.join(data_root_path, "original_images") 31 | self.train_annotations = os.path.join(data_root_path, "label_images_semantic") 32 | traingen = image_segmentation_generator(self.train_images, self.train_annotations, batch_size=2, 33 | n_classes=n_classes, input_height=input_height, 34 | input_width=input_width, output_height=output_height, 35 | output_width=output_width) 36 | self.test_images, self.test_labels = next(traingen) 37 | 38 | def __getitem__(self, index): 39 | return self.test_images[index], self.test_labels[index] 40 | 41 | def __len__(self): 42 | return len(self.test_images) 43 | 44 | def eval_function(self, graph_model): 45 | """ evaluate function to get relative accuracy of FP32""" 46 | n_classes = 21 47 | input_height = 416 48 | input_width = 608 49 | output_height = 208 50 | output_width = 304 51 | print("Model Input height , Model Input width, Model Output Height, Model Output Width") 52 | print(input_height, input_width, output_height, output_width) 53 | 54 | # Definfing input & output nodes of model 55 | INPUTS, OUTPUTS = 'input_1', 'Identity' 56 | output_graph = optimize_for_inference(graph_model.as_graph_def(), [INPUTS], [OUTPUTS], 57 | dtypes.float32.as_datatype_enum, False) 58 | 59 | # Initializing session 60 | tf.import_graph_def(output_graph, name="") 61 | l_input = graph_model.get_tensor_by_name('input_1:0') # Input Tensor 62 | l_output = graph_model.get_tensor_by_name('Identity:0') # Output Tensor 63 | config = tf.compat.v1.ConfigProto() 64 | sess = tf.compat.v1.Session(graph=graph_model, config=config) 65 | 66 | # Getting image path 67 | inp_images_dir = os.path.join(self.data_root_path, "original_images") 68 | annotations_dir = os.path.join(self.data_root_path, "label_images_semantic") 69 | paths = get_pairs_from_paths(inp_images_dir, annotations_dir, mode="eval") 70 | paths = list(zip(*paths)) 71 | inp_images = list(paths[0]) 72 | annotations = list(paths[1]) 73 | 74 | tp = np.zeros(21) 75 | fp = np.zeros(21) 76 | fn = np.zeros(21) 77 | n_pixels = np.zeros(21) 78 | 79 | for inp, ann in tqdm(zip(inp_images, annotations)): 80 | x = get_image_array(inp, input_width, input_height, ordering="channels_last") 81 | x = np.expand_dims(x, axis=0) 82 | pr = sess.run(l_output, feed_dict={l_input: x}) 83 | pr = pr[0] 84 | pr = pr.reshape((output_height, output_width, n_classes)).argmax(axis=2) 85 | gt = get_segmentation_array(ann, n_classes, output_width, output_height, no_reshape=True) 86 | gt = gt.argmax(-1) 87 | pr = pr.flatten() 88 | gt = gt.flatten() 89 | 90 | for cl_i in range(n_classes): 91 | tp[cl_i] += np.sum((pr == cl_i) * (gt == cl_i)) 92 | fp[cl_i] += np.sum((pr == cl_i) * (gt != cl_i)) 93 | fn[cl_i] += np.sum((pr != cl_i) * (gt == cl_i)) 94 | n_pixels[cl_i] += np.sum(gt == cl_i) 95 | 96 | cl_wise_score = tp / (tp + fp + fn + 0.000000000001) 97 | n_pixels_norm = n_pixels / np.sum(n_pixels) 98 | frequency_weighted_iu = np.sum(cl_wise_score * n_pixels_norm) 99 | mean_iu = np.mean(cl_wise_score) 100 | return cl_wise_score[1] 101 | 102 | 103 | # Define the command line arguments to get FP32 modelpath & Saving INT8 model path 104 | if __name__ == "__main__": 105 | parser = argparse.ArgumentParser() 106 | parser.add_argument('-m', 107 | '--modelpath', 108 | type=str, 109 | required=True, 110 | help='Path to the model trained with TensorFlow and saved as a ".pb" file.') 111 | parser.add_argument('-o', 112 | '--outpath', 113 | type=str, 114 | required=True, 115 | help='Directory to save the INT8 quantized model to.') 116 | parser.add_argument('-c', 117 | '--config', 118 | type=str, 119 | required=False, 120 | default=f'{os.environ["SRC_DIR"]}/intel_neural_compressor/deploy.yaml', 121 | help='Yaml file for quantizing model, default is "$SRC_DIR/intel_neural_compressor/deploy.yaml"') 122 | parser.add_argument('-d', 123 | '--data_path', 124 | type=str, 125 | required=True, 126 | help='Absolute path to the dataset folder containing ' 127 | '"original_images" and "label_images_semantic" folders.') 128 | parser.add_argument('-b', 129 | '--batchsize', 130 | type=str, 131 | required=False, 132 | default=1, 133 | help='batchsize for the dataloader. Default is 1.') 134 | 135 | FLAGS = parser.parse_args() 136 | model_path = FLAGS.modelpath 137 | config_path = FLAGS.config 138 | out_path = FLAGS.outpath 139 | data_path = FLAGS.data_path 140 | batchsize = int(FLAGS.batchsize) 141 | 142 | n_classes = 21 143 | input_height = 416 144 | input_width = 608 145 | output_height = 208 146 | output_width = 304 147 | 148 | # Quantization 149 | quantizer = Quantization(config_path) 150 | quantizer.model = model_path 151 | dataset = Dataset(data_path) 152 | quantizer.calib_dataloader = common.DataLoader(dataset) 153 | quantizer.eval_func = dataset.eval_function 154 | q_model = quantizer.fit() 155 | q_model.save(out_path) 156 | -------------------------------------------------------------------------------- /src/run_inference.py: -------------------------------------------------------------------------------- 1 | # Copyright (C) 2024 Intel Corporation 2 | # SPDX-License-Identifier: BSD-3-Clause 3 | """ 4 | inference of FP32 Model and INT8 Model 5 | """ 6 | # pylint: disable=E0401 C0301 R0913 I1101 C0103 R1708 E1129 7 | # flake8: noqa = E501 8 | import os 9 | import time 10 | import argparse 11 | import itertools 12 | import random 13 | import cv2 14 | import numpy as np 15 | import tensorflow as tf 16 | import six 17 | tf.compat.v1.disable_eager_execution() 18 | 19 | 20 | def get_image_array(image_input, width, height, imgnorm="sub_mean", ordering='channels_first'): 21 | """ Load image array from input """ 22 | 23 | if isinstance(image_input, np.ndarray): 24 | # It is already an array, use it as it is 25 | img = image_input 26 | elif isinstance(image_input, six.string_types): 27 | if not os.path.isfile(image_input): 28 | raise Exception(f"get_image_array: path {image_input} doesn't exist") 29 | img = cv2.imread(image_input, 1) 30 | else: 31 | raise Exception(f"get_image_array: Can't process input type {str(type(image_input))}") 32 | 33 | if imgnorm == "sub_and_divide": 34 | img = np.float32(cv2.resize(img, (width, height))) / 127.5 - 1 35 | elif imgnorm == "sub_mean": 36 | img = cv2.resize(img, (width, height)) 37 | img = img.astype(np.float32) 38 | img[:, :, 0] -= 103.939 39 | img[:, :, 1] -= 116.779 40 | img[:, :, 2] -= 123.68 41 | img = img[:, :, ::-1] 42 | elif imgnorm == "divide": 43 | img = cv2.resize(img, (width, height)) 44 | img = img.astype(np.float32) 45 | img = img/255.0 46 | 47 | if ordering == 'channels_first': 48 | img = np.rollaxis(img, 2, 0) 49 | return img 50 | 51 | 52 | def get_segmentation_array(image_input, num_classes, width, height, no_reshape=False): 53 | """ Load segmentation array from input """ 54 | 55 | seg_labels = np.zeros((height, width, num_classes)) 56 | 57 | if isinstance(image_input, np.ndarray): 58 | # It is already an array, use it as it is 59 | img = image_input 60 | elif isinstance(image_input, six.string_types): 61 | if not os.path.isfile(image_input): 62 | raise Exception(f"get_segmentation_array: path {image_input} doesn't exist") 63 | img = cv2.imread(image_input, 1) 64 | else: 65 | raise Exception(f"get_segmentation_array: Can't process input type {str(type(image_input))}") 66 | 67 | img = cv2.resize(img, (width, height), interpolation=cv2.INTER_NEAREST) 68 | img = img[:, :, 0] 69 | 70 | for c in range(N_CLASSES): 71 | seg_labels[:, :, c] = (img == c).astype(int) 72 | 73 | if not no_reshape: 74 | seg_labels = np.reshape(seg_labels, (width*height, N_CLASSES)) 75 | 76 | return seg_labels 77 | 78 | 79 | def get_pairs_from_paths(images_path, segs_path, ignore_non_matching=False): 80 | """ Find all the images from the images_path directory and 81 | the segmentation images from the segs_path directory 82 | while checking integrity of data """ 83 | 84 | acceptable_image_formats = [".jpg", ".jpeg", ".png", ".bmp"] 85 | acceptable_segmentation_formats = [".png", ".bmp"] 86 | 87 | image_files = [] 88 | segmentation_files = {} 89 | 90 | for dir_entry in os.listdir(images_path): 91 | if os.path.isfile(os.path.join(images_path, dir_entry)) \ 92 | and os.path.splitext(dir_entry)[1] in acceptable_image_formats: 93 | file_name, file_extension = os.path.splitext(dir_entry) 94 | image_files.append((file_name, file_extension, os.path.join(images_path, dir_entry))) 95 | 96 | for dir_entry in os.listdir(segs_path): 97 | if os.path.isfile(os.path.join(segs_path, dir_entry)) \ 98 | and os.path.splitext(dir_entry)[1] in acceptable_segmentation_formats: 99 | file_name, file_extension = os.path.splitext(dir_entry) 100 | if file_name in segmentation_files: 101 | raise Exception(f"Segmentation file with filename {file_name} already exists " 102 | f"and is ambiguous to resolve with path {os.path.join(segs_path, dir_entry)}." 103 | f"Please remove or rename the latter.") 104 | segmentation_files[file_name] = (file_extension, os.path.join(segs_path, dir_entry)) 105 | 106 | return_value = [] 107 | # Match the images and segmentations 108 | for image_file, _, image_full_path in image_files: 109 | if image_file in segmentation_files: 110 | return_value.append((image_full_path, segmentation_files[image_file][1])) 111 | elif ignore_non_matching: 112 | continue 113 | else: 114 | # Error out 115 | raise Exception(f"No corresponding segmentation found for image {image_full_path}.") 116 | 117 | return return_value 118 | 119 | 120 | def image_segmentation_generator(images_path, segs_path, batch_size, 121 | num_classes, in_height, in_width, 122 | out_height, out_width): 123 | """ 124 | image_segmentation_generator 125 | """ 126 | img_seg_pairs = get_pairs_from_paths(images_path, segs_path) 127 | random.shuffle(img_seg_pairs) 128 | zipped = itertools.cycle(img_seg_pairs) 129 | 130 | while True: 131 | x = [] 132 | y = [] 133 | for _ in range(batch_size): 134 | im, seg = next(zipped) 135 | 136 | im = cv2.imread(im, 1) 137 | seg = cv2.imread(seg, 1) 138 | 139 | x.append(get_image_array(im, in_width, 140 | in_height, ordering='channels_last')) 141 | y.append(get_segmentation_array( 142 | seg, num_classes, out_width, out_height)) 143 | 144 | yield np.array(x), np.array(y) 145 | 146 | 147 | class Dataset: 148 | """Creating Dataset class for getting Image and labels""" 149 | def __init__(self, data_root_path, batch_size): 150 | self.train_images = os.path.join(data_root_path, "original_images") 151 | self.train_annotations = os.path.join(data_root_path, "label_images_semantic") 152 | traingen = image_segmentation_generator(self.train_images, self.train_annotations, batch_size=batch_size, 153 | num_classes=N_CLASSES, in_height=INPUT_HEIGHT, 154 | in_width=INPUT_WIDTH, out_height=OUTPUT_HEIGHT, 155 | out_width=OUTPUT_WIDTH) 156 | self.test_images, self.test_labels = next(traingen) 157 | 158 | def __getitem__(self, index): 159 | return self.test_images, self.test_labels 160 | 161 | def __len__(self): 162 | return len(self.test_images) 163 | 164 | 165 | # Define the command line arguments to input the Hyperparameters - batchsize & Learning Rate 166 | if __name__ == "__main__": 167 | parser = argparse.ArgumentParser() 168 | parser.add_argument('-m', 169 | '--modelpath', 170 | type=str, 171 | required=True, 172 | help='Provide frozen Model path ".pb" file. Users can also use INC INT8 quantized model here.') 173 | parser.add_argument('-d', 174 | '--data_path', 175 | type=str, 176 | required=True, 177 | help='Absolute path to the dataset folder containing ' 178 | '"original_images" and "label_images_semantic" folders.') 179 | parser.add_argument('-b', 180 | '--batchsize', 181 | type=str, 182 | required=False, 183 | default=1, 184 | help='batchsize used for inference.') 185 | 186 | paramters = parser.parse_args() 187 | FLAGS = parser.parse_args() 188 | model_path = FLAGS.modelpath 189 | data_path = FLAGS.data_path 190 | batchsize = int(FLAGS.batchsize) 191 | 192 | N_CLASSES = 21 193 | INPUT_HEIGHT = 416 194 | INPUT_WIDTH = 608 195 | OUTPUT_HEIGHT = 208 196 | OUTPUT_WIDTH = 304 197 | 198 | dataset = Dataset(data_path, batch_size=batchsize) 199 | 200 | # Load frozen graph using TensorFlow 1.x functions 201 | with tf.Graph().as_default() as graph: 202 | with tf.compat.v1.Session() as sess: 203 | # Load the graph in graph_def 204 | print("load graph") 205 | with tf.io.gfile.GFile(model_path, "rb") as f: 206 | graph_def = tf.compat.v1.GraphDef() 207 | loaded = graph_def.ParseFromString(f.read()) 208 | sess.graph.as_default() 209 | tf.import_graph_def(graph_def, input_map=None, 210 | return_elements=None, 211 | name="", 212 | op_dict=None, 213 | producer_op_list=None) 214 | l_input = graph.get_tensor_by_name('input_1:0') # Input Tensor 215 | l_output = graph.get_tensor_by_name('Identity:0') # Output Tensor 216 | # initialize_all_variables 217 | tf.compat.v1.global_variables_initializer() 218 | 219 | # Model warm up adjustment process for the model to reach an optimal state 220 | print("Model warmup initiated ") 221 | for i in range(5): 222 | (images, labels) = next(iter(dataset)) 223 | Session_out = sess.run(l_output, feed_dict={l_input: images}) 224 | print("Model warm up completed for this inference run") 225 | 226 | # Run Kitty model on single image 227 | AVG_START_TIME = 0 228 | for i in range(10): 229 | (images, labels) = next(iter(dataset)) 230 | # images = np.expand_dims(images, axis=0) 231 | start_time = time.time() 232 | Session_out = sess.run(l_output, feed_dict={l_input: images}) 233 | end_time = time.time()-start_time 234 | AVG_START_TIME += end_time 235 | print("Time Taken for model inference in seconds ---> ", end_time) 236 | print("Average Time Taken for model inference in seconds ---> ", AVG_START_TIME/10) 237 | 238 | -------------------------------------------------------------------------------- /src/intel_neural_compressor/utils.py: -------------------------------------------------------------------------------- 1 | # Copyright (C) 2024 Intel Corporation 2 | # SPDX-License-Identifier: BSD-3-Clause 3 | """ 4 | Support file for training and evaluation 5 | """ 6 | # pylint: disable=E0401 C0301 C0103 E1101 E0611 R0913 R0914 R1708 R0912 R0915 E1129 W0612 W0611 7 | # flake8: noqa = E501 8 | import os 9 | from random import SystemRandom 10 | import itertools 11 | import time 12 | import glob 13 | import json 14 | from types import MethodType 15 | from tqdm import tqdm 16 | import six 17 | import cv2 18 | import numpy as np 19 | import tensorflow as tf 20 | from tensorflow import keras 21 | from tensorflow.keras.models import Model 22 | from tensorflow.keras.layers import Input, Activation, Permute, Conv2D, \ 23 | BatchNormalization, ZeroPadding2D, concatenate, UpSampling2D, MaxPooling2D, Reshape 24 | from tensorflow.python.tools.optimize_for_inference_lib import optimize_for_inference 25 | from tensorflow.python.framework import dtypes 26 | 27 | IMAGE_ORDERING_CHANNELS_FIRST = "channels_first" 28 | IMAGE_ORDERING_CHANNELS_LAST = "channels_last" 29 | # Default IMAGE_ORDERING = channels_last 30 | IMAGE_ORDERING = IMAGE_ORDERING_CHANNELS_LAST 31 | 32 | if IMAGE_ORDERING == 'channels_first': 33 | MERGE_AXIS = 1 34 | elif IMAGE_ORDERING == 'channels_last': 35 | MERGE_AXIS = -1 36 | 37 | if IMAGE_ORDERING == 'channels_first': 38 | pretrained_url = "https://github.com/fchollet/" \ 39 | "deep-learning-models/releases/download/v0.1/" \ 40 | "vgg16_weights_th_dim_ordering_th_kernels_notop.h5" 41 | elif IMAGE_ORDERING == 'channels_last': 42 | pretrained_url = "https://github.com/fchollet/" \ 43 | "deep-learning-models/releases/download/v0.1/" \ 44 | "vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5" 45 | 46 | cryptogen = SystemRandom() 47 | class_colors = [np.array((cryptogen.randint(0, 255), cryptogen.randint( 48 | 0, 255), cryptogen.randint(0, 255))) for _ in range(5000)] 49 | 50 | 51 | def get_colored_segmentation_image(seg_arr, n_classes, colors=None): 52 | """ get_colored_segmentation_image """ 53 | if colors is None: 54 | colors = class_colors 55 | output_height = seg_arr.shape[0] 56 | output_width = seg_arr.shape[1] 57 | 58 | seg_img = np.zeros((output_height, output_width, 3)) 59 | 60 | for c in range(n_classes): 61 | mask = (seg_arr == c) 62 | seg_img[mask] += colors[c].astype('uint8') 63 | 64 | return seg_img 65 | 66 | 67 | def visualize_segmentation(seg_arr, inp_img=None, n_classes=None, colors=None, 68 | overlay_img=False, prediction_width=None, prediction_height=None): 69 | """ visualize_segmentation """ 70 | if colors is None: 71 | colors = class_colors 72 | if n_classes is None: 73 | n_classes = np.max(seg_arr) 74 | 75 | seg_img = get_colored_segmentation_image(seg_arr, n_classes, colors=colors) 76 | 77 | if inp_img is not None: 78 | orininal_h = inp_img.shape[0] 79 | orininal_w = inp_img.shape[1] 80 | seg_img = cv2.resize(seg_img, (orininal_w, orininal_h)) 81 | 82 | if prediction_height is not None and prediction_width is not None: 83 | seg_img = cv2.resize(seg_img, (prediction_width, prediction_height)) 84 | if inp_img is not None: 85 | inp_img = cv2.resize(inp_img, (prediction_width, prediction_height)) 86 | 87 | if overlay_img: 88 | if inp_img is not None: 89 | # seg_img = overlay_seg_image(inp_img, seg_img) 90 | pass 91 | 92 | return seg_img 93 | 94 | 95 | def get_image_array(image_input, width, height, img_norm="sub_mean", 96 | ordering='channels_first'): 97 | """ Load image array from input """ 98 | 99 | if isinstance(image_input, np.ndarray): 100 | # It is already an array, use it as it is 101 | img = image_input 102 | elif isinstance(image_input, six.string_types): 103 | if not os.path.isfile(image_input): 104 | raise Exception(f"get_image_array: path {image_input} doesn't exist") 105 | img = cv2.imread(image_input, 1) 106 | else: 107 | raise Exception(f"get_image_array: Can't process input type {str(type(image_input))}") 108 | 109 | if img_norm == "sub_and_divide": 110 | img = np.float32(cv2.resize(img, (width, height))) / 127.5 - 1 111 | elif img_norm == "sub_mean": 112 | img = cv2.resize(img, (width, height)) 113 | img = img.astype(np.float32) 114 | img[:, :, 0] -= 103.939 115 | img[:, :, 1] -= 116.779 116 | img[:, :, 2] -= 123.68 117 | img = img[:, :, ::-1] 118 | elif img_norm == "divide": 119 | img = cv2.resize(img, (width, height)) 120 | img = img.astype(np.float32) 121 | img = img / 255.0 122 | 123 | if ordering == 'channels_first': 124 | img = np.rollaxis(img, 2, 0) 125 | return img 126 | 127 | 128 | def get_segmentation_array(image_input, nclasses, width, height, no_reshape=False): 129 | """ Load segmentation array from input """ 130 | 131 | seg_labels = np.zeros((height, width, nclasses)) 132 | 133 | if isinstance(image_input, np.ndarray): 134 | # It is already an array, use it as it is 135 | img = image_input 136 | elif isinstance(image_input, six.string_types): 137 | if not os.path.isfile(image_input): 138 | raise Exception(f"get_segmentation_array: path {image_input} doesn't exist") 139 | img = cv2.imread(image_input, 1) 140 | else: 141 | raise Exception(f"get_segmentation_array: Can't process input type {str(type(image_input))}") 142 | 143 | img = cv2.resize(img, (width, height), interpolation=cv2.INTER_NEAREST) 144 | img = img.mean(axis=-1) 145 | 146 | for c in range(nclasses): 147 | seg_labels[:, :, c] = (img == c).astype(int) 148 | 149 | if not no_reshape: 150 | seg_labels = np.reshape(seg_labels, (width * height, nclasses)) 151 | 152 | return seg_labels 153 | 154 | 155 | def image_segmentation_generator(images_path, segs_path, batch_size, 156 | n_classes, input_height, input_width, 157 | output_height, output_width, 158 | do_augment=False): 159 | """ image_segmentation_generator """ 160 | img_seg_pairs = get_pairs_from_paths(images_path, segs_path) 161 | cryptogen.shuffle(img_seg_pairs) 162 | zipped = itertools.cycle(img_seg_pairs) 163 | 164 | while True: 165 | x = [] 166 | y = [] 167 | for _ in range(batch_size): 168 | im, seg = next(zipped) 169 | 170 | im = cv2.imread(im, 1) 171 | seg = cv2.imread(seg, 1) 172 | 173 | if do_augment: 174 | # im, seg[:, :, 0] = augment_seg(im, seg[:, :, 0], augmentation_name=augmentation_name) 175 | pass 176 | 177 | x.append(get_image_array(im, input_width, 178 | input_height, ordering=IMAGE_ORDERING)) 179 | y.append(get_segmentation_array( 180 | seg, n_classes, output_width, output_height)) 181 | 182 | yield np.array(x), np.array(y) 183 | 184 | 185 | def get_pairs_from_paths(images_path, segs_path, mode="train", ignore_non_matching=False): 186 | """ Find all the images from the images_path directory and 187 | the segmentation images from the segs_path directory 188 | while checking integrity of data """ 189 | 190 | acceptable_img_formats = [".jpg", ".jpeg", ".png", ".bmp"] 191 | acceptable_segmentation_formats = [".png", ".bmp"] 192 | 193 | image_files = [] 194 | segmentation_files = {} 195 | if mode == "train": 196 | for dir_entry in os.listdir(images_path)[:int(len(os.listdir(images_path)) * 0.80)]: 197 | if os.path.isfile(os.path.join(images_path, dir_entry)) and os.path.splitext(dir_entry)[1] \ 198 | in acceptable_img_formats: 199 | file_name, file_extension = os.path.splitext(dir_entry) 200 | image_files.append((file_name, file_extension, os.path.join(images_path, dir_entry))) 201 | 202 | file_extension = acceptable_segmentation_formats[0] 203 | if file_name in segmentation_files: 204 | raise Exception( 205 | f"Segmentation file with filename {file_name} " 206 | f"already exists and is ambiguous " 207 | f"to resolve with path {os.path.join(segs_path, dir_entry)}" 208 | f". Please remove or rename the latter.") 209 | segmentation_files[file_name] = (file_extension, 210 | os.path.join(segs_path, dir_entry.split(".")[0] + 211 | acceptable_segmentation_formats[0])) 212 | print("80% of Data is considered for Training ===> ", int(len(os.listdir(segs_path)) * 0.80)) 213 | else: 214 | for dir_entry in os.listdir(images_path)[-int(len(os.listdir(images_path)) * 0.2):]: 215 | if os.path.isfile(os.path.join(images_path, dir_entry)) and os.path.splitext(dir_entry)[1] \ 216 | in acceptable_img_formats: 217 | file_name, file_extension = os.path.splitext(dir_entry) 218 | image_files.append((file_name, file_extension, os.path.join(images_path, dir_entry))) 219 | 220 | file_extension = acceptable_segmentation_formats[0] 221 | if file_name in segmentation_files: 222 | raise Exception( 223 | f"Segmentation file with filename {file_name} " 224 | f"already exists and is ambiguous " 225 | f"to resolve with path {os.path.join(segs_path, dir_entry)}" 226 | f". Please remove or rename the latter.") 227 | segmentation_files[file_name] = (file_extension, 228 | os.path.join(segs_path, dir_entry.split(".")[0] + 229 | acceptable_segmentation_formats[0])) 230 | print("20% of Data is considered for Evaluating===> ", int(len(os.listdir(segs_path)) * 0.2)) 231 | return_value = [] 232 | # Match the images and segmentations 233 | for image_file, _, image_full_path in image_files: 234 | if image_file in segmentation_files: 235 | return_value.append((image_full_path, segmentation_files[image_file][1])) 236 | elif ignore_non_matching: 237 | continue 238 | else: 239 | # Error out 240 | raise Exception(f"No corresponding segmentation found for image {image_full_path}.") 241 | 242 | return return_value 243 | 244 | 245 | def verify_segmentation_dataset(images_path, segs_path, n_classes, show_all_errors=False): 246 | """ verify_segmentation_dataset """ 247 | try: 248 | img_seg_pairs = get_pairs_from_paths(images_path, segs_path) 249 | if len(img_seg_pairs) <= 0: 250 | print(f"Couldn't load any data from images_path: {images_path, segs_path} and segmentations path: {1}") 251 | return False 252 | 253 | return_value = True 254 | for im_fn, seg_fn in tqdm(img_seg_pairs): 255 | img = cv2.imread(im_fn) 256 | seg = cv2.imread(seg_fn) 257 | # Check dimensions match 258 | if not img.shape == seg.shape: 259 | return_value = False 260 | print( 261 | f"The size of image {im_fn} and its segmentation {seg_fn} doesn't " 262 | f"match (possibly the files are corrupt).") 263 | if not show_all_errors: 264 | break 265 | else: 266 | max_pixel_value = np.max(seg[:, :, 0]) 267 | if max_pixel_value >= n_classes: 268 | return_value = False 269 | print( 270 | f"The pixel values of the segmentation image {seg_fn} violating " 271 | f"range [0, {str(n_classes - 1)}]. Found maximum pixel value {max_pixel_value}") 272 | if not show_all_errors: 273 | break 274 | if return_value: 275 | print("Dataset verified! ") 276 | else: 277 | print("Dataset not verified!") 278 | return return_value 279 | except RuntimeError: 280 | print("Found error during data loading") 281 | return False 282 | 283 | 284 | def evaluate(model=None, inp_images=None, annotations=None, inp_images_dir=None, annotations_dir=None, 285 | checkpoints_path=None): 286 | """ evaluate """ 287 | n_classes = model.n_classes 288 | input_height = model.input_height 289 | input_width = model.input_width 290 | output_height = model.output_height 291 | output_width = model.output_width 292 | print("Model Input height , Model Input width, Model Output Height, Model Output Width") 293 | print(input_height, input_width, output_height, output_width) 294 | 295 | if checkpoints_path is not None: 296 | with open(checkpoints_path + "_config.json", "w", encoding="utf8") as f: 297 | json.dump({ 298 | "model_class": model.model_name, 299 | "n_classes": n_classes, 300 | "input_height": input_height, 301 | "input_width": input_width, 302 | "output_height": output_height, 303 | "output_width": output_width 304 | }, f) 305 | 306 | if checkpoints_path is not None: 307 | latest_checkpoint = checkpoints_path # find_latest_checkpoint(checkpoints_path) 308 | if latest_checkpoint is not None: 309 | print("Loading the weights from latest checkpoint ", 310 | latest_checkpoint) 311 | model.load_weights(latest_checkpoint) 312 | 313 | if inp_images is None: 314 | paths = get_pairs_from_paths(inp_images_dir, annotations_dir, mode="eval") 315 | paths = list(zip(*paths)) 316 | inp_images = list(paths[0]) 317 | annotations = list(paths[1]) 318 | 319 | tp = np.zeros(model.n_classes) 320 | fp = np.zeros(model.n_classes) 321 | fn = np.zeros(model.n_classes) 322 | n_pixels = np.zeros(model.n_classes) 323 | 324 | for inp, ann in tqdm(zip(inp_images, annotations)): 325 | pr = predict(model, inp) 326 | gt = get_segmentation_array(ann, model.n_classes, model.output_width, model.output_height, no_reshape=True) 327 | gt = gt.argmax(-1) 328 | pr = pr.flatten() 329 | gt = gt.flatten() 330 | 331 | for cl_i in range(model.n_classes): 332 | tp[cl_i] += np.sum((pr == cl_i) * (gt == cl_i)) 333 | fp[cl_i] += np.sum((pr == cl_i) * (gt != cl_i)) 334 | fn[cl_i] += np.sum((pr != cl_i) * (gt == cl_i)) 335 | n_pixels[cl_i] += np.sum(gt == cl_i) 336 | 337 | cl_wise_score = tp / (tp + fp + fn) 338 | cl_wise_score = np.nan_to_num(cl_wise_score, nan=0.0) 339 | 340 | n_pixels_norm = n_pixels / np.sum(n_pixels) 341 | frequency_weighted_iu = np.sum(cl_wise_score * n_pixels_norm) 342 | mean_iu = np.mean(cl_wise_score) 343 | return {"frequency_weighted_IU": frequency_weighted_iu, "mean_IU": mean_iu, "class_wise_IU": cl_wise_score} 344 | 345 | 346 | def frozen(model=None, inp_images=None, annotations=None, inp_images_dir=None, annotations_dir=None,checkpoints_path=None): 347 | """ evaluate """ 348 | n_classes = model.n_classes 349 | input_height = model.input_height 350 | input_width = model.input_width 351 | output_height = model.output_height 352 | output_width = model.output_width 353 | print("Model Input height , Model Input width, Model Output Height, Model Output Width") 354 | print(input_height, input_width, output_height, output_width) 355 | 356 | # Load frozen graph using TensorFlow 1.x functions 357 | with tf.Graph().as_default() as graph: 358 | with tf.compat.v1.Session() as sess: 359 | # Load the graph in graph_def 360 | print("load graph") 361 | with tf.io.gfile.GFile(checkpoints_path, "rb") as f: 362 | graph_def = tf.compat.v1.GraphDef() 363 | loaded = graph_def.ParseFromString(f.read()) 364 | sess.graph.as_default() 365 | tf.import_graph_def(graph_def, input_map=None, 366 | return_elements=None, 367 | name="", 368 | op_dict=None, 369 | producer_op_list=None) 370 | l_input = graph.get_tensor_by_name('input_1:0') # Input Tensor 371 | l_output = graph.get_tensor_by_name('Identity:0') # Output Tensor 372 | # initialize_all_variables 373 | tf.compat.v1.global_variables_initializer() 374 | 375 | if inp_images is None: 376 | paths = get_pairs_from_paths(inp_images_dir, annotations_dir, mode="eval") 377 | paths = list(zip(*paths)) 378 | inp_images = list(paths[0]) 379 | annotations = list(paths[1]) 380 | 381 | tp = np.zeros(model.n_classes) 382 | fp = np.zeros(model.n_classes) 383 | fn = np.zeros(model.n_classes) 384 | n_pixels = np.zeros(model.n_classes) 385 | 386 | for inp, ann in tqdm(zip(inp_images, annotations)): 387 | x = get_image_array(inp, input_width, input_height, ordering=IMAGE_ORDERING) 388 | x = np.expand_dims(x, axis=0) 389 | pr = sess.run(l_output, feed_dict={l_input: x}) 390 | pr = pr[0] 391 | pr = pr.reshape((output_height, output_width, n_classes)).argmax(axis=2) 392 | gt = get_segmentation_array(ann, model.n_classes, model.output_width, model.output_height, no_reshape=True) 393 | gt = gt.argmax(-1) 394 | pr = pr.flatten() 395 | gt = gt.flatten() 396 | 397 | for cl_i in range(model.n_classes): 398 | tp[cl_i] += np.sum((pr == cl_i) * (gt == cl_i)) 399 | fp[cl_i] += np.sum((pr == cl_i) * (gt != cl_i)) 400 | fn[cl_i] += np.sum((pr != cl_i) * (gt == cl_i)) 401 | n_pixels[cl_i] += np.sum(gt == cl_i) 402 | 403 | cl_wise_score = tp / (tp + fp + fn + 0.000000000001) 404 | n_pixels_norm = n_pixels / np.sum(n_pixels) 405 | frequency_weighted_iu = np.sum(cl_wise_score * n_pixels_norm) 406 | mean_iu = np.mean(cl_wise_score) 407 | return {"frequency_weighted_IU": frequency_weighted_iu, "mean_IU": mean_iu, "class_wise_IU": cl_wise_score} 408 | 409 | 410 | 411 | def predict_multiple(model=None, inps=None, inp_dir=None, out_dir=None, 412 | checkpoints_path=None, overlay_img=False, 413 | colors=None, prediction_width=None, 414 | prediction_height=None): 415 | """ predict_multiple """ 416 | if colors is None: 417 | colors = class_colors 418 | if model is None and (checkpoints_path is not None): 419 | # model = model_from_checkpoint_path(checkpoints_path) 420 | pass 421 | 422 | if inps is None and (inp_dir is not None): 423 | inps = glob.glob(os.path.join(inp_dir, "*.jpg")) + glob.glob( 424 | os.path.join(inp_dir, "*.png")) + \ 425 | glob.glob(os.path.join(inp_dir, "*.jpeg")) 426 | 427 | all_prs = [] 428 | 429 | for i, inp in enumerate(tqdm(inps)): 430 | if out_dir is None: 431 | out_fname = None 432 | else: 433 | if isinstance(inp, six.string_types): 434 | out_fname = os.path.join(out_dir, os.path.basename(inp)) 435 | else: 436 | out_fname = os.path.join(out_dir, str(i) + ".jpg") 437 | 438 | pr = predict(model, inp, out_fname, 439 | overlay_img=overlay_img, 440 | colors=colors, prediction_width=prediction_width, 441 | prediction_height=prediction_height) 442 | 443 | all_prs.append(pr) 444 | 445 | return all_prs 446 | 447 | 448 | def predict(model=None, inp=None, out_fname=None, checkpoints_path=None, overlay_img=False, 449 | colors=None, prediction_width=None, prediction_height=None): 450 | """ predict """ 451 | if colors is None: 452 | colors = class_colors 453 | if model is None and (checkpoints_path is not None): 454 | # model = model_from_checkpoint_path(checkpoints_path) 455 | pass 456 | 457 | if isinstance(inp, six.string_types): 458 | inp = cv2.imread(inp) 459 | 460 | output_width = model.output_width 461 | output_height = model.output_height 462 | input_width = model.input_width 463 | input_height = model.input_height 464 | n_classes = model.n_classes 465 | 466 | x = get_image_array(inp, input_width, input_height, ordering=IMAGE_ORDERING) 467 | pr = model.predict(np.array([x]))[0] 468 | pr = pr.reshape((output_height, output_width, n_classes)).argmax(axis=2) 469 | 470 | if out_fname is not None: 471 | seg_img = visualize_segmentation(pr, inp, n_classes=n_classes, colors=colors, 472 | overlay_img=overlay_img, prediction_width=prediction_width, 473 | prediction_height=prediction_height) 474 | cv2.imwrite(out_fname, seg_img) 475 | 476 | return pr 477 | 478 | 479 | def train_hyperparameters_tuning(model, train_images, train_annotations, batch_size=4, 480 | steps_per_epoch=128, do_augment=False, epochs=3, load_weights=None): 481 | """ 482 | Hyperparameter Tuning 483 | """ 484 | 485 | # hyper-parameterss considered for tuning DL arch 486 | options = { 487 | "lr": [0.001, 0.01, 0.0001], 488 | "optimizer": ["Adam", "adadelta", "rmsprop"], 489 | "loss": ["categorical_crossentropy"]} 490 | 491 | # Replicating GridsearchCV functionality for params generation 492 | keys = options.keys() 493 | values = (options[key] for key in keys) 494 | p_combinations = [] 495 | for combination in itertools.product(*values): 496 | if len(combination) > 0: 497 | p_combinations.append(combination) 498 | 499 | steps_per_epoch = int(steps_per_epoch / batch_size) 500 | 501 | n_classes = model.n_classes 502 | input_height = model.input_height 503 | input_width = model.input_width 504 | output_height = model.output_height 505 | output_width = model.output_width 506 | print("Model Input height , Model Input width, Model Output Height, Model Output Width") 507 | print(input_height, input_width, output_height, output_width) 508 | print("Batch Size used for Training --> ", batch_size) 509 | 510 | train_gen = image_segmentation_generator( 511 | train_images, train_annotations, batch_size, n_classes, 512 | input_height, input_width, output_height, output_width, do_augment=do_augment) 513 | 514 | print("Total number of fits = ", len(p_combinations)) 515 | print("Take Break!!!\nThis will take time!") 516 | ctr = 0 517 | total_time = 0 518 | best_config = {"accuracy":0, "best_fit":0} 519 | for combination in p_combinations: 520 | if load_weights is not None and len(load_weights) > 0: 521 | print("Loading weights from ", load_weights) 522 | model.load_weights(load_weights) 523 | 524 | if len(combination) > 0: 525 | ctr += 1 526 | print("Current fit is at ", ctr) 527 | learning_r, optimizer, loss = combination 528 | print("Current fit parameters --> epochs=", epochs, " learning rate=", learning_r, 529 | " optimizer=", optimizer, " loss=", loss) 530 | if optimizer == "Adam": 531 | optimizer = keras.optimizers.Adam(learning_rate=learning_r) 532 | elif optimizer == "adadelta": 533 | optimizer = keras.optimizers.Adadelta(learning_rate=learning_r) 534 | else: 535 | optimizer = keras.optimizers.RMSprop(learning_rate=learning_r) 536 | 537 | model.compile(loss=loss, 538 | optimizer=optimizer, 539 | metrics=['accuracy', 'mae', keras.metrics.MeanIoU(num_classes=21)]) 540 | 541 | start_time = time.time() 542 | hist=model.fit_generator(train_gen, steps_per_epoch, epochs=epochs, workers=1, use_multiprocessing=False) 543 | total_time += time.time()-start_time 544 | print("Fit number: ", ctr, " ==> Time Taken for Training in seconds --> ", time.time()-start_time) 545 | if best_config["accuracy"] < hist.history["accuracy"][0]: 546 | best_config["accuracy"]=hist.history["accuracy"][0] 547 | best_config["best_fit"]=combination 548 | print("The best Tuningparameter combination is :",best_config) 549 | return total_time 550 | 551 | 552 | def train(model, train_images, train_annotations, verify_dataset=False, 553 | checkpoints_path=None, epochs=5, batch_size=4, validate=False, 554 | val_images=None, val_annotations=None, val_batch_size=4, 555 | auto_resume_checkpoint=False, load_weights=None, steps_per_epoch=512, val_steps_per_epoch=512, 556 | gen_use_multiprocessing=False, ignore_zero_class=False, optimizer_name='Adam', do_augment=False 557 | ): 558 | """ train """ 559 | steps_per_epoch = int(steps_per_epoch / batch_size) 560 | 561 | n_classes = model.n_classes 562 | input_height = model.input_height 563 | input_width = model.input_width 564 | output_height = model.output_height 565 | output_width = model.output_width 566 | print("Model Input height , Model Input width, Model Output Height, Model Output Width") 567 | print(input_height, input_width, output_height, output_width) 568 | print("Batch Size used for Training --> ", batch_size) 569 | print("Batch Size used for Validation --> ", val_batch_size) 570 | 571 | if optimizer_name is not None: 572 | 573 | if ignore_zero_class: 574 | loss_k = 'masked_categorical_crossentropy' 575 | else: 576 | loss_k = 'categorical_crossentropy' 577 | if optimizer_name == "Adam": 578 | optimizer_name = keras.optimizers.Adam(learning_rate=0.001) 579 | model.compile(loss=loss_k, 580 | optimizer=optimizer_name, 581 | metrics=['accuracy']) 582 | 583 | if checkpoints_path is not None: 584 | with open("train_config.json", "w", encoding="utf8") as f: 585 | json.dump({ 586 | "model_class": model.model_name, 587 | "n_classes": n_classes, 588 | "input_height": input_height, 589 | "input_width": input_width, 590 | "output_height": output_height, 591 | "output_width": output_width 592 | }, f) 593 | 594 | if load_weights is not None and len(load_weights) > 0: 595 | print("Loading weights from ", load_weights) 596 | model.load_weights(load_weights) 597 | 598 | if auto_resume_checkpoint and (checkpoints_path is not None): 599 | latest_checkpoint = checkpoints_path # find_latest_checkpoint(checkpoints_path) 600 | if latest_checkpoint is not None: 601 | print("Loading the weights from latest checkpoint ", 602 | latest_checkpoint) 603 | model.load_weights(latest_checkpoint) 604 | 605 | if verify_dataset: 606 | print("Verifying training dataset") 607 | verify_segmentation_dataset(train_images, train_annotations, n_classes) 608 | if validate: 609 | print("Verifying validation dataset") 610 | verify_segmentation_dataset(val_images, val_annotations, n_classes) 611 | 612 | train_gen = image_segmentation_generator( 613 | train_images, train_annotations, batch_size, n_classes, 614 | input_height, input_width, output_height, output_width, do_augment=do_augment) 615 | val_gen = None 616 | if validate: 617 | val_gen = image_segmentation_generator( 618 | val_images, val_annotations, val_batch_size, 619 | n_classes, input_height, input_width, output_height, output_width) 620 | if checkpoints_path is None: 621 | checkpoints_path = "vgg-unet" 622 | start_time = time.time() 623 | if not validate: 624 | for ep in range(epochs): 625 | print("Starting Epoch ", ep) 626 | model.fit_generator(train_gen, steps_per_epoch, epochs=1, use_multiprocessing=False) 627 | if checkpoints_path is not None: 628 | model.save_weights(checkpoints_path) 629 | print("saved ", checkpoints_path) 630 | print("Finished Epoch", ep) 631 | else: 632 | for ep in range(epochs): 633 | print("Starting Epoch ", ep) 634 | model.fit_generator(train_gen, steps_per_epoch, 635 | validation_data=val_gen, 636 | validation_steps=val_steps_per_epoch, epochs=1, 637 | use_multiprocessing=gen_use_multiprocessing) 638 | if checkpoints_path is not None: 639 | model.save_weights(checkpoints_path) 640 | print("saved ", checkpoints_path) 641 | print("Finished Epoch", ep) 642 | print("Time Taken for Training in seconds --> ", time.time() - start_time) 643 | 644 | 645 | def get_segmentation_model(input_data, output): 646 | """ get_segmentation_model """ 647 | output_width, output_height, n_classes, input_height, input_width = None, None, None, None, None 648 | img_input = input_data 649 | o = output 650 | 651 | o_shape = Model(img_input, o).output_shape 652 | i_shape = Model(img_input, o).input_shape 653 | 654 | if IMAGE_ORDERING == 'channels_first': 655 | output_height = o_shape[2] 656 | output_width = o_shape[3] 657 | input_height = i_shape[2] 658 | input_width = i_shape[3] 659 | n_classes = o_shape[1] 660 | o = (Reshape((-1, output_height * output_width)))(o) 661 | o = (Permute((2, 1)))(o) 662 | elif IMAGE_ORDERING == 'channels_last': 663 | output_height = o_shape[1] 664 | output_width = o_shape[2] 665 | input_height = i_shape[1] 666 | input_width = i_shape[2] 667 | n_classes = o_shape[3] 668 | o = (Reshape((output_height * output_width, -1)))(o) 669 | 670 | o = (Activation('softmax'))(o) 671 | model = Model(img_input, o) 672 | model.output_width = output_width 673 | model.output_height = output_height 674 | model.n_classes = n_classes 675 | model.input_height = input_height 676 | model.input_width = input_width 677 | model.model_name = "" 678 | 679 | model.train = MethodType(train, model) 680 | model.predict_segmentation = MethodType(predict, model) 681 | model.predict_multiple = MethodType(predict_multiple, model) 682 | model.evaluate_segmentation = MethodType(evaluate, model) 683 | 684 | return model 685 | 686 | 687 | def get_vgg_encoder(input_height=224, input_width=224, pretrained='imagenet'): 688 | """ get_vgg_encoder """ 689 | img_input = None 690 | if IMAGE_ORDERING == 'channels_first': 691 | img_input = Input(shape=(3, input_height, input_width)) 692 | elif IMAGE_ORDERING == 'channels_last': 693 | img_input = Input(shape=(input_height, input_width, 3)) 694 | 695 | x = Conv2D(64, (3, 3), activation='relu', padding='same', 696 | name='block1_conv1', data_format=IMAGE_ORDERING)(img_input) 697 | x = Conv2D(64, (3, 3), activation='relu', padding='same', 698 | name='block1_conv2', data_format=IMAGE_ORDERING)(x) 699 | x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool', 700 | data_format=IMAGE_ORDERING)(x) 701 | f1 = x 702 | # Block 2 703 | x = Conv2D(128, (3, 3), activation='relu', padding='same', 704 | name='block2_conv1', data_format=IMAGE_ORDERING)(x) 705 | x = Conv2D(128, (3, 3), activation='relu', padding='same', 706 | name='block2_conv2', data_format=IMAGE_ORDERING)(x) 707 | x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool', 708 | data_format=IMAGE_ORDERING)(x) 709 | f2 = x 710 | 711 | # Block 3 712 | x = Conv2D(256, (3, 3), activation='relu', padding='same', 713 | name='block3_conv1', data_format=IMAGE_ORDERING)(x) 714 | x = Conv2D(256, (3, 3), activation='relu', padding='same', 715 | name='block3_conv2', data_format=IMAGE_ORDERING)(x) 716 | x = Conv2D(256, (3, 3), activation='relu', padding='same', 717 | name='block3_conv3', data_format=IMAGE_ORDERING)(x) 718 | x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool', 719 | data_format=IMAGE_ORDERING)(x) 720 | f3 = x 721 | 722 | # Block 4 723 | x = Conv2D(512, (3, 3), activation='relu', padding='same', 724 | name='block4_conv1', data_format=IMAGE_ORDERING)(x) 725 | x = Conv2D(512, (3, 3), activation='relu', padding='same', 726 | name='block4_conv2', data_format=IMAGE_ORDERING)(x) 727 | x = Conv2D(512, (3, 3), activation='relu', padding='same', 728 | name='block4_conv3', data_format=IMAGE_ORDERING)(x) 729 | x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool', 730 | data_format=IMAGE_ORDERING)(x) 731 | f4 = x 732 | 733 | # Block 5 734 | x = Conv2D(512, (3, 3), activation='relu', padding='same', 735 | name='block5_conv1', data_format=IMAGE_ORDERING)(x) 736 | x = Conv2D(512, (3, 3), activation='relu', padding='same', 737 | name='block5_conv2', data_format=IMAGE_ORDERING)(x) 738 | x = Conv2D(512, (3, 3), activation='relu', padding='same', 739 | name='block5_conv3', data_format=IMAGE_ORDERING)(x) 740 | x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool', 741 | data_format=IMAGE_ORDERING)(x) 742 | f5 = x 743 | 744 | if pretrained == 'imagenet': 745 | vgg_weights_path = keras.utils.get_file(pretrained_url.rsplit('/', maxsplit=1)[-1], pretrained_url) 746 | Model(img_input, x).load_weights(vgg_weights_path) 747 | 748 | return img_input, [f1, f2, f3, f4, f5] 749 | 750 | 751 | def _unet(n_classes, encoder, l1_skip_conn=True, input_height=416, 752 | input_width=608): 753 | """ _unet """ 754 | img_input, levels = encoder( 755 | input_height=input_height, input_width=input_width) 756 | [f1, f2, f3, f4, _] = levels 757 | 758 | o = f4 759 | 760 | o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o) 761 | o = (Conv2D(512, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o) 762 | o = (BatchNormalization())(o) 763 | 764 | o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o) 765 | o = (concatenate([o, f3], axis=MERGE_AXIS)) 766 | o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o) 767 | o = (Conv2D(256, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o) 768 | o = (BatchNormalization())(o) 769 | 770 | o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o) 771 | o = (concatenate([o, f2], axis=MERGE_AXIS)) 772 | o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o) 773 | o = (Conv2D(128, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o) 774 | o = (BatchNormalization())(o) 775 | 776 | o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o) 777 | 778 | if l1_skip_conn: 779 | o = (concatenate([o, f1], axis=MERGE_AXIS)) 780 | 781 | o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o) 782 | o = (Conv2D(64, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o) 783 | o = (BatchNormalization())(o) 784 | 785 | o = Conv2D(n_classes, (3, 3), padding='same', data_format=IMAGE_ORDERING)(o) 786 | 787 | model = get_segmentation_model(img_input, o) 788 | 789 | return model 790 | 791 | 792 | def vgg_unet(n_classes, input_height=416, input_width=608): 793 | """ vgg_unet """ 794 | model = _unet(n_classes, get_vgg_encoder, input_height=input_height, input_width=input_width) 795 | model.model_name = "vgg_unet" 796 | return model 797 | -------------------------------------------------------------------------------- /src/utils.py: -------------------------------------------------------------------------------- 1 | # Copyright (C) 2024 Intel Corporation 2 | # SPDX-License-Identifier: BSD-3-Clause 3 | """ 4 | Support file for training and evaluation 5 | """ 6 | # pylint: disable=E0401 C0301 C0103 E1101 E0611 R0913 R0914 R1708 R0912 R0915 E1129 W0612 7 | # flake8: noqa = E501 8 | import os 9 | from random import SystemRandom 10 | import itertools 11 | import time 12 | import glob 13 | import json 14 | from types import MethodType 15 | from tqdm import tqdm 16 | import six 17 | import cv2 18 | import numpy as np 19 | import tensorflow as tf 20 | from tensorflow import keras 21 | from tensorflow.keras.models import Model 22 | from tensorflow.keras.layers import Input, Activation, Permute, Conv2D, \ 23 | BatchNormalization, ZeroPadding2D, concatenate, UpSampling2D, MaxPooling2D, Reshape 24 | 25 | IMAGE_ORDERING_CHANNELS_FIRST = "channels_first" 26 | IMAGE_ORDERING_CHANNELS_LAST = "channels_last" 27 | # Default IMAGE_ORDERING = channels_last 28 | IMAGE_ORDERING = IMAGE_ORDERING_CHANNELS_LAST 29 | 30 | if IMAGE_ORDERING == 'channels_first': 31 | MERGE_AXIS = 1 32 | elif IMAGE_ORDERING == 'channels_last': 33 | MERGE_AXIS = -1 34 | 35 | if IMAGE_ORDERING == 'channels_first': 36 | pretrained_url = "https://github.com/fchollet/" \ 37 | "deep-learning-models/releases/download/v0.1/" \ 38 | "vgg16_weights_th_dim_ordering_th_kernels_notop.h5" 39 | elif IMAGE_ORDERING == 'channels_last': 40 | pretrained_url = "https://github.com/fchollet/" \ 41 | "deep-learning-models/releases/download/v0.1/" \ 42 | "vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5" 43 | 44 | cryptogen = SystemRandom() 45 | class_colors = [np.array((cryptogen.randint(0, 255), cryptogen.randint( 46 | 0, 255), cryptogen.randint(0, 255))) for _ in range(5000)] 47 | 48 | 49 | def get_colored_segmentation_image(seg_arr, n_classes, colors=None): 50 | """ get_colored_segmentation_image """ 51 | if colors is None: 52 | colors = class_colors 53 | output_height = seg_arr.shape[0] 54 | output_width = seg_arr.shape[1] 55 | 56 | seg_img = np.zeros((output_height, output_width, 3)) 57 | 58 | for c in range(n_classes): 59 | mask = (seg_arr == c) 60 | seg_img[mask] += colors[c].astype('uint8') 61 | 62 | return seg_img 63 | 64 | 65 | def visualize_segmentation(seg_arr, inp_img=None, n_classes=None, colors=None, 66 | overlay_img=False, prediction_width=None, prediction_height=None): 67 | """ visualize_segmentation """ 68 | if colors is None: 69 | colors = class_colors 70 | if n_classes is None: 71 | n_classes = np.max(seg_arr) 72 | 73 | seg_img = get_colored_segmentation_image(seg_arr, n_classes, colors=colors) 74 | 75 | if inp_img is not None: 76 | orininal_h = inp_img.shape[0] 77 | orininal_w = inp_img.shape[1] 78 | seg_img = cv2.resize(seg_img, (orininal_w, orininal_h)) 79 | 80 | if prediction_height is not None and prediction_width is not None: 81 | seg_img = cv2.resize(seg_img, (prediction_width, prediction_height)) 82 | if inp_img is not None: 83 | inp_img = cv2.resize(inp_img, (prediction_width, prediction_height)) 84 | 85 | if overlay_img: 86 | if inp_img is not None: 87 | # seg_img = overlay_seg_image(inp_img, seg_img) 88 | pass 89 | 90 | return seg_img 91 | 92 | 93 | def get_image_array(image_input, width, height, img_norm="sub_mean", 94 | ordering='channels_first'): 95 | """ Load image array from input """ 96 | 97 | if isinstance(image_input, np.ndarray): 98 | # It is already an array, use it as it is 99 | img = image_input 100 | elif isinstance(image_input, six.string_types): 101 | if not os.path.isfile(image_input): 102 | raise Exception(f"get_image_array: path {image_input} doesn't exist") 103 | img = cv2.imread(image_input, 1) 104 | else: 105 | raise Exception(f"get_image_array: Can't process input type {str(type(image_input))}") 106 | 107 | if img_norm == "sub_and_divide": 108 | img = np.float32(cv2.resize(img, (width, height))) / 127.5 - 1 109 | elif img_norm == "sub_mean": 110 | img = cv2.resize(img, (width, height)) 111 | img = img.astype(np.float32) 112 | img[:, :, 0] -= 103.939 113 | img[:, :, 1] -= 116.779 114 | img[:, :, 2] -= 123.68 115 | img = img[:, :, ::-1] 116 | elif img_norm == "divide": 117 | img = cv2.resize(img, (width, height)) 118 | img = img.astype(np.float32) 119 | img = img / 255.0 120 | 121 | if ordering == 'channels_first': 122 | img = np.rollaxis(img, 2, 0) 123 | return img 124 | 125 | 126 | def get_segmentation_array(image_input, nclasses, width, height, no_reshape=False): 127 | """ Load segmentation array from input """ 128 | 129 | seg_labels = np.zeros((height, width, nclasses)) 130 | 131 | if isinstance(image_input, np.ndarray): 132 | # It is already an array, use it as it is 133 | img = image_input 134 | elif isinstance(image_input, six.string_types): 135 | if not os.path.isfile(image_input): 136 | raise Exception(f"get_segmentation_array: path {image_input} doesn't exist") 137 | img = cv2.imread(image_input, 1) 138 | else: 139 | raise Exception(f"get_segmentation_array: Can't process input type {str(type(image_input))}") 140 | 141 | img = cv2.resize(img, (width, height), interpolation=cv2.INTER_NEAREST) 142 | img = img.mean(axis=-1) 143 | 144 | for c in range(nclasses): 145 | seg_labels[:, :, c] = (img == c).astype(int) 146 | 147 | if not no_reshape: 148 | seg_labels = np.reshape(seg_labels, (width * height, nclasses)) 149 | 150 | return seg_labels 151 | 152 | 153 | def image_segmentation_generator(images_path, segs_path, batch_size, 154 | n_classes, input_height, input_width, 155 | output_height, output_width, 156 | do_augment=False): 157 | """ image_segmentation_generator """ 158 | img_seg_pairs = get_pairs_from_paths(images_path, segs_path) 159 | cryptogen.shuffle(img_seg_pairs) 160 | zipped = itertools.cycle(img_seg_pairs) 161 | 162 | while True: 163 | x = [] 164 | y = [] 165 | for _ in range(batch_size): 166 | im, seg = next(zipped) 167 | 168 | im = cv2.imread(im, 1) 169 | seg = cv2.imread(seg, 1) 170 | 171 | if do_augment: 172 | # im, seg[:, :, 0] = augment_seg(im, seg[:, :, 0], augmentation_name=augmentation_name) 173 | pass 174 | 175 | x.append(get_image_array(im, input_width, 176 | input_height, ordering=IMAGE_ORDERING)) 177 | y.append(get_segmentation_array( 178 | seg, n_classes, output_width, output_height)) 179 | 180 | yield np.array(x), np.array(y) 181 | 182 | 183 | def get_pairs_from_paths(images_path, segs_path, mode="train", ignore_non_matching=False): 184 | """ Find all the images from the images_path directory and 185 | the segmentation images from the segs_path directory 186 | while checking integrity of data """ 187 | 188 | acceptable_img_formats = [".jpg", ".jpeg", ".png", ".bmp"] 189 | acceptable_segmentation_formats = [".png", ".bmp"] 190 | 191 | image_files = [] 192 | segmentation_files = {} 193 | if mode == "train": 194 | for dir_entry in os.listdir(images_path)[:int(len(os.listdir(images_path)) * 0.80)]: 195 | if os.path.isfile(os.path.join(images_path, dir_entry)) and os.path.splitext(dir_entry)[1] \ 196 | in acceptable_img_formats: 197 | file_name, file_extension = os.path.splitext(dir_entry) 198 | image_files.append((file_name, file_extension, os.path.join(images_path, dir_entry))) 199 | 200 | file_extension = acceptable_segmentation_formats[0] 201 | if file_name in segmentation_files: 202 | raise Exception( 203 | f"Segmentation file with filename {file_name} " 204 | f"already exists and is ambiguous " 205 | f"to resolve with path {os.path.join(segs_path, dir_entry)}" 206 | f". Please remove or rename the latter.") 207 | segmentation_files[file_name] = (file_extension, 208 | os.path.join(segs_path, dir_entry.split(".")[0] + 209 | acceptable_segmentation_formats[0])) 210 | print("80% of Data is considered for Training ===> ", int(len(os.listdir(segs_path)) * 0.80)) 211 | else: 212 | for dir_entry in os.listdir(images_path)[-int(len(os.listdir(images_path)) * 0.2):]: 213 | if os.path.isfile(os.path.join(images_path, dir_entry)) and os.path.splitext(dir_entry)[1] \ 214 | in acceptable_img_formats: 215 | file_name, file_extension = os.path.splitext(dir_entry) 216 | image_files.append((file_name, file_extension, os.path.join(images_path, dir_entry))) 217 | 218 | file_extension = acceptable_segmentation_formats[0] 219 | if file_name in segmentation_files: 220 | raise Exception( 221 | f"Segmentation file with filename {file_name} " 222 | f"already exists and is ambiguous " 223 | f"to resolve with path {os.path.join(segs_path, dir_entry)}" 224 | f". Please remove or rename the latter.") 225 | segmentation_files[file_name] = (file_extension, 226 | os.path.join(segs_path, dir_entry.split(".")[0] + 227 | acceptable_segmentation_formats[0])) 228 | print("20% of Data is considered for Evaluating===> ", int(len(os.listdir(segs_path)) * 0.2)) 229 | return_value = [] 230 | # Match the images and segmentations 231 | for image_file, _, image_full_path in image_files: 232 | if image_file in segmentation_files: 233 | return_value.append((image_full_path, segmentation_files[image_file][1])) 234 | elif ignore_non_matching: 235 | continue 236 | else: 237 | # Error out 238 | raise Exception(f"No corresponding segmentation found for image {image_full_path}.") 239 | 240 | return return_value 241 | 242 | 243 | def verify_segmentation_dataset(images_path, segs_path, n_classes, show_all_errors=False): 244 | """ verify_segmentation_dataset """ 245 | try: 246 | img_seg_pairs = get_pairs_from_paths(images_path, segs_path) 247 | if len(img_seg_pairs) <= 0: 248 | print(f"Couldn't load any data from images_path: {images_path, segs_path} and segmentations path: {1}") 249 | return False 250 | 251 | return_value = True 252 | for im_fn, seg_fn in tqdm(img_seg_pairs): 253 | img = cv2.imread(im_fn) 254 | seg = cv2.imread(seg_fn) 255 | # Check dimensions match 256 | if not img.shape == seg.shape: 257 | return_value = False 258 | print( 259 | f"The size of image {im_fn} and its segmentation {seg_fn} doesn't " 260 | f"match (possibly the files are corrupt).") 261 | if not show_all_errors: 262 | break 263 | else: 264 | max_pixel_value = np.max(seg[:, :, 0]) 265 | if max_pixel_value >= n_classes: 266 | return_value = False 267 | print( 268 | f"The pixel values of the segmentation image {seg_fn} violating " 269 | f"range [0, {str(n_classes - 1)}]. Found maximum pixel value {max_pixel_value}") 270 | if not show_all_errors: 271 | break 272 | if return_value: 273 | print("Dataset verified! ") 274 | else: 275 | print("Dataset not verified!") 276 | return return_value 277 | except RuntimeError: 278 | print("Found error during data loading") 279 | return False 280 | 281 | 282 | def evaluate(model=None, inp_images=None, annotations=None, inp_images_dir=None, annotations_dir=None, 283 | checkpoints_path=None): 284 | """ evaluate """ 285 | n_classes = model.n_classes 286 | input_height = model.input_height 287 | input_width = model.input_width 288 | output_height = model.output_height 289 | output_width = model.output_width 290 | print("Model Input height , Model Input width, Model Output Height, Model Output Width") 291 | print(input_height, input_width, output_height, output_width) 292 | 293 | if checkpoints_path is not None: 294 | with open(checkpoints_path + "_config.json", "w", encoding="utf8") as f: 295 | json.dump({ 296 | "model_class": model.model_name, 297 | "n_classes": n_classes, 298 | "input_height": input_height, 299 | "input_width": input_width, 300 | "output_height": output_height, 301 | "output_width": output_width 302 | }, f) 303 | 304 | latest_checkpoint = checkpoints_path # find_latest_checkpoint(checkpoints_path) 305 | if latest_checkpoint is not None: 306 | print("Loading the weights from latest checkpoint ", 307 | latest_checkpoint) 308 | model.load_weights(latest_checkpoint).expect_partial() 309 | 310 | if inp_images is None: 311 | paths = get_pairs_from_paths(inp_images_dir, annotations_dir, mode="eval") 312 | paths = list(zip(*paths)) 313 | inp_images = list(paths[0]) 314 | annotations = list(paths[1]) 315 | 316 | tp = np.zeros(model.n_classes) 317 | fp = np.zeros(model.n_classes) 318 | fn = np.zeros(model.n_classes) 319 | n_pixels = np.zeros(model.n_classes) 320 | 321 | for inp, ann in tqdm(zip(inp_images, annotations)): 322 | pr = predict(model, inp) 323 | gt = get_segmentation_array(ann, model.n_classes, model.output_width, model.output_height, no_reshape=True) 324 | gt = gt.argmax(-1) 325 | pr = pr.flatten() 326 | gt = gt.flatten() 327 | 328 | for cl_i in range(model.n_classes): 329 | tp[cl_i] += np.sum((pr == cl_i) * (gt == cl_i)) 330 | fp[cl_i] += np.sum((pr == cl_i) * (gt != cl_i)) 331 | fn[cl_i] += np.sum((pr != cl_i) * (gt == cl_i)) 332 | n_pixels[cl_i] += np.sum(gt == cl_i) 333 | 334 | cl_wise_score = tp / (tp + fp + fn) 335 | cl_wise_score = np.nan_to_num(cl_wise_score, nan=0.0) 336 | 337 | n_pixels_norm = n_pixels / np.sum(n_pixels) 338 | frequency_weighted_iu = np.sum(cl_wise_score * n_pixels_norm) 339 | mean_iu = np.mean(cl_wise_score) 340 | return {"frequency_weighted_IU": frequency_weighted_iu, "mean_IU": mean_iu, "class_wise_IU": cl_wise_score} 341 | 342 | 343 | def frozen(model=None, inp_images=None, annotations=None, inp_images_dir=None, annotations_dir=None,checkpoints_path=None): 344 | """ evaluate """ 345 | n_classes = model.n_classes 346 | input_height = model.input_height 347 | input_width = model.input_width 348 | output_height = model.output_height 349 | output_width = model.output_width 350 | print("Model Input height , Model Input width, Model Output Height, Model Output Width") 351 | print(input_height, input_width, output_height, output_width) 352 | 353 | # Load frozen graph using TensorFlow 1.x functions 354 | with tf.Graph().as_default() as graph: 355 | with tf.compat.v1.Session() as sess: 356 | # Load the graph in graph_def 357 | print("load graph") 358 | with tf.io.gfile.GFile(checkpoints_path, "rb") as f: 359 | graph_def = tf.compat.v1.GraphDef() 360 | loaded = graph_def.ParseFromString(f.read()) 361 | sess.graph.as_default() 362 | tf.import_graph_def(graph_def, input_map=None, 363 | return_elements=None, 364 | name="", 365 | op_dict=None, 366 | producer_op_list=None) 367 | l_input = graph.get_tensor_by_name('input_1:0') # Input Tensor 368 | l_output = graph.get_tensor_by_name('Identity:0') # Output Tensor 369 | # initialize_all_variables 370 | tf.compat.v1.global_variables_initializer() 371 | 372 | if inp_images is None: 373 | paths = get_pairs_from_paths(inp_images_dir, annotations_dir, mode="eval") 374 | paths = list(zip(*paths)) 375 | inp_images = list(paths[0]) 376 | annotations = list(paths[1]) 377 | 378 | tp = np.zeros(model.n_classes) 379 | fp = np.zeros(model.n_classes) 380 | fn = np.zeros(model.n_classes) 381 | n_pixels = np.zeros(model.n_classes) 382 | 383 | for inp, ann in tqdm(zip(inp_images, annotations)): 384 | x = get_image_array(inp, input_width, input_height, ordering=IMAGE_ORDERING) 385 | x = np.expand_dims(x, axis=0) 386 | pr = sess.run(l_output, feed_dict={l_input: x}) 387 | pr = pr[0] 388 | pr = pr.reshape((output_height, output_width, n_classes)).argmax(axis=2) 389 | gt = get_segmentation_array(ann, model.n_classes, model.output_width, model.output_height, no_reshape=True) 390 | gt = gt.argmax(-1) 391 | pr = pr.flatten() 392 | gt = gt.flatten() 393 | 394 | for cl_i in range(model.n_classes): 395 | tp[cl_i] += np.sum((pr == cl_i) * (gt == cl_i)) 396 | fp[cl_i] += np.sum((pr == cl_i) * (gt != cl_i)) 397 | fn[cl_i] += np.sum((pr != cl_i) * (gt == cl_i)) 398 | n_pixels[cl_i] += np.sum(gt == cl_i) 399 | 400 | cl_wise_score = tp / (tp + fp + fn + 0.000000000001) 401 | n_pixels_norm = n_pixels / np.sum(n_pixels) 402 | frequency_weighted_iu = np.sum(cl_wise_score * n_pixels_norm) 403 | mean_iu = np.mean(cl_wise_score) 404 | return {"frequency_weighted_IU": frequency_weighted_iu, "mean_IU": mean_iu, "class_wise_IU": cl_wise_score} 405 | 406 | 407 | def predict_multiple(model=None, inps=None, inp_dir=None, out_dir=None, 408 | checkpoints_path=None, overlay_img=False, 409 | colors=None, prediction_width=None, 410 | prediction_height=None): 411 | """ predict_multiple """ 412 | if colors is None: 413 | colors = class_colors 414 | if model is None and (checkpoints_path is not None): 415 | # model = model_from_checkpoint_path(checkpoints_path) 416 | pass 417 | 418 | if inps is None and (inp_dir is not None): 419 | inps = glob.glob(os.path.join(inp_dir, "*.jpg")) + glob.glob( 420 | os.path.join(inp_dir, "*.png")) + \ 421 | glob.glob(os.path.join(inp_dir, "*.jpeg")) 422 | 423 | all_prs = [] 424 | 425 | for i, inp in enumerate(tqdm(inps)): 426 | if out_dir is None: 427 | out_fname = None 428 | else: 429 | if isinstance(inp, six.string_types): 430 | out_fname = os.path.join(out_dir, os.path.basename(inp)) 431 | else: 432 | out_fname = os.path.join(out_dir, str(i) + ".jpg") 433 | 434 | pr = predict(model, inp, out_fname, 435 | overlay_img=overlay_img, 436 | colors=colors, prediction_width=prediction_width, 437 | prediction_height=prediction_height) 438 | 439 | all_prs.append(pr) 440 | 441 | return all_prs 442 | 443 | 444 | def predict(model=None, inp=None, out_fname=None, checkpoints_path=None, overlay_img=False, 445 | colors=None, prediction_width=None, prediction_height=None): 446 | """ predict """ 447 | if colors is None: 448 | colors = class_colors 449 | if model is None and (checkpoints_path is not None): 450 | # model = model_from_checkpoint_path(checkpoints_path) 451 | pass 452 | 453 | if isinstance(inp, six.string_types): 454 | inp = cv2.imread(inp) 455 | 456 | output_width = model.output_width 457 | output_height = model.output_height 458 | input_width = model.input_width 459 | input_height = model.input_height 460 | n_classes = model.n_classes 461 | 462 | x = get_image_array(inp, input_width, input_height, ordering=IMAGE_ORDERING) 463 | pr = model.predict(np.array([x]))[0] 464 | pr = pr.reshape((output_height, output_width, n_classes)).argmax(axis=2) 465 | 466 | if out_fname is not None: 467 | seg_img = visualize_segmentation(pr, inp, n_classes=n_classes, colors=colors, 468 | overlay_img=overlay_img, prediction_width=prediction_width, 469 | prediction_height=prediction_height) 470 | cv2.imwrite(out_fname, seg_img) 471 | 472 | return pr 473 | 474 | 475 | def train_hyperparameters_tuning(model, train_images, train_annotations, batch_size=4, 476 | steps_per_epoch=128, do_augment=False, epochs=3, load_weights=None): 477 | """ 478 | Hyperparameter Tuning 479 | """ 480 | 481 | # hyper-parameterss considered for tuning DL arch 482 | options = { 483 | "lr": [0.001, 0.01, 0.0001], 484 | "optimizer": ["Adam", "adadelta", "rmsprop"], 485 | "loss": ["categorical_crossentropy"]} 486 | 487 | # Replicating GridsearchCV functionality for params generation 488 | keys = options.keys() 489 | values = (options[key] for key in keys) 490 | p_combinations = [] 491 | for combination in itertools.product(*values): 492 | if len(combination) > 0: 493 | p_combinations.append(combination) 494 | 495 | steps_per_epoch = int(steps_per_epoch / batch_size) 496 | 497 | n_classes = model.n_classes 498 | input_height = model.input_height 499 | input_width = model.input_width 500 | output_height = model.output_height 501 | output_width = model.output_width 502 | print("Model Input height , Model Input width, Model Output Height, Model Output Width") 503 | print(input_height, input_width, output_height, output_width) 504 | print("Batch Size used for Training --> ", batch_size) 505 | 506 | train_gen = image_segmentation_generator( 507 | train_images, train_annotations, batch_size, n_classes, 508 | input_height, input_width, output_height, output_width, do_augment=do_augment) 509 | 510 | print("Total number of fits = ", len(p_combinations)) 511 | print("Take Break!!!\nThis will take time!") 512 | ctr = 0 513 | total_time = 0 514 | best_config = {"accuracy":0, "best_fit":0} 515 | for combination in p_combinations: 516 | if load_weights is not None and len(load_weights) > 0: 517 | print("Loading weights from ", load_weights) 518 | model.compile(loss='categorical_crossentropy', 519 | optimizer=keras.optimizers.Adam(learning_rate=0.001), 520 | metrics=['accuracy']) 521 | model.load_weights(load_weights).expect_partial() 522 | 523 | if len(combination) > 0: 524 | ctr += 1 525 | print("Current fit is at ", ctr) 526 | learning_r, optimizer, loss = combination 527 | print("Current fit parameters --> epochs=", epochs, " learning rate=", learning_r, 528 | " optimizer=", optimizer, " loss=", loss) 529 | if optimizer == "Adam": 530 | optimizer = keras.optimizers.Adam(learning_rate=learning_r) 531 | elif optimizer == "adadelta": 532 | optimizer = keras.optimizers.Adadelta(learning_rate=learning_r) 533 | else: 534 | optimizer = keras.optimizers.RMSprop(learning_rate=learning_r) 535 | 536 | model.compile(loss=loss, 537 | optimizer=optimizer, 538 | metrics=['accuracy', 'mae', keras.metrics.MeanIoU(num_classes=21)]) 539 | 540 | start_time = time.time() 541 | hist=model.fit_generator(train_gen, steps_per_epoch, epochs=epochs, workers=1, use_multiprocessing=False) 542 | total_time += time.time()-start_time 543 | print("Fit number: ", ctr, " ==> Time Taken for Training in seconds --> ", time.time()-start_time) 544 | if best_config["accuracy"] < hist.history["accuracy"][0]: 545 | best_config["accuracy"]=hist.history["accuracy"][0] 546 | best_config["best_fit"]=combination 547 | print("The best Tuningparameter combination is :",best_config) 548 | return total_time 549 | 550 | 551 | def train(model, train_images, train_annotations, verify_dataset=False, 552 | checkpoints_path=None, epochs=5, batch_size=4, validate=False, 553 | val_images=None, val_annotations=None, val_batch_size=4, 554 | auto_resume_checkpoint=False, load_weights=None, steps_per_epoch=512, val_steps_per_epoch=512, 555 | gen_use_multiprocessing=False, ignore_zero_class=False, optimizer_name='Adam', do_augment=False 556 | ): 557 | """ train """ 558 | steps_per_epoch = int(steps_per_epoch / batch_size) 559 | 560 | n_classes = model.n_classes 561 | input_height = model.input_height 562 | input_width = model.input_width 563 | output_height = model.output_height 564 | output_width = model.output_width 565 | print("Model Input height , Model Input width, Model Output Height, Model Output Width") 566 | print(input_height, input_width, output_height, output_width) 567 | print("Batch Size used for Training --> ", batch_size) 568 | print("Batch Size used for Validation --> ", val_batch_size) 569 | 570 | if optimizer_name is not None: 571 | 572 | if ignore_zero_class: 573 | loss_k = 'masked_categorical_crossentropy' 574 | else: 575 | loss_k = 'categorical_crossentropy' 576 | if optimizer_name == "Adam": 577 | optimizer_name = keras.optimizers.Adam(learning_rate=0.001) 578 | model.compile(loss=loss_k, 579 | optimizer=optimizer_name, 580 | metrics=['accuracy']) 581 | 582 | if checkpoints_path is not None: 583 | with open("train_config.json", "w", encoding="utf8") as f: 584 | json.dump({ 585 | "model_class": model.model_name, 586 | "n_classes": n_classes, 587 | "input_height": input_height, 588 | "input_width": input_width, 589 | "output_height": output_height, 590 | "output_width": output_width 591 | }, f) 592 | 593 | if load_weights is not None and len(load_weights) > 0: 594 | print("Loading weights from ", load_weights) 595 | model.load_weights(load_weights) 596 | 597 | if auto_resume_checkpoint and (checkpoints_path is not None): 598 | latest_checkpoint = checkpoints_path # find_latest_checkpoint(checkpoints_path) 599 | if latest_checkpoint is not None: 600 | print("Loading the weights from latest checkpoint ", 601 | latest_checkpoint) 602 | model.load_weights(latest_checkpoint) 603 | 604 | if verify_dataset: 605 | print("Verifying training dataset") 606 | verify_segmentation_dataset(train_images, train_annotations, n_classes) 607 | if validate: 608 | print("Verifying validation dataset") 609 | verify_segmentation_dataset(val_images, val_annotations, n_classes) 610 | 611 | train_gen = image_segmentation_generator( 612 | train_images, train_annotations, batch_size, n_classes, 613 | input_height, input_width, output_height, output_width, do_augment=do_augment) 614 | val_gen = None 615 | if validate: 616 | val_gen = image_segmentation_generator( 617 | val_images, val_annotations, val_batch_size, 618 | n_classes, input_height, input_width, output_height, output_width) 619 | if checkpoints_path is None: 620 | checkpoints_path = "vgg-unet" 621 | start_time = time.time() 622 | if not validate: 623 | for ep in range(epochs): 624 | print("Starting Epoch ", ep) 625 | model.fit_generator(train_gen, steps_per_epoch, epochs=1, use_multiprocessing=False) 626 | if checkpoints_path is not None: 627 | model.save_weights(checkpoints_path) 628 | print("saved ", checkpoints_path) 629 | print("Finished Epoch", ep) 630 | else: 631 | for ep in range(epochs): 632 | print("Starting Epoch ", ep) 633 | model.fit_generator(train_gen, steps_per_epoch, 634 | validation_data=val_gen, 635 | validation_steps=val_steps_per_epoch, epochs=1, 636 | use_multiprocessing=gen_use_multiprocessing) 637 | if checkpoints_path is not None: 638 | model.save_weights(checkpoints_path) 639 | print("saved ", checkpoints_path) 640 | print("Finished Epoch", ep) 641 | print("Time Taken for Training in seconds --> ", time.time() - start_time) 642 | 643 | 644 | def get_segmentation_model(input_data, output): 645 | """ get_segmentation_model """ 646 | output_width, output_height, n_classes, input_height, input_width = None, None, None, None, None 647 | img_input = input_data 648 | o = output 649 | 650 | o_shape = Model(img_input, o).output_shape 651 | i_shape = Model(img_input, o).input_shape 652 | 653 | if IMAGE_ORDERING == 'channels_first': 654 | output_height = o_shape[2] 655 | output_width = o_shape[3] 656 | input_height = i_shape[2] 657 | input_width = i_shape[3] 658 | n_classes = o_shape[1] 659 | o = (Reshape((-1, output_height * output_width)))(o) 660 | o = (Permute((2, 1)))(o) 661 | elif IMAGE_ORDERING == 'channels_last': 662 | output_height = o_shape[1] 663 | output_width = o_shape[2] 664 | input_height = i_shape[1] 665 | input_width = i_shape[2] 666 | n_classes = o_shape[3] 667 | o = (Reshape((output_height * output_width, -1)))(o) 668 | 669 | o = (Activation('softmax'))(o) 670 | model = Model(img_input, o) 671 | model.output_width = output_width 672 | model.output_height = output_height 673 | model.n_classes = n_classes 674 | model.input_height = input_height 675 | model.input_width = input_width 676 | model.model_name = "" 677 | 678 | model.train = MethodType(train, model) 679 | model.predict_segmentation = MethodType(predict, model) 680 | model.predict_multiple = MethodType(predict_multiple, model) 681 | model.evaluate_segmentation = MethodType(evaluate, model) 682 | 683 | return model 684 | 685 | 686 | def get_vgg_encoder(input_height=224, input_width=224, pretrained='imagenet'): 687 | """ get_vgg_encoder """ 688 | img_input = None 689 | if IMAGE_ORDERING == 'channels_first': 690 | img_input = Input(shape=(3, input_height, input_width)) 691 | elif IMAGE_ORDERING == 'channels_last': 692 | img_input = Input(shape=(input_height, input_width, 3)) 693 | 694 | x = Conv2D(64, (3, 3), activation='relu', padding='same', 695 | name='block1_conv1', data_format=IMAGE_ORDERING)(img_input) 696 | x = Conv2D(64, (3, 3), activation='relu', padding='same', 697 | name='block1_conv2', data_format=IMAGE_ORDERING)(x) 698 | x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool', 699 | data_format=IMAGE_ORDERING)(x) 700 | f1 = x 701 | # Block 2 702 | x = Conv2D(128, (3, 3), activation='relu', padding='same', 703 | name='block2_conv1', data_format=IMAGE_ORDERING)(x) 704 | x = Conv2D(128, (3, 3), activation='relu', padding='same', 705 | name='block2_conv2', data_format=IMAGE_ORDERING)(x) 706 | x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool', 707 | data_format=IMAGE_ORDERING)(x) 708 | f2 = x 709 | 710 | # Block 3 711 | x = Conv2D(256, (3, 3), activation='relu', padding='same', 712 | name='block3_conv1', data_format=IMAGE_ORDERING)(x) 713 | x = Conv2D(256, (3, 3), activation='relu', padding='same', 714 | name='block3_conv2', data_format=IMAGE_ORDERING)(x) 715 | x = Conv2D(256, (3, 3), activation='relu', padding='same', 716 | name='block3_conv3', data_format=IMAGE_ORDERING)(x) 717 | x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool', 718 | data_format=IMAGE_ORDERING)(x) 719 | f3 = x 720 | 721 | # Block 4 722 | x = Conv2D(512, (3, 3), activation='relu', padding='same', 723 | name='block4_conv1', data_format=IMAGE_ORDERING)(x) 724 | x = Conv2D(512, (3, 3), activation='relu', padding='same', 725 | name='block4_conv2', data_format=IMAGE_ORDERING)(x) 726 | x = Conv2D(512, (3, 3), activation='relu', padding='same', 727 | name='block4_conv3', data_format=IMAGE_ORDERING)(x) 728 | x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool', 729 | data_format=IMAGE_ORDERING)(x) 730 | f4 = x 731 | 732 | # Block 5 733 | x = Conv2D(512, (3, 3), activation='relu', padding='same', 734 | name='block5_conv1', data_format=IMAGE_ORDERING)(x) 735 | x = Conv2D(512, (3, 3), activation='relu', padding='same', 736 | name='block5_conv2', data_format=IMAGE_ORDERING)(x) 737 | x = Conv2D(512, (3, 3), activation='relu', padding='same', 738 | name='block5_conv3', data_format=IMAGE_ORDERING)(x) 739 | x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool', 740 | data_format=IMAGE_ORDERING)(x) 741 | f5 = x 742 | 743 | if pretrained == 'imagenet': 744 | vgg_weights_path = keras.utils.get_file(pretrained_url.rsplit('/', maxsplit=1)[-1], pretrained_url) 745 | Model(img_input, x).load_weights(vgg_weights_path) 746 | 747 | return img_input, [f1, f2, f3, f4, f5] 748 | 749 | 750 | def _unet(n_classes, encoder, l1_skip_conn=True, input_height=416, 751 | input_width=608): 752 | """ _unet """ 753 | img_input, levels = encoder( 754 | input_height=input_height, input_width=input_width) 755 | [f1, f2, f3, f4, _] = levels 756 | 757 | o = f4 758 | 759 | o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o) 760 | o = (Conv2D(512, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o) 761 | o = (BatchNormalization())(o) 762 | 763 | o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o) 764 | o = (concatenate([o, f3], axis=MERGE_AXIS)) 765 | o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o) 766 | o = (Conv2D(256, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o) 767 | o = (BatchNormalization())(o) 768 | 769 | o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o) 770 | o = (concatenate([o, f2], axis=MERGE_AXIS)) 771 | o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o) 772 | o = (Conv2D(128, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o) 773 | o = (BatchNormalization())(o) 774 | 775 | o = (UpSampling2D((2, 2), data_format=IMAGE_ORDERING))(o) 776 | 777 | if l1_skip_conn: 778 | o = (concatenate([o, f1], axis=MERGE_AXIS)) 779 | 780 | o = (ZeroPadding2D((1, 1), data_format=IMAGE_ORDERING))(o) 781 | o = (Conv2D(64, (3, 3), padding='valid', data_format=IMAGE_ORDERING))(o) 782 | o = (BatchNormalization())(o) 783 | 784 | o = Conv2D(n_classes, (3, 3), padding='same', data_format=IMAGE_ORDERING)(o) 785 | 786 | model = get_segmentation_model(img_input, o) 787 | 788 | return model 789 | 790 | 791 | def vgg_unet(n_classes, input_height=416, input_width=608): 792 | """ vgg_unet """ 793 | model = _unet(n_classes, get_vgg_encoder, input_height=input_height, input_width=input_width) 794 | model.model_name = "vgg_unet" 795 | return model 796 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | PROJECT NOT UNDER ACTIVE MANAGEMENT 2 | 3 | This project will no longer be maintained by Intel. 4 | 5 | Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project. 6 | 7 | Intel no longer accepts patches to this project. 8 | 9 | If you have an ongoing need to use this project, are interested in independently developing it, or would like to maintain patches for the open source software community, please create your own fork of this project. 10 | 11 | Contact: webadmin@linux.intel.com 12 | # Drone Navigation Inspection 13 | 14 | ## Introduction 15 | Build an optimized semantic segmentation solution based on the Visual Geometry Group (VGG)-UNET architecture and oriented to assist drones on safely landing by identifying and segmenting paved areas. The proposed system uses Intel® oneDNN optimized TensorFlow\* to accelerate the training and inference performance of drones equipped with Intel® hardware, whereas Intel® Neural Compressor is applied to compress the trained segmentation model to further increase speed up inference. Check out the [Developer Catalog](https://developer.intel.com/aireferenceimplementations) for information about different use cases. 16 | 17 | ## Solution Technical Overview 18 | Drones are unmanned aerial vehicles (UAVs) or unmanned aircraft systems. Essentially, a drone is a flying robot that can be remotely controlled using remote control devices which communicate with the drone. While drones have huge applications in sectors like urban development, construction & infrastructure, supply chain and logistics use cases, safety is a huge concern. 19 | 20 | Drones are used commercially as first-aid vehicles, as tools for investigation by police departments, in high-tech photography and as recording devices for real estate properties, concerts, sporting events, etc. This reference kit model has been built with the objective of improving the safety of autonomous drone flight and landing procedures at the edge (which runs on CPU based hardware) without ground-based controllers or human pilots onsite. 21 | 22 | Drones at construction sites are used to scan, record, and map locations or buildings land surveys, tracking machines, remote monitoring, construction site security, building inspection, and worker safety. However, drone crashes are dangerous and can lead to devastation. 23 | 24 | In utilities sector, inspecting growing numbers of towers, powerlines, and wind turbines is difficult and creates prime opportunities for drones to replace human inspection with accurate image-based inspection and diagnosis. Drones transform the way inspection and maintenance personnel do their jobs at utility companies. If a drone meets with an accident while landing, it could damage assets and injure personnel. 25 | 26 | Safe landing of drones without injuring people or damaging property is vital for massive adoption of drones in day-to-day life. Considering the risks associated with drone landing, paved areas dedicated for drones to land are considered safe. The Artificial Intelligence (AI) system introduced in this project presents a deep learning model which segments paved areas for safe landing. Furthermore, the proposed solution allows an efficient deployment while maintaining the accuracy and speeding up the inference time by leveraging the following Intel® oneAPI packages: 27 | 28 | * ***Intel® Distribution for Python\**** 29 | 30 | The [Intel® Distribution for Python*](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html) provides: 31 | 32 | * Scalable performance using all available CPU cores on laptops, desktops, and powerful servers 33 | * Support for the latest CPU instructions 34 | * Near-native performance through acceleration of core numerical and machine learning packages with libraries like the Intel® oneAPI Math Kernel Library (oneMKL) and Intel® oneAPI Data Analytics Library 35 | * Productivity tools for compiling Python* code into optimized instructions 36 | * Essential Python\* bindings for easing integration of Intel® native tools with your Python\* project 37 | 38 | * ***[Intel® Optimizations for TensorFlow\*](https://www.intel.com/content/www/us/en/developer/tools/oneapi/optimization-for-tensorflow.html#gs.174f5y)*** 39 | 40 | * Accelerate AI performance with Intel® oneAPI Deep Neural Network Library (oneDNN) features such as graph optimizations and memory pool allocation. 41 | * Automatically use Intel® Deep Learning Boost instruction set features to parallelize and accelerate AI workloads. 42 | * Reduce inference latency for models deployed using TensorFlow Serving. 43 | * Starting with TensorFlow 2.9, take advantage of oneDNN optimizations automatically. 44 | * Enable optimizations by setting the environment variable TF_ENABLE_ONEDNN_OPTS=1 in TensorFlow\* 2.5 through 2.8. 45 | 46 | * ***Intel® Neural Compressor*** 47 | 48 | [Intel® Neural Compressor](https://www.intel.com/content/www/us/en/developer/tools/oneapi/neural-compressor.html#gs.5vjr1p) performs model compression to reduce the model size and increase the speed of deep learning inference for deployment on CPUs or GPUs. This open source Python\* library automates popular model compression technologies, such as quantization, pruning, and knowledge distillation across multiple deep learning frameworks. 49 | 50 | The use of AI in the context of drones can be further optimized using Intel® oneAPI which improves the performance of computing intensive image processing, reduces training/inference time and scales the usage of complex models by compressing models to run efficiently on edge devices. Intel® oneDNN optimized TensorFlow\* provides additional optimizations for an extra performance boost on Intel® CPU. 51 | 52 | For more details, visit [Intel® Distribution for Python\*](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html), [Intel® Optimizations for TensorFlow\*](https://www.intel.com/content/www/us/en/developer/tools/oneapi/optimization-for-tensorflow.html#gs.174f5y), [Intel® Neural Compressor](https://www.intel.com/content/www/us/en/developer/tools/oneapi/neural-compressor.html#gs.5vjr1p), and the [Drone Navigation Inspection](https://github.com/oneapi-src/drone-navigation-inspection) GitHub repository. 53 | 54 | ## Solution Technical Details 55 | This reference kit leverages Intel® oneAPI to demonstrate the application of TensorFlow\* based AI models that works on drone technology to help segment paved areas which increases the probability of landing drones safely. 56 | 57 | The experiment focus is drone navigation for inspections. Therefore, the experiment aims to segment the paved area and different objects around the drone path in order to land the drone safely on the paved area. The goal is therefore to take an image captured by the drone camera as input and pass it through the semantic segmentation model (VGG-UNET architecture) to accurately recognize entities, like paved area, people, vehicles or dogs, to then benchmark speed and accuracy of batch/real-time training and inference against Intel®’s technology. When it comes to the deployment of this model on edge devices with less computing and memory resources such as drones themselves, model is quantized and compressed while bringing out the same level of accuracy and efficient utilization of underlying computing resources. Model optimization and compression is done using Intel® Neural Compressor. 58 | 59 | ### Dataset 60 | Pixel-accurate annotation for drone dataset focuses on semantic understanding of urban scenes for increasing the safety of drone landing procedures. The imagery depicts more than 20 houses from nadir (bird's eye) view acquired at an altitude of 5 to 30 meters above ground. A high resolution camera was used to acquire images at a size of 6000x4000px (24Mpx). The complexity of the dataset is limited to 20 classes and the target output is paved area class. The training set contains 320 publicly available images, and the test set is made up of 80 images. Here the train & test dataset split is 80:20. 61 | 62 | | **Use case** | Paved Area Segmentation 63 | | :--- | :--- 64 | | **Object of interest** | Paved Area 65 | | **Size** | Total 400 Labelled Images
66 | | **Train : Test Split** | 80:20 67 | | **Source** | https://www.kaggle.com/datasets/bulentsiyah/semantic-drone-dataset 68 | 69 | Instructions on how to download and manage the dataset can be found in this [subsection](#download-the-dataset). 70 | 71 | > *Please see this dataset's applicable license for terms and conditions. Intel® does not own the rights to this data set and does not confer any rights to it.* 72 | 73 | ## Validated Hardware Details 74 | There are workflow-specific hardware and software setup requirements depending on 75 | how the workflow is run. 76 | 77 | | Recommended Hardware | Precision 78 | | ----------------------------------------------------------------|- 79 | | CPU: Intel® 2th Gen Xeon® Platinum 8280L CPU @ 2.70GHz or higher | FP32, INT8 80 | | RAM: 187 GB | 81 | | Recommended Free Disk Space: 20 GB or more | 82 | 83 | Code was tested on Ubuntu\* 22.04 LTS. 84 | 85 | ## How it Works 86 | The semantic segmentation pipeline presented in this reference kit enables the optimization of the training, hyperparameter tuning and inference modalities by using Intel® oneAPI specialized packages. The next diagram illustrates the workflow of these processes and how the Intel® optimization features are applied in each stage. 87 | 88 | ![segmentation-flow](assets/segmentation_workflow.png) 89 | 90 | ### Intel® oneDNN optimized TensorFlow\* 91 | Training a convolutional neural network, like the VGG-UNET model used in this reference kit, and making inference with it, usually represent compute-intensive tasks. To address these requirements and to gain a performance boost on Intel® hardware, in this reference kit the training and inference stages include the implementation of TensorFlow\* optimized via Intel ®oneDNN. 92 | 93 | Regarding the training step, the present solution allows to perform regular training and to undertake an exhaustive search of optimal hyperparameters by implementing a hyperparameter tuning scheme. 94 | 95 | In the case of the regular training of the VGG-UNET architecture, the efficiency of the process is increased by using transfer learning based on the pre-trained VGG encoder. Also, the machine learning practitioner can set different epochs values to assess the performance of multiple segmentation models. Please refer to this [subsection](#training-vgg-unet-model) to see how the regular training procedure is deployed. 96 | 97 | The semantic segmentation model fitted through regular training can obtain a performance boost by implementing a hyperparameter tuning process using different values for `learning rate`, `optimizer` and `loss function`. In this reference kit, the hyperparameter search space is confined to the few hyperparameters listed in the next table: 98 | 99 | | **Hyperparameter** | Values 100 | | :--- | :--- 101 | | **Learning rates** | [0.001, 0.01, 0.0001]
102 | | **Optimizers** | ["Adam", "adadelta", "rmsprop"] 103 | | **Loss function** | ["categorical_crossentropy"] 104 | 105 | As part of the hyperparameter tuning process, it is important to state that the dataset remains the same with 80:20 split for training and testing (see [here](#dataset) for more details about the dataset). Once the best combination of hyperparameters is defined, the model can be retrained with such combination to achieve better accuracy. The hyperparameter tuning execution is shown in this [subsection](#hyperparameter-tuning). 106 | 107 | Another important aspect of the VGG-UNET model trained with Intel ®oneDNN optimized TensorFlow is that this model is trained using a FP32 precision. 108 | 109 | ### Intel® Neural Compressor 110 | After training the VGG-UNET model using Intel® oneDNN optimized TensorFlow, its inference efficiency can be accelerated even more by the Intel® Neural Compressor library. This project enables the use of Intel® Neural Compressor to convert the trained FP32 VGG-UNET model into an INT8 CRNN model by implementing post-training quantization, which apart from reducing model size, increase the inference speed up. 111 | 112 | The quantization of the trained FP32 VGG-UNET model into an INT8 CRNN model and other executed operations based on Intel® Neural Compressor optimizations can be inspected [here](#optimizations-with-intel-neural-compressor). 113 | 114 | ## Get Started 115 | Start by **defining an environment variable** that will store the workspace path, this can be an existing directory or one to be created in further steps. This ENVVAR will be used for all the commands executed using absolute paths. 116 | 117 | [//]: # (capture: baremetal) 118 | ```bash 119 | export WORKSPACE=$PWD/drone-navigation-inspection 120 | ``` 121 | 122 | Also, it is necessary to define the following environment variables that will be used in later stages. 123 | 124 | [//]: # (capture: baremetal) 125 | ```bash 126 | export SRC_DIR=$WORKSPACE/src 127 | export DATA_DIR=$WORKSPACE/data 128 | export OUTPUT_DIR=$WORKSPACE/output 129 | ``` 130 | 131 | ### Download the Workflow Repository 132 | Create the workspace directory for the workflow and clone the [Drone Navigation Inspection]() repository inside it. 133 | 134 | ```bash 135 | mkdir -p $WORKSPACE && cd $WORKSPACE 136 | ``` 137 | 138 | ```bash 139 | git clone https://github.com/oneapi-src/drone-navigation-inspection $WORKSPACE 140 | ``` 141 | 142 | Create the `$DATA_DIR` folder that will store the dataset in later steps. 143 | 144 | [//]: # (capture: baremetal) 145 | ```bash 146 | mkdir -p $DATA_DIR 147 | ``` 148 | 149 | ### Set Up Conda 150 | Please follow the instructions below to download and install Miniconda. 151 | 152 | 1. Download the required Miniconda installer for Linux. 153 | 154 | ```bash 155 | wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh 156 | ``` 157 | 158 | 2. Install Miniconda. 159 | 160 | ```bash 161 | bash Miniconda3-latest-Linux-x86_64.sh 162 | ``` 163 | 164 | 3. Delete Miniconda installer. 165 | 166 | ```bash 167 | rm Miniconda3-latest-Linux-x86_64.sh 168 | ``` 169 | 170 | Please visit [Conda Installation on Linux](https://docs.anaconda.com/free/anaconda/install/linux/) for more details. 171 | 172 | ### Set Up Environment 173 | Execute the next commands to install and setup libmamba as conda's default solver. 174 | 175 | ```bash 176 | conda install -n base conda-libmamba-solver 177 | conda config --set solver libmamba 178 | ``` 179 | 180 | | Packages | Version | 181 | | -------- | ------- | 182 | | intelpython3_core | 2024.1.0 | 183 | | python | 3.9 | 184 | | intelpython3_core | 2024.1.0 | 185 | | intel-aikit-tensorflow | 2024.1 | 186 | | tqdm | 4.66.2 | 187 | | pip | 24.0 | 188 | | opencv-python | 4.9.0.80 | 189 | 190 | The dependencies required to properly execute this workflow can be found in the yml file [$WORKSPACE/env/intel_env.yml](env/intel_env.yml). 191 | 192 | Proceed to create the conda environment. 193 | 194 | ```bash 195 | conda env create -f $WORKSPACE/env/intel_env.yml 196 | ``` 197 | 198 | Environment setup is required only once. This step does not cleanup the existing environment with the same name hence we need to make sure there is no conda environment with the same name. During this setup, `drone_navigation_intel` conda environment will be created with the dependencies listed in the YAML configuration. 199 | 200 | Activate the `drone_navigation_intel` conda environment as follows: 201 | 202 | ```bash 203 | conda activate drone_navigation_intel 204 | ``` 205 | 206 | ### Download the Dataset 207 | Please follow the next instructions to correctly download and setup the dataset required for this semantic segmentation workload. 208 | 209 | 1. Install [Kaggle\* API](https://github.com/Kaggle/kaggle-api) and configure your [credentials](https://github.com/Kaggle/kaggle-api#api-credentials) and [proxies](https://github.com/Kaggle/kaggle-api#set-a-configuration-value). 210 | 211 | 2. Navigate inside the `data` folder and download the dataset from https://www.kaggle.com/datasets/bulentsiyah/semantic-drone-dataset. 212 | 213 | ```bash 214 | cd $DATA_DIR 215 | kaggle datasets download -d bulentsiyah/semantic-drone-dataset 216 | ``` 217 | 218 | 3. Unzip the dataset file. 219 | 220 | ```bash 221 | unzip semantic-drone-dataset.zip 222 | ``` 223 | 224 | 4. Move the dataset and the image masks into a proper locations. 225 | 226 | ```bash 227 | mkdir Aerial_Semantic_Segmentation_Drone_Dataset 228 | mv ./dataset ./Aerial_Semantic_Segmentation_Drone_Dataset 229 | mv ./RGB_color_image_masks ./Aerial_Semantic_Segmentation_Drone_Dataset 230 | ``` 231 | 232 | After completing the previous steps, the `data` folder should have the below structure: 233 | 234 | ``` 235 | - Aerial_Semantic_Segmentation_Drone_Dataset 236 | - dataset 237 | - semantic_drone_dataset 238 | - label_images_semantic 239 | - original_images 240 | - RGB_color_image_masks 241 | ``` 242 | 243 | ## Supported Runtime Environment 244 | The execution of this reference kit is compatible with the following environments: 245 | * Bare Metal 246 | 247 | ### Run Using Bare Metal 248 | 249 | #### Set Up System Software 250 | 251 | Our examples use the `conda` package and environment on your local computer. If you don't already have `conda` installed or the `conda` environment created, go to [Set Up Conda*](#set-up-conda) or see the [Conda* Linux installation instructions](https://docs.conda.io/projects/conda/en/stable/user-guide/install/linux.html). 252 | 253 | > *Note: It is assumed that the present working directory is the root directory of this code repository. Use the following command to go to the root directory.* 254 | 255 | ```bash 256 | cd $WORKSPACE 257 | ``` 258 | 259 | ### Run Workflow 260 | The following subsections provide the commands to make an optimized execution of this semantic segmentation workflow based on Intel® oneDNN optimized TensorFlow\* and Intel® Neural Compressor. As an illustrative guideline to understand how the Intel® specialized packages are used to optimize the performance of the VGG-UNET semantic segmentation model, please check the [How it Works section](#how-it-works). 261 | 262 | ### Optimizations with Intel® oneDNN TensorFlow\* 263 | Based on TensorFlow\* optimized by Intel® oneDNN, the stages of training, hyperparameter tuning, conversion to frozen graph, inference and evaluation are executed below. 264 | 265 | ### Training VGG-UNET Model 266 | The Python\* script given below needs to be executed to start training the VGG-UNET. For more details about the training process, see this [subsection](#intel®-onednn-optimized-tensorflow). About the training data, please check this [subsection](#dataset). 267 | 268 | ``` 269 | usage: training.py [-h] [-m MODEL_PATH] -d DATA_PATH [-e EPOCHS] [-hy HYPERPARAMS] 270 | 271 | optional arguments: 272 | -h, --help show this help message and exit. 273 | -m MODEL_PATH, --model_path MODEL_PATH 274 | Please provide the Latest Checkpoint path. Default is None. 275 | -d DATA_PATH, --data_path DATA_PATH 276 | Absolute path to the dataset folder containing "original_images" and "label_images_semantic" folders. 277 | -e EPOCHS, --epochs EPOCHS 278 | Provide the number of epochs want to train. 279 | -hy HYPERPARAMS, --hyperparams HYPERPARAMS 280 | Enable hyperparameter tuning. Default is "0" to indicate unabled hyperparameter tuning. 281 | ``` 282 | 283 | Example: 284 | 285 | [//]: # (capture: baremetal) 286 | ```bash 287 | python $SRC_DIR/training.py -d $DATA_DIR/Aerial_Semantic_Segmentation_Drone_Dataset/dataset/semantic_drone_dataset -e 10 -m $OUTPUT_DIR/model 288 | ``` 289 | 290 | In this example, Intel® oneDNN optimized TensorFlow\* is applied to boost training performance and the generated TensorFlow\* checkpoint model will be saved in the `$OUTPUT_DIR/model` folder. 291 | 292 | ### Hyperparameter Tuning 293 | The Python\* script given below needs to be executed to start hyperparameter tuned training. The model generated using the regular training approach will be regard as the pretrained model in which the fine tuning process will be applied. To obtain more details about the hyperparameter tuning modality, refer to this [subsection](#intel®-onednn-optimized-tensorflow). 294 | 295 | [//]: # (capture: baremetal) 296 | ```bash 297 | python $SRC_DIR/training.py -d $DATA_DIR/Aerial_Semantic_Segmentation_Drone_Dataset/dataset/semantic_drone_dataset -e 3 -m $OUTPUT_DIR/model -hy 1 298 | ``` 299 | 300 | > **Note**: **The best combinations of hyperparameters will get printed at the end of the script. The model can be retrained for a longer time (more number of epochs) with the best combination of hyperparameters to achieve a better accuracy.** 301 | 302 | ### Convert the Model to Frozen Graph 303 | Run the Python\* conversion script given below to convert the TensorFlow\* checkpoint model format to frozen graph format. This frozen graph can be later used when performing inference with Intel® Neural Compressor. 304 | 305 | ``` 306 | usage: create_frozen_graph.py [-h] [-m MODEL_PATH] -o OUTPUT_SAVED_DIR 307 | 308 | optional arguments: 309 | -h, --help show this help message and exit. 310 | -m MODEL_PATH, --model_path MODEL_PATH 311 | Please provide the Latest Checkpoint path. Default is None. 312 | -o OUTPUT_SAVED_DIR, --output_saved_dir OUTPUT_SAVED_DIR 313 | Directory to save frozen graph to. 314 | 315 | ``` 316 | 317 | Example: 318 | 319 | [//]: # (capture: baremetal) 320 | ```bash 321 | python $SRC_DIR/create_frozen_graph.py -m $OUTPUT_DIR/model/vgg_unet --output_saved_dir $OUTPUT_DIR/model 322 | ``` 323 | 324 | In this example, the generated frozen graph will be saved in the `$OUTPUT_DIR/model` folder with the name `frozen_graph.pb`. 325 | 326 | ### Inference 327 | 328 | The Python\* script given below needs to be executed to perform inference based on the segmentation model converted into frozen graph. 329 | 330 | ``` 331 | usage: run_inference.py [-h] [-m MODELPATH] -d DATA_PATH [-b BATCHSIZE] 332 | 333 | optional arguments: 334 | -h, --help show this help message and exit. 335 | -m MODELPATH, --modelpath MODELPATH 336 | Provide frozen Model path ".pb" file. Users can also use Intel® Neural Compressor INT8 quantized model here. 337 | -d DATA_PATH, --data_path DATA_PATH 338 | Absolute path to the dataset folder containing "original_images" and "label_images_semantic" folders. 339 | -b BATCHSIZE, --batchsize BATCHSIZE 340 | batchsize used for inference. 341 | ``` 342 | 343 | Example: 344 | - 345 | [//]: # (capture: baremetal) 346 | ```bash 347 | python $SRC_DIR/run_inference.py -m $OUTPUT_DIR/model/frozen_graph.pb -d $DATA_DIR/Aerial_Semantic_Segmentation_Drone_Dataset/dataset/semantic_drone_dataset -b 1 348 | ``` 349 | 350 | > Above inference script can be run using different batch sizes
351 | Same script can be used to benchmark Intel® Neural Compressor INT8 quantized model. For more details please refer to Intel® Neural Compressor quantization section.
352 | By using different batchsize one can observe the gain obtained using Intel® oneDNN optimized TensorFlow.
353 | Run this script to record multiple trials and the average can be calculated. 354 | 355 | ### Evaluating the Model on Test Dataset 356 | Run the Python\* script given below to evaluate the semantic segmentation model and find out the class-wise accuracy score. 357 | 358 | ``` 359 | usage: evaluation.py [-h] [-m MODEL_PATH] -d DATA_PATH [-t MODEL_TYPE] 360 | 361 | optional arguments: 362 | -h, --help Show this help message and exit 363 | -m MODEL_PATH, --model_path MODEL_PATH 364 | Please provide the Latest Checkpoint path. Default is None. 365 | -d DATA_PATH, --data_path DATA_PATH 366 | Absolute path to the dataset folder containing "original_images" and "label_images_semantic" folders. 367 | -t MODEL_TYPE, --model_type MODEL_TYPE 368 | 0 for checkpoint, 1 for frozen_graph. 369 | ``` 370 | 371 | Example to run evaluation using the original TensorFlow checkpoint model: 372 | 373 | [//]: # (capture: baremetal) 374 | ```bash 375 | python $SRC_DIR/evaluation.py -d $DATA_DIR/Aerial_Semantic_Segmentation_Drone_Dataset/dataset/semantic_drone_dataset -m $OUTPUT_DIR/model/vgg_unet -t 0 376 | ``` 377 | 378 | Example to run evaluation using the frozen graph: 379 | 380 | [//]: # (capture: baremetal) 381 | ```bash 382 | python $SRC_DIR/evaluation.py -d $DATA_DIR/Aerial_Semantic_Segmentation_Drone_Dataset/dataset/semantic_drone_dataset -m $OUTPUT_DIR/model/frozen_graph.pb -t 1 383 | ``` 384 | 385 | > Same script can be used for Evaluating Intel® Neural Compressor INT8 quantized model. For more details please refer to Intel® Neural Compressor quantization section.
386 | 387 | ### Optimizations with Intel® Neural Compressor 388 | Intel® Neural Compressor is used to quantize the FP32 VGG-UNET model into a INT8 model. In this case, we used post training quantization method to quantize the FP32 model. 389 | 390 | ### Conversion of FP32 VGG-UNET Model to INT8 Model 391 | Run the Python\* script given below to convert the FP32 VGG-UNET model in the form of the frozen graph into a INT8 model. 392 | 393 | ``` 394 | usage: neural_compressor_conversion.py [-h] -m MODELPATH -o OUTPATH [-c CONFIG] -d DATA_PATH [-b BATCHSIZE] 395 | 396 | optional arguments: 397 | -h, --help show this help message and exit. 398 | -m MODELPATH, --modelpath MODELPATH 399 | Path to the model trained with TensorFlow and saved as a ".pb" file. 400 | -o OUTPATH, --outpath OUTPATH 401 | Directory to save the INT8 quantized model to. 402 | -c CONFIG, --config CONFIG 403 | Yaml file for quantizing model, default is "$SRC_DIR/intel_neural_compressor/deploy.yaml". 404 | -d DATA_PATH, --data_path DATA_PATH 405 | Absolute path to the dataset folder containing "original_images" and "label_images_semantic" folders. 406 | -b BATCHSIZE, --batchsize BATCHSIZE 407 | batchsize for the dataloader. Default is 1. 408 | ``` 409 | 410 | [//]: # (capture: baremetal) 411 | ```bash 412 | python $SRC_DIR/intel_neural_compressor/neural_compressor_conversion.py -d $DATA_DIR/Aerial_Semantic_Segmentation_Drone_Dataset/dataset/semantic_drone_dataset -m $OUTPUT_DIR/model/frozen_graph.pb -o $OUTPUT_DIR/model/inc_compressed_model/output 413 | ``` 414 | 415 | The model after conversion, that is the quantized model, will be stored in the `$OUTPUT_DIR/model/inc_compressed_model/` folder with the name `output.pb`. 416 | 417 | ### Inference Using Quantized INT8 VGG-UNET Model 418 | The Python\* script given below needs to be executed to perform inference based on the quantized INT8 VGG-UNET Model. 419 | 420 | ``` 421 | usage: run_inference.py [-h] [-m MODELPATH] [-d DATA_PATH] [-b BATCHSIZE] 422 | 423 | optional arguments: 424 | -h, --help show this help message and exit. 425 | -m MODELPATH, --modelpath MODELPATH 426 | Provide frozen Model path ".pb" file .Users can also use Intel® Neural Compressor INT8 quantized model here. Default is $OUTPUT_DIR/model/frozen_graph.pb 427 | -d DATA_PATH, --data_path DATA_PATH 428 | Absolute path to the dataset folder containing "original_images" and "label_images_semantic" folders. Default is $DATA_DIR/Aerial_Semantic_Segmentation_Drone_Dataset/dataset/semantic_drone_dataset 429 | -b BATCHSIZE, --batchsize BATCHSIZE 430 | batchsize used for inference. 431 | ``` 432 | 433 | Example: 434 | 435 | [//]: # (capture: baremetal) 436 | ```bash 437 | python $SRC_DIR/run_inference.py -m $OUTPUT_DIR/model/inc_compressed_model/output.pb -d $DATA_DIR/Aerial_Semantic_Segmentation_Drone_Dataset/dataset/semantic_drone_dataset -b 1 438 | ``` 439 | 440 | > Use `-b` to test with different batch size (e.g. `-b 10`) 441 | 442 | ### Evaluating the Quantized INT8 VGG-UNET Model on Test Dataset 443 | Run the Python\* script given below to evaluate the quantized INT8 VGG-UNET model and find out the class-wise accuracy score. 444 | 445 | ``` 446 | usage: evaluation.py [-h] [-m MODEL_PATH] [-d DATA_PATH] [-t MODEL_TYPE] 447 | 448 | optional arguments: 449 | -h, --help Show this help message and exit 450 | -m MODEL_PATH, --model_path MODEL_PATH 451 | Please provide the Latest Checkpoint path. Default is None. 452 | -d DATA_PATH, --data_path DATA_PATH 453 | Absolute path to the dataset folder containing "original_images" and "label_images_semantic" folders. 454 | -t MODEL_TYPE, --model_type MODEL_TYPE 455 | 0 for checkpoint, 1 for frozen_graph. 456 | ``` 457 | 458 | Example to run evaluation using the frozen graph: 459 | 460 | [//]: # (capture: baremetal) 461 | ```bash 462 | python $SRC_DIR/evaluation.py -d $DATA_DIR/Aerial_Semantic_Segmentation_Drone_Dataset/dataset/semantic_drone_dataset -m $OUTPUT_DIR/model/inc_compressed_model/output.pb -t 1 463 | ``` 464 | 465 | #### Clean Up Bare Metal 466 | The next commands are useful to remove the previously generated conda environment, as well as the dataset and the multiple models and files created during the workflow execution. Before proceeding with the clean up process, it is recommended to back up the data you want to preserve. 467 | 468 | ```bash 469 | conda deactivate #Run this line if the drone_navigation_intel environment is still active 470 | conda env remove -n drone_navigation_intel 471 | rm -rf $WORKSPACE 472 | ``` 473 | 474 | --- 475 | 476 | ### Expected Output 477 | A successful execution of the different stages of this workflow should produce outputs similar to the following: 478 | 479 | #### Regular Training Output with Intel® oneDNN optimized TensorFlow\* 480 | 481 | ``` 482 | 2023-12-07 06:53:46.349193: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 483 | 2023-12-07 06:53:46.379141: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. 484 | To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 485 | Started data validation and Training for 10 epochs 486 | Model Input height , Model Input width, Model Output Height, Model Output Width 487 | 416 608 208 304 488 | Batch Size used for Training --> 4 489 | Batch Size used for Validation --> 4 490 | ``` 491 | ... 492 | ``` 493 | Starting Epoch 5 494 | 128/128 [==============================] - 298s 2s/step - loss: 1.0430 - accuracy: 0.6553 495 | saved //drone-navigation-inspection/output/model/vgg_unet 496 | Finished Epoch 5 497 | Starting Epoch 6 498 | 128/128 [==============================] - 298s 2s/step - loss: 1.0443 - accuracy: 0.6564 499 | saved //drone-navigation-inspection/output/model/vgg_unet 500 | Finished Epoch 6 501 | Starting Epoch 7 502 | 128/128 [==============================] - 297s 2s/step - loss: 1.0667 - accuracy: 0.6519 503 | saved //drone-navigation-inspection/output/model/vgg_unet 504 | Finished Epoch 7 505 | Starting Epoch 8 506 | 128/128 [==============================] - 298s 2s/step - loss: 1.0777 - accuracy: 0.6458 507 | saved //drone-navigation-inspection/output/model/vgg_unet 508 | Finished Epoch 8 509 | Starting Epoch 9 510 | 128/128 [==============================] - 297s 2s/step - loss: 1.0888 - accuracy: 0.6441 511 | saved //drone-navigation-inspection/output/model/vgg_unet 512 | Finished Epoch 9 513 | Time Taken for Training in seconds --> 3034.8698456287384 514 | ``` 515 | 516 | #### Hyperparameter Tuning Output with Intel® oneDNN optimized TensorFlow\* 517 | 518 | ``` 519 | 2023-12-07 12:11:35.758937: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 520 | 2023-12-07 12:11:35.789043: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. 521 | To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 522 | Started Hyperprameter tuning 523 | Model Input height , Model Input width, Model Output Height, Model Output Width 524 | 416 608 208 304 525 | Batch Size used for Training --> 4 526 | Total number of fits = 9 527 | Take Break!!! 528 | This will take time! 529 | Loading weights from //drone-navigation-inspection/output/model/vgg_unet 530 | Current fit is at 1 531 | Current fit parameters --> epochs= 3 learning rate= 0.001 optimizer= Adam loss= categorical_crossentropy 532 | /drone-navigation-inspection/src/utils.py:542: UserWarning: `Model.fit_generator` is deprecated and will be removed in a future version. Please use `Model.fit`, which supports generators. 533 | hist=model.fit_generator(train_gen, steps_per_epoch, epochs=epochs, workers=1, use_multiprocessing=False) 534 | 80% of Data is considered for Training ===> 320 535 | ``` 536 | ... 537 | ``` 538 | Current fit is at 7 539 | Current fit parameters --> epochs= 3 learning rate= 0.0001 optimizer= Adam loss= categorical_crossentropy 540 | Epoch 1/3 541 | 32/32 [==============================] - 76s 2s/step - loss: 0.9938 - accuracy: 0.6624 - mae: 0.0451 - mean_io_u_6: 0.4771 542 | Epoch 2/3 543 | 32/32 [==============================] - 75s 2s/step - loss: 0.9846 - accuracy: 0.6790 - mae: 0.0444 - mean_io_u_6: 0.4770 544 | Epoch 3/3 545 | 32/32 [==============================] - 75s 2s/step - loss: 0.9586 - accuracy: 0.6818 - mae: 0.0446 - mean_io_u_6: 0.4771 546 | Fit number: 7 ==> Time Taken for Training in seconds --> 230.05844235420227 547 | The best Tuningparameter combination is : {'accuracy': 0.6624360680580139, 'best_fit': (0.0001, 'Adam', 'categorical_crossentropy')} 548 | Loading weights from //drone-navigation-inspection/output/model/vgg_unet 549 | Current fit is at 8 550 | Current fit parameters --> epochs= 3 learning rate= 0.0001 optimizer= adadelta loss= categorical_crossentropy 551 | Epoch 1/3 552 | 32/32 [==============================] - 76s 2s/step - loss: 1.0724 - accuracy: 0.6356 - mae: 0.0461 - mean_io_u_7: 0.4771 553 | Epoch 2/3 554 | 32/32 [==============================] - 75s 2s/step - loss: 1.0755 - accuracy: 0.6294 - mae: 0.0465 - mean_io_u_7: 0.4771 555 | Epoch 3/3 556 | 32/32 [==============================] - 76s 2s/step - loss: 1.0776 - accuracy: 0.6274 - mae: 0.0466 - mean_io_u_7: 0.4771 557 | Fit number: 8 ==> Time Taken for Training in seconds --> 230.5231795310974 558 | The best Tuningparameter combination is : {'accuracy': 0.6624360680580139, 'best_fit': (0.0001, 'Adam', 'categorical_crossentropy')} 559 | Loading weights from //drone-navigation-inspection/output/model/vgg_unet 560 | Current fit is at 9 561 | Current fit parameters --> epochs= 3 learning rate= 0.0001 optimizer= rmsprop loss= categorical_crossentropy 562 | Epoch 1/3 563 | 32/32 [==============================] - 75s 2s/step - loss: 1.0011 - accuracy: 0.6676 - mae: 0.0449 - mean_io_u_8: 0.4770 564 | Epoch 2/3 565 | 32/32 [==============================] - 75s 2s/step - loss: 0.9739 - accuracy: 0.6791 - mae: 0.0451 - mean_io_u_8: 0.4771 566 | Epoch 3/3 567 | 32/32 [==============================] - 75s 2s/step - loss: 0.9588 - accuracy: 0.6840 - mae: 0.0440 - mean_io_u_8: 0.4771 568 | Fit number: 9 ==> Time Taken for Training in seconds --> 229.0527949333191 569 | The best Tuningparameter combination is : {'accuracy': 0.6676027774810791, 'best_fit': (0.0001, 'rmsprop', 'categorical_crossentropy')} 570 | Time Taken for Total Hyper parameter Tuning and Model loading in seconds --> 2080.2837102413177 571 | total_time --> 2079.7287237644196 572 | ``` 573 | 574 | #### Inference Output with Intel® oneDNN optimized TensorFlow\* 575 | 576 | ``` 577 | 2023-12-07 13:09:59.528826: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 578 | 2023-12-07 13:09:59.558645: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. 579 | To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 580 | load graph 581 | ``` 582 | ... 583 | ``` 584 | Time Taken for model inference in seconds ---> 0.05593061447143555 585 | Time Taken for model inference in seconds ---> 0.05590033531188965 586 | Time Taken for model inference in seconds ---> 0.055976152420043945 587 | Time Taken for model inference in seconds ---> 0.0559389591217041 588 | Time Taken for model inference in seconds ---> 0.055884599685668945 589 | Time Taken for model inference in seconds ---> 0.05602383613586426 590 | Time Taken for model inference in seconds ---> 0.0559384822845459 591 | Time Taken for model inference in seconds ---> 0.055918216705322266 592 | Time Taken for model inference in seconds ---> 0.05595517158508301 593 | Time Taken for model inference in seconds ---> 0.05599474906921387 594 | Average Time Taken for model inference in seconds ---> 0.05594611167907715 595 | ``` 596 | 597 | #### Evaluation Output with Intel® oneDNN optimized TensorFlow\* 598 | Output using the original TensorFlow checkpoint model: 599 | 600 | ``` 601 | 2023-12-07 13:18:49.975518: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 602 | 2023-12-07 13:18:50.004278: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. 603 | To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 604 | ``` 605 | ... 606 | ``` 607 | [TARGET CLASS] paved-area => 64.86 608 | dirt => 19.10 609 | grass => 58.54 610 | gravel => 12.78 611 | water => 12.97 612 | rocks => 0.09 613 | pool => 33.43 614 | vegetation => 33.72 615 | roof => 49.62 616 | wall => 2.15 617 | window => 0.00 618 | door => 0.00 619 | fence => 0.00 620 | fence-pole => 0.34 621 | person => 12.24 622 | dog => 0.00 623 | car => 9.96 624 | bicycle => 1.97 625 | tree => 3.10 626 | bald-tree => 0.00 627 | Time Taken for Prediction in seconds --> 0.7498071193695068 628 | ``` 629 | 630 | Output using the frozen graph: 631 | 632 | ``` 633 | 2023-12-07 13:26:49.874614: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 634 | 2023-12-07 13:26:49.903073: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. 635 | To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 636 | ``` 637 | ... 638 | ``` 639 | [TARGET CLASS] paved-area => 67.84 640 | dirt => 20.96 641 | grass => 47.30 642 | gravel => 33.26 643 | water => 9.31 644 | rocks => 0.11 645 | pool => 43.17 646 | vegetation => 32.72 647 | roof => 25.90 648 | wall => 3.51 649 | window => 0.00 650 | door => 0.00 651 | fence => 0.00 652 | fence-pole => 0.00 653 | person => 5.84 654 | dog => 0.00 655 | car => 0.00 656 | bicycle => 0.04 657 | tree => 0.15 658 | bald-tree => 0.01 659 | ``` 660 | 661 | #### Expected Output for Conversion of FP32 VGG-UNET Model to INT8 Model Using Intel® Neural Compressor 662 | 663 | ``` 664 | 2023-12-07 13:30:13.110902: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 665 | 2023-12-07 13:30:13.140783: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. 666 | To enable the following instructions: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 667 | ``` 668 | ... 669 | ``` 670 | 2023-12-07 13:31:02 [INFO] |****Mixed Precision Statistics****| 671 | 2023-12-07 13:31:02 [INFO] +------------+-------+------+------+ 672 | 2023-12-07 13:31:02 [INFO] | Op Type | Total | INT8 | FP32 | 673 | 2023-12-07 13:31:02 [INFO] +------------+-------+------+------+ 674 | 2023-12-07 13:31:02 [INFO] | ConcatV2 | 3 | 0 | 3 | 675 | 2023-12-07 13:31:02 [INFO] | Conv2D | 15 | 15 | 0 | 676 | 2023-12-07 13:31:02 [INFO] | MaxPool | 4 | 4 | 0 | 677 | 2023-12-07 13:31:02 [INFO] | QuantizeV2 | 5 | 5 | 0 | 678 | 2023-12-07 13:31:02 [INFO] | Dequantize | 8 | 8 | 0 | 679 | 2023-12-07 13:31:02 [INFO] +------------+-------+------+------+ 680 | 2023-12-07 13:31:02 [INFO] Pass quantize model elapsed time: 4136.69 ms 681 | Model Input height , Model Input width, Model Output Height, Model Output Width 682 | 416 608 208 304 683 | 20% of Data is considered for Evaluating===> 80 684 | 80it [00:36, 2.18it/s] 685 | 2023-12-07 13:31:39 [INFO] Tune 1 result is: [Accuracy (int8|fp32): 0.6624|0.6784, Duration (seconds) (int8|fp32): 36.7283|38.3752], Best tune result is: [Accuracy: 0.6624, Duration (seconds): 36.7283] 686 | 2023-12-07 13:31:39 [INFO] |**********************Tune Result Statistics**********************| 687 | 2023-12-07 13:31:39 [INFO] +--------------------+----------+---------------+------------------+ 688 | 2023-12-07 13:31:39 [INFO] | Info Type | Baseline | Tune 1 result | Best tune result | 689 | 2023-12-07 13:31:39 [INFO] +--------------------+----------+---------------+------------------+ 690 | 2023-12-07 13:31:39 [INFO] | Accuracy | 0.6784 | 0.6624 | 0.6624 | 691 | 2023-12-07 13:31:39 [INFO] | Duration (seconds) | 38.3752 | 36.7283 | 36.7283 | 692 | 2023-12-07 13:31:39 [INFO] +--------------------+----------+---------------+------------------+ 693 | 2023-12-07 13:31:39 [INFO] Save tuning history to /drone-navigation-inspection/src/intel_neural_compressor/nc_workspace/2023-12-07_13-30-16/./history.snapshot. 694 | 2023-12-07 13:31:39 [INFO] Specified timeout or max trials is reached! Found a quantized model which meet accuracy goal. Exit. 695 | 2023-12-07 13:31:39 [INFO] Save deploy yaml to /drone-navigation-inspection/src/intel_neural_compressor/nc_workspace/2023-12-07_13-30-16/deploy.yaml 696 | 2023-12-07 13:31:39 [INFO] Save quantized model to //drone-navigation-inspection/output/model/inc_compressed_model/output.pb. 697 | ``` 698 | 699 | #### Inference Output with Intel® Neural Compressor 700 | 701 | ``` 702 | 2023-12-07 13:33:52.653916: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 703 | 2023-12-07 13:33:52.684842: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. 704 | ``` 705 | ... 706 | ``` 707 | Time Taken for model inference in seconds ---> 0.027923583984375 708 | Time Taken for model inference in seconds ---> 0.027917861938476562 709 | Time Taken for model inference in seconds ---> 0.027941226959228516 710 | Time Taken for model inference in seconds ---> 0.02792835235595703 711 | Time Taken for model inference in seconds ---> 0.024005651473999023 712 | Time Taken for model inference in seconds ---> 0.02797222137451172 713 | Time Taken for model inference in seconds ---> 0.027898311614990234 714 | Time Taken for model inference in seconds ---> 0.02796459197998047 715 | Time Taken for model inference in seconds ---> 0.027928590774536133 716 | Time Taken for model inference in seconds ---> 0.027961254119873047 717 | Average Time Taken for model inference in seconds ---> 0.027544164657592775 718 | ``` 719 | 720 | #### Evaluation Output with Intel® Neural Compressor 721 | ``` 722 | 2023-12-07 13:36:36.775949: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 723 | 2023-12-07 13:36:36.805139: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. 724 | ``` 725 | ... 726 | ``` 727 | [TARGET CLASS] paved-area => 66.24 728 | dirt => 22.63 729 | grass => 46.54 730 | gravel => 29.26 731 | water => 11.96 732 | rocks => 0.07 733 | pool => 37.07 734 | vegetation => 32.53 735 | roof => 21.08 736 | wall => 1.17 737 | window => 0.00 738 | door => 0.00 739 | fence => 0.00 740 | fence-pole => 0.00 741 | person => 4.47 742 | dog => 0.00 743 | car => 0.00 744 | bicycle => 0.02 745 | tree => 0.08 746 | bald-tree => 0.01 747 | ``` 748 | 749 | ## Summary and Next Steps 750 | 751 | This reference kit presents an AI semantic segmentation solution specialized in accurately recognize entities and segment paved areas from input images captured by drones. Thus, this system could contribute to the safe landing of drones in dedicated paved areas, reducing the risk of injuring people or damaging property. 752 | 753 | To carry out the segmentation task, the system makes use of a semantic segmentation model called VGG-UNET. Furthermore, the VGG-UNET model leverages the optimizations given by Intel® oneDNN optimized TensorFlow\* and Intel® Neural Compressor to accelerate its training, hyperparameter tuning and inference processing capabilities while maintaining the accuracy. 754 | 755 | As next steps, the machine learning practitioner could adapt this semantic segmentation solution for different drone navigation scenarios by including a larger and more complex dataset, which could be used for a sophisticated training based on TensorFlow\* optimized with Intel® oneDNN. Finally, the trained model could be quantized with Intel® Neural Compressor to meet the resource-constrained demands of drone technology. 756 | 757 | ## Learn More 758 | For more information about Predictive Asset Maintenance or to read about other relevant workflow examples, see these guides and software resources: 759 | 760 | - [Intel® AI Analytics Toolkit (AI Kit)](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-analytics-toolkit.html) 761 | - [Intel® Distribution for Python](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html) 762 | - [Intel® oneDNN optimized TensorFlow\*](https://www.intel.com/content/www/us/en/developer/tools/oneapi/optimization-for-tensorflow.html#gs.174f5y) 763 | - [Intel® Neural Compressor](https://www.intel.com/content/www/us/en/developer/tools/oneapi/neural-compressor.html#gs.5vjr1p) 764 | 765 | ## Troubleshooting 766 | 1. libGL.so.1/libgthread-2.0.so.0: cannot open shared object file: No such file or directory 767 | 768 | **Issue:** 769 | ``` 770 | ImportError: libGL.so.1: cannot open shared object file: No such file or directory 771 | or 772 | libgthread-2.0.so.0: cannot open shared object file: No such file or directory 773 | ``` 774 | 775 | **Solution:** 776 | 777 | Install the libgl11-mesa-glx and libglib2.0-0 libraries. For Ubuntu this will be: 778 | 779 | ```bash 780 | apt install libgl1-mesa-glx 781 | apt install libglib2.0-0 782 | ``` 783 | 784 | ## Support 785 | If you have questions or issues about this workflow, want help with troubleshooting, want to report a bug or submit enhancement requests, please submit a GitHub issue. 786 | 787 | ## Appendix 788 | \*Names and brands that may be claimed as the property of others. [Trademarks](https://www.intel.com/content/www/us/en/legal/trademarks.html). 789 | 790 | ### Disclaimer 791 | 792 | To the extent that any public or non-Intel datasets or models are referenced by or accessed using tools or code on this site those datasets or models are provided by the third party indicated as the content source. Intel does not create the content and does not warrant its accuracy or quality. By accessing the public content, or using materials trained on or with such content, you agree to the terms associated with that content and that your use complies with the applicable license. 793 | 794 | Intel expressly disclaims the accuracy, adequacy, or completeness of any such public content, and is not liable for any errors, omissions, or defects in the content, or for any reliance on the content. Intel is not liable for any liability or damages relating to your use of public content. 795 | --------------------------------------------------------------------------------