├── .env ├── .gitignore ├── README.md ├── assets └── sample_image.jpeg ├── main.py ├── output ├── .gitkeep ├── result.avi └── sample_image.jpeg ├── requirements.txt ├── weights └── .gitkeep └── yolov9.py /.env: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsyahputra/yolov9-onnx/93d6454534323236c2230c860f3e375651fde71e/.env -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | **__pycache__** 3 | tmp 4 | *.onnx 5 | *.yaml 6 | *.mp4 -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # YOLOv9 with ONNX & ONNXRuntime 2 | 3 | Performing Object Detection for YOLOv9 with ONNX and ONNXRuntime 4 | 5 | ![! ONNX YOLOv9 Object Detection](https://github.com/danielsyahputra/yolov9-onnx/blob/master/output/sample_image.jpeg) 6 | 7 | 8 | ## Requirements 9 | 10 | * Check the **requirements.txt** file. 11 | * For ONNX, if you have a NVIDIA GPU, then install the **onnxruntime-gpu**, otherwise use the **onnxruntime** library. 12 | 13 | ## Installation 14 | 15 | ```shell 16 | git clone https://github.com/danielsyahputra/yolov9-onnx.git 17 | cd yolov9-onnx 18 | pip install -r requirements.txt 19 | ``` 20 | ### ONNX Runtime 21 | For Nvidia GPU computers: 22 | `pip install onnxruntime-gpu` 23 | 24 | Otherwise: 25 | `pip install onnxruntime` 26 | 27 | ## ONNX model and Class metadata 28 | 29 | You can download the onnx model and class metadata file on the link below 30 | 31 | ``` 32 | https://drive.google.com/drive/folders/1QH5RCF5WOk53SfdzsHTFkXAdzMLbbQeO?usp=sharing 33 | ``` 34 | 35 | ## Examples 36 | 37 | ### Arguments 38 | List the arguments available in main.py file. 39 | 40 | - `--source`: Path to image or video file 41 | - `--weights`: Path to yolov9 onnx file (ex: weights/yolov9-c.onnx) 42 | - `--classes`: Path to yaml file that contains the list of class from model (ex: weights/metadata.yaml) 43 | - `--score-threshold`: Score threshold for inference, range from 0 - 1 44 | - `--conf-threshold`: Confidence threshold for inference, range from 0 - 1 45 | - `--iou-threshold`: IOU threshold for inference, range from 0 - 1 46 | - `--image`: Image inference mode 47 | - `--video`: Video inference mode 48 | - `--show`: Show result on pop-up window 49 | - `--device`: Device use for inference, default = cpu. 50 | 51 | 52 | 53 | Note: If you want to use `cuda` for inference, please make sure you are already install `onnxruntime-gpu` before running the script. 54 | 55 | 56 | This code provides two modes of inference, image and video inference. Basically, you just add `--image` flag for image inference and `--video` flag for video inference when you are running the python script. 57 | 58 | 59 | If you have your own custom model, don't forget to provide a yaml file that consists the list of class that your model want to predict. This is example of yaml content for defining your own classes: 60 | 61 | ``` 62 | names: 63 | 0: person 64 | 1: bicycle 65 | 2: car 66 | 3: motorcycle 67 | 4: airplane 68 | . 69 | . 70 | . 71 | . 72 | n: object 73 | ``` 74 | 75 | ### Inference on Image 76 | 77 | ``` 78 | python main.py --source assets/sample_image.jpeg --weights weights/yolov9-c.onnx --classes weights/metadata.yaml --image 79 | ``` 80 | 81 | ### Inference on Video 82 | 83 | ``` 84 | python main.py --source assets/road.mp4 --weights weights/yolov9-c.onnx --classes weights/metadata.yaml --video 85 | ``` 86 | 87 | # References: 88 | * YOLOv9 model: [https://github.com/WongKinYiu/yolov9](https://github.com/WongKinYiu/yolov9) -------------------------------------------------------------------------------- /assets/sample_image.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsyahputra/yolov9-onnx/93d6454534323236c2230c860f3e375651fde71e/assets/sample_image.jpeg -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | import os 2 | import cv2 3 | from pathlib import Path 4 | 5 | from yolov9 import YOLOv9 6 | 7 | 8 | def get_detector(args): 9 | weights_path = args.weights 10 | classes_path = args.classes 11 | source_path = args.source 12 | assert os.path.isfile(weights_path), f"There's no weight file with name {weights_path}" 13 | assert os.path.isfile(classes_path), f"There's no classes file with name {weights_path}" 14 | assert os.path.isfile(source_path), f"There's no source file with name {weights_path}" 15 | 16 | if args.image: 17 | image = cv2.imread(source_path) 18 | h,w = image.shape[:2] 19 | elif args.video: 20 | cap = cv2.VideoCapture(source_path) 21 | w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) 22 | h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) 23 | detector = YOLOv9(model_path=weights_path, 24 | class_mapping_path=classes_path, 25 | original_size=(w, h), 26 | score_threshold=args.score_threshold, 27 | conf_thresold=args.conf_threshold, 28 | iou_threshold=args.iou_threshold, 29 | device=args.device) 30 | return detector 31 | 32 | def inference_on_image(args): 33 | print("[INFO] Intialize Model") 34 | detector = get_detector(args) 35 | image = cv2.imread(args.source) 36 | 37 | print("[INFO] Inference Image") 38 | detections = detector.detect(image) 39 | detector.draw_detections(image, detections=detections) 40 | 41 | output_path = f"output/{Path(args.source).name}" 42 | print(f"[INFO] Saving result on {output_path}") 43 | cv2.imwrite(output_path, image) 44 | 45 | if args.show: 46 | cv2.imshow("Result", image) 47 | cv2.waitKey(0) 48 | 49 | def inference_on_video(args): 50 | print("[INFO] Intialize Model") 51 | detector = get_detector(args) 52 | 53 | cap = cv2.VideoCapture(args.source) 54 | w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) 55 | h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) 56 | video_fps = int(cap.get(cv2.CAP_PROP_FPS)) 57 | writer = cv2.VideoWriter('output/result.avi', cv2.VideoWriter_fourcc(*'MJPG'), video_fps, (w, h)) 58 | 59 | print("[INFO] Inference on Video") 60 | while True: 61 | ret, frame = cap.read() 62 | if not ret: 63 | break 64 | detections = detector.detect(frame) 65 | detector.draw_detections(frame, detections=detections) 66 | writer.write(frame) 67 | cv2.imshow("Result", frame) 68 | key = cv2.waitKey(1) & 0xFF 69 | if key == ord("q"): 70 | break 71 | print("[INFO] Finish. Saving result to output/result.avi") 72 | 73 | if __name__=="__main__": 74 | import argparse 75 | 76 | parser = argparse.ArgumentParser(description="Argument for YOLOv9 Inference using ONNXRuntime") 77 | 78 | parser.add_argument("--source", type=str, required=True, help="Path to image or video file") 79 | parser.add_argument("--weights", type=str, required=True, help="Path to yolov9 onnx file") 80 | parser.add_argument("--classes", type=str, required=True, help="Path to list of class in yaml file") 81 | parser.add_argument("--score-threshold", type=float, required=False, default=0.1) 82 | parser.add_argument("--conf-threshold", type=float, required=False, default=0.4) 83 | parser.add_argument("--iou-threshold", type=float, required=False, default=0.4) 84 | parser.add_argument("--image", action="store_true", required=False, help="Image inference mode") 85 | parser.add_argument("--video", action="store_true", required=False) 86 | parser.add_argument("--show", required=False, type=bool, default=True, help="Show result on pop-up window") 87 | parser.add_argument("--device", type=str, required=False, help="Device use (cpu or cude)", choices=["cpu", "cuda"], default="cpu") 88 | 89 | args = parser.parse_args() 90 | 91 | if args.image: 92 | inference_on_image(args=args) 93 | elif args.video: 94 | inference_on_video(args=args) 95 | else: 96 | raise ValueError("You can't process the result because you have not define the source type (video or image) in the argument") 97 | -------------------------------------------------------------------------------- /output/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsyahputra/yolov9-onnx/93d6454534323236c2230c860f3e375651fde71e/output/.gitkeep -------------------------------------------------------------------------------- /output/result.avi: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsyahputra/yolov9-onnx/93d6454534323236c2230c860f3e375651fde71e/output/result.avi -------------------------------------------------------------------------------- /output/sample_image.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsyahputra/yolov9-onnx/93d6454534323236c2230c860f3e375651fde71e/output/sample_image.jpeg -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | hydra-core 2 | PyYAML 3 | python-dotenv 4 | pyrootutils 5 | opencv-python-headless 6 | numpy 7 | openvino-dev[onnx] 8 | matplotlib 9 | colorlog 10 | Pillow 11 | scipy 12 | ipyfilechooser 13 | onnx 14 | onnxruntime 15 | requests 16 | pandas -------------------------------------------------------------------------------- /weights/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/danielsyahputra/yolov9-onnx/93d6454534323236c2230c860f3e375651fde71e/weights/.gitkeep -------------------------------------------------------------------------------- /yolov9.py: -------------------------------------------------------------------------------- 1 | import pyrootutils 2 | 3 | ROOT = pyrootutils.setup_root( 4 | search_from=__file__, 5 | indicator=["requirements.txt"], 6 | pythonpath=True, 7 | dotenv=True, 8 | ) 9 | 10 | import cv2 11 | import time 12 | import yaml 13 | import onnxruntime 14 | import numpy as np 15 | from typing import Tuple, List 16 | 17 | class YOLOv9: 18 | def __init__(self, 19 | model_path: str, 20 | class_mapping_path: str, 21 | original_size: Tuple[int, int] = (1280, 720), 22 | score_threshold: float = 0.1, 23 | conf_thresold: float = 0.4, 24 | iou_threshold: float = 0.4, 25 | device: str = "CPU") -> None: 26 | self.model_path = model_path 27 | self.class_mapping_path = class_mapping_path 28 | 29 | self.device = device 30 | self.score_threshold = score_threshold 31 | self.conf_thresold = conf_thresold 32 | self.iou_threshold = iou_threshold 33 | self.image_width, self.image_height = original_size 34 | self.create_session() 35 | 36 | def create_session(self) -> None: 37 | opt_session = onnxruntime.SessionOptions() 38 | opt_session.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_DISABLE_ALL 39 | providers = ['CPUExecutionProvider'] 40 | if self.device.casefold() != "cpu": 41 | providers.append("CUDAExecutionProvider") 42 | session = onnxruntime.InferenceSession(self.model_path, providers=providers) 43 | self.session = session 44 | self.model_inputs = self.session.get_inputs() 45 | self.input_names = [self.model_inputs[i].name for i in range(len(self.model_inputs))] 46 | self.input_shape = self.model_inputs[0].shape 47 | self.model_output = self.session.get_outputs() 48 | self.output_names = [self.model_output[i].name for i in range(len(self.model_output))] 49 | self.input_height, self.input_width = self.input_shape[2:] 50 | 51 | if self.class_mapping_path is not None: 52 | with open(self.class_mapping_path, 'r') as file: 53 | yaml_file = yaml.safe_load(file) 54 | self.classes = yaml_file['names'] 55 | self.color_palette = np.random.uniform(0, 255, size=(len(self.classes), 3)) 56 | 57 | def preprocess(self, img: np.ndarray) -> np.ndarray: 58 | image_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 59 | resized = cv2.resize(image_rgb, (self.input_width, self.input_height)) 60 | 61 | # Scale input pixel value to 0 to 1 62 | input_image = resized / 255.0 63 | input_image = input_image.transpose(2,0,1) 64 | input_tensor = input_image[np.newaxis, :, :, :].astype(np.float32) 65 | return input_tensor 66 | 67 | def xywh2xyxy(self, x): 68 | # Convert bounding box (x, y, w, h) to bounding box (x1, y1, x2, y2) 69 | y = np.copy(x) 70 | y[..., 0] = x[..., 0] - x[..., 2] / 2 71 | y[..., 1] = x[..., 1] - x[..., 3] / 2 72 | y[..., 2] = x[..., 0] + x[..., 2] / 2 73 | y[..., 3] = x[..., 1] + x[..., 3] / 2 74 | return y 75 | 76 | def postprocess(self, outputs): 77 | predictions = np.squeeze(outputs).T 78 | scores = np.max(predictions[:, 4:], axis=1) 79 | predictions = predictions[scores > self.conf_thresold, :] 80 | scores = scores[scores > self.conf_thresold] 81 | class_ids = np.argmax(predictions[:, 4:], axis=1) 82 | 83 | # Rescale box 84 | boxes = predictions[:, :4] 85 | 86 | input_shape = np.array([self.input_width, self.input_height, self.input_width, self.input_height]) 87 | boxes = np.divide(boxes, input_shape, dtype=np.float32) 88 | boxes *= np.array([self.image_width, self.image_height, self.image_width, self.image_height]) 89 | boxes = boxes.astype(np.int32) 90 | indices = cv2.dnn.NMSBoxes(boxes, scores, score_threshold=self.score_threshold, nms_threshold=self.iou_threshold) 91 | detections = [] 92 | for bbox, score, label in zip(self.xywh2xyxy(boxes[indices]), scores[indices], class_ids[indices]): 93 | detections.append({ 94 | "class_index": label, 95 | "confidence": score, 96 | "box": bbox, 97 | "class_name": self.get_label_name(label) 98 | }) 99 | return detections 100 | 101 | def get_label_name(self, class_id: int) -> str: 102 | return self.classes[class_id] 103 | 104 | def detect(self, img: np.ndarray) -> List: 105 | input_tensor = self.preprocess(img) 106 | outputs = self.session.run(self.output_names, {self.input_names[0]: input_tensor})[0] 107 | return self.postprocess(outputs) 108 | 109 | def draw_detections(self, img, detections: List): 110 | """ 111 | Draws bounding boxes and labels on the input image based on the detected objects. 112 | 113 | Args: 114 | img: The input image to draw detections on. 115 | detections: List of detection result which consists box, score, and class_ids 116 | box: Detected bounding box. 117 | score: Corresponding detection score. 118 | class_id: Class ID for the detected object. 119 | 120 | Returns: 121 | None 122 | """ 123 | 124 | for detection in detections: 125 | # Extract the coordinates of the bounding box 126 | x1, y1, x2, y2 = detection['box'].astype(int) 127 | class_id = detection['class_index'] 128 | confidence = detection['confidence'] 129 | 130 | # Retrieve the color for the class ID 131 | color = self.color_palette[class_id] 132 | 133 | # Draw the bounding box on the image 134 | cv2.rectangle(img, (x1, y1), (x2, y2), color, 2) 135 | 136 | # Create the label text with class name and score 137 | label = f"{self.classes[class_id]}: {confidence:.2f}" 138 | 139 | # Calculate the dimensions of the label text 140 | (label_width, label_height), _ = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1) 141 | 142 | # Calculate the position of the label text 143 | label_x = x1 144 | label_y = y1 - 10 if y1 - 10 > label_height else y1 + 10 145 | 146 | # Draw a filled rectangle as the background for the label text 147 | cv2.rectangle( 148 | img, (label_x, label_y - label_height), (label_x + label_width, label_y + label_height), color, cv2.FILLED 149 | ) 150 | 151 | # Draw the label text on the image 152 | cv2.putText(img, label, (label_x, label_y), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 1, cv2.LINE_AA) 153 | 154 | if __name__=="__main__": 155 | 156 | weight_path = "weights/yolov9-c.onnx" 157 | image = cv2.imread("assets/sample_image.jpeg") 158 | h, w = image.shape[:2] 159 | detector = YOLOv9(model_path=f"{weight_path}", 160 | class_mapping_path="weights/metadata.yaml", 161 | original_size=(w, h)) 162 | detections = detector.detect(image) 163 | detector.draw_detections(image, detections=detections) 164 | 165 | cv2.imshow("Tambang Preview", image) 166 | cv2.waitKey(0) --------------------------------------------------------------------------------