├── .env
├── .gitignore
├── README.md
├── assets
    └── sample_image.jpeg
├── main.py
├── output
    ├── .gitkeep
    ├── result.avi
    └── sample_image.jpeg
├── requirements.txt
├── weights
    └── .gitkeep
└── yolov9.py


/.env:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/danielsyahputra/yolov9-onnx/93d6454534323236c2230c860f3e375651fde71e/.env


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | .DS_Store
2 | **__pycache__**
3 | tmp
4 | *.onnx
5 | *.yaml
6 | *.mp4


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # YOLOv9 with ONNX & ONNXRuntime
 2 | 
 3 | Performing Object Detection for YOLOv9 with ONNX and ONNXRuntime
 4 | 
 5 | ![! ONNX YOLOv9 Object Detection](https://github.com/danielsyahputra/yolov9-onnx/blob/master/output/sample_image.jpeg)
 6 | 
 7 | 
 8 | ## Requirements
 9 | 
10 |  * Check the **requirements.txt** file.
11 |  * For ONNX, if you have a NVIDIA GPU, then install the **onnxruntime-gpu**, otherwise use the **onnxruntime** library.
12 | 
13 | ## Installation
14 | 
15 | ```shell
16 | git clone https://github.com/danielsyahputra/yolov9-onnx.git
17 | cd yolov9-onnx
18 | pip install -r requirements.txt
19 | ```
20 | ### ONNX Runtime
21 | For Nvidia GPU computers:
22 | `pip install onnxruntime-gpu`
23 | 
24 | Otherwise:
25 | `pip install onnxruntime`
26 | 
27 | ## ONNX model and Class metadata
28 | 
29 | You can download the onnx model and class metadata file on the link below
30 | 
31 | ```
32 | https://drive.google.com/drive/folders/1QH5RCF5WOk53SfdzsHTFkXAdzMLbbQeO?usp=sharing
33 | ```
34 | 
35 | ## Examples
36 | 
37 | ### Arguments
38 | List  the arguments available in main.py file.
39 | 
40 | - `--source`: Path to image or video file
41 | - `--weights`: Path to yolov9 onnx file (ex: weights/yolov9-c.onnx)
42 | - `--classes`: Path to yaml file that contains the list of class from model (ex: weights/metadata.yaml)
43 | - `--score-threshold`: Score threshold for inference, range from 0 - 1
44 | - `--conf-threshold`: Confidence threshold for inference, range from 0 - 1
45 | - `--iou-threshold`: IOU threshold for inference, range from 0 - 1
46 | - `--image`: Image inference mode
47 | - `--video`: Video inference mode
48 | - `--show`: Show result on pop-up window
49 | - `--device`: Device use for inference, default = cpu.
50 | 
51 | 
52 | 
53 | Note: If you want to use `cuda` for inference, please make sure you are already install `onnxruntime-gpu` before running the script.
54 | 
55 | 
56 | This code provides two modes of inference, image and video inference. Basically, you just add `--image` flag for image inference and `--video` flag for video inference when you are running the python script.
57 | 
58 | 
59 | If you have your own custom model, don't forget to provide a yaml file that consists the list of class that your model want to predict. This is example of yaml content for defining your own classes:
60 | 
61 | ```
62 | names:
63 |   0: person
64 |   1: bicycle
65 |   2: car
66 |   3: motorcycle
67 |   4: airplane
68 |   .
69 |   .
70 |   .
71 |   .
72 |   n: object
73 | ```
74 | 
75 | ### Inference on Image
76 | 
77 | ```
78 | python main.py --source assets/sample_image.jpeg --weights weights/yolov9-c.onnx --classes weights/metadata.yaml --image
79 | ```
80 | 
81 | ### Inference on Video
82 | 
83 | ```
84 | python main.py --source assets/road.mp4 --weights weights/yolov9-c.onnx --classes weights/metadata.yaml --video
85 | ```
86 | 
87 | # References:
88 | * YOLOv9 model: [https://github.com/WongKinYiu/yolov9](https://github.com/WongKinYiu/yolov9)


--------------------------------------------------------------------------------
/assets/sample_image.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/danielsyahputra/yolov9-onnx/93d6454534323236c2230c860f3e375651fde71e/assets/sample_image.jpeg


--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import cv2
 3 | from pathlib import Path
 4 | 
 5 | from yolov9 import YOLOv9
 6 | 
 7 | 
 8 | def get_detector(args):
 9 |     weights_path = args.weights
10 |     classes_path = args.classes
11 |     source_path = args.source
12 |     assert os.path.isfile(weights_path), f"There's no weight file with name {weights_path}"
13 |     assert os.path.isfile(classes_path), f"There's no classes file with name {weights_path}"
14 |     assert os.path.isfile(source_path), f"There's no source file with name {weights_path}"
15 | 
16 |     if args.image:
17 |         image = cv2.imread(source_path)
18 |         h,w = image.shape[:2]
19 |     elif args.video:
20 |         cap = cv2.VideoCapture(source_path)
21 |         w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
22 |         h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
23 |     detector = YOLOv9(model_path=weights_path,
24 |                       class_mapping_path=classes_path,
25 |                       original_size=(w, h),
26 |                       score_threshold=args.score_threshold,
27 |                       conf_thresold=args.conf_threshold,
28 |                       iou_threshold=args.iou_threshold,
29 |                       device=args.device)
30 |     return detector
31 | 
32 | def inference_on_image(args):
33 |     print("[INFO] Intialize Model")
34 |     detector = get_detector(args)
35 |     image = cv2.imread(args.source)
36 | 
37 |     print("[INFO] Inference Image")
38 |     detections = detector.detect(image)
39 |     detector.draw_detections(image, detections=detections)
40 | 
41 |     output_path = f"output/{Path(args.source).name}"
42 |     print(f"[INFO] Saving result on {output_path}")
43 |     cv2.imwrite(output_path, image)
44 | 
45 |     if args.show:
46 |         cv2.imshow("Result", image)
47 |         cv2.waitKey(0)
48 | 
49 | def inference_on_video(args):
50 |     print("[INFO] Intialize Model")
51 |     detector = get_detector(args)
52 | 
53 |     cap = cv2.VideoCapture(args.source)
54 |     w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
55 |     h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
56 |     video_fps = int(cap.get(cv2.CAP_PROP_FPS))
57 |     writer = cv2.VideoWriter('output/result.avi', cv2.VideoWriter_fourcc(*'MJPG'), video_fps, (w, h))
58 | 
59 |     print("[INFO] Inference on Video")
60 |     while True:
61 |         ret, frame = cap.read()
62 |         if not ret:
63 |             break
64 |         detections = detector.detect(frame)
65 |         detector.draw_detections(frame, detections=detections)
66 |         writer.write(frame)
67 |         cv2.imshow("Result", frame)
68 |         key = cv2.waitKey(1) & 0xFF
69 |         if key == ord("q"):
70 |             break
71 |     print("[INFO] Finish. Saving result to output/result.avi")
72 | 
73 | if __name__=="__main__":
74 |     import argparse
75 |     
76 |     parser = argparse.ArgumentParser(description="Argument for YOLOv9 Inference using ONNXRuntime")
77 | 
78 |     parser.add_argument("--source", type=str, required=True, help="Path to image or video file")
79 |     parser.add_argument("--weights", type=str, required=True, help="Path to yolov9 onnx file")
80 |     parser.add_argument("--classes", type=str, required=True, help="Path to list of class in yaml file")
81 |     parser.add_argument("--score-threshold", type=float, required=False, default=0.1)
82 |     parser.add_argument("--conf-threshold", type=float, required=False, default=0.4)
83 |     parser.add_argument("--iou-threshold", type=float, required=False, default=0.4)
84 |     parser.add_argument("--image", action="store_true", required=False, help="Image inference mode")
85 |     parser.add_argument("--video", action="store_true", required=False)
86 |     parser.add_argument("--show", required=False, type=bool, default=True, help="Show result on pop-up window")
87 |     parser.add_argument("--device", type=str, required=False, help="Device use (cpu or cude)", choices=["cpu", "cuda"], default="cpu")
88 | 
89 |     args = parser.parse_args()
90 | 
91 |     if args.image:
92 |         inference_on_image(args=args)
93 |     elif args.video:
94 |         inference_on_video(args=args)
95 |     else:
96 |         raise ValueError("You can't process the result because you have not define the source type (video or image) in the argument")
97 |     


--------------------------------------------------------------------------------
/output/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/danielsyahputra/yolov9-onnx/93d6454534323236c2230c860f3e375651fde71e/output/.gitkeep


--------------------------------------------------------------------------------
/output/result.avi:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/danielsyahputra/yolov9-onnx/93d6454534323236c2230c860f3e375651fde71e/output/result.avi


--------------------------------------------------------------------------------
/output/sample_image.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/danielsyahputra/yolov9-onnx/93d6454534323236c2230c860f3e375651fde71e/output/sample_image.jpeg


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | hydra-core
 2 | PyYAML
 3 | python-dotenv
 4 | pyrootutils
 5 | opencv-python-headless
 6 | numpy
 7 | openvino-dev[onnx]
 8 | matplotlib
 9 | colorlog
10 | Pillow
11 | scipy
12 | ipyfilechooser
13 | onnx
14 | onnxruntime
15 | requests
16 | pandas


--------------------------------------------------------------------------------
/weights/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/danielsyahputra/yolov9-onnx/93d6454534323236c2230c860f3e375651fde71e/weights/.gitkeep


--------------------------------------------------------------------------------
/yolov9.py:
--------------------------------------------------------------------------------
  1 | import pyrootutils
  2 | 
  3 | ROOT = pyrootutils.setup_root(
  4 |     search_from=__file__,
  5 |     indicator=["requirements.txt"],
  6 |     pythonpath=True,
  7 |     dotenv=True,
  8 | )
  9 | 
 10 | import cv2
 11 | import time
 12 | import yaml
 13 | import onnxruntime
 14 | import numpy as np
 15 | from typing import Tuple, List
 16 | 
 17 | class YOLOv9:
 18 |     def __init__(self,
 19 |                  model_path: str,
 20 |                  class_mapping_path: str,
 21 |                  original_size: Tuple[int, int] = (1280, 720),
 22 |                  score_threshold: float = 0.1,
 23 |                  conf_thresold: float = 0.4,
 24 |                  iou_threshold: float = 0.4,
 25 |                  device: str = "CPU") -> None:
 26 |         self.model_path = model_path
 27 |         self.class_mapping_path = class_mapping_path
 28 | 
 29 |         self.device = device
 30 |         self.score_threshold = score_threshold
 31 |         self.conf_thresold = conf_thresold
 32 |         self.iou_threshold = iou_threshold
 33 |         self.image_width, self.image_height = original_size
 34 |         self.create_session()
 35 | 
 36 |     def create_session(self) -> None:
 37 |         opt_session = onnxruntime.SessionOptions()
 38 |         opt_session.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_DISABLE_ALL
 39 |         providers = ['CPUExecutionProvider']
 40 |         if self.device.casefold() != "cpu":
 41 |             providers.append("CUDAExecutionProvider")
 42 |         session = onnxruntime.InferenceSession(self.model_path, providers=providers)
 43 |         self.session = session
 44 |         self.model_inputs = self.session.get_inputs()
 45 |         self.input_names = [self.model_inputs[i].name for i in range(len(self.model_inputs))]
 46 |         self.input_shape = self.model_inputs[0].shape
 47 |         self.model_output = self.session.get_outputs()
 48 |         self.output_names = [self.model_output[i].name for i in range(len(self.model_output))]
 49 |         self.input_height, self.input_width = self.input_shape[2:]
 50 | 
 51 |         if self.class_mapping_path is not None:
 52 |             with open(self.class_mapping_path, 'r') as file:
 53 |                 yaml_file = yaml.safe_load(file)
 54 |                 self.classes = yaml_file['names']
 55 |                 self.color_palette = np.random.uniform(0, 255, size=(len(self.classes), 3))
 56 | 
 57 |     def preprocess(self, img: np.ndarray) -> np.ndarray:
 58 |         image_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
 59 |         resized = cv2.resize(image_rgb, (self.input_width, self.input_height))
 60 | 
 61 |         # Scale input pixel value to 0 to 1
 62 |         input_image = resized / 255.0
 63 |         input_image = input_image.transpose(2,0,1)
 64 |         input_tensor = input_image[np.newaxis, :, :, :].astype(np.float32)
 65 |         return input_tensor
 66 |     
 67 |     def xywh2xyxy(self, x):
 68 |         # Convert bounding box (x, y, w, h) to bounding box (x1, y1, x2, y2)
 69 |         y = np.copy(x)
 70 |         y[..., 0] = x[..., 0] - x[..., 2] / 2
 71 |         y[..., 1] = x[..., 1] - x[..., 3] / 2
 72 |         y[..., 2] = x[..., 0] + x[..., 2] / 2
 73 |         y[..., 3] = x[..., 1] + x[..., 3] / 2
 74 |         return y 
 75 |     
 76 |     def postprocess(self, outputs):
 77 |         predictions = np.squeeze(outputs).T
 78 |         scores = np.max(predictions[:, 4:], axis=1)
 79 |         predictions = predictions[scores > self.conf_thresold, :]
 80 |         scores = scores[scores > self.conf_thresold]
 81 |         class_ids = np.argmax(predictions[:, 4:], axis=1)
 82 | 
 83 |         # Rescale box
 84 |         boxes = predictions[:, :4]
 85 |         
 86 |         input_shape = np.array([self.input_width, self.input_height, self.input_width, self.input_height])
 87 |         boxes = np.divide(boxes, input_shape, dtype=np.float32)
 88 |         boxes *= np.array([self.image_width, self.image_height, self.image_width, self.image_height])
 89 |         boxes = boxes.astype(np.int32)
 90 |         indices = cv2.dnn.NMSBoxes(boxes, scores, score_threshold=self.score_threshold, nms_threshold=self.iou_threshold)
 91 |         detections = []
 92 |         for bbox, score, label in zip(self.xywh2xyxy(boxes[indices]), scores[indices], class_ids[indices]):
 93 |             detections.append({
 94 |                 "class_index": label,
 95 |                 "confidence": score,
 96 |                 "box": bbox,
 97 |                 "class_name": self.get_label_name(label)
 98 |             })
 99 |         return detections
100 |     
101 |     def get_label_name(self, class_id: int) -> str:
102 |         return self.classes[class_id]
103 |         
104 |     def detect(self, img: np.ndarray) -> List:
105 |         input_tensor = self.preprocess(img)
106 |         outputs = self.session.run(self.output_names, {self.input_names[0]: input_tensor})[0]
107 |         return self.postprocess(outputs)
108 |     
109 |     def draw_detections(self, img, detections: List):
110 |         """
111 |         Draws bounding boxes and labels on the input image based on the detected objects.
112 | 
113 |         Args:
114 |             img: The input image to draw detections on.
115 |             detections: List of detection result which consists box, score, and class_ids
116 |             box: Detected bounding box.
117 |             score: Corresponding detection score.
118 |             class_id: Class ID for the detected object.
119 | 
120 |         Returns:
121 |             None
122 |         """
123 | 
124 |         for detection in detections:
125 |             # Extract the coordinates of the bounding box
126 |             x1, y1, x2, y2 = detection['box'].astype(int)
127 |             class_id = detection['class_index']
128 |             confidence = detection['confidence']
129 | 
130 |             # Retrieve the color for the class ID
131 |             color = self.color_palette[class_id]
132 | 
133 |             # Draw the bounding box on the image
134 |             cv2.rectangle(img, (x1, y1), (x2, y2), color, 2)
135 | 
136 |             # Create the label text with class name and score
137 |             label = f"{self.classes[class_id]}: {confidence:.2f}"
138 | 
139 |             # Calculate the dimensions of the label text
140 |             (label_width, label_height), _ = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
141 | 
142 |             # Calculate the position of the label text
143 |             label_x = x1
144 |             label_y = y1 - 10 if y1 - 10 > label_height else y1 + 10
145 | 
146 |             # Draw a filled rectangle as the background for the label text
147 |             cv2.rectangle(
148 |                 img, (label_x, label_y - label_height), (label_x + label_width, label_y + label_height), color, cv2.FILLED
149 |             )
150 | 
151 |             # Draw the label text on the image
152 |             cv2.putText(img, label, (label_x, label_y), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 1, cv2.LINE_AA)
153 | 
154 | if __name__=="__main__":
155 | 
156 |     weight_path = "weights/yolov9-c.onnx"
157 |     image = cv2.imread("assets/sample_image.jpeg")
158 |     h, w = image.shape[:2]
159 |     detector = YOLOv9(model_path=f"{weight_path}",
160 |                       class_mapping_path="weights/metadata.yaml",
161 |                       original_size=(w, h))
162 |     detections = detector.detect(image)
163 |     detector.draw_detections(image, detections=detections)
164 |     
165 |     cv2.imshow("Tambang Preview", image)
166 |     cv2.waitKey(0)


--------------------------------------------------------------------------------