├── CLA.md
├── LICENSE.md
├── NanoOWL_Layout.json
├── README.md
├── assets
├── Foxglove-WebSocket-connection.png
├── Foxglove-default.png
├── Foxglove-import-layout.png
├── Foxglove-open-connection.png
├── Foxglove-publish-panel.png
├── ROS2-NanoOWL-query.png
├── forklift_detection.png
├── ladder_detection.png
├── pallet_detection.png
└── people_detection.png
├── launch
├── camera_input_example.launch.py
└── nano_owl_example.launch.py
├── package.xml
├── resource
└── ros2_nanoowl
├── ros2_nanoowl
├── __init__.py
└── nano_owl_py.py
├── setup.cfg
├── setup.py
└── test
├── test_copyright.py
├── test_flake8.py
└── test_pep257.py
/CLA.md:
--------------------------------------------------------------------------------
1 | ## Individual Contributor License Agreement (CLA)
2 |
3 | **Thank you for submitting your contributions to this project.**
4 |
5 | By signing this CLA, you agree that the following terms apply to all of your past, present and future contributions
6 | to the project.
7 |
8 | ### License.
9 |
10 | You hereby represent that all present, past and future contributions are governed by the
11 | [Apache-2.0 License](https://www.apache.org/licenses/LICENSE-2.0)
12 | copyright statement.
13 |
14 | This entails that to the extent possible under law, you transfer all copyright and related or neighboring rights
15 | of the code or documents you contribute to the project itself or its maintainers.
16 | Furthermore you also represent that you have the authority to perform the above waiver
17 | with respect to the entirety of you contributions.
18 |
19 | ### Moral Rights.
20 |
21 | To the fullest extent permitted under applicable law, you hereby waive, and agree not to
22 | assert, all of your “moral rights” in or relating to your contributions for the benefit of the project.
23 |
24 | ### Third Party Content.
25 |
26 | If your Contribution includes or is based on any source code, object code, bug fixes, configuration changes, tools,
27 | specifications, documentation, data, materials, feedback, information or other works of authorship that were not
28 | authored by you (“Third Party Content”) or if you are aware of any third party intellectual property or proprietary
29 | rights associated with your Contribution (“Third Party Rights”),
30 | then you agree to include with the submission of your Contribution full details respecting such Third Party
31 | Content and Third Party Rights, including, without limitation, identification of which aspects of your
32 | Contribution contain Third Party Content or are associated with Third Party Rights, the owner/author of the
33 | Third Party Content and Third Party Rights, where you obtained the Third Party Content, and any applicable
34 | third party license terms or restrictions respecting the Third Party Content and Third Party Rights. For greater
35 | certainty, the foregoing obligations respecting the identification of Third Party Content and Third Party Rights
36 | do not apply to any portion of a Project that is incorporated into your Contribution to that same Project.
37 |
38 | ### Representations.
39 |
40 | You represent that, other than the Third Party Content and Third Party Rights identified by
41 | you in accordance with this Agreement, you are the sole author of your Contributions and are legally entitled
42 | to grant the foregoing licenses and waivers in respect of your Contributions. If your Contributions were
43 | created in the course of your employment with your past or present employer(s), you represent that such
44 | employer(s) has authorized you to make your Contributions on behalf of such employer(s) or such employer
45 | (s) has waived all of their right, title or interest in or to your Contributions.
46 |
47 | ### Disclaimer.
48 |
49 | To the fullest extent permitted under applicable law, your Contributions are provided on an "as is"
50 | basis, without any warranties or conditions, express or implied, including, without limitation, any implied
51 | warranties or conditions of non-infringement, merchantability or fitness for a particular purpose. You are not
52 | required to provide support for your Contributions, except to the extent you desire to provide support.
53 |
54 | ### No Obligation.
55 |
56 | You acknowledge that the maintainers of this project are under no obligation to use or incorporate your contributions
57 | into the project. The decision to use or incorporate your contributions into the project will be made at the
58 | sole discretion of the maintainers or their authorized delegates.
59 |
--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | Apache License
2 | Version 2.0, January 2004
3 | http://www.apache.org/licenses/
4 |
5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6 |
7 | 1. Definitions.
8 |
9 | "License" shall mean the terms and conditions for use, reproduction,
10 | and distribution as defined by Sections 1 through 9 of this document.
11 |
12 | "Licensor" shall mean the copyright owner or entity authorized by
13 | the copyright owner that is granting the License.
14 |
15 | "Legal Entity" shall mean the union of the acting entity and all
16 | other entities that control, are controlled by, or are under common
17 | control with that entity. For the purposes of this definition,
18 | "control" means (i) the power, direct or indirect, to cause the
19 | direction or management of such entity, whether by contract or
20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the
21 | outstanding shares, or (iii) beneficial ownership of such entity.
22 |
23 | "You" (or "Your") shall mean an individual or Legal Entity
24 | exercising permissions granted by this License.
25 |
26 | "Source" form shall mean the preferred form for making modifications,
27 | including but not limited to software source code, documentation
28 | source, and configuration files.
29 |
30 | "Object" form shall mean any form resulting from mechanical
31 | transformation or translation of a Source form, including but
32 | not limited to compiled object code, generated documentation,
33 | and conversions to other media types.
34 |
35 | "Work" shall mean the work of authorship, whether in Source or
36 | Object form, made available under the License, as indicated by a
37 | copyright notice that is included in or attached to the work
38 | (an example is provided in the Appendix below).
39 |
40 | "Derivative Works" shall mean any work, whether in Source or Object
41 | form, that is based on (or derived from) the Work and for which the
42 | editorial revisions, annotations, elaborations, or other modifications
43 | represent, as a whole, an original work of authorship. For the purposes
44 | of this License, Derivative Works shall not include works that remain
45 | separable from, or merely link (or bind by name) to the interfaces of,
46 | the Work and Derivative Works thereof.
47 |
48 | "Contribution" shall mean any work of authorship, including
49 | the original version of the Work and any modifications or additions
50 | to that Work or Derivative Works thereof, that is intentionally
51 | submitted to Licensor for inclusion in the Work by the copyright owner
52 | or by an individual or Legal Entity authorized to submit on behalf of
53 | the copyright owner. For the purposes of this definition, "submitted"
54 | means any form of electronic, verbal, or written communication sent
55 | to the Licensor or its representatives, including but not limited to
56 | communication on electronic mailing lists, source code control systems,
57 | and issue tracking systems that are managed by, or on behalf of, the
58 | Licensor for the purpose of discussing and improving the Work, but
59 | excluding communication that is conspicuously marked or otherwise
60 | designated in writing by the copyright owner as "Not a Contribution."
61 |
62 | "Contributor" shall mean Licensor and any individual or Legal Entity
63 | on behalf of whom a Contribution has been received by Licensor and
64 | subsequently incorporated within the Work.
65 |
66 | 2. Grant of Copyright License. Subject to the terms and conditions of
67 | this License, each Contributor hereby grants to You a perpetual,
68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69 | copyright license to reproduce, prepare Derivative Works of,
70 | publicly display, publicly perform, sublicense, and distribute the
71 | Work and such Derivative Works in Source or Object form.
72 |
73 | 3. Grant of Patent License. Subject to the terms and conditions of
74 | this License, each Contributor hereby grants to You a perpetual,
75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76 | (except as stated in this section) patent license to make, have made,
77 | use, offer to sell, sell, import, and otherwise transfer the Work,
78 | where such license applies only to those patent claims licensable
79 | by such Contributor that are necessarily infringed by their
80 | Contribution(s) alone or by combination of their Contribution(s)
81 | with the Work to which such Contribution(s) was submitted. If You
82 | institute patent litigation against any entity (including a
83 | cross-claim or counterclaim in a lawsuit) alleging that the Work
84 | or a Contribution incorporated within the Work constitutes direct
85 | or contributory patent infringement, then any patent licenses
86 | granted to You under this License for that Work shall terminate
87 | as of the date such litigation is filed.
88 |
89 | 4. Redistribution. You may reproduce and distribute copies of the
90 | Work or Derivative Works thereof in any medium, with or without
91 | modifications, and in Source or Object form, provided that You
92 | meet the following conditions:
93 |
94 | (a) You must give any other recipients of the Work or
95 | Derivative Works a copy of this License; and
96 |
97 | (b) You must cause any modified files to carry prominent notices
98 | stating that You changed the files; and
99 |
100 | (c) You must retain, in the Source form of any Derivative Works
101 | that You distribute, all copyright, patent, trademark, and
102 | attribution notices from the Source form of the Work,
103 | excluding those notices that do not pertain to any part of
104 | the Derivative Works; and
105 |
106 | (d) If the Work includes a "NOTICE" text file as part of its
107 | distribution, then any Derivative Works that You distribute must
108 | include a readable copy of the attribution notices contained
109 | within such NOTICE file, excluding those notices that do not
110 | pertain to any part of the Derivative Works, in at least one
111 | of the following places: within a NOTICE text file distributed
112 | as part of the Derivative Works; within the Source form or
113 | documentation, if provided along with the Derivative Works; or,
114 | within a display generated by the Derivative Works, if and
115 | wherever such third-party notices normally appear. The contents
116 | of the NOTICE file are for informational purposes only and
117 | do not modify the License. You may add Your own attribution
118 | notices within Derivative Works that You distribute, alongside
119 | or as an addendum to the NOTICE text from the Work, provided
120 | that such additional attribution notices cannot be construed
121 | as modifying the License.
122 |
123 | You may add Your own copyright statement to Your modifications and
124 | may provide additional or different license terms and conditions
125 | for use, reproduction, or distribution of Your modifications, or
126 | for any such Derivative Works as a whole, provided Your use,
127 | reproduction, and distribution of the Work otherwise complies with
128 | the conditions stated in this License.
129 |
130 | 5. Submission of Contributions. Unless You explicitly state otherwise,
131 | any Contribution intentionally submitted for inclusion in the Work
132 | by You to the Licensor shall be under the terms and conditions of
133 | this License, without any additional terms or conditions.
134 | Notwithstanding the above, nothing herein shall supersede or modify
135 | the terms of any separate license agreement you may have executed
136 | with Licensor regarding such Contributions.
137 |
138 | 6. Trademarks. This License does not grant permission to use the trade
139 | names, trademarks, service marks, or product names of the Licensor,
140 | except as required for reasonable and customary use in describing the
141 | origin of the Work and reproducing the content of the NOTICE file.
142 |
143 | 7. Disclaimer of Warranty. Unless required by applicable law or
144 | agreed to in writing, Licensor provides the Work (and each
145 | Contributor provides its Contributions) on an "AS IS" BASIS,
146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 | implied, including, without limitation, any warranties or conditions
148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 | PARTICULAR PURPOSE. You are solely responsible for determining the
150 | appropriateness of using or redistributing the Work and assume any
151 | risks associated with Your exercise of permissions under this License.
152 |
153 | 8. Limitation of Liability. In no event and under no legal theory,
154 | whether in tort (including negligence), contract, or otherwise,
155 | unless required by applicable law (such as deliberate and grossly
156 | negligent acts) or agreed to in writing, shall any Contributor be
157 | liable to You for damages, including any direct, indirect, special,
158 | incidental, or consequential damages of any character arising as a
159 | result of this License or out of the use or inability to use the
160 | Work (including but not limited to damages for loss of goodwill,
161 | work stoppage, computer failure or malfunction, or any and all
162 | other commercial damages or losses), even if such Contributor
163 | has been advised of the possibility of such damages.
164 |
165 | 9. Accepting Warranty or Additional Liability. While redistributing
166 | the Work or Derivative Works thereof, You may choose to offer,
167 | and charge a fee for, acceptance of support, warranty, indemnity,
168 | or other liability obligations and/or rights consistent with this
169 | License. However, in accepting such obligations, You may act only
170 | on Your own behalf and on Your sole responsibility, not on behalf
171 | of any other Contributor, and only if You agree to indemnify,
172 | defend, and hold each Contributor harmless for any liability
173 | incurred by, or claims asserted against, such Contributor by reason
174 | of your accepting any such warranty or additional liability.
175 |
176 | END OF TERMS AND CONDITIONS
177 |
178 | APPENDIX: How to apply the Apache License to your work.
179 |
180 | To apply the Apache License to your work, attach the following
181 | boilerplate notice, with the fields enclosed by brackets "[]"
182 | replaced with your own identifying information. (Don't include
183 | the brackets!) The text should be enclosed in the appropriate
184 | comment syntax for the file format. We also recommend that a
185 | file or class name and description of purpose be included on the
186 | same "printed page" as the copyright notice for easier
187 | identification within third-party archives.
188 |
189 | Copyright [yyyy] [name of copyright owner]
190 |
191 | Licensed under the Apache License, Version 2.0 (the "License");
192 | you may not use this file except in compliance with the License.
193 | You may obtain a copy of the License at
194 |
195 | http://www.apache.org/licenses/LICENSE-2.0
196 |
197 | Unless required by applicable law or agreed to in writing, software
198 | distributed under the License is distributed on an "AS IS" BASIS,
199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 | See the License for the specific language governing permissions and
201 | limitations under the License.
202 |
--------------------------------------------------------------------------------
/NanoOWL_Layout.json:
--------------------------------------------------------------------------------
1 | {
2 | "configById": {
3 | "Image!3mnp456": {
4 | "cameraState": {
5 | "distance": 20,
6 | "perspective": true,
7 | "phi": 60,
8 | "target": [
9 | 0,
10 | 0,
11 | 0
12 | ],
13 | "targetOffset": [
14 | 0,
15 | 0,
16 | 0
17 | ],
18 | "targetOrientation": [
19 | 0,
20 | 0,
21 | 0,
22 | 1
23 | ],
24 | "thetaOffset": 45,
25 | "fovy": 45,
26 | "near": 0.5,
27 | "far": 5000
28 | },
29 | "followMode": "follow-pose",
30 | "scene": {},
31 | "transforms": {},
32 | "topics": {},
33 | "layers": {},
34 | "publish": {
35 | "type": "point",
36 | "poseTopic": "/move_base_simple/goal",
37 | "pointTopic": "/clicked_point",
38 | "poseEstimateTopic": "/initialpose",
39 | "poseEstimateXDeviation": 0.5,
40 | "poseEstimateYDeviation": 0.5,
41 | "poseEstimateThetaDeviation": 0.26179939
42 | },
43 | "imageMode": {
44 | "imageTopic": "/input_image"
45 | }
46 | },
47 | "Image!1cdv3dh": {
48 | "cameraState": {
49 | "distance": 20,
50 | "perspective": true,
51 | "phi": 60,
52 | "target": [
53 | 0,
54 | 0,
55 | 0
56 | ],
57 | "targetOffset": [
58 | 0,
59 | 0,
60 | 0
61 | ],
62 | "targetOrientation": [
63 | 0,
64 | 0,
65 | 0,
66 | 1
67 | ],
68 | "thetaOffset": 45,
69 | "fovy": 45,
70 | "near": 0.5,
71 | "far": 5000
72 | },
73 | "followMode": "follow-pose",
74 | "scene": {},
75 | "transforms": {},
76 | "topics": {},
77 | "layers": {},
78 | "publish": {
79 | "type": "point",
80 | "poseTopic": "/move_base_simple/goal",
81 | "pointTopic": "/clicked_point",
82 | "poseEstimateTopic": "/initialpose",
83 | "poseEstimateXDeviation": 0.5,
84 | "poseEstimateYDeviation": 0.5,
85 | "poseEstimateThetaDeviation": 0.26179939
86 | },
87 | "imageMode": {
88 | "imageTopic": "/output_image"
89 | }
90 | },
91 | "Publish!1ujcm34": {
92 | "buttonText": "Publish",
93 | "buttonTooltip": "",
94 | "advancedView": true,
95 | "value": "{\n \"data\": \"a cone, a box, a ladder, a pallet, a person\"\n}",
96 | "topicName": "input_query",
97 | "datatype": "std_msgs/String",
98 | "buttonColor": "#ea1d53"
99 | }
100 | },
101 | "globalVariables": {},
102 | "userNodes": {},
103 | "playbackConfig": {
104 | "speed": 1
105 | },
106 | "layout": {
107 | "first": {
108 | "first": "Image!3mnp456",
109 | "second": "Image!1cdv3dh",
110 | "direction": "row"
111 | },
112 | "second": "Publish!1ujcm34",
113 | "direction": "column",
114 | "splitPercentage": 78.99860917941585
115 | }
116 | }
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # ROS2 NanoOWL
2 |
3 | ## ROS2 node for open-vocabulary object detection using [NanoOWL](https://github.com/NVIDIA-AI-IOT/nanoowl).
4 |
5 | [NanoOWL](https://github.com/NVIDIA-AI-IOT/nanoowl) optimizes [OWL-ViT](https://huggingface.co/docs/transformers/model_doc/owlvit) to run real-time on NVIDIA Jetson Orin with [TensorRT](https://developer.nvidia.com/tensorrt). This project provides a ROS 2 package for object detection using NanoOWL.
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 | ## Setup
21 |
22 | 1. Set up your Isaac ROS development environment following instructions [here](https://nvidia-isaac-ros.github.io/getting_started/dev_env_setup.html).
23 | 2. Clone required projects under ```${ISAAC_ROS_WS}/src```:
24 | ```
25 | cd ${ISAAC_ROS_WS}/src
26 | git clone https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_common.git
27 | git clone https://github.com/NVIDIA-AI-IOT/ROS2-NanoOWL.git
28 | git clone https://github.com/NVIDIA-AI-IOT/nanoowl
29 | git clone https://github.com/NVIDIA-AI-IOT/torch2trt
30 | git clone --branch humble https://github.com/ros2/demos.git
31 | ```
32 | 3. Launch the docker container using the ```run_dev.sh``` script:
33 | ```
34 | cd ${ISAAC_ROS_WS}/src/isaac_ros_common
35 | ./scripts/run_dev.sh
36 | ```
37 | 4. Install dependencies:
38 | * **Pytorch**: The Isaac ROS development environment that we set up in step 1 comes with PyTorch preinstalled. Check your PyTorch version using the interactive Python interpreter by running python from terminal, and these commands:
39 | ```
40 | import torch
41 | torch.__version__
42 | ```
43 | * **NVIDIA TensorRT**: If you’re developing on an NVIDIA Jetson, TensorRT is pre installed as part of JetPack. Verify the installation by running python from terminal, and then this command in the interactive Python interpreter: ```import tensorrt```. If it says ‘ModuleNotFound’, try the following command and check again following the steps above:
44 | ```
45 | sudo apt-get install python3-libnvinfer-dev
46 | ```
47 | If this fails, run the following command and try again:
48 | ```
49 | sudo apt-get install apt-utils
50 | ```
51 | In case the 'ModuleNotFound' error still shows up - The python bindings to tensorrt are available in ```dist-packages```, which may not be visible to your environment. We add ```dist-packages``` to ```PYTHONPATH``` to make this work:
52 | ```
53 | export PYTHONPATH=/usr/lib/python3.8/dist-packages:$PYTHONPATH
54 | ```
55 | If ```tensorrt``` is still not installed, try the following command:
56 | ```
57 | pip install pycuda
58 | ```
59 | * **Torchvision**: Identify which version of torchvision is compatible with your PyTorch version from [here](https://pytorch.org/get-started/previous-versions/). Clone and install that specific version from source in your workspace's src folder: ```git clone –-branch https://github.com/pytorch/vision.git```. For example:
60 | ```
61 | cd ${ISAAC_ROS_WS}/src
62 | git clone --branch v0.13.0 https://github.com/pytorch/vision.git
63 | cd vision
64 | pip install .
65 | ```
66 | Verify that torchvision has been installed correctly using the interactive Python interpreter by running python from terminal, and these commands:
67 | ```
68 | cd ../
69 | import torchvision
70 | torchvision.__version__
71 | ```
72 | If it says ‘ModuleNotFound’, try each of the following and check again following the steps above:
73 | ```
74 | sudo apt install nvidia-cuda-dev
75 | pip install ninja
76 | sudo apt-get install ninja-build
77 | ```
78 | * **Transformers library**:
79 | ```
80 | pip install transformers
81 | ```
82 | * **Matplotlib**:
83 | ```
84 | pip install matplotlib
85 | ```
86 | * **torch2trt**:
87 | Enter the torch2trt repository cloned in step 2 and install the package:
88 | ```
89 | cd ${ISAAC_ROS_WS}/src/torch2trt
90 | pip install .
91 | ```
92 | * **NanoOWL**:
93 | Enter the NanoOWL repository cloned in step 2 and install the package:
94 | ```
95 | cd ${ISAAC_ROS_WS}/src/nanoowl
96 | pip install .
97 | ```
98 | * **cam2image**:
99 | We want to use the [image_tools](https://github.com/ros2/demos/tree/rolling/image_tools) package from the ```demos``` repository that we cloned to take input from an attached usb camera. Build and source this package from your workspace:
100 | ```
101 | cd ${ISAAC_ROS_WS}
102 | colcon build --symlink-install --packages-select image_tools
103 | source install/setup.bash
104 | ```
105 | Verify that the cam2image node works by running the following command in a terminal and viewing topic ```/image``` in RViz/Foxglove from another terminal:
106 | ```
107 | ros2 run image_tools cam2image
108 | ```
109 | 5. Build ros2_nanoowl:
110 | ```
111 | cd ${ISAAC_ROS_WS}
112 | colcon build --symlink-install --packages-select ros2_nanoowl
113 | source install/setup.bash
114 | ```
115 | 6. Build the TensorRT engine for the OWL-ViT vision encoder - this step may take a few minutes:
116 | ```
117 | cd ${ISAAC_ROS_WS}/src/nanoowl
118 | mkdir -p data
119 | python3 -m nanoowl.build_image_encoder_engine data/owl_image_encoder_patch32.engine
120 | ```
121 | Copy this ```data``` folder with the generated engine file to the ROS2-NanoOWL folder:
122 | ```
123 | cp -r data/ ${ISAAC_ROS_WS}/src/ROS2-NanoOWL
124 | ```
125 | 7. Run the image publisher node to publish input images for inference. We can use the sample image in ```${ISAAC_ROS_WS}/src/nanoowl/assets/```:
126 | ```
127 | cd ${ISAAC_ROS_WS}
128 | ros2 run image_publisher image_publisher_node src/nanoowl/assets/owl_glove_small.jpg --ros-args --remap /image_raw:=/input_image
129 | ```
130 | 8. You can also play a rosbag for inference. Make sure to remap the image topic to ```input_image```. For example:
131 | ```
132 | ros2 bag play --remap /front/stereo_camera/left/rgb:=/input_image
133 | ```
134 | 9. From another terminal, publish your input query as a list of objects on the ```input_query``` topic using the command below. This query can be changed anytime while the ```ros2_nanoowl``` node is running to detect different objects. Another way to publish your query is through the ```publish``` panel in [Foxglove](https://foxglove.dev/) (instructions given below in this repository).
135 | ```
136 | ros2 topic pub /input_query std_msgs/String 'data: a person, a box, a forklift'
137 | ```
138 | 10. Run the launch file to start detecting objects. Find more information on usage and arguments below:
139 | ```
140 | ros2 launch ros2_nanoowl nano_owl_example.launch.py thresholds:=0.1 image_encoder_engine:='src/ROS2-NanoOWL/data/owl_image_encoder_patch32.engine'
141 | ```
142 | 11. The ```ros2_nanoowl``` node prints the current query to terminal, so you can check that your most recent query is being used:
143 | 
144 |
145 | If an older query is being published, please update it:
146 | * If using Foxglove: Check that the query on the panel is correct and click the Publish button again. Remember to click the Publish button everytime you update your query!
147 | * If using command line: Rerun the ```ros2 topic pub``` command (given in step 9) with the updated query.
148 | 12. Visualize output on topic ```/output_image``` using RVIZ or Foxglove. Output bounding boxes are published on topic ```/output_detections```.
149 | 13. To perform inference on a live camera stream, run the following launch file. Publish a query as given in step 9:
150 | ```
151 | ros2 launch ros2_nanoowl camera_input_example.launch.py thresholds:=0.1 image_encoder_engine:='src/ROS2-NanoOWL/data/owl_image_encoder_patch32.engine'
152 | ```
153 |
154 | ## Usage
155 |
156 | ```ros2 launch ros2_nanoowl nano_owl_example.launch.py thresholds:= image_encoder_engine:=```
157 |
158 | ## ROS Parameters
159 |
160 | | ROS Parameter | Type | Default | Description |
161 | | --- | --- | --- | --- |
162 | | thresholds | float | 0.1 | Threshold for filtering detections |
163 | | image_encoder_engine | string | "src/ROS2-NanoOWL/data/owl_image_encoder_patch32.engine" | Path to the TensorRT engine for the OWL-ViT vision encoder |
164 |
165 | ## Topics Subscribed
166 |
167 | | ROS Topic | Interface | Description |
168 | | --- | --- | --- |
169 | | input_image | [sensor_msgs/Image](https://github.com/ros2/common_interfaces/blob/humble/sensor_msgs/msg/Image.msg) | The image on which detection is to be performed |
170 | | input_query | [std_msgs/String](https://github.com/ros2/common_interfaces/blob/humble/std_msgs/msg/String.msg) | List of objects to be detected in the image |
171 |
172 | ## Topics Published
173 |
174 | | ROS Topic | Interface | Description |
175 | | --- | --- | --- |
176 | | output_image | [sensor_msgs/Image](https://github.com/ros2/common_interfaces/blob/humble/sensor_msgs/msg/Image.msg) | The output image with bounding boxes and labels around detected objects |
177 | | output_detections | [vision_msgs/Detection2DArray](https://github.com/ros-perception/vision_msgs/blob/ros2/vision_msgs/msg/Detection2DArray.msg) | Output detections including bounding box coordinates and label information for each detected object in the image |
178 |
179 | ## Using Foxglove for visualization and publishing queries
180 |
181 | 1. [Download](https://foxglove.dev/download) and install Foxglove on your Jetson.
182 | 2. Open Foxglove and click on **Open connection**.
183 | 
184 | 3. Click on the **Foxglove WebSocket** option - it tells you to connect to your system using the [Foxglove Websocket](https://docs.foxglove.dev/docs/connecting-to-data/frameworks/ros2#foxglove-websocket) protocol. This option requires running an extra ROS node called the **foxglove_bridge**.
185 | 
186 | 4. Follow instructions on installing and launching the [Foxglove bridge](https://docs.foxglove.dev/docs/connecting-to-data/ros-foxglove-bridge).
187 | 5. Once you’ve successfully launched foxglove_bridge in a terminal, Foxglove should connect to your system and show the default layout.
188 | 
189 | 6. Use the **Import from file** option to import the **NanoOWL_Layout.json** file included in this repository.
190 | 
191 | 7. From the panel at the bottom, you can publish and update queries to the ros2_nano_owl node. Type in the objects you want to detect and click on the red Publish button to start inference!
192 | 
193 |
194 | ## Resources
195 |
196 | 1. [NanoOWL](https://github.com/NVIDIA-AI-IOT/nanoowl) - A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.
197 | 2. [Torch2trt](https://github.com/NVIDIA-AI-IOT/torch2trt) - An easy to use PyTorch to TensorRT converter.
198 |
--------------------------------------------------------------------------------
/assets/Foxglove-WebSocket-connection.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/ROS2-NanoOWL/de2236f6dce43d43ada35ac1e2ba4455faeee69e/assets/Foxglove-WebSocket-connection.png
--------------------------------------------------------------------------------
/assets/Foxglove-default.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/ROS2-NanoOWL/de2236f6dce43d43ada35ac1e2ba4455faeee69e/assets/Foxglove-default.png
--------------------------------------------------------------------------------
/assets/Foxglove-import-layout.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/ROS2-NanoOWL/de2236f6dce43d43ada35ac1e2ba4455faeee69e/assets/Foxglove-import-layout.png
--------------------------------------------------------------------------------
/assets/Foxglove-open-connection.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/ROS2-NanoOWL/de2236f6dce43d43ada35ac1e2ba4455faeee69e/assets/Foxglove-open-connection.png
--------------------------------------------------------------------------------
/assets/Foxglove-publish-panel.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/ROS2-NanoOWL/de2236f6dce43d43ada35ac1e2ba4455faeee69e/assets/Foxglove-publish-panel.png
--------------------------------------------------------------------------------
/assets/ROS2-NanoOWL-query.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/ROS2-NanoOWL/de2236f6dce43d43ada35ac1e2ba4455faeee69e/assets/ROS2-NanoOWL-query.png
--------------------------------------------------------------------------------
/assets/forklift_detection.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/ROS2-NanoOWL/de2236f6dce43d43ada35ac1e2ba4455faeee69e/assets/forklift_detection.png
--------------------------------------------------------------------------------
/assets/ladder_detection.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/ROS2-NanoOWL/de2236f6dce43d43ada35ac1e2ba4455faeee69e/assets/ladder_detection.png
--------------------------------------------------------------------------------
/assets/pallet_detection.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/ROS2-NanoOWL/de2236f6dce43d43ada35ac1e2ba4455faeee69e/assets/pallet_detection.png
--------------------------------------------------------------------------------
/assets/people_detection.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/ROS2-NanoOWL/de2236f6dce43d43ada35ac1e2ba4455faeee69e/assets/people_detection.png
--------------------------------------------------------------------------------
/launch/camera_input_example.launch.py:
--------------------------------------------------------------------------------
1 | # SPDX-FileCopyrightText: Copyright (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2 | # SPDX-License-Identifier: Apache-2.0
3 | #
4 | # Licensed under the Apache License, Version 2.0 (the "License");
5 | # you may not use this file except in compliance with the License.
6 | # You may obtain a copy of the License at
7 | #
8 | # http://www.apache.org/licenses/LICENSE-2.0
9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 |
16 | from launch import LaunchDescription
17 | from launch_ros.actions import Node
18 | from launch.substitutions import LaunchConfiguration
19 | from launch.actions import DeclareLaunchArgument
20 |
21 | def generate_launch_description():
22 | launch_args = [
23 | DeclareLaunchArgument(
24 | 'thresholds',
25 | default_value='0.1',
26 | description='Threshold for filtering detections'),
27 | DeclareLaunchArgument(
28 | 'image_encoder_engine',
29 | default_value='src/ROS2-NanoOWL/data/owl_image_encoder_patch32.engine',
30 | description='Path to the TensorRT engine for the OWL-ViT vision encoder'),
31 | ]
32 |
33 | # NanoOWL parameters
34 | thresholds = LaunchConfiguration('thresholds')
35 | image_encoder_engine = LaunchConfiguration('image_encoder_engine')
36 |
37 | cam2image_node = Node(
38 | package='image_tools',
39 | executable='cam2image',
40 | remappings=[('image', 'input_image')]
41 | )
42 |
43 | nanoowl_node = Node(
44 | package='ros2_nanoowl',
45 | executable='nano_owl_py',
46 | parameters=[{
47 | 'model': 'google/owlvit-base-patch32',
48 | 'image_encoder_engine': image_encoder_engine,
49 | 'thresholds':thresholds,
50 | }]
51 | )
52 |
53 | final_launch_description = launch_args + [cam2image_node] + [nanoowl_node]
54 |
55 | return LaunchDescription(final_launch_description)
56 |
--------------------------------------------------------------------------------
/launch/nano_owl_example.launch.py:
--------------------------------------------------------------------------------
1 | # SPDX-FileCopyrightText: Copyright (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2 | # SPDX-License-Identifier: Apache-2.0
3 | #
4 | # Licensed under the Apache License, Version 2.0 (the "License");
5 | # you may not use this file except in compliance with the License.
6 | # You may obtain a copy of the License at
7 | #
8 | # http://www.apache.org/licenses/LICENSE-2.0
9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 |
16 | from launch import LaunchDescription
17 | from launch_ros.actions import Node
18 | from launch.substitutions import LaunchConfiguration
19 | from launch.actions import DeclareLaunchArgument
20 |
21 | def generate_launch_description():
22 | launch_args = [
23 | DeclareLaunchArgument(
24 | 'thresholds',
25 | default_value='0.1',
26 | description='Threshold for filtering detections'),
27 | DeclareLaunchArgument(
28 | 'image_encoder_engine',
29 | default_value='src/ROS2-NanoOWL/data/owl_image_encoder_patch32.engine',
30 | description='Path to the TensorRT engine for the OWL-ViT vision encoder'),
31 | ]
32 |
33 | # NanoOWL parameters
34 | thresholds = LaunchConfiguration('thresholds')
35 | image_encoder_engine = LaunchConfiguration('image_encoder_engine')
36 |
37 | nanoowl_node = Node(
38 | package='ros2_nanoowl',
39 | executable='nano_owl_py',
40 | parameters=[{
41 | 'model': 'google/owlvit-base-patch32',
42 | 'image_encoder_engine': image_encoder_engine,
43 | 'thresholds':thresholds,
44 | }]
45 | )
46 |
47 | final_launch_description = launch_args + [nanoowl_node]
48 |
49 | return LaunchDescription(final_launch_description)
50 |
--------------------------------------------------------------------------------
/package.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 | ros2_nanoowl
5 | 0.0.0
6 | ROS 2 package for object detection using NanoOWL on NVIDIA Jetson
7 | asawareeb
8 | TODO: License declaration
9 |
10 | ament_copyright
11 | ament_flake8
12 | ament_pep257
13 | python3-pytest
14 |
15 | rclpy
16 | std_msgs
17 | sensor_msgs
18 | vision_msgs
19 | ros2launch
20 | opencv2
21 | cv_bridge
22 |
23 |
24 | ament_python
25 |
26 |
27 |
--------------------------------------------------------------------------------
/resource/ros2_nanoowl:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/ROS2-NanoOWL/de2236f6dce43d43ada35ac1e2ba4455faeee69e/resource/ros2_nanoowl
--------------------------------------------------------------------------------
/ros2_nanoowl/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVIDIA-AI-IOT/ROS2-NanoOWL/de2236f6dce43d43ada35ac1e2ba4455faeee69e/ros2_nanoowl/__init__.py
--------------------------------------------------------------------------------
/ros2_nanoowl/nano_owl_py.py:
--------------------------------------------------------------------------------
1 | # SPDX-FileCopyrightText: Copyright (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2 | # SPDX-License-Identifier: Apache-2.0
3 | #
4 | # Licensed under the Apache License, Version 2.0 (the "License");
5 | # you may not use this file except in compliance with the License.
6 | # You may obtain a copy of the License at
7 | #
8 | # http://www.apache.org/licenses/LICENSE-2.0
9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 |
16 | import rclpy
17 | from rclpy.node import Node
18 | from std_msgs.msg import String
19 | from sensor_msgs.msg import Image
20 | from vision_msgs.msg import Detection2D, Detection2DArray, ObjectHypothesisWithPose
21 | from cv_bridge import CvBridge
22 | import cv2
23 | import numpy as np
24 | from PIL import Image as im
25 | from nanoowl.owl_predictor import (OwlPredictor)
26 | from nanoowl.owl_drawing import (draw_owl_output)
27 |
28 | class Nano_OWL_Subscriber(Node):
29 |
30 | def __init__(self):
31 | super().__init__('nano_owl_subscriber')
32 |
33 | self.declare_parameter('model', 'google/owlvit-base-patch32')
34 | self.declare_parameter('image_encoder_engine', '../data/owl_image_encoder_patch32.engine')
35 | self.declare_parameter('thresholds', rclpy.Parameter.Type.DOUBLE)
36 |
37 | # Subscriber for input query
38 | self.query_subscription = self.create_subscription(
39 | String,
40 | 'input_query',
41 | self.query_listener_callback,
42 | 10)
43 | self.query_subscription # prevent unused variable warning
44 |
45 | # Subscriber for input image
46 | self.image_subscription = self.create_subscription(
47 | Image,
48 | 'input_image',
49 | self.listener_callback,
50 | 10)
51 | self.image_subscription # prevent unused variable warning
52 |
53 | # To convert ROS image message to OpenCV image
54 | self.cv_br = CvBridge()
55 |
56 | self.output_publisher = self.create_publisher(Detection2DArray, 'output_detections', 10)
57 | self.output_image_publisher = self.create_publisher(Image, 'output_image', 10)
58 |
59 | self.image_encoder_engine = self.get_parameter('image_encoder_engine').get_parameter_value().string_value
60 |
61 | self.predictor = OwlPredictor(
62 | 'google/owlvit-base-patch32',
63 | image_encoder_engine=self.image_encoder_engine
64 | )
65 |
66 | self.query = "a person, a box"
67 |
68 | def query_listener_callback(self, msg):
69 | self.query = msg.data
70 |
71 |
72 | def listener_callback(self, data):
73 | input_query = self.query
74 | input_model = self.get_parameter('model').get_parameter_value().string_value
75 | input_image_encoder_engine = self.get_parameter('image_encoder_engine').get_parameter_value().string_value
76 | thresholds = self.get_parameter('thresholds').get_parameter_value().double_value
77 |
78 | # call model with input_query and input_image
79 | cv_img = self.cv_br.imgmsg_to_cv2(data, 'rgb8')
80 | PIL_img = im.fromarray(cv_img)
81 |
82 | # Parsing input text prompt
83 | prompt = input_query.strip("][()")
84 | text = prompt.split(',')
85 | self.get_logger().info('Your query: %s' % text)
86 |
87 | thresholds = [thresholds] * len(text)
88 |
89 | text_encodings = self.predictor.encode_text(text)
90 |
91 | output = self.predictor.predict(
92 | image=PIL_img,
93 | text=text,
94 | text_encodings=text_encodings,
95 | threshold=thresholds,
96 | pad_square=False
97 | )
98 |
99 | detections_arr = Detection2DArray()
100 | detections_arr.header = data.header
101 |
102 | num_detections = len(output.labels)
103 |
104 | for i in range(num_detections):
105 | box = output.boxes[i]
106 | label_index = int(output.labels[i])
107 | box = [float(x) for x in box]
108 | top_left = (box[0], box[1])
109 | bottom_right = (box[2], box[3])
110 | obj = Detection2D()
111 | obj.bbox.size_x = abs(box[2] - box[0])
112 | obj.bbox.size_y = abs(box[1] - box[3])
113 | obj.bbox.center.position.x = (box[0] + box[2]) / 2.0
114 | obj.bbox.center.position.y = (box[1] + box[3]) / 2.0
115 | hyp = ObjectHypothesisWithPose()
116 | hyp.hypothesis.class_id = str(label_index)
117 | obj.results.append(hyp)
118 | obj.header = data.header
119 | detections_arr.detections.append(obj)
120 |
121 | self.output_publisher.publish(detections_arr)
122 |
123 | image = draw_owl_output(PIL_img, output, text=text, draw_text=True)
124 | # convert PIL image to ROS2 image message before publishing
125 | image = np.array(image)
126 | # convert RGB to BGR
127 | image = image[:, :, ::-1].copy()
128 |
129 | self.output_image_publisher.publish(self.cv_br.cv2_to_imgmsg(image, "bgr8"))
130 |
131 |
132 |
133 | def main(args=None):
134 | rclpy.init(args=args)
135 |
136 | nano_owl_subscriber = Nano_OWL_Subscriber()
137 |
138 | rclpy.spin(nano_owl_subscriber)
139 |
140 | # Destroy the node explicitly
141 | # (optional - otherwise it will be done automatically
142 | # when the garbage collector destroys the node object)
143 | minimal_subscriber.destroy_node()
144 | rclpy.shutdown()
145 |
146 |
147 | if __name__ == '__main__':
148 | main()
149 |
--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
1 | [develop]
2 | script_dir=$base/lib/ros2_nanoowl
3 | [install]
4 | install_scripts=$base/lib/ros2_nanoowl
5 |
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | # SPDX-FileCopyrightText: Copyright (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2 | # SPDX-License-Identifier: Apache-2.0
3 | #
4 | # Licensed under the Apache License, Version 2.0 (the "License");
5 | # you may not use this file except in compliance with the License.
6 | # You may obtain a copy of the License at
7 | #
8 | # http://www.apache.org/licenses/LICENSE-2.0
9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 |
16 | import os
17 | from glob import glob
18 | from setuptools import find_packages, setup
19 |
20 | package_name = 'ros2_nanoowl'
21 | submodules = 'ros2_nanoowl/nanoowl'
22 |
23 | setup(
24 | name=package_name,
25 | version='0.0.0',
26 | packages=find_packages(exclude=['test']),
27 | data_files=[
28 | ('share/ament_index/resource_index/packages',
29 | ['resource/' + package_name]),
30 | ('share/' + package_name, ['package.xml']),
31 | ('share/' + package_name, glob('launch/*.launch.py'))
32 | ],
33 | install_requires=['setuptools'],
34 | zip_safe=True,
35 | maintainer='asawareeb',
36 | maintainer_email='asawareeb@nvidia.com',
37 | description='ROS 2 package for object detection using NanoOWL on NVIDIA Jetson',
38 | license='TODO: License declaration',
39 | tests_require=['pytest'],
40 | entry_points={
41 | 'console_scripts': [
42 | 'nano_owl_py = ros2_nanoowl.nano_owl_py:main'
43 | ],
44 | },
45 | )
46 |
--------------------------------------------------------------------------------
/test/test_copyright.py:
--------------------------------------------------------------------------------
1 | # Copyright 2015 Open Source Robotics Foundation, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | from ament_copyright.main import main
16 | import pytest
17 |
18 |
19 | # Remove the `skip` decorator once the source file(s) have a copyright header
20 | @pytest.mark.skip(reason='No copyright header has been placed in the generated source file.')
21 | @pytest.mark.copyright
22 | @pytest.mark.linter
23 | def test_copyright():
24 | rc = main(argv=['.', 'test'])
25 | assert rc == 0, 'Found errors'
26 |
--------------------------------------------------------------------------------
/test/test_flake8.py:
--------------------------------------------------------------------------------
1 | # Copyright 2017 Open Source Robotics Foundation, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | from ament_flake8.main import main_with_errors
16 | import pytest
17 |
18 |
19 | @pytest.mark.flake8
20 | @pytest.mark.linter
21 | def test_flake8():
22 | rc, errors = main_with_errors(argv=[])
23 | assert rc == 0, \
24 | 'Found %d code style errors / warnings:\n' % len(errors) + \
25 | '\n'.join(errors)
26 |
--------------------------------------------------------------------------------
/test/test_pep257.py:
--------------------------------------------------------------------------------
1 | # Copyright 2015 Open Source Robotics Foundation, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | from ament_pep257.main import main
16 | import pytest
17 |
18 |
19 | @pytest.mark.linter
20 | @pytest.mark.pep257
21 | def test_pep257():
22 | rc = main(argv=['.', 'test'])
23 | assert rc == 0, 'Found code style errors / warnings'
24 |
--------------------------------------------------------------------------------