├── .gitignore ├── LICENSE ├── README.md └── assets ├── TAPTRv1.png ├── TAPTRv2.png ├── TAPTRv3.png └── performance.png /.gitignore: -------------------------------------------------------------------------------- 1 | .nfs* 2 | *.ipynb 3 | *.pyc 4 | .dumbo.json 5 | .DS_Store 6 | .*.swp 7 | *.pth 8 | **/__pycache__/** 9 | *.tmp 10 | *.pkl 11 | **/.mypy_cache/* 12 | .mypy_cache/* 13 | .vscode 14 | logs 15 | *.sub 16 | *.pyc 17 | 18 | models/dino/ops/build/ 19 | models/dino/ops/*.egg-info/ 20 | models/dino/ops/dist/ 21 | vis_results 22 | datas 23 | saved_videos 24 | checkpoint/*.pth 25 | checkpoints 26 | *.tar.gz 27 | *.mp4 28 | *.gif 29 | checkpoints 30 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | IDEA License 1.0 2 | 3 | This License Agreement (as may be amended in accordance with this License Agreement, “License”), between you, or your employer or other entity (if you are entering into this agreement on behalf of your employer or other entity) (“Licensee” or “you”) and the International Digital Economy Academy (“IDEA” or “we”) applies to your use of any computer program, algorithm, source code, object code, or software that is made available by IDEA under this License (“Software”) and any specifications, manuals, documentation, and other written information provided by IDEA related to the Software (“Documentation”). 4 | 5 | By downloading the Software or by using the Software, you agree to the terms of this License. If you do not agree to this License, then you do not have any rights to use the Software or Documentation (collectively, the “Software Products”), and you must immediately cease using the Software Products. If you are agreeing to be bound by the terms of this License on behalf of your employer or other entity, you represent and warrant to IDEA that you have full legal authority to bind your employer or such entity to this License. If you do not have the requisite authority, you may not accept the License or access the Software Products on behalf of your employer or other entity. 6 | 7 | 1. LICENSE GRANT 8 | 9 | a. You are granted a non-exclusive, worldwide, transferable, sublicensable, irrevocable, royalty free and limited license under IDEA’s copyright interests to use, reproduce, distribute, copy, create derivative works of, and make modifications to the Software solely for your non-commercial research purposes. 10 | 11 | b. The grant of rights expressly set forth in this Section 1 (License Grant) are the complete grant of rights to you in the Software Products, and no other licenses are granted, whether by waiver, estoppel, implication, equity or otherwise. IDEA and its licensors reserve all rights not expressly granted by this License. 12 | 13 | c. If you intend to use the Software Products for any commercial purposes, you must request a license from IDEA, which IDEA may grant to you in its sole discretion. 14 | 15 | 2. REDISTRIBUTION AND USE 16 | 17 | a. If you distribute or make the Software Products, or any derivative works thereof, available to a third party, you shall provide a copy of this Agreement to such third party. 18 | 19 | b. You must retain in all copies of the Software Products that you distribute the following attribution notice: "[https://github.com/IDEA-Research/DiffHOI] is licensed under the IDEA License 1.0, Copyright (c) IDEA. All Rights Reserved." 20 | 21 | d. Your use of the Software Products must comply with applicable laws and regulations (including trade compliance laws and regulations). 22 | 23 | e. You will not, and will not permit, assist or cause any third party to use, modify, copy, reproduce, create derivative works of, or distribute the Software Products (or any derivative works thereof, works incorporating the Software Products, or any data produced by the Software), in whole or in part, for in any manner that infringes, misappropriates, or otherwise violates any third-party rights. 24 | 25 | 3. DISCLAIMER OF WARRANTY 26 | 27 | UNLESS REQUIRED BY APPLICABLE LAW, THE SOFTWARE PRODUCTS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE SOFTWARE PRODUCTS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE SOFTWARE PRODUCTS AND ANY OUTPUT AND RESULTS. 28 | 29 | 4. LIMITATION OF LIABILITY 30 | 31 | IN NO EVENT WILL IDEA OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF IDEA OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING. 32 | 33 | 5. INDEMNIFICATION 34 | 35 | You will indemnify, defend and hold harmless IDEA and our subsidiaries and affiliates, and each of our respective shareholders, directors, officers, employees, agents, successors, and assigns (collectively, the “IDEA Parties”) from and against any losses, liabilities, damages, fines, penalties, and expenses (including reasonable attorneys’ fees) incurred by any IDEA Party in connection with any claim, demand, allegation, lawsuit, proceeding, or investigation (collectively, “Claims”) arising out of or related to: (a) your access to or use of the Software Products (as well as any results or data generated from such access or use); (b) your violation of this License; or (c) your violation, misappropriation or infringement of any rights of another (including intellectual property or other proprietary rights and privacy rights). You will promptly notify the IDEA Parties of any such Claims, and cooperate with IDEA Parties in defending such Claims. You will also grant the IDEA Parties sole control of the defense or settlement, at IDEA’s sole option, of any Claims. This indemnity is in addition to, and not in lieu of, any other indemnities or remedies set forth in a written agreement between you and IDEA or the other IDEA Parties. 36 | 37 | 6. TERMINATION; SURVIVAL 38 | 39 | a. This License will automatically terminate upon any breach by you of the terms of this License. 40 | 41 | b. If you institute litigation or other proceedings against IDEA or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Software Products, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. 42 | 43 | c. The following sections survive termination of this License: 2 (Redistribution and use), 3 (Disclaimers of Warranty), 4 (Limitation of Liability), 5 (Indemnification), 6 (Termination; Survival), 7 (Trademarks) and 8 (Applicable Law; Dispute Resolution). 44 | 45 | 7. TRADEMARKS 46 | 47 | Licensee has not been granted any trademark license as part of this License and may not use any name or mark associated with IDEA without the prior written permission of IDEA, except to the extent necessary to make the reference required by the attribution notice of this Agreement. 48 | 49 | 8. APPLICABLE LAW; DISPUTE RESOLUTION 50 | 51 | This License will be governed and construed under the laws of the People’s Republic of China without regard to conflicts of law provisions. The parties expressly agree that the United Nations Convention on Contracts for the International Sale of Goods will not apply. Any suit or proceeding arising out of or relating to this License will be brought in the courts, as applicable, in Shenzhen, Guangdong, and each party irrevocably submits to the jurisdiction and venue of such courts. 52 | 53 | ----------------License for [https://github.com/fundamentalvision/BEVFormer]------------------ 54 | 55 | Apache2.0 56 | Copyright (c) [2022] [fundamentalvision] 57 | Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License. You may obtain a copy of the License at 58 | http://www.apache.org/licenses/LICENSE-2.0 59 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. 60 | 61 | ----------------License for [https://github.com/open-mmlab/mmcv/tree/main]------------------ 62 | 63 | Apache2.0 64 | Copyright (c) [2018-2019] [Open-MMLab] 65 | Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License. You may obtain a copy of the License at 66 | http://www.apache.org/licenses/LICENSE-2.0 67 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. 68 | 69 | ----------------License for [https://github.com/open-mmlab/mmdetection3d]------------------ 70 | 71 | Apache2.0 72 | Copyright (c) [2018-2019] [Open-MMLab] 73 | Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License. You may obtain a copy of the License at 74 | http://www.apache.org/licenses/LICENSE-2.0 75 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. 76 | 77 | ----------------License for [https://github.com/Megvii-BaseDetection/BEVDepth/tree/main]------------------ 78 | 79 | MIT 80 | Copyright (c) [2022] [Megvii-BaseDetection] 81 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: 82 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 83 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 84 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # TAPTR: **T**racking **A**ny **P**oint **TR**ansformers 2 | 3 | We open-source TAPTRv1, TAPTRv2, and TAPTRv3 (not yet) in this repository. You can find them in their corresponding branches. 4 | 5 | These works are completed by [Hongyang Li](https://scholar.google.com.hk/citations?view_op=list_works&hl=zh-CN&user=zdgHNmkAAAAJ&gmla=AMpAcmTJNHoetv6zgfzZkIRcYsFr0UkGGDyl5tAp5etuBqhz3lzYZCQrVDot02xVQ1XTbnMS1fPdAfe0-2--aTXOtewokjyShNLOQQyyhtkolwaz0hvENZpi-pJ-Wg), [Jinyuan Qu](https://scholar.google.com/citations?user=-RSeOl0AAAAJ&hl=zh-CN), [Hao Zhang](https://scholar.google.com/citations?user=B8hPxMQAAAAJ&hl=zh-CN), [Shilong Liu](https://scholar.google.com/citations?hl=zh-CN&user=nkSVY3MAAAAJ), [Zhaoyang Zeng](https://scholar.google.com.hk/citations?user=U_cvvUwAAAAJ&hl=zh-CN&oi=sra), [Tianhe Ren](https://scholar.google.com.hk/citations?user=cW4ILs0AAAAJ&hl=zh-CN&oi=sra), [Feng Li](https://scholar.google.com.hk/citations?user=ybRe9GcAAAAJ&hl=zh-CN&oi=sra), [Bohan Li](https://scholar.google.com.hk/citations?hl=zh-CN&user=V-YdQiAAAAAJ) and [Lei Zhang](https://scholar.google.com/citations?hl=zh-CN&user=fIlGZToAAAAJ) :email:. 6 | 7 | ### Paper Links: [TAPTRv1](https://arxiv.org/pdf/2403.13042) | [TAPTRv2](https://arxiv.org/abs/2407.16291) | [TAPTRv3](https://arxiv.org/abs/2411.18671) 8 | ### More Links: [TAPTR Project Page](https://taptr.github.io) | [Demo (v3)](https://huggingface.co/spaces/HYeungLee/TAPTR) | [BibTeX](#citing-taptr) | [IDEA-Research](https://github.com/IDEA-Research) 9 | 10 | 11 | # :fire: News 12 | 13 | [2024/11/28] We release our TAPTRv3 paper. 14 | 15 | [2024/11/25] TAPTRv2's code is released. 16 | 17 | [2024/9/26] TAPTRv2 is accepted by NeurIPS2024. 18 | 19 | [2024/7/24] We release our TAPTRv2 paper. 20 | 21 | [2024/7/16] TAPTRv1's code is released. 22 | 23 | [2024/7/9] TAPTRv1 is accepted by ECCV2024. 24 | 25 | [2024/3/15] We release our TAPTRv1 paper. 26 | 27 | 28 | # :dna: What is TAPTR. 29 | Inspired by recent visual prompt-based detection [1], we propose to convert Track Any Point (TAP) task to point-level visual prompt detection task. Building upon the recent advanced DEtection TRansformer (DETR) [2, 3, 4, 5], we propose our Track Any Point TRansformer (TAPTR). 30 | 31 | [1] T-rex2: Towards Generic Object Detection via Text-visual Prompt Synergy. IDEA-Research. ECCV2024. 32 | 33 | [2] DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR. IDEA-Research. ICLR2022. 34 | 35 | [3] DN-DETR: Accelerate DETR Training by Introducing Query DeNoising. IDEA-Research. CVPR2022. 36 | 37 | [4] DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection. IDEA-Research. ICLR2023. 38 | 39 | [5] DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding. IDEA-Research. ArXiv2024. 40 | 41 | # :footprints: From V1 to V3, a brief overview. 42 | 43 | ### TAPTRv1 - Simple yet strong baseline. 44 | TAPTRv1 first proposes to address TAP task __from the perspective of detection__. Instead of building upon the traditional optical flow methods, TAPTRv1 is also the first propose to adopt the more advanced DETR-like framework for TAP task. Compared with previous methods, TAPTRv1 has a __clearer and simpler definition of point query and better performance__. 45 | 46 | _Although TAPTRv1 achieves SoTA performance with the DETR-like framework, TAPTRv1 still needs the source-consuming cost-volume to obtain its optimal performance._ 47 | 48 |