├── Figures
├── fig1.jpeg
├── fig2.jpeg
├── fig3.jpeg
├── fig4.jpeg
├── fig5.jpeg
├── fig6.jpeg
└── overview.png
├── README.md
├── dataset_summary.md
├── ideal_scenarios_methods.md
├── new_trends.md
└── real_world_methods.md
/Figures/fig1.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/CatOneTwo/Collaborative-Perception-in-Autonomous-Driving/c9490c2816dbea5da0366685d64d587119355ac8/Figures/fig1.jpeg
--------------------------------------------------------------------------------
/Figures/fig2.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/CatOneTwo/Collaborative-Perception-in-Autonomous-Driving/c9490c2816dbea5da0366685d64d587119355ac8/Figures/fig2.jpeg
--------------------------------------------------------------------------------
/Figures/fig3.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/CatOneTwo/Collaborative-Perception-in-Autonomous-Driving/c9490c2816dbea5da0366685d64d587119355ac8/Figures/fig3.jpeg
--------------------------------------------------------------------------------
/Figures/fig4.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/CatOneTwo/Collaborative-Perception-in-Autonomous-Driving/c9490c2816dbea5da0366685d64d587119355ac8/Figures/fig4.jpeg
--------------------------------------------------------------------------------
/Figures/fig5.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/CatOneTwo/Collaborative-Perception-in-Autonomous-Driving/c9490c2816dbea5da0366685d64d587119355ac8/Figures/fig5.jpeg
--------------------------------------------------------------------------------
/Figures/fig6.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/CatOneTwo/Collaborative-Perception-in-Autonomous-Driving/c9490c2816dbea5da0366685d64d587119355ac8/Figures/fig6.jpeg
--------------------------------------------------------------------------------
/Figures/overview.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/CatOneTwo/Collaborative-Perception-in-Autonomous-Driving/c9490c2816dbea5da0366685d64d587119355ac8/Figures/overview.png
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Collaborative Perception in Autonomous Driving Survey
2 |
3 | This repo is constructed for collecting and categorizing papers about collaborative perception according to our ITSM survey paper:
4 | ***Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges*** [[arXiv](https://arxiv.org/abs/2301.06262)] [[ITSM](https://ieeexplore.ieee.org/document/10248946)] [[Zhihu](https://zhuanlan.zhihu.com/p/644931857)]
5 |
6 |
7 |
8 |
9 |
10 |
17 |
18 |
19 |
20 |
21 | ## Methods
22 | ### Methods for Ideal Scenarios
23 | - Raw data fusion
24 | - Customized communication mechanism
25 | - Feature fusion
26 | - Customized loss function
27 | - Output fusion
28 |
29 | 👉 View details in [**Methods for Ideal Scenarios**](ideal_scenarios_methods.md)
30 |
31 | ### Methods for Real-world Issues
32 | - Localization errors
33 | - Communication issues
34 | - Model or task discrepancies
35 | - Privacy and security issues
36 |
37 | 👉 View details in [**Methods for Real-World Issues**](real_world_methods.md)
38 |
39 |
40 | ## Datasets
41 | - Real-world or Simulator
42 | - V2V or V2I
43 |
44 | 👉 View details in [**Datasets Summary**](dataset_summary.md)
45 |
46 | ## Challenges
47 | - Transmission Efficiency in Collaborative Perception
48 | - Collaborative Perception in Complex Scenes
49 | - Federated Learning-based Collaborative Perception
50 | - Collaborative Perception with Low Labeling Dependence
51 |
52 | 👉 View details in [**New Trends**](new_trends.md)
53 |
54 | ## Citation
55 | If you find this work useful, please cite our paper:
56 | ```
57 | @article{han2023collaborative,
58 | author={Han, Yushan and Zhang, Hui and Li, Huifang and Jin, Yi and Lang, Congyan and Li, Yidong},
59 | journal={IEEE Intelligent Transportation Systems Magazine},
60 | title={Collaborative Perception in Autonomous Driving: Methods, Datasets, and Challenges},
61 | year={2023},
62 | volume={15},
63 | number={6},
64 | pages={131-151},
65 | doi={10.1109/MITS.2023.3298534}}
66 |
67 | ```
68 |
69 |
--------------------------------------------------------------------------------
/dataset_summary.md:
--------------------------------------------------------------------------------
1 | # A summary of large-scale collaborative perception datasets
2 |
3 | **Usage of Common Datasets** (from papers with code):
4 | - [V2X-SIM](https://paperswithcode.com/dataset/v2x-sim) | [OPV2V](https://paperswithcode.com/dataset/opv2v) | [V2XSet](https://paperswithcode.com/dataset/v2xset) | [DAIR-V2X](https://paperswithcode.com/dataset/dair-v2x) | [V2V4Real](https://paperswithcode.com/dataset/v2v4real) | [DAIR-V2X-Seq](https://paperswithcode.com/dataset/dair-v2x-seq)
5 |
6 |
7 | | **Dataset** | **Venue** | **Source** | **Frame** | **V2V** | **V2I** | **I2I** | **Agents** | **Camera** | **LiDAR** | **Depth** | **OD** | **SS** | **OT** | **MP** | **Website** |
8 | |:----------------:|:---------:|:----------:|:---------:|:--------:|:--------:|:----------:|:----------:|:----------:|:---------:|:---------:|:---------:|:---------:|:---------:|:---------:|:----------------------------------------------------:|
9 | | V2V-Sim [1] | ECCV'20 | Simu | 51K | ✔ | - | - | 1-7 | - | ✔ | - | ✔ | - | - | ✔ | - |
10 | | V2X-Sim [2] | RAL'21 | Simu | 10k | ✔ | ✔ | - |1-5 | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | - | [Link](https://ai4ce.github.io/V2X-Sim) |
11 | | OPV2V [3] | ICRA'22 | Simu | 11K | ✔ | - | - |1-7 | ✔ | ✔ | - | ✔ | ✔ | - | - | [Link](https://mobility-lab.seas.ucla.edu/opv2v) |
12 | | DAIR-V2X-C [4] | CVPR'22 | Real | 39k | - | ✔ | - |2 | ✔ | ✔ | - | ✔ | - | - | - | [Link](https://thudair.baai.ac.cn/coop-dtest) |
13 | | V2XSet [5] | ECCV'22 | Simu | 11K | ✔ | ✔ | - |2-5 | ✔ | ✔ | - | ✔ | - | - | - | [Link](https://github.com/DerrickXuNu/v2x-vit) |
14 | | DOLPHINS [6] | ACCV'22 | Simu | 42k | ✔ | ✔ | - |3 | ✔ | ✔ | - | ✔ | - | - | - | [Link](https://dolphins-dataset.net) |
15 | | V2V4Real [7] | CVPR'23 | Real | 20K | ✔ | - | - |2 | ✔ | ✔ | - | ✔ | - | ✔ | - | [Link](https://mobility-lab.seas.ucla.edu/v2v4real/) |
16 | | V2X-Seq [8] | CVPR'23 | Real | 15k | - | ✔ | - |2 | ✔ | ✔ | - | ✔ | - | ✔ | ✔ | [Link](https://thudair.baai.ac.cn/coop-forecast) |
17 | | DeepAccident [9] | AAAI'24 | Simu | 57K | ✔ | ✔ | - |1-5 | ✔ | ✔ | - | ✔ | ✔ | ✔ | ✔ | [Link](https://deepaccident.github.io/index.html) |
18 | | HoloVIC [10] | CVPR'24 | Real | 100K | - | ✔ | - |2 | ✔ | ✔ | - | ✔ | - | ✔ | - | [Link](https://holovic.net/) |
19 | | TUMTraf-V2X [11] | CVPR'24 | Real | 2K | - | ✔ | - |2 | ✔ | ✔ | - | ✔ | - | ✔ | - | [Link](https://tum-traffic-dataset.github.io/tumtraf-v2x/) |
20 | | R-Cooper [12] | CVPR'24 | Real | 30K | - | - | ✔ | 2 | ✔ | ✔ | - | ✔ | - | ✔ | - | [Link](https://tum-traffic-dataset.github.io/tumtraf-v2x/) |
21 | | MARS [13] | CVPR'24 | Real | 1.5k | ✔ | - | - |2-3 | ✔ | ✔ | - | -| - |-| - | [Link](https://ai4ce.github.io/MARS/) |
22 | | V2X-Real [14] | ECCV'24 | Real | 33k | ✔ | ✔ | ✔ |2-4 | ✔ | ✔ | - | ✔ | - |-| - | [Link](https://github.com/ucla-mobility/V2X-Real) |
23 |
24 | Notes:
25 | - Source: simulator (Simu) and real-world (Real).
26 | - Frame refers to annotated LiDAR-based cooperative perception frame number.
27 | - Supported common perception tasks: 3D object detection (OD), BEV semantic segmentation (SS), 3D object tracking (OT), motion prediction (MP).
28 |
29 |
30 | Back to [Contents](README.md) 🔙
31 |
32 | References:
33 | 1. [V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction](https://arxiv.org/abs/2008.07519) (ECCV'20)
34 | 2. [V2X-Sim: Multi-Agent Collaborative Perception Dataset and Benchmark for Autonomous Driving](https://arxiv.org/abs/2202.08449) (RAL'21)
35 | 3. [OPV2V: An open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communication](https://arxiv.org/abs/2109.07644) (ICRA'22)
36 | 4. [DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection](https://arxiv.org/abs/2204.05575) (CVPR'22)
37 | 5. [V2X-ViT: Vehicle-to-everything cooperative perception with vision transformer](https://arxiv.org/abs/2203.10638) (ECCV'22)
38 | 6. [DOLPHINS: Dataset for Collaborative Perception enabled Harmonious and Interconnected Self-driving](https://arxiv.org/abs/2207.07609) (ACCV'22)
39 | 7. [V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception](https://arxiv.org/abs/2303.07601) (CVPR'23)
40 | 8. [V2X-Seq: A large-scale sequential dataset for vehicle-infrastructure cooperative perception and forecasting](https://arxiv.org/abs/2305.05938) (CVPR'23)
41 | 9. [DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving](https://arxiv.org/abs/2304.01168) (AAAI'24)
42 | 10. [HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative](https://arxiv.org/abs/2403.02640) (CVPR'24)
43 | 11. [TUMTraf V2X Cooperative Perception Dataset](https://arxiv.org/abs/2403.01316) (CVPR'24)
44 | 12. [RCooper: A Real-world Large-scale Dataset for Roadside Cooperative Perception](https://arxiv.org/abs/2403.10145) (CVPR'24)
45 | 13. [Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset](https://openaccess.thecvf.com/content/CVPR2024/papers/Li_Multiagent_Multitraversal_Multimodal_Self-Driving_Open_MARS_Dataset_CVPR_2024_paper.pdf) (CVPR'24)
46 | 14. [V2X-Real: a Largs-Scale Dataset for Vehicle-to-Everything Cooperative Perception](https://arxiv.org/abs/2403.16034) (ECCV'24)
47 |
48 |
--------------------------------------------------------------------------------
/ideal_scenarios_methods.md:
--------------------------------------------------------------------------------
1 | # A summary of state-of-the-art collaborative perception methods for ideal scenarios
2 |
3 |
9 |
10 | ## Table
11 | | Method | Venue | Modality | Scheme | Data Fusion | Comm Mecha | Feat Fusion | Loss Func | Code |
12 | |:-----------------:|:----------:|:-------------:|:------------:|:-----------------:|:----------------:|:-----------------:|:---------------:|:------------------------------------------------------------------:|
13 | | Cooper [1] | ICDCS'19 | LiDAR | E | Raw | - | - | - | - |
14 | | F-Cooper [2] | SEC'19 | LiDAR | I | - | - | Trad | - | [Linkn](https://github.com/Aug583/F-COOPER) |
15 | | Who2com [3] | ICRA'20 | Camera | I | - | Agent | Trad | - | - |
16 | | When2com [4] | CVPR'20 | Camera | I | - | Agent | Trad | - | [Linkn](https://github.com/GT-RIPL/MultiAgentPerception) |
17 | | V2VNet [5] | ECCV'20 | LiDAR | I | - | - | Graph | - | - |
18 | | Coop3D [6] | TITS'20 | LiDAR | E, L | Raw, Out | - | - | - | [Linkn](https://github.com/eduardohenriquearnold/coop-3dod-infra) |
19 | | CoFF [7] | IoT'21 | LiDAR | I | - | - | Trad | - | - |
20 | | DiscoNet [8] | NeurIPS'21 | LiDAR | I | Raw | - | Graph | - | [Linkc](https://github.com/ai4ce/DiscoNet) |
21 | | MP-Pose [9] | RAL'22 | Camera | I | - | - | Graph | - | - |
22 | | FPV-RCNN [10] | RAL'22 | LiDAR | I | Out | Feat | Trad | - | [Linkn](https://github.com/YuanYunshuang/FPV_RCNN) |
23 | | AttFusion [11] | ICRA'22 | LiDAR | I | - | - | Atten | - | [Linko](https://github.com/DerrickXuNu/OpenCOOD) |
24 | | TCLF [12] | CVPR'22 | LiDAR | L | Out | - | - | - | [Linkv](https://github.com/AIR-THU/DAIR-V2X) |
25 | | COOPERNAUT [13] | CVPR'22 | LiDAR | I | - | - | Atten | - | [Linkn](https://github.com/UT-Austin-RPL/Coopernaut) |
26 | | V2X-ViT [14] | ECCV'22 | LiDAR | I | - | - | Atten | - | [Linko](https://github.com/DerrickXuNu/v2x-vit) |
27 | | CRCNet [15] | MM'22 | LiDAR | I | - | - | Atten | Redund | - |
28 | | CoBEVT [16] | CoRL'22 | Camera | I | - | - | Atten | - | [Linko](https://github.com/DerrickXuNu/CoBEVT) |
29 | | Where2comm [17] | NeurIPS'22 | LiDAR | I | - | Agent, Feat | Atten | - | [Linko](https://github.com/MediaBrain-SJTU/Where2comm) |
30 | | Double-M [18] | ICRA'23 | LiDAR | E, I, L | - | - | - | Uncert | [Linkc](https://github.com/coperception/double-m-quantification) |
31 | | CoCa3D [19] | CVPR'23 | Camera | I | - | Feat | Trad | - | [Linko](https://github.com/MediaBrain-SJTU/CoCa3D) |
32 | | HM-ViT [20] | ICCV'23 | LiDAR, Camera | I | - | - | Atten | - | [Linko](https://github.com/XHwind/HM-ViT) |
33 | | CORE [21] | ICCV'23 | LiDAR | I | Raw | Feat | Atten | Recon | [Linko](https://github.com/zllxot/CORE) |
34 | | SCOPE [22] | ICCV'23 | LiDAR | I | - | - | Atten (ST) | - | [Linko](https://github.com/starfdu1418/SCOPE) |
35 | | TransIFF [23] | ICCV'23 | LiDAR | I | - | Feat | Atten | - | -
36 | | UMC [24] | ICCV'23 | LiDAR | I | - | Feat |Graph | - | [Linkc](https://github.com/ispc-lab/UMC) |
37 | | HYDRO-3D [25] | TIV'23 | LiDAR | I | - | - |Atten (ST) | - | - |
38 | | MKD-Cooper [26] | TIV'23 | LiDAR | I | Raw | - |Atten | - | [Linko](https://github.com/EricLee523/MKD-Cooper)|
39 | | V2VFormer++ [27] | TITS'23 | LiDAR, Camera | I | - | - |Atten | - | - |
40 | | How2comm [28] | NeurIPS'23 | LiDAR | I | - | Feat |Atten (ST) | - | [Linko](https://github.com/ydk122024/How2comm)|
41 | | What2comm [29] | MM'23 | LiDAR | I | - | Feat |Atten (ST) | - | - |
42 | | BM2CP [30] | CoRL'23 | LiDAR, Camera | I | - | Feat |Atten | - | [Linko](https://github.com/byzhaoAI/BM2CP) |
43 | | DI-V2X [31] | AAAI'24 | LiDAR | I | Raw | Feat |Atten | - | [Linko](https://github.com/Serenos/DI-V2X) |
44 | | QUEST [32] | ICRA'24 | Camera | I, L | - | Feat |Atten | - | - |
45 | | CMiMC [33] | AAAI'24 | LiDAR | I | - | Feat |- | ✔️ | [Linkc](https://github.com/77SWF/CMiMC)|
46 | | Select2Col [34] | TVT'24 | LiDAR | I | - | Agent |Atten (ST) | - | [Linko](https://github.com/huangqzj/select2col) |
47 | | MOT-CUP [35] | RAL'24 | LiDAR | E, I, L | - | - |- | Uncert | [Linkc](https://github.com/susanbao/mot_cup) |
48 | | CodeFilling [36] | CVPR'24 | LiDAR, Camera | I | - | Feat | Trad | - | [Linko](https://github.com/PhyllisH/CodeFilling) |
49 | | IFTR [37] | ECCV'24 | Camera | I | - | Feat | Atten | - | [Linko](https://github.com/wangsh0111/IFTR) |
50 | | VIMI [38] | ICRA'24 | Camera | I | - | - | Atten | - | [Linkv](https://github.com/Bosszhe/EMIFF) |
51 | | CPPC [39] | ICLR'25 | LiDAR | I | - | Feat | Trad | - | - |
52 | | CoSDH [40] | CVPR'25 | LiDAR | I, L | - | Feat | Trad | - | [Linko](https://github.com/Xu2729/CoSDH) |
53 | | CoGMP [41] | CVPR'25 | Camera | I | - | Feat | Trad | - | - |
54 |
55 |
56 | Note:
57 | - Schemes include early (E), intermediate (I) and late (L) collaboration.
58 | - **Data Fusion**: data fusion includes raw data fusion (**Raw**) and output fusion (**Out**).
59 | - **Comm Mecha**: communication mechanism includes agent selection (**Agent**) and feature selection (**Feat**).
60 | - **Feat Fusion**: feature fusion can be divided into traditional (**Trad**), graph-based (**Graph**) and attention-based (**Atten**) feature fusion. (ST: spatio-temporal)
61 | - **Loss Func**: loss function can be used for uncertainty estimation (**Uncert**), redundancy minimization (**Redund**) and Reconstruction (**Recon**), etc.
62 | - **Code Framework**: o ([OpenCOOD](https://github.com/DerrickXuNu/OpenCOOD)), v ([VIC3D](https://github.com/AIR-THU/DAIR-V2X)), c ([CoPerception](https://github.com/coperception/coperception)), n (Non-mainstream framework)
63 |
64 | Back to [Contents](README.md) 🔙
65 |
66 | ## References
67 | ### Published
68 | 1. Cooper: Cooperative perception for connected autonomous vehicles based on 3d point clouds (ICDCS'19) [[`pdf`](https://arxiv.org/abs/1905.05265)]
69 | 2. F-Cooper: Feature based cooperative perception for autonomous vehicle edge computing system using 3D point clouds (SEC'19) [[`pdf`](https://arxiv.org/abs/1909.06459)]
70 | 3. Who2com: Collaborative perception via learnable handshake communication (ICRA'20) [[`pdf`](https://arxiv.org/abs/2003.09575)]
71 | 4. When2com: Multi-agent perception via communication graph grouping (CVPR'20) [[`pdf`](https://arxiv.org/abs/2006.00176)]
72 | 5. V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction (ECCV'20) [[`pdf`](https://arxiv.org/abs/2008.07519)]
73 | 6. Cooperative perception for 3D object detection in driving scenarios using infrastructure sensors (TITS'20) [[`pdf`](https://arxiv.org/abs/1912.12147)]
74 | 7. CoFF: Cooperative spatial feature fusion for 3-d object detection on autonomous vehicles (IoT'21) [[`pdf`](https://arxiv.org/abs/2009.11975)]
75 | 8. Learning distilled collaboration graph for multi-agent perception (NeurIPS'21) [[`pdf`](https://arxiv.org/abs/2111.00643)] [[`code`](https://github.com/ai4ce/DiscoNet)]
76 | 9. Multi-Robot Collaborative Perception with Graph Neural Networks (RAL'22) [[`pdf`](https://arxiv.org/abs/2201.01760)]
77 | 10. Keypoints-Based Deep Feature Fusion for Cooperative Vehicle Detection of Autonomous Driving (RAL'22) [[`pdf`](https://arxiv.org/abs/2109.11615)] [[`code`](https://github.com/YuanYunshuang/FPV_RCNN)]
78 | 11. OPV2V: An open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communication (ICRA'22) [[`pdf`](https://arxiv.org/abs/2109.07644)] [[`code`](https://github.com/DerrickXuNu/OpenCOOD)]
79 | 12. DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection (CVPR'22) [[`pdf`](https://arxiv.org/abs/2204.05575)] [[`code`](https://github.com/AIR-THU/DAIR-V2X)]
80 | 13. COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles (CVPR'22) [[`pdf`](https://arxiv.org/abs/2205.02222)] [[`code`](https://github.com/UT-Austin-RPL/Coopernaut)]
81 | 14. V2X-ViT: Vehicle-to-everything cooperative perception with vision transformer (ECCV'22) [[`pdf`](https://arxiv.org/abs/2203.10638)] [[`code`]()]
82 | 15. Complementarity-Enhanced and Redundancy-Minimized Collaboration Network for Multi-agent Perception (MM'22) [[`pdf`](https://dl.acm.org/doi/abs/10.1145/3503161.3548197)]
83 | 16. CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers (CoRL'22) [[`pdf`](https://arxiv.org/abs/2207.02202)] [[`code`](https://github.com/DerrickXuNu/v2x-vit)]
84 | 17. Where2comm: Communication-Efficient Collaborative Perception via Spatial Confidence Maps (NeurIPS'22) [[`pdf`](https://arxiv.org/abs/2209.12836)] [[`code`](https://github.com/MediaBrain-SJTU/Where2comm)]
85 | 18. Uncertainty Quantification of Collaborative Detection for Self-Driving (ICRA'23) [[`pdf`](https://arxiv.org/abs/2209.08162)] [[`code`](https://github.com/coperception/double-m-quantification)]
86 | 19. Collaboration Helps Camera Overtake LiDAR in 3D Detection (CVPR'23) [[`pdf`](https://arxiv.org/abs/2303.13560)] [[`code`](https://github.com/MediaBrain-SJTU/CoCa3D)]
87 | 20. HM-ViT: Hetero-modal Vehicle-to-Vehicle Cooperative perception with vision transformer (ICCV'23) [[`pdf`](https://arxiv.org/abs/2304.10628)]
88 | 21. CORE: Cooperative Reconstruction for Multi-Agent Perception (ICCV'23) [[`pdf`](https://arxiv.org/abs/2307.11514)] [[`code`](https://github.com/zllxot/CORE)]
89 | 22. Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception (ICCV'23) [[`pdf`](https://arxiv.org/abs/2307.13929)] [[`code`](https://github.com/starfdu1418/SCOPE)]
90 | 23. TransIFF: An Instance-Level Feature Fusion Framework for Vehicle-Infrastructure Cooperative 3D Detection with Transformers (ICCV'23) [[`pdf`](https://openaccess.thecvf.com/content/ICCV2023/papers/Chen_TransIFF_An_Instance-Level_Feature_Fusion_Framework_for_Vehicle-Infrastructure_Cooperative_3D_ICCV_2023_paper.pdf)]
91 | 24. UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework (ICCV'23) [[`pdf`](https://arxiv.org/abs/2303.12400)] [[`code`](https://github.com/ispc-lab/UMC)]
92 | 25. HYDRO-3D: Hybrid Object Detection and Tracking for Cooperative Perception Using 3D LiDAR (TIV'23) [[`pdf`](https://ieeexplore.ieee.org/abstract/document/10148929)]
93 | 26. MKD-Cooper: Cooperative 3D Object Detection for Autonomous Driving via Multi-teacher Knowledge Distillation (TIV'23) [[`pdf`](https://ieeexplore.ieee.org/abstract/document/10236578)] [[`code`](https://github.com/EricLee523/MKD-Cooper)]
94 | 27. V2VFormer ++ : Multi-Modal Vehicle-to-Vehicle Cooperative Perception via Global-Local Transformer (TITS'23) [[`pdf`](https://ieeexplore.ieee.org/document/10265751/)]
95 | 28. How2comm: Communication-Efficient and Collaboration-Pragmatic Multi-Agent Perception (NeurIPS'23) [[`pdf`](https://openreview.net/forum?id=Dbaxm9ujq6)] [[`code`](https://github.com/ydk122024/How2comm)]
96 | 29. What2comm: Towards Communication-efficient Collaborative Perception via Feature Decoupling (MM'23) [[`pdf`](https://dl.acm.org/doi/10.1145/3581783.3611699)]
97 | 30. BM2CP: Efficient Collaborative Perception with LiDAR-Camera Modalities (CoRL'23) [[`pdf`](https://openreview.net/forum?id=uJqxFjF1xWp)] [[`code`](https://github.com/byzhaoAI/BM2CP)]
98 | 31. DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection (AAAI'24) [[`pdf`](https://arxiv.org/abs/2312.15742)] [[`code`](https://github.com/Serenos/DI-V2X)]
99 | 32. QUEST: Query Stream for Practical Cooperative Perception (ICRA'24) [[`pdf`](https://arxiv.org/abs/2308.01804)]
100 | 33. What Makes Good Collaborative Views? Contrastive Mutual Information Maximization for Multi-Agent Perception (AAAI'24) [[`pdf`](https://arxiv.org/abs/2403.10068)] [[`code`](https://github.com/77SWF/CMiMC)]
101 | 34. Select2Col: Leveraging Spatial-Temporal Importance of Semantic Information for Efficient Collaborative Perception (TVT'24) [[`pdf`](https://arxiv.org/abs/2307.16517)] [[`code`](https://github.com/huangqzj/select2col)]
102 | 35. Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation (RAL'24) [[`pdf`](https://arxiv.org/abs/2303.14346)] [[`code`](https://github.com/susanbao/mot_cup)]
103 | 36. Communication-Efficient Collaborative Perception via Information Filling with Codebook (CVPR'24) [[`pdf`](https://arxiv.org/abs/2405.04966)] [[`code`](https://github.com/PhyllisH/CodeFilling)]
104 | 37. IFTR: An Instance-Level Fusion Transformer for Visual Collaborative Perception (ECCV'24) [[`pdf`](https://arxiv.org/abs/2407.09857)] [[`code`](https://github.com/wangsh0111/IFTR)]
105 | 38. EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection (ICRA'24) [[`pdf`](https://arxiv.org/abs/2303.10975)] [[`code`](https://github.com/Bosszhe/EMIFF)]
106 | 39. Point Cluster: A Compact Message Unit for Communication-Efficient Collaborative Perception. (ICLR'25) [[`pdf`](https://openreview.net/forum?id=54XlM8Clkg)]
107 | 40. CoSDH: Communication-Efficient Collaborative Perception via Supply-Demand Awareness and Intermediate-Late Hybridization. (CVPR'25) [[`pdf`](https://arxiv.org/abs/2503.03430)] [[`code`](https://github.com/Xu2729/CoSDH)]
108 | 41. Generative Map Priors for Collaborative BEV Semantic Segmentation. (CVPR'25) [[`pdf`](https://openaccess.thecvf.com/content/CVPR2025/papers/Fu_Generative_Map_Priors_for_Collaborative_BEV_Semantic_Segmentation_CVPR_2025_paper.pdf)]
109 |
110 |
111 |
112 |
--------------------------------------------------------------------------------
/new_trends.md:
--------------------------------------------------------------------------------
1 |
2 | ## New Trends
3 | - **Label Efficient**
4 | - Unsupervised / Self-supervised learning: CO3 [1] , DOtA [14]
5 | - Weakly / Sparsely supervised learning: SSC3OD [2], CoDTS [11]
6 | - Domain adaption: S2R-ViT [3], DUSA [4], CUDA-X [18]
7 | - Based Others’ Predictions: R&B-POP [17]
8 | - **Model Adaptation**
9 | - MACP [5]
10 | - CoPEFT [13]
11 | - **Open Heterogeneous Collaborative Perception**
12 | - HEAL [6]
13 | - **New Perception Tasks**
14 | - Multi-Object Cooperative Tracking: DMSTrack [7], MOT-CUP [8]
15 | - Collaborative Semantic Occupancy Prediction: CoHFF [9]
16 | - Cooperative Trajectory: V2X-Graph [10]
17 | - **Extreme Environments**
18 | - DSRC [12], MDD [15], RCP-Bench [19]
19 | - **End-to-End Autonomous Driving**
20 | - UniV2X [16]
21 |
22 | ## References
23 | 1. CO3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving (*ICLR'23*) [[`pdf`](https://arxiv.org/abs/2206.04028)] [[`code`](https://github.com/Runjian-Chen/CO3)] 
24 | 2. SSC3OD: Sparsely Supervised Collaborative 3D Object Detection from LiDAR Point Clouds (*SMC'23*) [[`pdf`](https://arxiv.org/abs/2307.00717)]
25 | 3. S2R-ViT for Multi-Agent Cooperative Perception: Bridging the Gap from Simulation to Reality (*arXiv'23*) [[`pdf`](https://arxiv.org/abs/2307.07935)]
26 | 4. DUSA: Decoupled Unsupervised Sim2Real Adaptation for Vehicle-to-Everything Collaborative Perception (*MM'23*) [[`pdf`](https://dl.acm.org/doi/10.1145/3581783.3611948)] [[`code`](https://github.com/refkxh/DUSA)] 
27 | 5. MACP: Efficient Model Adaptation for Cooperative Perception (*WACV'24*) [[`pdf`](https://arxiv.org/abs/2310.16870)] [[`code`](https://github.com/PurdueDigitalTwin/MACP)] 
28 | 6. HEAL: An Extensible Framework for Open Heterogeneous Collaborative Perception (*ICLR'24*) [[`pdf`](https://openreview.net/forum?id=KkrDUGIASk)] [[`code`](https://github.com/yifanlu0227/HEAL)] 
29 | 7. Probabilistic 3D Multi-Object Cooperative Tracking for Autonomous Driving via Differentiable Multi-Sensor Kalman Filter (*ICRA'24*) [[`pdf`](https://arxiv.org/abs/2309.14655)] [[`code`](https://github.com/eddyhkchiu/DMSTrack)] 
30 | 8. Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation (*RAL'24*) [[`pdf`](https://arxiv.org/abs/2303.14346)] [[`code`](https://github.com/susanbao/mot_cup)] 
31 | 9. Collaborative Semantic Occupancy Prediction with Hybrid Feature Fusion in Connected Automated Vehicles (*CVPR'24*) [[`pdf`](https://arxiv.org/abs/2402.07635)] [[`code`](https://github.com/rruisong/CoHFF)] 
32 | 10. Learning Cooperative Trajectory Representations for Motion Forecasting (*NeurIPS'24*) [[`pdf`](https://openreview.net/pdf?id=mcY221BgKi)] [[`code`](https://github.com/AIR-THU/V2X-Graph)] 
33 | 11. CoDTS: Enhancing Sparsely Supervised Collaborative Perception with a Dual Teacher-Student Framework (*AAAI'25*) [[`pdf`](https://arxiv.org/abs/2412.08344)] [[`code`](https://github.com/CatOneTwo/CoDTS)] 
34 | 12. DSRC: Learning Density-insensitive and Semantic-aware Collaborative Representation against Corruptions (*AAAI'25*) [[`pdf`](https://arxiv.org/abs/2412.10739)] [[`code`](https://github.com/Terry9a/DSRC)] 
35 | 13. CoPEFT: Fast Adaptation Framework for Multi-Agent Collaborative Perception with Parameter-Efficient Fine-Tuning (*AAAI'25*) [[`pdf`](https://arxiv.org/abs/2502.10705)] [[`code`](https://github.com/fengxueguiren/CoPEFT)] 
36 | 14. Learning to Detect Objects from Multi-Agent LiDAR Scans without Manual Labels (*CVPR'25*) [[`pdf`](https://arxiv.org/abs/2503.08421)] [[`code`](https://github.com/xmuqimingxia/DOtA)] 
37 | 15. V2X-R: Cooperative LiDAR-4D Radar Fusion with Denoising Diffusion for 3D Object Detection (*CVPR'25*) [[`pdf`](https://arxiv.org/abs/2411.08402)] [[`code`](https://github.com/ylwhxht/V2X-R)] 
38 | 16. End-to-End Autonomous Driving through V2X Cooperation (*AAAI'25*) [[`pdf`](https://arxiv.org/abs/2404.00717)] [[`code`](https://github.com/AIR-THU/UniV2X)] 
39 | 17. Learning 3D Perception from Others' Predictions (*ICLR'25*) [[`pdf`](https://openreview.net/forum?id=Ylk98vWQuQ)] [[`code`](https://github.com/jinsuyoo/rnb-pop)] 
40 | 18. CUDA-X: Unsupervised Domain-Adaptive Vehicle-to-Everything Collaboration via Knowledge Transfer and Alignment (*TNNLS'25*) [[`pdf`](https://ieeexplore.ieee.org/document/10891961)]
41 | 19. RCP-Bench: Benchmarking Robustness for Collaborative Perception Under Diverse Corruptions (*CVPR'25*) [[`pdf`](https://openaccess.thecvf.com/content/CVPR2025/papers/Du_RCP-Bench_Benchmarking_Robustness_for_Collaborative_Perception_Under_Diverse_Corruptions_CVPR_2025_paper.pdf)] [[`code`](https://github.com/LuckyDush/RCP-Bench)] 
42 |
--------------------------------------------------------------------------------
/real_world_methods.md:
--------------------------------------------------------------------------------
1 | # A summary of state-of-the-art collaborative perception methods for real-world issues
2 |
3 |
9 |
10 | ## Table
11 | | Method | Venue | Modality | Scheme | Loc Error | Comm Issue | Discrep | Security | Code |
12 | |:----------------------:|:---------------:|:----------------:|:--------------------:|:---------------------:|:--------------------:|:-------------------:|:---------------------:|:-------------------------------------------------------------------------------:|
13 | | RobustV2VNet [1] | CoRL'20 | LiDAR | I | Loc, Pos | - | - | - | - |
14 | | AOMAC [2] | ICCV'21 | LiDAR | I | - | - | - | Attack | - |
15 | | P-CNN [3] | IoT'21 | Camera | E | - | - | - | Privacy | - |
16 | | FPV-RCNN [4] | RAL'22 | LiDAR | I | Loc, Pos | - | - | - | [Linkn](https://github.com/YuanYunshuang/FPV_RCNN) |
17 | | TCLF [5] | CVPR'22 | LiDAR | L | - | Laten | - | - | [Linkv](https://github.com/AIR-THU/DAIR-V2X) |
18 | | V2X-ViT [6] | ECCV'22 | LiDAR | I | Loc, Pos | Laten | - | - | [Linko](https://github.com/DerrickXuNu/v2x-vit) |
19 | | SyncNet [7] | ECCV'22 | LiDAR | I | - | Laten | - | - | [Linkc](https://github.com/MediaBrain-SJTU/SyncNet) |
20 | | TaskAgnostic [8] | CoRL'22 | LiDAR | I | - | - | Task | - | [Linkc](https://github.com/coperception/star) |
21 | | SecPCV [9] | TITS'22 | LiDAR | E | - | - | - | Privacy | - |
22 | | ModelAgnostic [10] | ICRA'23 | LiDAR | L | - | - | Model | - | [Linko](https://github.com/DerrickXuNu/model_anostic) |
23 | | MPDA [11] | ICRA'23 | LiDAR | I | - | - | Model | - | [Linko](https://github.com/DerrickXuNu/MPDA) |
24 | | CoAlign [12] | ICRA'23 | LiDAR | I, L | Loc, Pos | - | - | - | [Linko](https://github.com/yifanlu0227/CoAlign) |
25 | | LCRN [13] | TIV'23 | LiDAR | L | - | Loss | - | - | - |
26 | | OptiMatch [14] | IV'23 | LiDAR | L | Loc, Pos | - | - | - | - |
27 | | P2OD [15] | IoT'23 | Camera | E | - | - | - | Privacy | - |
28 | | ROBOSAC [16] | ICCV'23 | LiDAR | I | - | - | - | Attack | [Linkc](https://github.com/coperception/ROBOSAC) |
29 | | FFNet [17] | NeurIPS'23 | LiDAR | I | - | Laten | - | - | [Linkv](https://github.com/haibao-yu/FFNet-VIC3D) |
30 | | CoBEVFlow [18] | NeurIPS'23 | LiDAR | I | - | Laten | - | - | [Linko](https://github.com/MediaBrain-SJTU/CoBEVFlow) |
31 | | How2comm [19] | NeurIPS'23 | LiDAR | I | - | Laten | - | - | [Linko](https://github.com/ydk122024/How2comm) |
32 | | FeaCo [20] | MM'23 | LiDAR | I | Loc, Pos | - | - | - | [Linko](https://github.com/jmgu0212/FeaCo) |
33 | | ERMVP [21] | CVPR'24 | LiDAR | I | Loc, Pos | - | - | - | [Linko](https://github.com/Terry9a/ERMVP) |
34 | | MRCNet [22] | CVPR'24 | LiDAR | I | Loc, Pos | - | - | - | [Linko](https://github.com/IndigoChildren/collaborative-perception-MRCNet) |
35 | | RoCo [23] | MM'24 | LiDAR | I | Loc, Pos | - | - | - | [Linko](https://github.com/HuangZhe885/RoCo) |
36 | | PnPDA [24] | ECCV'24 | LiDAR | I | - | - | Model | - | [Linko](https://github.com/luotianyou349/PnPDA) |
37 | | Hetecooper [25] | ECCV'24 | LiDAR | I | - | - | Model | - | |
38 | | NEAT [26] | ECCV'24 | LiDAR | I | Loc, Pos | Laten | - | - | |
39 | | V2X-INCOP [27] | TIV'24 | LiDAR | I | - | Inter | - | - | - |
40 | | MADE [28] | IROS'24 | LiDAR | I | - | - | - | Attack | - |
41 | | CP-Guard [29] | AAAI'25 | LiDAR | I | - | - | - | Attack | [Linkc](https://github.com/CP-Security/CP-Guard) |
42 | | PLDA [30] | AAAI'25 | LiDAR | I | - | - | Model | - | - |
43 | | BEVSync [31] | AAAI'25 | LiDAR | I | - | Laten | - | - | - |
44 | | STAMP [32] | ICLR'25 | LiDAR, Camera | I | - | - | Model, Task | - | [Linko](https://github.com/taco-group/STAMP) |
45 | | PolyInter [33] | CVPR'25 | LiDAR | I | - | - | Model | - | - |
46 |
47 |
48 | Notes:
49 | - Schemes include early (E), intermediate (I) and late (L) collaboration.
50 | - **Loc Error** includes localization (**Loc**) and pose (**Pos**) errors.
51 | - **Comm Issue** includes latency (**Laten**), interruption (**Inter**) and loss (**Loss**).
52 | - **Discrep** includes model (**Model**) and task (**Task**) discrepancies.
53 | - **Security** includes attack defense (**Attack**) and privacy protection (**Privacy**).
54 | - **Code Framework**: o ([OpenCOOD](https://github.com/DerrickXuNu/OpenCOOD)), v ([VIC3D](https://github.com/AIR-THU/DAIR-V2X)), c ([CoPerception](https://github.com/coperception/coperception)), n (Non-mainstream framework)
55 |
56 | Back to [Contents](README.md) 🔙
57 |
58 | ## References
59 | ### Published
60 | 1. Learning to Communicate and Correct Pose Errors (CoRL'20) [[`pdf`](https://arxiv.org/abs/2011.05289)]
61 | 2. Adversarial attacks on multi-agent communication (ICCV'21) [[`pdf`](https://arxiv.org/abs/2101.06560)]
62 | 3. Toward lightweight, privacy-preserving cooperative object classification for connected autonomous vehicles (IoT'21) [[`pdf`](https://ieeexplore.ieee.org/document/9468670)]
63 | 4. Keypoints-Based Deep Feature Fusion for Cooperative Vehicle Detection of Autonomous Driving (RAL'22) [[`pdf`](https://arxiv.org/abs/2109.11615)] [[`code`](https://github.com/YuanYunshuang/FPV_RCNN)]
64 | 5. DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection (CVPR'22) [[`pdf`](https://arxiv.org/abs/2204.05575)] [[`code`](https://github.com/AIR-THU/DAIR-V2X)]
65 | 6. V2X-ViT: Vehicle-to-everything cooperative perception with vision transformer (ECCV'22) [[`pdf`](https://arxiv.org/abs/2203.10638)] [[`code`](https://github.com/DerrickXuNu/v2x-vit)]
66 | 7. Latency-Aware Collaborative Perception (ECCV'22) [[`pdf`](https://arxiv.org/abs/2207.08560)] [[`code`](https://github.com/MediaBrain-SJTU/SyncNet)]
67 | 8. Multi-robot scene completion: Towards task-agnostic collaborative perception (CoRL'22) [[`pdf`](https://openreview.net/forum?id=hW0tcXOJas2)] [[`code`](https://github.com/coperception/star)]
68 | 9. Edge-Cooperative Privacy-Preserving Object Detection Over Random Point Cloud Shares for Connected Autonomous Vehicles (TITS'22) [[`pdf`](https://ieeexplore.ieee.org/document/9928424)]
69 | 10. Model-Agnostic Multi-Agent Perception Framework (ICRA'23) [[`pdf`](https://arxiv.org/abs/2203.13168)] [[`code`](https://github.com/DerrickXuNu/model_anostic)]
70 | 11. Bridging the Domain Gap for Multi-Agent Perception (ICRA'23) [[`pdf`](https://arxiv.org/abs/2210.08451)] [[`code`](https://github.com/DerrickXuNu/MPDA)]
71 | 12. Robust Collaborative 3D Object Detection in Presence of Pose Errors (ICRA'23) [[`pdf`](https://arxiv.org/abs/2211.07214)] [[`code`](https://github.com/yifanlu0227/CoAlign)]
72 | 13. Learning for Vehicle-to-Vehicle Cooperative Perception under Lossy Communication (TIV'23) [[`pdf`](https://arxiv.org/abs/2212.08273)]
73 | 14. A Cooperative Perception System Robust to Localization Errors (IV'23) [[`pdf`](https://arxiv.org/abs/2210.06289)]
74 | 15. Achieving Lightweight and Privacy-Preserving Object Detection for Connected Autonomous Vehicles (IoT'23) [[`pdf`](https://ieeexplore.ieee.org/document/9913215)]
75 | 16. Among Us: Adversarially Robust Collaborative Perception by Consensus (ICCV'23) [[`pdf`](https://arxiv.org/abs/2303.09495)] [[`code`](https://github.com/coperception/ROBOSAC)]
76 | 17. Flow-Based Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection (NeurIPS'23) [[`pdf`](https://openreview.net/forum?id=gsglrhvQxX)] [[`code`](https://github.com/haibao-yu/FFNet-VIC3D)]
77 | 18. Robust Asynchronous Collaborative 3D Detection via Bird’s Eye View Flow (NeurIPS'23) [[`pdf`](https://arxiv.org/abs/2309.16940)] [[`code`](https://github.com/MediaBrain-SJTU/CoBEVFlow)]
78 | 19. How2comm: Communication-Efficient and Collaboration-Pragmatic Multi-Agent Perception (NeurIPS'23) [[`pdf`](https://openreview.net/forum?id=Dbaxm9ujq6)] [[`code`](https://github.com/ydk122024/How2comm)]
79 | 20. FeaCo: Reaching Robust Feature-Level Consensus in Noisy Pose Conditions (MM'23) [[`pdf`](https://dl.acm.org/doi/abs/10.1145/3581783.3611880)]
80 | 21. ERMVP: Communication-Efficient and Collaboration-Robust Multi-Vehicle Perception in Challenging Environments (CVPR'24) [[`pdf`](https://openaccess.thecvf.com/content/CVPR2024/papers/Zhang_ERMVP_Communication-Efficient_and_Collaboration-Robust_Multi-Vehicle_Perception_in_Challenging_Environments_CVPR_2024_paper.pdf)]
81 | 22. Multi-agent Collaborative Perception via Motion-aware Robust Communication Network (CVPR'24) [[`pdf`](https://openaccess.thecvf.com/content/CVPR2024/papers/Hong_Multi-agent_Collaborative_Perception_via_Motion-aware_Robust_Communication_Network_CVPR_2024_paper.pdf)]
82 | 23. RoCo: Robust Cooperative Perception By Iterative Object Matching and Pose Adjustment (MM'24) [[`pdf`](https://openreview.net/forum?id=TFFnsgu2Pr)]
83 | 24. Plug and Play: A Representation Enhanced Domain Adapter for Collaborative Perception (ECCV'24) [[`pdf`](https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/10564.pdf)] [[`code`](https://github.com/luotianyou349/PnPDA)]
84 | 25. Hetecooper: Feature Collaboration Graph for Heterogeneous Collaborative Perception (ECCV'24) [[`pdf`](https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/07071.pdf)]
85 | 26. Align before Collaborate: Mitigating Feature Misalignment for Robust Multi-Agent Perception (ECCV'24) [[`pdf`](https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/00560.pdf)]
86 | 27. Interruption-Aware Cooperative Perception for V2X Communication-Aided Autonomous Driving (TIV'24) [[`pdf`](https://arxiv.org/abs/2304.11821)]
87 | 28. Malicious Agent Detection for Robust Multi-Agent Collaborative Perception (IROS'24) [[`pdf`](https://arxiv.org/abs/2304.11821)]
88 | 29. CP-Guard: Malicious Agent Detection and Defense in Collaborative Bird’s Eye View Perception (AAAI'25) [[`pdf`](https://arxiv.org/abs/2412.12000)] [[`code`](https://github.com/CP-Security/CP-Guard)]
89 | 30. Privacy-Preserving V2X Collaborative Perception Integrating Unknown Collaborators (AAAI'25) [[`pdf`](https://ojs.aaai.org/index.php/AAAI/article/view/32619)]
90 | 31. BEVSync: Asynchronous Data Alignment for Camera-based Vehicle-Infrastructure Cooperative Perception Under Uncertain Delays (AAAI'25) [[`pdf`](https://ojs.aaai.org/index.php/AAAI/article/view/33611)]
91 | 32. STAMP: Scalable Task And Model-agnostic Collaborative Perception (ICLR'25) [[`pdf`](https://arxiv.org/abs/2501.18616)] [[`code`](https://github.com/taco-group/STAMP)]
92 | 33. One is Plenty: A Polymorphic Feature Interpreter for Immutable Heterogeneous Collaborative Perception (CVPR'25) [[`pdf`](https://arxiv.org/abs/2411.16799)]
93 |
94 |
--------------------------------------------------------------------------------