├── README.md └── NeurIPS2023.md /README.md: -------------------------------------------------------------------------------- 1 | # NeurIPS2024-Papers-about-Autonomous-Driving 2 | 3 | ## End to End Autonomous Driving 4 | 5 | **Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving** 6 | 7 | - paper: https://arxiv.org/pdf/2406.03877 8 | - code: https://github.com/Thinklab-SJTU/Bench2Drive 9 | 10 | **NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking** 11 | 12 | - paper: https://arxiv.org/pdf/2406.15349 13 | - code: https://github.com/autonomousvision/navsim 14 | 15 | **E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection** 16 | 17 | - paper: https://arxiv.org/pdf/2403.09323 18 | - code: https://github.com/icey-zhang/E2E-MFD 19 | 20 | **Autonomous Driving with Spiking Neural Networks** 21 | 22 | - paper: https://arxiv.org/pdf/2405.19687 23 | - code: https://github.com/ridgerchu/SAD 24 | 25 | 26 | 27 | ## Vision Foundation Model 28 | 29 | **Fine-grained Image-to-LiDAR Contrastive Distillation with Visual Foundation Models** 30 | 31 | - paper: https://arxiv.org/pdf/2405.14271 32 | - code: https://github.com/Eaphan/OLIVINE 33 | 34 | **Learning Frequency-Adapted Vision Foundation Model for Domain Generalized Semantic Segmentation** 35 | 36 | - paper: 37 | - code: https://github.com/BiQiWHU/FADA 38 | 39 | 40 | 41 | ## World Model 42 | 43 | **DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model** 44 | 45 | - paper: 46 | - code: https://github.com/Robertwyq/Drivingdojo 47 | 48 | **Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability** 49 | 50 | - paper: https://arxiv.org/pdf/2405.17398 51 | - code: https://github.com/OpenDriveLab/Vista 52 | 53 | 54 | 55 | 56 | ## 3D Object Detection 57 | 58 | **LION: Linear Group RNN for 3D Object Detection in Point Clouds** 59 | 60 | - paper: https://arxiv.org/pdf/2407.18232 61 | - code: https://github.com/happinesslz/LION 62 | 63 | **Real-time Stereo-based 3D Object Detection for Streaming Perception** 64 | 65 | - paper: https://arxiv.org/pdf/2410.12394 66 | - code: https://github.com/weiyangdaren/streamDSGN-pytorch 67 | 68 | 69 | 70 | ## LLM 71 | 72 | **Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving** 73 | 74 | - paper: https://arxiv.org/pdf/2405.15324 75 | - code: https://github.com/PJLab-ADG/LeapAD 76 | 77 | **VisionLLM v2: An End-toEnd Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks** 78 | 79 | - paper: https://arxiv.org/pdf/2406.08394 80 | - code: 81 | 82 | 83 | 84 | ## Smart Agent Simulation 85 | 86 | **BehaviorGPT: Smart Agent Simulation for Autonomous Driving with Next-Patch Prediction** 87 | 88 | - paper: https://arxiv.org/pdf/2405.17372 89 | - code: 90 | 91 | 92 | 93 | ## OCC 94 | 95 | **RadarOcc: Robust 3D Occupancy Prediction with 4D Imaging Radar** 96 | 97 | - paper: https://arxiv.org/pdf/2405.14014 98 | - code: https://github.com/Toytiny/RadarOcc/ 99 | 100 | **ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving** 101 | 102 | - paper: https://arxiv.org/pdf/2411.05311 103 | - code: https://github.com/PJLab-ADG/ZOPP 104 | 105 | **OPUS: Occupancy Prediction Using a Sparse Set** 106 | 107 | - paper: https://arxiv.org/pdf/2409.09350 108 | - code: https://github.com/jbwang1997/OPUS 109 | 110 | 111 | 112 | ## Point Cloud 113 | 114 | **Is Your LiDAR Placement Optimized for 3D Scene Understanding?** 115 | 116 | - paper: https://arxiv.org/pdf/2403.17009 117 | - code: https://github.com/ywyeli/Place3D 118 | 119 | **TARSS-Net: Temporal-Aware Radar Semantic Segmentation Network** 120 | 121 | - paper: https://openreview.net/pdf?id=5AeLrXb9sQ 122 | - code: https://github.com/zlw9161/TARSS-Net 123 | 124 | 125 | 126 | ## Visual Grouding 127 | 128 | **SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion** 129 | 130 | - paper: https://arxiv.org/pdf/2409.17531 131 | - code: https://github.com/Dmmm1997/SimVG 132 | 133 | 134 | 135 | ## NeRF 136 | 137 | **NeuroGauss4D-PCI: 4D Neural Fields and Gaussian Deformation Fieldsfor Point Cloud Interpolation** 138 | 139 | - paper: https://arxiv.org/pdf/2405.14241 140 | - code: https://github.com/jiangchaokang/NeuroGauss4D-PCI 141 | 142 | **Rad-NeRF: Ray-decoupled Training of Neural Radiance Field** 143 | 144 | - paper: https://openreview.net/pdf?id=nBrnfYeKf9 145 | - code: https://github.com/thu-nics/Rad-NeRF?tab=readme-ov-file 146 | 147 | 148 | 149 | ## Topology Reasoning 150 | 151 | **Reasoning Multi-Agent Behavioral Topology for Interactive Autonomous Driving** 152 | 153 | - paper: https://arxiv.org/pdf/2409.18031 154 | - code: https://github.com/OpenDriveLab/BeTop 155 | 156 | 157 | 158 | 159 | 160 | ## Multi Model 161 | 162 | 163 | 164 | ## Pre Training 165 | 166 | 167 | 168 | ## Radar 169 | 170 | 171 | 172 | ## Depth Estimation 173 | 174 | 175 | 176 | ## Motion Planning 177 | 178 | 179 | 180 | ## Map Vectorization 181 | 182 | 183 | 184 | ## Simulator 185 | 186 | 187 | 188 | ## Trajectory Prediction 189 | 190 | 191 | 192 | ## Navigation 193 | 194 | 195 | 196 | ## Postscript 197 | 198 | This repository was compiled by [Rujia Wang](https://github.com/shenxiaowrj). 199 | 200 | If you have any questions about the paper list, please do not hesitate to email [me](rujiawang329@gmail.com) or open an issue on GitHub. 201 | -------------------------------------------------------------------------------- /NeurIPS2023.md: -------------------------------------------------------------------------------- 1 | # NeurIPS2023-Papers-about-Autonomous-Driving 2 | 3 | ## 3D Object Detection 4 | 5 | **Query-based Temporal Fusion with Explicit Motion for 3D Object Detection** 6 | 7 | - Paper: https://openreview.net/attachment?id=gySmwdmVDF&name=pdf 8 | - Code: https://github.com/AlmoonYsl/QTNet 9 | 10 | **RangePerception: Taming LiDAR Range View for Efficient and Accurate 3D Object Detection** 11 | 12 | - Paper: https://openreview.net/attachment?id=9kFQEJSyCM&name=pdf 13 | 14 | **CluB: Cluster Meets BEV for LiDAR-Based 3D Object Detection** 15 | 16 | - Paper: https://openreview.net/attachment?id=jIhX7SpfCz&name=pdf 17 | 18 | **3D Copy-Paste: Physical Plausible Indoor Object Insertion for Monocular 3D Object Detection** 19 | 20 | - Paper: https://openreview.net/attachment?id=d86B6Mdweq&name=pdf 21 | - Code: https://github.com/gyhandy/3D-Copy-Paste 22 | 23 | **Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection** 24 | 25 | - Paper: https://openreview.net/attachment?id=YoghyvSG0H&name=pdf 26 | - Code: https://github.com/luluho1208/Diffusion-SS3D 27 | 28 | **Unleash The Potential of Image Branch for Cross-modal 3D Object Detection** 29 | 30 | - Paper: https://openreview.net/attachment?id=eYCGrGdKf3&name=pdf 31 | 32 | **Depth-discriminative Metric Learning for Monocular 3D Object Detection** 33 | 34 | - Paper: https://openreview.net/attachment?id=ZNBblMEP16&name=pdf 35 | 36 | **Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection** 37 | 38 | - Paper: https://openreview.net/attachment?id=QW5ouyyIgG&name=pdf 39 | - Code: https://github.com/yangcaoai/CoDA_NeurIPS2023 40 | 41 | **HEDNet: A Hierarchical Encoder-Decoder Network for 3D Object Detection in Point Clouds** 42 | 43 | - Paper: https://openreview.net/attachment?id=MUwr2YVJfN&name=pdf 44 | - Code: https://github.com/zhanggang001/HEDNet 45 | 46 | **STXD: Structural and Temporal Cross-Modal Distillation for Multi-View 3D Object Detection** 47 | 48 | - Paper: https://openreview.net/attachment?id=Grz2ijKrWI&name=pdf 49 | 50 | **Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection** 51 | 52 | - Paper: https://openreview.net/attachment?id=pQF9kbM8Ea&name=pdf 53 | - Code: https://github.com/OpenDriveLab/Birds-eye-view-Perception 54 | 55 | ## Multi Model 56 | 57 | **OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding** 58 | 59 | - Paper: https://openreview.net/attachment?id=Eu4Kkefq7p&name=pdf 60 | - Code: https://colin97.github.io/OpenShape/ 61 | 62 | **PointGPT: Auto-regressively Generative Pre-training from Point Clouds** 63 | 64 | - Paper: https://openreview.net/attachment?id=rqE0fEQDqs&name=pdf 65 | - Code: https://github.com/CGuangyan-BIT/PointGPT 66 | 67 | ## Pre Training 68 | 69 | **AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud Dataset** 70 | 71 | - Paper: https://openreview.net/attachment?id=eIFZtkshgH&name=pdf 72 | - Code: https://jiakangyuan.github.io/AD-PT.github.io/ 73 | 74 | **PRED: Pre-training Via Semantic Rendering on LiDAR Point Clouds** 75 | 76 | - Paper: https://openreview.net/attachment?id=rUldfB4SPT&name=pdf 77 | - Code: https://github.com/PRED4pc/PRED 78 | 79 | ## OCC 80 | 81 | **Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving** 82 | 83 | - Paper: https://openreview.net/attachment?id=ApqgcSnhjh&name=pdf 84 | - Code: https://github.com/Tsinghua-MARS-Lab/Occ3D 85 | 86 | **POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images** 87 | 88 | - Paper: https://openreview.net/attachment?id=eBXM62SqKY&name=pdf 89 | - Code: https://github.com/vobecant/POP3D 90 | 91 | ## Cooperative Perception 92 | 93 | **Flow-Based Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection** 94 | 95 | - Paper: https://openreview.net/attachment?id=gsglrhvQxX&name=pdf 96 | - Code: https://github.com/haibao-yu/FFNet-VIC3D 97 | 98 | **MonoUNI: A Unified Vehicle and Infrastructure-side Monocular 3D Object Detection Network with Sufficient Depth Clues** 99 | 100 | - Paper: https://openreview.net/attachment?id=v2oGdhbKxi&name=pdf 101 | 102 | ## Point Cloud 103 | 104 | **Segment Any Point Cloud Sequences By Distilling Vision Foundation Models** 105 | 106 | - Paper: https://openreview.net/attachment?id=i39yXaUKuF&name=pdf 107 | - Code: https://github.com/youquanl/Segment-Any-Point-Cloud 108 | 109 | **ConDaFormer: Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding** 110 | 111 | - Paper: https://openreview.net/attachment?id=kKXJkiniOx&name=pdf 112 | 113 | ## Visual Grouding 114 | 115 | **CityRefer: Geography-aware 3D Visual Grounding Dataset on City-scale Point Cloud Data** 116 | 117 | - Paper: https://openreview.net/attachment?id=fr3OT4rosO&name=pdf 118 | - Code: https://github.com/ATR-DBI/CityRefer 119 | 120 | ## NeRF 121 | 122 | **DäRF: Boosting Radiance Fields from Sparse Input Views with Monocular Depth Adaptation** 123 | 124 | - Paper: https://openreview.net/attachment?id=rsrfEIdawr&name=pdf 125 | - Code: https://github.com/KU-CVLAB/DaRF 126 | 127 | ## Radar 128 | 129 | **Echoes Beyond Points: Unleashing The Power of Raw Radar Data in Multi-modality Fusion** 130 | 131 | - Paper: https://openreview.net/attachment?id=LZzsn51DPr&name=pdf 132 | - Code: https://github.com/tusen-ai/EchoFusion 133 | 134 | ## Depth Estimation 135 | 136 | **RoboDepth: Robust Out-of-Distribution Depth Estimation Under Corruptions** 137 | 138 | - Paper: https://openreview.net/attachment?id=SNznC08OOO&name=pdf 139 | - Code: https://github.com/ldkong1205/RoboDepth 140 | 141 | **The Surprising Effectiveness of Diffusion Models for Optical Flow and Monocular Depth Estimation** 142 | 143 | - Paper: https://openreview.net/attachment?id=jDIlzSU8wJ&name=pdf 144 | - Code: https://diffusion-vision.github.io/ 145 | 146 | **Dynamo-Depth: Fixing Unsupervised Depth Estimation for Dynamical Scenes** 147 | 148 | - Paper: https://openreview.net/attachment?id=R6qMmdl4qP&name=pdf 149 | - Code: https://github.com/YihongSun/Dynamo-Depth 150 | 151 | ## Motion Planning 152 | 153 | **GraphMP: Graph Neural Network-based Motion Planning with Efficient Graph Search** 154 | 155 | - Paper: https://openreview.net/attachment?id=cQdc9Dyk4i&name=pdf 156 | 157 | **Accelerating Motion Planning Via Optimal Transport** 158 | 159 | - Paper: https://openreview.net/attachment?id=9B9J8X23LK&name=pdf 160 | 161 | **Thinker: Learning to Plan and Act** 162 | 163 | - Paper: https://openreview.net/attachment?id=mumEBl0arj&name=pdf 164 | - Code: https://github.com/stephen-chung-mh/thinker 165 | 166 | ## Map Vectorization 167 | 168 | **Online Map Vectorization for Autonomous Driving: A Rasterization Perspective** 169 | 170 | - Paper: https://openreview.net/attachment?id=YvO5yTVv5Y&name=pdf 171 | - Code: https://github.com/ZhangGongjie/MapVR 172 | 173 | ## Topology Reasoning 174 | 175 | **OpenLane-V2: A Topology Reasoning Benchmark for Scene Understanding in Autonomous Driving** 176 | 177 | - Paper: https://openreview.net/attachment?id=OMOOO3ls6g&name=pdf 178 | - Code: https://github.com/OpenDriveLab/OpenLane-V2 179 | 180 | ## Simulator 181 | 182 | **Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research** 183 | 184 | - Paper: https://openreview.net/attachment?id=7VSBaP2OXN&name=pdf 185 | - Code: https://github.com/waymo-research/waymax 186 | 187 | ## Trajectory Prediction 188 | 189 | **What Truly Matters in Trajectory Prediction for Autonomous Driving?** 190 | 191 | - Paper: https://openreview.net/attachment?id=nG35q8pNL9&name=pdf 192 | 193 | **BCDiff: Bidirectional Consistent Diffusion for Instantaneous Trajectory Prediction** 194 | 195 | - Paper: https://openreview.net/attachment?id=FOFJmR1oxt&name=pdf 196 | 197 | **SiT Dataset: Socially Interactive Pedestrian Trajectory Dataset for Social Navigation Robots** 198 | 199 | - Paper: https://openreview.net/attachment?id=gMYsxTin4x&name=pdf 200 | - Code: https://github.com/SPALaboratory/SiT-Dataset 201 | 202 | ## Navigation 203 | 204 | **A Diffusion-Model of Joint Interactive Navigation** 205 | 206 | - Paper: https://openreview.net/attachment?id=2yXExAl0FW&name=pdf 207 | 208 | **NeRF-IBVS: Visual Servo Based on NeRF for Visual Localization and Navigation** 209 | 210 | - Paper: https://openreview.net/attachment?id=9pLaDXX8m3&name=pdf 211 | 212 | ## Postscript 213 | 214 | This repository was compiled by [Rujia Wang](https://github.com/shenxiaowrj). 215 | 216 | If you have any questions about the paper list, please do not hesitate to email [me](rujiawang329@gmail.com) or open an issue on GitHub. 217 | --------------------------------------------------------------------------------