└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # CVPR-MIA 2 | 3 | Recent papers about medical images published on CVPR. [[Github](https://github.com/MedAIerHHL/CVPR-MIA/)] 4 | 5 | 🌟🌟🌟To complement or correct it (highlight, oral, and so on), please contact me at **1729766533 [at] qq [dot] com** or **send a pull request**. 6 | 7 | Last updated: 2025/04/09 8 | 9 | # CVPR2025 10 | 11 | ## Image Generation (图像生成) 12 | 13 | * Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis. [[Paper](https://arxiv.org/abs/2412.20651)][[Code](https://latentdrifting.github.io/)] 14 | * Blood Flow Speed Estimation with Optical Coherence Tomography Angiography Images. [[Paper](https://www3.cs.stonybrook.edu/~hling/publication/octa-flow-cvpr25.pdf)][[Code](https://gitub.com/Spritea/OCTA-Flow)] 15 | 16 | ## Image Segmentation (图像分割) 17 | 18 | - nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation Benchmark. [Paper][Code] 19 | - Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline. [[Paper](https://arxiv.org/abs/2411.12814)][[Code](https://github.com/uni-medical/IMIS-Bench)] 20 | - Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation. [[Paper](https://export.arxiv.org/abs/2503.16997)][[Code](https://dycon25.github.io/)] 21 | - DyCON: Dynamic Uncertainty-aware Consistency and Contrastive Learning for Semi-supervised Medical Image Segmentation. [[Paper](https://arxiv.org/abs/2504.04566)][[Code](https://dycon25.github.io/)] 22 | - LesionLocator: Zero-Shot Universal Tumor Segmentation and Tracking in 3D Whole-Body Imaging. [[Paper](https://arxiv.org/pdf/2502.20985)][[Code](https://github.com/MIC-DKFZ/LesionLocator)] 23 | - EffiDec3D: An Optimized Decoder for High-Performance and Efficient 3D Medical Image Segmentation. [Paper][Code] 24 | - nnWNet: Rethinking the Use of Transformers in Biomedical Image Segmentation and Calling for a Unified Evaluation Benchmark. [Paper][Code] 25 | - Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline. [[Paper](https://arxiv.org/pdf/2411.12814)][[Code](https://github.com/uni-medical/IMIS-Bench)] 26 | - Advancing Generalizable Tumor Segmentation with Anomaly.Aware Open-Vocabulary Attention Maps and Frozen FoundationDiffusion Models. [Paper][Code] 27 | - Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation. [[Paper](https://arxiv.org/pdf/2503.04639)][Code] 28 | - Boost the Inference with Co-training: A Depth-guided Mutual Learning Framework for Semi-supervised Medical Polyp Segmentation (RD-Net). [Paper][[Code](https://github.com/pingchuan/RD-Net)] 29 | - Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation. [[Paper](https://arxiv.org/abs/2503.13012)][[Code](https://github.com/Yore0/TTDG-MGM)] 30 | 31 | ## Medical Pre-training $ Foundation Model(预训练&基础模型) 32 | 33 | * Multi-modal Vision Pre-training for Medical Image Analysis. [[Paper](https://arxiv.org/abs/2410.10604)![]()][[Code](https://github.com/shaoao011/BrainMVP)] 34 | * CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning [[Paper](https://arxiv.org/abs/2504.13820)![]()][[Code](https://github.com/LeapLabTHU/CheXWorld)] 35 | * EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance [[Paper](https://arxiv.org/abs/2504.13065)![]()][[Code](https://github.com/LeapLabTHU/EchoWorld)] 36 | 37 | ## Vision-Language Model (视觉-语言) 38 | 39 | * VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge. [[Paper](https://arxiv.org/abs/2411.12915)![]()][[Code](https://github.com/Project-MONAI/VLM-Radiology-Agent-Framework)] 40 | * BIOMEDICA: An Open Biomedical Image-Caption Archive with Vision-Language Models derived from Scientific Literature. [[Paper](https://arxiv.org/abs/2501.07171v3)![]()][[Project](https://minwoosun.github.io/biomedica-website/)] 41 | * BiomedCoOp: Learning to Prompt for Biomedical Vision-Language Models. [[Paper](https://arxiv.org/abs/2411.15232)![]()][[Code](https://github.com/HealthX-Lab/BiomedCoOp)] 42 | * MIMO: A medical vision language model with visual referring multimodal input and pixel grounding multimodal output. [Paper][Code] 43 | * Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis. [Paper][Code] 44 | * Alignment, Mining and Fusion: Representation Alignment with Hard Negative Mining and Selective Knowledge Fusion for Medical Visual Question Answering. [Paper][Code] 45 | * Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation. [[Paper](https://arxiv.org/abs/2502.20056)![]()][[Code](https://github.com/mk-runner/MLRG)] 46 | * FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification. [[Paper](https://arxiv.org/pdf/2411.14743)][[Code](https://github.com/dddavid4real/FOCUS)] 47 | * MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Disrete Visual Representations. [[Paper](https://arxiv.org/pdf/2503.01019)][Code] 48 | 49 | ## Computational Pathology (计算病理) 50 | 51 | - Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance LearningComputational Pathology. [[Paper](https://arxiv.org/pdf/2502.21130)][[Code](https://github.com/JiuyangDong/HDMIL.)] 52 | - FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification. [[Paper](https://arxiv.org/abs/2411.14743)][[Code](https://github.com/dddavid4real/FOCUS)][[推送](https://mp.weixin.qq.com/s/1MYkitZ3btZUBOMcBg_ryw)] 53 | - Distilled Prompt Learning for Incomplete Multimodal Survival Prediction. [[Paper](https://arxiv.org/pdf/2503.01653)][Code] 54 | - Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning. [[Paper](https://arxiv.org/abs/2502.21130)][[Code](https://github.com/JiuyangDong/HDMIL.)] 55 | - SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding. [[Paper](https://arxiv.org/abs/2410.11761)][Code] 56 | - 2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image Classification. [[Paper](https://arxiv.org/abs/2412.00678)][Code] 57 | - CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology. [[Paper](https://arxiv.org/abs/2412.12077)][Code] 58 | - MERGE: Multi-faceted Hierarchical Graph-based GNN for Gene Expression Prediction from Whole Slide Histopathology Images. [[Paper](https://arxiv.org/html/2412.02601v1)][Code] 59 | - HistoFS: Non-IID Histopathologic Whole Slide Image Classification via Federated Style Transfer with RoI-Preserving. [Paper][Code] 60 | - M3amba: Memory Mamba is All You Need for Whole Slide Image Classification. [Paper][Code] 61 | - Advancing Multiple Instance Learning with Continual Learning for Whole Slide Imaging. [Paper][Code] 62 | - BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology. [Paper][Code] 63 | - Multi-Resolution Pathology-Language Pre-training Model with Text-Guided Visual Representation. [Paper][Code] 64 | - TopoCellGen: Generating Histopathology Cell Topology with a Diffusion Model. [[Paper](https://arxiv.org/abs/2412.06011)][Code] 65 | - Multi-modal Topology-embedded Graph Learning for Spatially Resolved Genes Prediction from Pathology Images with Prior Gene Similarity Information. [Paper][Code] 66 | - Robust Multimodal Survival Prediction with the Latent Differentiation Conditional Variational AutoEncoder. [Paper][Code] 67 | - MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification. [Paper][Code] 68 | - Learning Heterogeneous Tissues with Mixture of Experts for Gigapixel Whole Slide Images. [Paper][Code] 69 | - Unsupervised Foundation Model-Agnostic Slide-Level Representation Learning. [[Paper](https://arxiv.org/abs/2411.13623)][Code] 70 | - WISE: A Framework for Gigapixel Whole-Slide-Image Lossless Compression. [Paper][Code] 71 | 72 | ## Others 73 | 74 | * Q-PART: Quasi-Periodic Adaptive Regression with Test-time Training for Pediatric Left Ventricular Ejection Fraction Regression. 75 | * Towards All-in-One Medical Image Re-Identification. [[Paper](https://arxiv.org/abs/2503.08173)][[Code](https://github.com/tianyuan168326/All-in-One-MedReID-Pytorch)] 76 | 77 | # CVPR2024 78 | 79 | ## Image Reconstruction (图像重建) 80 | 81 | - QN-Mixer: A Quasi-Newton MLP-Mixer Model for Sparse-View CT Reconstruction. [[Paper](https://arxiv.org/abs/2402.17951v1)][Code][[Project](https://towzeur.github.io/QN-Mixer/)] 82 | - Fully Convolutional Slice-to-Volume Reconstruction for Single-Stack MRI. [[Paper](https://arxiv.org/abs/2312.03102)][[Code](http://github.com/seannz/svr)] 83 | - Structure-Aware Sparse-View X-ray 3D Reconstruction.[[Paper](https://arxiv.org/abs/2311.10959)][[Code](https://github.com/caiyuanhao1998/SAX-NeRF)] 84 | - Progressive Divide-and-Conquer via Subsampling Decomposition for Accelerated MRI. [[Paper](https://arxiv.org/abs/2403.10064)][[Code](https://github.com/ChongWang1024/PDAC)] 85 | 86 | ## Image Resolution (图像超分) 87 | 88 | - Learning Large-Factor EM Image Super-Resolution with Generative Priors. [[Paper](https://openaccess.thecvf.com/content/CVPR2024/papers/Shou_Learning_Large-Factor_EM_Image_Super-Resolution_with_Generative_Priors_CVPR_2024_paper.pdf)][[Code](https://github.com/jtshou/GPEMSR)][[Video](https://youtu.be/LNSLQM5-YcM)] 89 | - CycleINR: Cycle Implicit Neural Representation for Arbitrary-Scale Volumetric Super-Resolution of Medical Data. [[Paper](https://arxiv.org/abs/2404.04878v1)][Code] 90 | 91 | ## Image Registration (图像配准) 92 | 93 | - Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration. [[Paper](https://arxiv.org/abs/2402.18933)] 94 | - **[Oral & Best Paper Candidate!!!] Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration. [[Paper](https://arxiv.org/abs/2406.00123)][[Code](https://github.com/MungoMeng/Registration-CorrMLP)]** 95 | 96 | ## Image Segmentation (图像分割) 97 | 98 | - PrPSeg: Universal Proposition Learning for Panoramic Renal Pathology Segmentation. [[Paper](https://arxiv.org/abs/2402.19286)] 99 | - Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation. [[Paper](https://arxiv.org/abs/2311.10696)] 100 | - Each Test Image Deserves A Specific Prompt: Continual Test-Time Adaptation for 2D Medical Image Segmentation. [[Paper](https://arxiv.org/abs/2311.18363)][[Code](https://github.com/Chen-Ziyang/VPTTA)] 101 | - One-Prompt to Segment All Medical Images. [[Paper](https://arxiv.org/abs/2305.10300)][[Code](https://github.com/WuJunde/PromptUNet/tree/main)] 102 | - Modality-agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale Attention. [[Paper](https://arxiv.org/abs/2405.06284)][Code][[Project](https://skawngus1111.github.io/MADGNet_project/)] 103 | - Diversified and Personalized Multi-rater Medical Image Segmentation. [[Paper](https://arxiv.org/pdf/2212.00601)][[Code](https://github.com/ycwu1997/D-Persona)] 104 | - MAPSeg: Unified Unsupervised Domain Adaptation for Heterogeneous Medical Image Segmentation Based on 3D Masked Autoencoding and Pseudo-Labeling. [[Paper](https://arxiv.org/abs/2303.09373)][Code] 105 | - Adaptive Bidirectional Displacement for Semi-Supervised Medical Image Segmentation. [[Paper](https://arxiv.org/abs/2405.00378)][[Code](https://github.com/chy-upc/ABD)] 106 | - Cross-dimension Affinity Distillation for 3D EM Neuron Segmentation. [[Paper](https://openaccess.thecvf.com/content/CVPR2024/papers/Liu_Cross-Dimension_Affinity_Distillation_for_3D_EM_Neuron_Segmentation_CVPR_2024_paper.pdf)][[Code](https://github.com/liuxy1103/CAD)] 107 | - ToNNO: Tomographic Reconstruction of a Neural Network’s Output for Weakly Supervised Segmentation of 3D Medical Images.[[Paper](https://arxiv.org/abs/2404.13103)][Code] 108 | - Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation. [[Paper](https://arxiv.org/abs/2311.10696)][Code] 109 | - Teeth-SEG: An Efficient Instance Segmentation Framework for Orthodontic Treatment based on Anthropic Prior Knowledge. [[Paper](https://arxiv.org/abs/2404.01013)][Code] 110 | - Tyche: Stochastic in Context Learning for Universal Medical Image Segmentation. [[Paper](https://arxiv.org/abs/2401.13650)][[Code](https://github.com/mariannerakic/tyche/)] 111 | - Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation. [[Paper](https://arxiv.org/abs/2404.08951)][[Code](https://github.com/MQinghe/MiDSS)] 112 | - S2VNet: Universal Multi-Class Medical Image Segmentation via Clustering-based Slice-to-Volume Propagation. [[Paper](https://arxiv.org/abs/2403.16646)][[Code](https://github.com/dyh127/S2VNet)] 113 | - EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation.[[Paper](https://arxiv.org/abs/2405.06880)][[Code](https://github.com/SLDGroup/EMCAD)] 114 | - Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation.[[Paper](https://arxiv.org/abs/2306.02416)][[Code](https://github.com/yhygao/universal-medical-image-segmentation)] 115 | - ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangling and Self-Prompting. [[Paper](https://arxiv.org/abs/2312.04964)][Code] 116 | - **[Oral!!!] Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration. [[Paper](https://github.com/dengxl0520/MemSAM/blob/main/paper.pdf)][[Code](https://github.com/dengxl0520/MemSAM/tree/main)]** 117 | - PH-Net: Semi-Supervised Breast Lesion Segmentation via Patch-wise Hardness. [[Paper](https://openaccess.thecvf.com/content/CVPR2024/papers/Jiang_PH-Net_Semi-Supervised_Breast_Lesion_Segmentation_via_Patch-wise_Hardness_CVPR_2024_paper.pdf)][[Code](https://github.com/jjjsyyy/PH-Net)][[Video](https://cvpr.thecvf.com/virtual/2024/poster/30539)] 118 | 119 | ## Image Generation (图像生成) 120 | 121 | - Learned representation-guided diffusion models for large-image generation. [[Paper](https://arxiv.org/abs/2312.07330)] 122 | - MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant. [[Paper](https://arxiv.org/html/2403.04290v1)] 123 | - Towards Generalizable Tumor Synthesis. [[Paper](https://arxiv.org/abs/2402.19470v1)][[Code](https://github.com/MrGiovanni/DiffTumor)] 124 | - Data-Efficient Unsupervised Interpolation Without Any Intermediate Frame for 4D Medical Images. [[Paper](https://arxiv.org/abs/2404.01464)][[Code](https://github.com/jungeun122333/UVI-Net)] 125 | 126 | ## Image Classification (图像分类) 127 | 128 | - Systematic comparison of semi-supervised and self-supervised learning for medical image classification. [[Paper](https://arxiv.org/abs/2307.08919v2)][[Code](https://github.com/tufts-ml/SSL-vs-SSL-benchmark)] 129 | - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images. [[Paper](https://arxiv.org/abs/2403.12570)][[Code](https://github.com/MediaBrain-SJTU/MVFA-AD)] 130 | 131 | ## Federated Learning(联邦学习) 132 | 133 | - Think Twice Before Selection: Federated Evidential Active Learning for Medical Image Analysis with Domain Shifts. [[Paper](https://arxiv.org/abs/2312.02567)] 134 | 135 | ## Medical Pre-training $ Foundation Model(预训练&基础模型) 136 | 137 | - VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis. [[Paper](https://arxiv.org/abs/2402.17300)][[Code](https://github.com/Luffy03/VoCo)] 138 | - MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning. [[Paper](https://arxiv.org/abs/2402.02045)] 139 | - **[Highlight!]** **Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning. [[Paper](https://arxiv.org/abs/2311.17597)][[Code](https://github.com/yeerwen/MedCoSS)]** 140 | - Bootstrapping Chest CT Image Understanding by Distilling Knowledge from X-ray Expert Models. [[Paper](https://arxiv.org/abs/2404.04936v1)][Code] 141 | - Unleashing the Potential of SAM for Medical Adaptation via Hierarchical Decoding. [[Paper](https://arxiv.org/abs/2403.18271)][[Code](https://github.com/Cccccczh404/H-SAM)] 142 | - Low-Rank Knowledge Decomposition for Medical Foundation Models. [[Paper](https://arxiv.org/abs/2404.17184)][[Code](https://github.com/MediaBrain-SJTU/LoRKD)] 143 | 144 | ## Vision-Language Model (视觉-语言) 145 | 146 | - PairAug: What Can Augmented Image-Text Pairs Do for Radiology? [[Paper](https://arxiv.org/abs/2404.04960)][[Code](https://github.com/YtongXie/PairAug)] 147 | - Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Matching Framework. [[Paper](https://arxiv.org/abs/2403.07636)][[Code](https://github.com/HieuPhan33/MAVL)] 148 | - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images. [[Paper](https://arxiv.org/abs/2403.12570)][[Code](https://github.com/MediaBrain-SJTU/MVFA-AD)] 149 | - OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM. [[Paper](https://arxiv.org/abs/2402.09181)][Code] 150 | - CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification. [[Paper](https://arxiv.org/abs/2402.17417)][Code] 151 | - FairCLIP: Harnessing Fairness in Vision-Language Learning [[Paper](https://arxiv.org/abs/2403.19949)][[Code](https://github.com/Harvard-Ophthalmology-AI-Lab/FairCLIP)][[推送](https://mp.weixin.qq.com/s/EEe4Z1OrKaKqr5xXr3vipg)] 152 | 153 | ## Computational Pathology (计算病理) 154 | 155 | - Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction. [[Paper](https://arxiv.org/abs/2402.19326)] 156 | - Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology. [[Paper](https://arxiv.org/abs/2402.17228)][[Code](https://github.com/DearCaat/RRT-MIL)] 157 | - PrPSeg: Universal Proposition Learning for Panoramic Renal Pathology Segmentation. [[Paper](https://arxiv.org/abs/2402.19286)] 158 | - ChAda-ViT: Channel Adaptive Attention for Joint Representation Learning of Heterogeneous Microscopy Images. [[Paper](https://arxiv.org/abs/2311.15264)][[Code](https://github.com/nicoboou/chada_vit)] 159 | - SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology. [[Paper](https://arxiv.org/abs/2312.15010)][Code] 160 | - Transcriptomics-guided Slide Representation Learning in Computational Pathology [[Paper](https://arxiv.org/abs/2405.11618)][[Code](https://arxiv.org/abs/2405.11618)] 161 | 162 | ## Others 163 | 164 | - Seeing Unseen: Discover Novel Biomedical Concepts via Geometry-Constrained Probabilistic Modeling. [[Paper](https://arxiv.org/html/2403.01053v2)] 165 | - FocusMAE: Gallbladder Cancer Detection from Ultrasound Videos with Focused Masked Autoencoders. [[Paper](https://arxiv.org/abs/2403.08848)][[Code](https://github.com/sbasu276/FocusMAE)] 166 | 167 | # Acknowledgement 168 | 169 | * Some CVPR 2025 papers sourced from [https://github.com/cerishleon/cvpr25_medical_paper](https://github.com/cerishleon/cvpr25_medical_paper?tab=readme-ov-file) 170 | --------------------------------------------------------------------------------