└── README.md /README.md: -------------------------------------------------------------------------------- 1 | ## Transfomers For Segmentation [![Awesome](https://awesome.re/badge-flat.svg)](https://awesome.re) 2 | 3 | The suggested list is a compendium of works that use **Transformer-Based Segmentation** techniques for **Semantic and Instance Segmentation** of image or video datasets. 4 | 5 | ## Contribution 6 | You can add to this repository; we would be grateful. 7 | Please feel free to send me [pull requests](https://github.com/Syeda-Farhat/awesome-Transformers-For-Segmentation/pulls) 8 | 9 | The structure that we'll use: 10 | - [Paper Name] (link) -**Conference Name and Year** -[github] (link) 11 | 12 | ## Table of Contents 13 | 14 | - [Papers](#papers) 15 | * [Survey Papers](#survey-Papers) 16 | * [2024](#2024) 17 | * [CVPR 2024](#CVPR-2024) 18 | * [IEEE 2024](#IEEE-2024) 19 | * [2023](#2023) 20 | * [ICCV 2023](#ICCV-2023) 21 | * [CVPR 2023](#CVPR-2023) 22 | * [WACV 2023](#WACV-2023) 23 | * [IEEE 2023](#IEEE-2023) 24 | * [MDPI 2023](#MDPI-2023) 25 | * [arXiv 2023](#arXiv-2023) 26 | * [2022](#2022) 27 | * [CVPR 2022](#CVPR-2022) 28 | * [WACV 2022](#WACV-2022) 29 | * [NIPs 2022](#NIPs-2022) 30 | * [IEEE 2022](#IEEE-2022) 31 | * [MDPI 2022](#MDPI-2022) 32 | * [arXiv 2022](#arXiv-2022) 33 | * [2021](#2021) 34 | * [CVPR 2021](#CVPR-2021) 35 | * [ICCV 2021](#ICCV-2021) 36 | * [NIPs 2021](#NIPs-2021) 37 | * [MICCIA 2021](#MICCIA-2021) 38 | * [MDPI 2021](#MDPI-2021) 39 | * [IEEE 2021](#IEEE-2021) 40 | * [arXiv 2021](#arXiv-2021) 41 | * [2020](#2020) 42 | * [CVPR 2020](#CVPR-2020) 43 | * [ECCV 2020](#ECCV-2020) 44 | * [MICCIA 2020](#MICCIA-2020) 45 | * [IEEE 2020](#IEEE-2020) 46 | * [arXiv 2020](#arXiv-2020) 47 | * [2019](#2019) 48 | * [IEEE 2019](#IEEE-2019) 49 | * [arXiv 2019](#arXiv-2019) 50 | * [Others](#Others) 51 | * [Acknowledgements](#Acknowledgements) 52 | * [Citation](#Citation) 53 | 54 | ## Papers 55 | ### Survey Papers 56 | * [A Survey of Transformers](https://arxiv.org/pdf/2106.04554.pdf) -**arXiv 2021**. 57 | * [Transformers in Vision: A Survey](https://arxiv.org/pdf/2101.01169.pdf) -**arXiv 2021**. 58 | * [Transformers in computational visual media: A survey](https://link.springer.com/article/10.1007/s41095-021-0247-3) -**SpringerLink 2022**. 59 | * [A Survey on Vision Transformer](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9716741) -**IEEE 2022**. 60 | * [Vision Transformers in Medical Computer Vision - A Contemplative Retrospection](https://arxiv.org/ftp/arxiv/papers/2203/2203.15269.pdf) -**arXiv 2022**. 61 | * [Recent Advances in Vision Transformer: A Survey and Outlook of Recent Work](https://arxiv.org/pdf/2203.01536.pdf) -**arXiv 2022**. 62 | * [3D Vision with Transformers: A Survey](https://arxiv.org/pdf/2208.04309.pdf) -**arXiv 2022**. 63 | * [A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective](https://arxiv.org/pdf/2209.13232.pdf) -**arXiv 2022**. 64 | * [VISION TRANSFORMERS FOR ACTION RECOGNITION: A SURVEY](https://arxiv.org/pdf/2209.05700.pdf) -**arXiv 2022**. 65 | * [Vision transformers for dense prediction: A survey](https://www.sciencedirect.com/science/article/abs/pii/S0950705122007821) -**ELSEVIER 2022**. 66 | * [Semantic segmentation using Vision Transformers: A survey](https://www.sciencedirect.com/science/article/abs/pii/S0952197623008539) -**ELSEVIER 2023**. 67 | * [A Comprehensive Survey of Transformers for Computer Vision ](https://www.mdpi.com/2504-446X/7/5/287) -**MDPI 2023**. 68 | * [Transformers in Remote Sensing: A Survey](https://www.mdpi.com/2072-4292/15/7/1860) -**MDPI 2023**. 69 | * [A Survey of Visual Transformers](https://ieeexplore.ieee.org/abstract/document/10088164) -**IEEE 2023**. 70 | * [Transformer-Based Visual Segmentation: A Survey](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10613466) -**IEEE 2024**. 71 | ### 2024 72 | #### CVPR 2024 #### 73 | * [SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation](https://openaccess.thecvf.com/content/CVPR2024W/DEF-AI-MIA/papers/Perera_SegFormer3D_An_Efficient_Transformer_for_3D_Medical_Image_Segmentation_CVPRW_2024_paper.pdf) -**CVPR 2024** -[github](https://github.com/OSUPCVLab/SegFormer3D) 74 | * [OneFormer: One Transformer to Rule Universal Image Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Jain_OneFormer_One_Transformer_To_Rule_Universal_Image_Segmentation_CVPR_2023_paper.pdf) -**CVPR 2024** -[github](https://github.com/SHI-Labs/OneFormer) 75 | * [UniDAformer: Unified Domain Adaptive Panoptic Segmentation Transformer via Hierarchical Mask Calibration](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhang_UniDAformer_Unified_Domain_Adaptive_Panoptic_Segmentation_Transformer_via_Hierarchical_Mask_CVPR_2023_paper.pdf) -**CVPR 2024** -[github] 76 | 77 | #### IEEE 2024 #### 78 | ### 2023 79 | #### ICCV 2023 #### 80 | * [Mask-Attention-Free Transformer for 3D Instance Segmentation](https://openaccess.thecvf.com/content/ICCV2023/papers/Lai_Mask-Attention-Free_Transformer_for_3D_Instance_Segmentation_ICCV_2023_paper.pdf) -**ICCV 2023** -[github](https://github.com/dvlab-research/Mask-Attention-Free-Transformer) 81 | * [Query Refinement Transformer for 3D Instance Segmentation](https://openaccess.thecvf.com/content/ICCV2023/papers/Lu_Query_Refinement_Transformer_for_3D_Instance_Segmentation_ICCV_2023_paper.pdf) -**ICCV 2023** -[github] 82 | * [2D-3D Interlaced Transformer for Point Cloud Segmentation with Scene-Level Supervision](https://openaccess.thecvf.com/content/ICCV2023/papers/Yang_2D-3D_Interlaced_Transformer_for_Point_Cloud_Segmentation_with_Scene-Level_Supervision_ICCV_2023_paper.pdf) -**ICCV 2023** -[github](https://github.com/jimmy15923/mit) 83 | * [CDAC: Cross-domain Attention Consistency in Transformer for Domain Adaptive Semantic Segmentation](https://openaccess.thecvf.com/content/ICCV2023/papers/Wang_CDAC_Cross-domain_Attention_Consistency_in_Transformer_for_Domain_Adaptive_Semantic_ICCV_2023_paper.pdf) -**ICCV 2023** -[github](https://github.com/wangkaihong/CDAC) 84 | * [A Good Student is Cooperative and Reliable: CNN-Transformer Collaborative Learning for Semantic Segmentation](https://openaccess.thecvf.com/content/ICCV2023/papers/Zhu_A_Good_Student_is_Cooperative_and_Reliable_CNN-Transformer_Collaborative_Learning_ICCV_2023_paper.pdf) -**ICCV 2023** -[github] 85 | * [Efficient 3D Semantic Segmentation with Superpoint Transformer](https://openaccess.thecvf.com/content/ICCV2023/papers/Robert_Efficient_3D_Semantic_Segmentation_with_Superpoint_Transformer_ICCV_2023_paper.pdf) -**ICCV 2023** -[github](https://github.com/drprojects/superpoint_transformer) 86 | * [Adaptive Template Transformer for Mitochondria Segmentation in Electron Microscopy Images](https://openaccess.thecvf.com/content/ICCV2023/papers/Pan_Adaptive_Template_Transformer_for_Mitochondria_Segmentation_in_Electron_Microscopy_Images_ICCV_2023_paper.pdf) -**ICCV 2023** -[github] 87 | * [CVSformer: Cross-View Synthesis Transformer for Semantic Scene Completion](https://openaccess.thecvf.com/content/ICCV2023/papers/Dong_CVSformer_Cross-View_Synthesis_Transformer_for_Semantic_Scene_Completion_ICCV_2023_paper.pdf) -**ICCV 2023** -[github] 88 | #### CVPR 2023 #### 89 | * [VoxFormer: Sparse Voxel Transformer for Camera-Based 3D Semantic Scene Completion](https://openaccess.thecvf.com/content/CVPR2023/papers/Li_VoxFormer_Sparse_Voxel_Transformer_for_Camera-Based_3D_Semantic_Scene_Completion_CVPR_2023_paper.pdf) -**CVPR 2023** -[github](https://github.com/NVlabs/VoxFormer) 90 | * [Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Li_Mask_DINO_Towards_a_Unified_Transformer-Based_Framework_for_Object_Detection_CVPR_2023_paper.pdf) -**CVPR 2023** -[github](https://github.com/IDEA-Research/MaskDINO) 91 | * [Heat Diffusion based Multi-scale and Geometric Structure-aware Transformer for Mesh Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Wong_Heat_Diffusion_Based_Multi-Scale_and_Geometric_Structure-Aware_Transformer_for_Mesh_CVPR_2023_paper.pdf) -**CVPR 2023** -[github] 92 | * [CLIP is Also an Efficient Segmenter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Lin_CLIP_Is_Also_an_Efficient_Segmenter_A_Text-Driven_Approach_for_CVPR_2023_paper.pdf) -**CVPR 2023** -[github](https://github.com/linyq2117/CLIP-ES) 93 | * [MED-VT: Multiscale Encoder-Decoder Video Transformer with Application to Object Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Karim_MED-VT_Multiscale_Encoder-Decoder_Video_Transformer_With_Application_To_Object_Segmentation_CVPR_2023_paper.pdf) -**CVPR 2023** -[github](https://github.com/rkyuca/medvt) 94 | * [Contrastive Grouping with Transformer for Referring Image Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Tang_Contrastive_Grouping_With_Transformer_for_Referring_Image_Segmentation_CVPR_2023_paper.pdf) -**CVPR 2023** -[github](https://github.com/Toneyaya/CGFormer) 95 | * [SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Huang_SemiCVT_Semi-Supervised_Convolutional_Vision_Transformer_for_Semantic_Segmentation_CVPR_2023_paper.pdf) -**CVPR 2023** -[github] 96 | * [OneFormer: One Transformer to Rule Universal Image Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Jain_OneFormer_One_Transformer_To_Rule_Universal_Image_Segmentation_CVPR_2023_paper.pdf) -**CVPR 2023** -[github](https://github.com/SHI-Labs/OneFormer) 97 | * [HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Ding_HGFormer_Hierarchical_Grouping_Transformer_for_Domain_Generalized_Semantic_Segmentation_CVPR_2023_paper.pdf) -**CVPR 2023** -[github](https://github.com/dingjiansw101/HGFormer) 98 | * [Incrementer: Transformer for Class-Incremental Semantic Segmentation with Knowledge Distillation Focusing on Old Class](https://openaccess.thecvf.com/content/CVPR2023/papers/Shang_Incrementer_Transformer_for_Class-Incremental_Semantic_Segmentation_With_Knowledge_Distillation_Focusing_CVPR_2023_paper.pdf) -**CVPR 2023** -[github] 99 | * [MP-Former: Mask-Piloted Transformer for Image Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhang_MP-Former_Mask-Piloted_Transformer_for_Image_Segmentation_CVPR_2023_paper.pdf) -**CVPR 2023** -[github](https://github.com/IDEA-Research/MP-Former) 100 | * [Transformer Scale Gate for Semantic Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Shi_Transformer_Scale_Gate_for_Semantic_Segmentation_CVPR_2023_paper.pdf) -**CVPR 2023** -[github] 101 | * [UniDAformer: Unified Domain Adaptive Panoptic Segmentation Transformer via Hierarchical Mask Calibration ](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhang_UniDAformer_Unified_Domain_Adaptive_Panoptic_Segmentation_Transformer_via_Hierarchical_Mask_CVPR_2023_paper.pdf) -**CVPR 2023** -[github] 102 | #### WACV 2023 #### 103 | * [HiFormer: Hierarchical Multi-scale Representations Using Transformers for Medical Image Segmentation](https://openaccess.thecvf.com/content/WACV2023/papers/Heidari_HiFormer_Hierarchical_Multi-Scale_Representations_Using_Transformers_for_Medical_Image_Segmentation_WACV_2023_paper.pdf) -**WACV 2023** -[github](https://github.com/amirhossein-kz/HiFormer) 104 | * [SCTS: Instance Segmentation of Single Cells Using a Transformer-Based Semantic-Aware Model and Space-Filling Augmentation](https://openaccess.thecvf.com/content/WACV2023/papers/Zhou_SCTS_Instance_Segmentation_of_Single_Cells_Using_a_Transformer-Based_Semantic-Aware_WACV_2023_paper.pdf) -**WACV 2023** -[github] 105 | * [Full Contextual Attention for Multi-resolution Transformers in Semantic Segmentation](https://openaccess.thecvf.com/content/WACV2023/papers/Themyr_Full_Contextual_Attention_for_Multi-Resolution_Transformers_in_Semantic_Segmentation_WACV_2023_paper.pdf) -**WACV 2023** -[github](https://github.com/themyrl/glam) 106 | * [The Fully Convolutional Transformer for Medical Image Segmentation](https://openaccess.thecvf.com/content/WACV2023/papers/Tragakis_The_Fully_Convolutional_Transformer_for_Medical_Image_Segmentation_WACV_2023_paper.pdf) -**WACV 2023** -[github](https://github.com/Thanos-DB/FullyConvolutionalTransformer) 107 | * [Towards Few-Annotation Learning for Object Detection: Are Transformer-based Models More Efficient ?](https://openaccess.thecvf.com/content/WACV2023/papers/Bouniot_Towards_Few-Annotation_Learning_for_Object_Detection_Are_Transformer-Based_Models_More_WACV_2023_paper.pdf) -**WACV 2023** -[github] 108 | * [BEVSegFormer: Bird’s Eye View Semantic Segmentation From Arbitrary Camera Rigs](https://openaccess.thecvf.com/content/WACV2023/papers/Peng_BEVSegFormer_Birds_Eye_View_Semantic_Segmentation_From_Arbitrary_Camera_Rigs_WACV_2023_paper.pdf) -** WACV 2023** -[github] 109 | * [Medical Image Segmentation via Cascaded Attention Decoding](https://openaccess.thecvf.com/content/WACV2023/papers/Rahman_Medical_Image_Segmentation_via_Cascaded_Attention_Decoding_WACV_2023_paper.pdf) -** WACV 2023** -[github] 110 | * [Unsupervised multi-object segmentation using attention and soft-argmax](https://openaccess.thecvf.com/content/WACV2023/papers/Sauvalle_Unsupervised_Multi-Object_Segmentation_Using_Attention_and_Soft-Argmax_WACV_2023_paper.pdf) -** WACV 2023** -[github](https://github.com/BrunoSauvalle/AST) 111 | #### IEEE 2023 #### 112 | * [The Power of Fragmentation: A Hierarchical Transformer Model for Structural Segmentation in Symbolic Music Generation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10089423) -**IEEE 2023** -[github] 113 | * [Local-Global Context Aware Transformer for Language-Guided Video Segmentation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10083244) -**IEEE 2023** -[github](https://github.com/leonnnop/Locater) 114 | * [Medical Image Segmentation Based on Transformer and HarDNet Structures](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10042417) -**IEEE 2023** -[github] 115 | * [A Unified Transformer Framework for Group-based Segmentation: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10093043) -**IEEE 2023** -[github](https://github.com/suyukun666/UFO) 116 | * [The Lighter The Better: Rethinking Transformers in Medical Image Segmentation Through Adaptive Pruning](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10050127) -**IEEE 2023** -[github] 117 | * [RNGDet++: Road Network Graph Detection by Transformer with Instance Segmentation and Multi-scale Features Enhancement](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10093124) -**IEEE 2023** -[github](https://github.com/TonyXuQAQ/RNGDetPlusPlus) 118 | * [RockFormer: A U-Shaped Transformer Network for Martian Rock Segmentation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10012398) -**IEEE 2023** -[github] 119 | * [Unsupervised Visual Representation Learning Based on Segmentation of Geometric Pseudo-Shapes for Transformer-Based Medical Tasks](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10018448) -**IEEE 2023** -[github] 120 | * [CKD-TransBTS: Clinical Knowledge-Driven Hybrid Transformer with Modality-Correlated Cross-Attention for Brain Tumor Segmentation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10056308) -**IEEE 2023** -[github] 121 | * [RSSFormer: Foreground Saliency Enhancement for Remote Sensing Land-Cover Segmentation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10026298) -**IEEE 2023** -[github] 122 | * [Normal-Knowledge-Based Pavement Defect Segmentation Using Relevance-Aware and Cross-Reasoning Mechanisms](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10013944) -**IEEE 2023** -[github] 123 | #### MDPI 2023 #### 124 | * [High-Resolution Swin Transformer for Automatic Medical Image Segmentation](https://www.mdpi.com/1424-8220/23/7/3420) -**MDPI 2023** -[github] 125 | * [Multi-Swin Mask Transformer for Instance Segmentation of Agricultural Field Extraction](https://www.mdpi.com/2072-4292/15/3/549) -**MDPI 2023** -[github] 126 | * [Enhancing Mask Transformer with Auxiliary Convolution Layers for Semantic Segmentation](https://www.mdpi.com/1424-8220/23/2/581) -**MDPI 2023** -[github] 127 | * [Efficient Lung Cancer Image Classification and Segmentation Algorithm Based on an Improved Swin Transformer](https://www.mdpi.com/2079-9292/12/4/1024) -**MDPI 2023** -[github] 128 | * [Transformer-Based Weed Segmentation for Grass Management](https://www.mdpi.com/1424-8220/23/1/65) -**MDPI 2023** -[github] 129 | * [RCCT-ASPPNet: Dual-Encoder Remote Image Segmentation Based on Transformer and ASPP](https://www.mdpi.com/2072-4292/15/2/379) -**MDPI 2023** -[github] 130 | * [MCANet: A Multi-Branch Network for Cloud/Snow Segmentation in High-Resolution Remote Sensing Images](https://www.mdpi.com/2072-4292/15/4/1055) -**MDPI 2023** -[github] 131 | * [Muscle Cross-Sectional Area Segmentation in Transverse Ultrasound Images Using Vision Transformers](https://www.mdpi.com/2075-4418/13/2/217) -**MDPI 2023** -[github] 132 | * [MCAFNet: A Multiscale Channel Attention Fusion Network for Semantic Segmentation of Remote Sensing Images](https://www.mdpi.com/2072-4292/15/2/361) -**MDPI 2023** -[github] 133 | #### arXiv 2023 #### 134 | * [Temporal Segment Transformer for Action Segmentation](https://arxiv.org/pdf/2302.13074.pdf) -**arXiv 2023** -[github] 135 | * [SEAFORMER: SQUEEZE-ENHANCED AXIAL TRANSFORMER FOR MOBILE SEMANTIC SEGMENTATION](https://arxiv.org/pdf/2301.13156.pdf) -**arXiv 2023** -[github] 136 | * [MP-Former: Mask-Piloted Transformer for Image Segmentation](https://arxiv.org/pdf/2303.07336.pdf) -**arXiv 2023** -[github] 137 | * [MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer](https://arxiv.org/pdf/2301.11798.pdf) -**arXiv 2023** -[github] 138 | * [SwinVFTR: A Novel Volumetric Feature-learning Transformer for 3D OCT Fluid Segmentation](https://arxiv.org/pdf/2303.09233.pdf) -**arXiv 2023** -[github] 139 | * [Towards Robust Video Instance Segmentation with Temporal-Aware Transformer](https://arxiv.org/pdf/2301.09416.pdf) -**arXiv 2023** -[github] 140 | * [Head-Free Lightweight Semantic Segmentation with Linear Transformer](https://arxiv.org/pdf/2301.04648.pdf) -**arXiv 2023** -[github] 141 | * [FullStop: Punctuation and Segmentation Prediction for Dutch with Transformers](https://arxiv.org/pdf/2301.03319.pdf) -**arXiv 2023** -[github] 142 | * [Cooperation Learning Enhanced Colonic Polyp Segmentation Based on Transformer-CNN Fusion](https://arxiv.org/ftp/arxiv/papers/2301/2301.06892.pdf) -**arXiv 2023** -[github] 143 | * [SAT: Size-Aware Transformer for 3D Point Cloud Semantic Segmentation](https://arxiv.org/pdf/2301.06869.pdf) -**arXiv 2023** -[github] 144 | * [Effects of Architectures on Continual Semantic Segmentation](https://arxiv.org/pdf/2302.10718.pdf) -**arXiv 2023** -[github] 145 | * [MECPformer: Multi-estimations Complementary Patch with CNN-Transformers for Weakly Supervised Semantic Segmentation](https://github.com/ChunmengLiu1/MECPformer) -**arXiv 2023** -[github](https://arxiv.org/pdf/2303.10689.pdf) 146 | * [PSST! Prosodic Speech Segmentation with Transformers](https://arxiv.org/pdf/2302.01984.pdf) -**arXiv 2023** -[github] 147 | * [TRANSADAPT: A TRANSFORMATIVE FRAMEWORK FOR ONLINE TEST TIME ADAPTIVE SEMANTIC SEGMENTATION](https://arxiv.org/pdf/2302.14611.pdf) -**arXiv 2023** -[github] 148 | ### 2022 149 | #### CVPR 2022 #### 150 | * [Multi-class Token Transformer for Weakly Supervised Semantic Segmentation](https://openaccess.thecvf.com/content/CVPR2022/papers/Xu_Multi-Class_Token_Transformer_for_Weakly_Supervised_Semantic_Segmentation_CVPR_2022_paper.pdf) -**CVPR 2022** -[github] 151 | * [TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation](https://openaccess.thecvf.com/content/CVPR2022/papers/Zhang_TopFormer_Token_Pyramid_Transformer_for_Mobile_Semantic_Segmentation_CVPR_2022_paper.pdf) -**CVPR 2022** -[github](https://github.com/hustvl/TopFormer) 152 | * [Masked-attention Mask Transformer for Universal Image Segmentation](https://openaccess.thecvf.com/content/CVPR2022/papers/Cheng_Masked-Attention_Mask_Transformer_for_Universal_Image_Segmentation_CVPR_2022_paper.pdf) -**CVPR 2022** -[github](https://github.com/facebookresearch/Mask2Former) 153 | * [Temporally Efficient Vision Transformer for Video Instance Segmentation](https://openaccess.thecvf.com/content/CVPR2022/papers/Yang_Temporally_Efficient_Vision_Transformer_for_Video_Instance_Segmentation_CVPR_2022_paper.pdf) -**CVPR 2022** -[github](https://github.com/hustvl/TeViT) 154 | * [An MIL-Derived Transformer for Weakly Supervised Point Cloud Segmentation](https://openaccess.thecvf.com/content/CVPR2022/papers/Yang_An_MIL-Derived_Transformer_for_Weakly_Supervised_Point_Cloud_Segmentation_CVPR_2022_paper.pdf) -**CVPR 2022** -[github] 155 | * [Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation](https://openaccess.thecvf.com/content/CVPR2022/papers/Gu_Multi-Scale_High-Resolution_Vision_Transformer_for_Semantic_Segmentation_CVPR_2022_paper.pdf) -**CVPR 2022** -[github](https://github.com/facebookresearch/HRViT) 156 | * [MPViT : Multi-Path Vision Transformer for Dense Prediction](https://openaccess.thecvf.com/content/CVPR2022/papers/Lee_MPViT_Multi-Path_Vision_Transformer_for_Dense_Prediction_CVPR_2022_paper.pdf) -**CVPR 2022** -[github] 157 | 158 | #### WACV 2022 #### 159 | * [Unetr: Transformers for 3d medical image segmentation](https://openaccess.thecvf.com/content/WACV2022/papers/Hatamizadeh_UNETR_Transformers_for_3D_Medical_Image_Segmentation_WACV_2022_paper.pdf) -**WACV 2022** -[github](https://github.com/tamasino52/UNETR) 160 | * [AFTer-UNet: Axial Fusion Transformer UNet for Medical Image Segmentation](https://openaccess.thecvf.com/content/WACV2022/papers/Yan_AFTer-UNet_Axial_Fusion_Transformer_UNet_for_Medical_Image_Segmentation_WACV_2022_paper.pdf) -**WACV 2022** -[github] 161 | * [Spatial-Temporal Transformer for 3D Point Cloud Sequences](https://openaccess.thecvf.com/content/WACV2022/papers/Wei_Spatial-Temporal_Transformer_for_3D_Point_Cloud_Sequences_WACV_2022_paper.pdf) -**WACV 2022** -[github] 162 | 163 | #### NIPs 2022 #### 164 | * [SegViT: Semantic Segmentation with Plain Vision Transformerss](https://proceedings.neurips.cc/paper_files/paper/2022/file/20189b1aaa8edbb6d8bd6c1067ab5f3f-Paper-Conference.pdf) -**NIPs 2022** -[github](https://github.com/zbwxp/SegVit) 165 | * [Intermediate Prototype Mining Transformer for Few-Shot Semantic Segmentation](https://proceedings.neurips.cc/paper_files/paper/2022/hash/f7fef21d1fb3e950b12b50ad7f395e31-Abstract-Conference.html) -**NIPs 2022** -[github] 166 | * [RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer](https://proceedings.neurips.cc/paper_files/paper/2022/file/30e10e671c5e43edb67eb257abb6c3ea-Paper-Conference.pdf) -**NIPs 2022** -[github] 167 | 168 | #### IEEE 2022 #### 169 | * [Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9686686) -**IEEE 2022** -[github] 170 | * [Transformer and CNN Hybrid Deep Neural Network for Semantic Segmentation of Very-high-resolution Remote Sensing Imagery](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9686732) -**IEEE 2022** -[github] 171 | * [A novel transformer based semantic segmentation scheme for fine-resolution remote sensing images](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9681903) -**IEEE 2022** -[github] 172 | * [LFT-Net: Local Feature Transformer Network for Point Clouds Analysis](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9700748) -**IEEE 2022** -[github] 173 | * [Transformer-based Efficient Salient Instance Segmentation Networks with Orientative Query](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9678049) --[Code](https://github.com/ssecv/OQTR) 174 | * [Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9681287) -**IEEE 2022** -[github] 175 | * [Looking Outside the Window: Wide-Context Transformer for the Semantic Segmentation of High-Resolution Remote Sensing Images](https://ieeexplore.ieee.org/document/9759447) -**IEEE 2022** -[github] 176 | #### MDPI 2022 #### 177 | * [Enhanced Feature Pyramid Vision Transformer for Semantic Segmentation on Thailand Landsat-8 Corpus](https://www.mdpi.com/2078-2489/13/5/259) -**MDPI 2022** -[github] 178 | #### arXiv 2022 #### 179 | * [Pyramid fusion transformer for semantic segmentation](https://arxiv.org/pdf/2201.04019.pdf) -**arXiv 2022** -[github] 180 | * [TransBTSV2: Wider Instead of Deeper Transformer for Medical Image Segmentation](https://arxiv.org/pdf/2201.12785.pdf) -**arXiv 2022** -[github](https://github.com/Wenxuan-1119/TransBTS) 181 | * [Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images](https://link.springer.com/chapter/10.1007/978-3-031-08999-2_22) -**arXiv 2022** -[github] 182 | * [Task-Adaptive Feature Transformer with Semantic Enrichment for Few-Shot Segmentation](https://arxiv.org/pdf/2202.06498.pdf) -**arXiv 2022** -[github](https://github.com/istarjun/TAFT-SE) 183 | * [Inverted Pyramid Multi-task Transformer for Dense Scene Understanding](https://arxiv.org/pdf/2203.07997.pdf) -**arXiv 2022** -[github](https://github.com/prismformore/InvPT) 184 | ### 2021 185 | #### CVPR 2021 #### 186 | * [MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers](https://openaccess.thecvf.com/content/CVPR2021/papers/Wang_MaX-DeepLab_End-to-End_Panoptic_Segmentation_With_Mask_Transformers_CVPR_2021_paper.pdf) -**CVPR 2021** -[github](https://github.com/mattdeitke/cvpr-buzz/blob/992c23b72584342f8621d3d272dc60077766b002/paper-data/Wang_MaX-DeepLab_End-to-End_Panoptic_Segmentation_With_Mask_Transformers.json) 187 | * [End-to-End Video Instance Segmentation With Transformers](https://openaccess.thecvf.com/content/CVPR2021/papers/Wang_End-to-End_Video_Instance_Segmentation_With_Transformers_CVPR_2021_paper.pdf) -**CVPR 2021** -[github](https://github.com/Epiphqny/VisTR) 188 | * [Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective 189 | with Transformers](https://openaccess.thecvf.com/content/CVPR2021/papers/Zheng_Rethinking_Semantic_Segmentation_From_a_Sequence-to-Sequence_Perspective_With_Transformers_CVPR_2021_paper.pdf) -**CVPR 2021** -[github](https://github.com/fudan-zvg/SETR) 190 | * [Sstvos: Sparse spatiotemporal transformers for video object segmentation](https://openaccess.thecvf.com/content/CVPR2021/papers/Duke_SSTVOS_Sparse_Spatiotemporal_Transformers_for_Video_Object_Segmentation_CVPR_2021_paper.pdf) -**CVPR 2021** -[github](https://github.com/dukebw/SSTVOS) 191 | * [Locate then Segment: A Strong Pipeline for Referring Image Segmentation](https://openaccess.thecvf.com/content/CVPR2021/papers/Jing_Locate_Then_Segment_A_Strong_Pipeline_for_Referring_Image_Segmentation_CVPR_2021_paper.pdf) -**CVPR 2021** -[github] 192 | #### ICCV 2021 #### 193 | * [Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Pyramid_Vision_Transformer_A_Versatile_Backbone_for_Dense_Prediction_Without_ICCV_2021_paper.pdf) -**ICCV 2021** -[github](https://github.com/wangermeng2021/PVT-tensorflow2) 194 | * [Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions — Supplemental Materials](https://openaccess.thecvf.com/content/ICCV2021/supplemental/Wang_Pyramid_Vision_Transformer_ICCV_2021_supplemental.pdf) -**ICCV 2021** -[github] 195 | * [Joint Inductive and Transductive Learning for Video Object Segmentation](https://openaccess.thecvf.com/content/ICCV2021/papers/Mao_Joint_Inductive_and_Transductive_Learning_for_Video_Object_Segmentation_ICCV_2021_paper.pdf) -**ICCV 2021** -[github](https://github.com/maoyunyao/JOINT) 196 | * [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_Swin_Transformer_Hierarchical_Vision_Transformer_Using_Shifted_Windows_ICCV_2021_paper.pdf) -**ICCV 2021** -[github](https://github.com/microsoft/Swin-Transformer) 197 | * [Self-supervised Video Object Segmentation by Motion Grouping](https://openaccess.thecvf.com/content/ICCV2021/papers/Yang_Self-Supervised_Video_Object_Segmentation_by_Motion_Grouping_ICCV_2021_paper.pdf) -**ICCV 2021** -[github](https://github.com/charigyang/motiongrouping) 198 | * [Vision Transformers for Dense Prediction](https://openaccess.thecvf.com/content/ICCV2021/papers/Ranftl_Vision_Transformers_for_Dense_Prediction_ICCV_2021_paper.pdf) -**ICCV 2021** -[github](https://github.com/czczup/ViT-Adapter) 199 | * [Point Transformer](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhao_Point_Transformer_ICCV_2021_paper.pdf) -**ICCV 2021** -[github](https://github.com/qq456cvb/Point-Transformers) 200 | * [SOTR: Segmenting Objects with Transformers](https://openaccess.thecvf.com/content/ICCV2021/papers/Guo_SOTR_Segmenting_Objects_With_Transformers_ICCV_2021_paper.pdf) -**ICCV 2021** -[github](https://github.com/easton-cau/SOTR) 201 | * [A Unified Efficient Pyramid Transformer for Semantic Segmentation](https://openaccess.thecvf.com/content/ICCV2021W/VSPW/papers/Zhu_A_Unified_Efficient_Pyramid_Transformer_for_Semantic_Segmentation_ICCVW_2021_paper.pdf) -**ICCV 2021** -[github](https://github.com/amazon-research/unified-ept) 202 | * [Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Multi-Scale_Vision_Longformer_A_New_Vision_Transformer_for_High-Resolution_Image_ICCV_2021_paper.pdf) -**ICCV 2021** -[github](https://github.com/microsoft/vision-longformer) 203 | * [Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer](https://openaccess.thecvf.com/content/ICCV2021/papers/Lu_Simpler_Is_Better_Few-Shot_Semantic_Segmentation_With_Classifier_Weight_Transformer_ICCV_2021_paper.pdf) -**ICCV 2021** -[github](https://github.com/zhiheLu/CWT-for-FSS) 204 | * [Trans4Trans: Efficient Transformer for Transparent Object Segmentation to Help Visually Impaired People Navigate in the Real World](https://openaccess.thecvf.com/content/ICCV2021W/ACVR/papers/Zhang_Trans4Trans_Efficient_Transformer_for_Transparent_Object_Segmentation_To_Help_Visually_ICCVW_2021_paper.pdf) -**ICCV 2021** -[github] 205 | * [Vision-Language Transformer and Query Generation for Referring Segmentation](https://openaccess.thecvf.com/content/ICCV2021/papers/Ding_Vision-Language_Transformer_and_Query_Generation_for_Referring_Segmentation_ICCV_2021_paper.pdf) -**ICCV 2021** -[github](https://github.com/henghuiding/Vision-Language-Transformer) 206 | * [Segmenter: Transformer for Semantic Segmentation](https://openaccess.thecvf.com/content/ICCV2021/papers/Strudel_Segmenter_Transformer_for_Semantic_Segmentation_ICCV_2021_paper.pdf) -**ICCV 2021** -[github](https://github.com/rstrudel/segmenter) 207 | #### NIPs 2021 #### 208 | * [Twins: Revisiting the Design of Spatial Attention in Vision Transformers](https://proceedings.neurips.cc/paper/2021/file/4e0928de075538c593fbdabb0c5ef2c3-Paper.pdf) -**NIPs 2021** -[github](https://github.com/EarthNets/RSI-Classification/blob/1a858a80881757fc2114305f15c1ae26be2c2169/configs/twins/README.md) 209 | * [HRFormer: High-Resolution Transformer for Dense Prediction](https://proceedings.neurips.cc/paper/2021/file/3bbfdde8842a5c44a0323518eec97cbe-Paper.pdf) -**NIPs 2021** -[github](https://github.com/HRNet/HRFormer) 210 | * [SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers](https://proceedings.neurips.cc/paper/2021/file/64f1f27bf1b4ec22924fd0acb550c235-Paper.pdf) -**NIPs 2021** -[github](https://github.com/NVlabs/SegFormer) 211 | * [Per-Pixel Classification is Not All You Need for Semantic Segmentation](https://proceedings.neurips.cc/paper/2021/file/950a4152c2b4aa3ad78bdd6b366cc179-Paper.pdf) -**NIPs 2021** -[github](https://github.com/facebookresearch/MaskFormer) 212 | * [Associating Objects with Transformers for Video Object Segmentation](https://proceedings.neurips.cc/paper/2021/file/147702db07145348245dc5a2f2fe5683-Paper.pdf) -**NIPs 2021** -[github](https://github.com/yoxu515/aot-benchmark) 213 | * [Video Instance Segmentation using Inter-Frame Communication Transformers](https://proceedings.neurips.cc/paper/2021/file/6f2688a5fce7d48c8d19762b88c32c3b-Paper.pdf) -**NIPs 2021** -[github](https://github.com/sukjunhwang/IFC) 214 | * [Few-Shot Segmentation via Cycle-Consistent Transformer](https://proceedings.neurips.cc/paper/2021/file/b8b12f949378552c21f28deff8ba8eb6-Paper.pdf) -**NIPs 2021** -[github](https://github.com/GengDavid/CyCTR) 215 | * 216 | #### MICCIA 2021 #### 217 | * [Medical Transformer: Gated Axial-Attention for Medical Image Segmentation](https://arxiv.org/pdf/2102.10662.pdf) -**MICCIA 2021** -[github](https://github.com/jeya-maria-jose/Medical-Transformer) 218 | * [UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation](https://arxiv.org/pdf/2107.00781.pdf) -**MICCIA 2021** -[github](https://github.com/yhygao/UTNet) 219 | * [Transbts: Multimodal brain tumor segmentation using transformer](https://arxiv.org/pdf/2103.04430.pdf) -**MICCIA 2021** -[github](https://github.com/Wenxuan-1119/TransBTS) 220 | * [Multi-compound transformer for accurate biomedical image segmentation](https://arxiv.org/pdf/2106.14385.pdf) -**MICCIA 2021** -[github] 221 | * [A multi-branch hybrid transformer network for corneal endothelial cell segmentation](https://arxiv.org/pdf/2106.07557.pdf) -**MICCIA 2021** -[github] 222 | * [DC-Net: Dual Context Network for 2D Medical Image Segmentation](https://link.springer.com/chapter/10.1007/978-3-030-87193-2_48) -**MICCIA 2021** -[github] 223 | * [Transfuse: Fusing transformers and cnns for medical image segmentation](https://arxiv.org/pdf/2102.08005.pdf) -**MICCIA 2021** -[github](https://github.com/Rayicer/TransFuse) 224 | * [Teds-net: Enforcing diffeomorphisms in spatial transformers to guarantee topology preservation in segmentations](https://arxiv.org/pdf/2107.13542.pdf) -**MICCIA 2021** -[github] 225 | * [Cotr: Efficiently bridging cnn and transformer for 3d medical image segmentation](https://arxiv.org/pdf/2103.03024.pdf) -**MICCIA 2021** -[github](https://github.com/YtongXie/CoTr) 226 | * [Boundary-aware transformers for skin lesion segmentation](https://arxiv.org/pdf/2110.03864.pdf) -**MICCIA 2021** -[github](https://github.com/jcwang123/BA-Transformer) 227 | * [Convolution-Free Medical Image Segmentation using Transformers](https://arxiv.org/pdf/2102.13645.pdf) -**MICCIA 2021** -[github] 228 | #### MDPI 2021 #### 229 | * [Transformer Meets Convolution: A Bilateral Awareness Network for Semantic Segmentation of Very Fine Resolution Urban Scene Images](https://www.mdpi.com/2072-4292/13/16/3065) -**MDPI 2021** -[github] 230 | * [Wildfire Segmentation Using Deep Vision Transformers ](https://www.mdpi.com/2072-4292/13/17/3527) -**MDPI 2021** -[github] 231 | * [Transformer-Based Decoder Designs for Semantic Segmentation on Remotely Sensed Images ](https://www.mdpi.com/2072-4292/13/24/5100) -**MDPI 2021** -[github](https://github.com/kaopanboonyuen/transformer-based-decoder-designs) 232 | * [Efficient Transformer for Remote Sensing Image Segmentation ](https://www.mdpi.com/2072-4292/13/18/3585/htm) -**MDPI 2021** -[github](https://github.com/Syeda-Farhat/Efficient-Transformer) 233 | #### IEEE 2021 #### 234 | * [Segmentation applying TAG type label data and Transformer](https://ieeexplore.ieee.org/document/9650042) -**IEEE 2021** -[github] 235 | * [Local Memory Attention for Fast Video Semantic Segmentation](https://ieeexplore.ieee.org/document/9636192) --**IEEE 2021** -[github] 236 | * [A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9648201) -**IEEE 2021** -[github] 237 | * [STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9573374) -**IEEE 2021** -[github] 238 | * [Swin-Spectral Transformer for Cholangiocarcinoma Hyperspectral Image Segmentation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9624405) -**IEEE 2021** -[github] 239 | * [ECT-NAS: Searching Efficient CNN-Transformers Architecture for Medical Image Segmentation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9669734) -**IEEE 2021** -[github] 240 | * [3D Deep Attentive U-Net with Transformer for Breast Tumor Segmentation from Automated Breast Volume Scanner](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9629523) -**IEEE 2021** -[github] 241 | * [Visual-Semantic Transformer for Face Forgery Detection](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9484407) -**IEEE 2021** -[github] 242 | * [MaAST: Map Attention with Semantic Transformers for Efficient Visual Navigation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9561058) -**IEEE 2021** -[github] 243 | * [Multi-scale Hierarchical Transformer structure for 3D medical image segmentation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9669799) -**IEEE 2021** -[github] 244 | * [A Temporary Transformer Network for Guide- Wire Segmentation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9624350) -**IEEE 2021** -[github] 245 | * [A Transformer-Based Network for Anisotropic 3D Medical Image Segmentation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9411990) -**IEEE 2021** -[github] 246 | #### arXiv 2021 #### 247 | * [OffRoadTranSeg: Semi-Supervised Segmentation using Transformers on OffRoad environments](https://arxiv.org/pdf/2106.13963.pdf) -**arXiv 2021** -[github] 248 | * [Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation](https://arxiv.org/pdf/2111.01236.pdf) -**arXiv 2021** -[github](https://github.com/facebookresearch/HRViT) 249 | * [Self-Supervised Learning with Swin Transformers](https://arxiv.org/pdf/2105.04553.pdf) -**arXiv 2021** -[github] 250 | * [GT U-Net: A U-Net Like Group Transformer Network for Tooth Root Segmentation](https://link.springer.com/chapter/10.1007/978-3-030-87589-3_40) -**arXiv 2021** -[github] 251 | * [SpecTr: Spectral Transformer for Hyperspectral Pathology Image Segmentation](https://arxiv.org/pdf/2103.03604.pdf) -**arXiv 2021** -[github] 252 | * [Satellite Image Semantic Segmentation](https://arxiv.org/pdf/2110.05812.pdf) -**arXiv 2021** -[github](https://github.com/YudeWang/UNet-Satellite-Image-Segmentation) 253 | * [Boosting Few-shot Semantic Segmentation with Transformers](https://arxiv.org/pdf/2108.02266.pdf) -**arXiv 2021** -[github] 254 | * [Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation](https://arxiv.org/pdf/2111.01236.pdf) -**arXiv 2021** -[github](https://github.com/facebookresearch/HRViT) 255 | * [A Robust Volumetric Transformer for Accurate 3D Tumor Segmentation](https://arxiv.org/pdf/2111.13300.pdf) -**arXiv 2021** -[github](https://github.com/himashi92/VT-UNet) 256 | * [Dynamic Convolution for 3D Point Cloud Instance Segmentation](https://arxiv.org/pdf/2107.08392.pdf) -**arXiv 2021** -[github] 257 | * [Fast Point Transformer](https://openaccess.thecvf.com/content/CVPR2022/papers/Park_Fast_Point_Transformer_CVPR_2022_paper.pdf) -**arXiv 2021** -[github](https://github.com/POSTECH-CVLab/FastPointTransformer) 258 | * [ViTBIS: Vision Transformer for Biomedical Image Segmentation](https://link.springer.com/chapter/10.1007/978-3-030-90874-4_4) -**arXiv 2021** -[github] 259 | * [Fully Transformer Networks for Semantic Image Segmentation](https://arxiv.org/pdf/2106.04108.pdf) -**arXiv 2021** -[github] 260 | * [UNetFormer: A UNet-like Transformer for Efficient Semantic Segmentation of Remote Sensing Urban Scene Imagery](https://arxiv.org/ftp/arxiv/papers/2109/2109.08937.pdf) -**arXiv 2021** -[github] 261 | * [Unsupervised Brain Anomaly Detection and Segmentation with Transformers](https://arxiv.org/pdf/2102.11650.pdf) -**arXiv 2021** -[github] 262 | * [few-Shot Temporal Action Localization with Query Adaptive Transformer](https://arxiv.org/pdf/2110.10552.pdf) -**arXiv 2021** -[github](https://github.com/sauradip/fewshotQA) 263 | * [Cost Aggregation Is All You Need for Few-Shot Segmentation](https://arxiv.org/pdf/2112.11685.pdf) -**arXiv 2021** -[github] 264 | * [Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers](https://arxiv.org/pdf/2108.06932.pdf) -**arXiv 2021** -[github](https://github.com/DengPingFan/Polyp-PVT) 265 | * [TransAttUnet: Multi-level Attention-guided U-Net with Transformer for Medical Image Segmentation](https://arxiv.org/pdf/2107.05274.pdf) -**arXiv 2021** -[github] 266 | * [ASFormer: Transformer for Action Segmentation](https://arxiv.org/pdf/2110.08568.pdf) -**arXiv 2021** -[github](https://github.com/ChinaYi/ASFormer) 267 | * [TransClaw U-Net: Claw U-Net with Transformers for Medical Image Segmentation](https://arxiv.org/pdf/2107.05188.pdf) -**arXiv 2021** -[github] 268 | * [SeqFormer: Sequential Transformer for Video Instance Segmentation](https://arxiv.org/pdf/2112.08275.pdf) -**arXiv 2021** -[github](https://github.com/wjf5203/SeqFormer) 269 | * [Mask2Former for Video Instance Segmentation](https://arxiv.org/pdf/2112.10764.pdf) -**arXiv 2021** -[github](https://github.com/facebookresearch/Mask2Former) 270 | * [Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation](https://arxiv.org/pdf/2105.05537.pdf) -**arXiv 2021** -[github](https://github.com/HuCaoFighting/Swin-Unet) 271 | * [LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation](https://arxiv.org/pdf/2107.08623.pdf) -**arXiv 2021** -[github] 272 | * [ISTR: End-to-End Instance Segmentation with Transformers](https://arxiv.org/pdf/2105.00637.pdf) -**arXiv 2021** -[github](https://github.com/hujiecpp/ISTR) 273 | * [P2T: Pyramid Pooling Transformer for Scene Understanding](https://arxiv.org/pdf/2106.12011.pdf) -**arXiv 2021** -[github] 274 | * [Medical Transformer: Universal Brain Encoder for 3D MRI Analysis](https://arxiv.org/pdf/2104.13633.pdf) -**arXiv 2021** -[github] 275 | * [nnFormer: Interleaved Transformer for Volumetric Segmentation](https://arxiv.org/pdf/2109.03201.pdf) -**arXiv 2021** -[github] 276 | * [MISSFormer: An Effective Medical Image Segmentation Transformer](https://arxiv.org/pdf/2109.07162.pdf) -**arXiv 2021** -[github] 277 | * [ViT-V-Net: Vision Transformer for Unsupervised Volumetric Medical Image Registration](https://arxiv.org/pdf/2104.06468.pdf) -**arXiv 2021** -[github] 278 | * [Pyramid Medical Transformer for Medical Image Segmentation](https://arxiv.org/ftp/arxiv/papers/2104/2104.14702.pdf) -**arXiv 2021** -[github] 279 | * [U-Net Transformer: Self and Cross Attention for Medical Image Segmentation](https://link.springer.com/chapter/10.1007/978-3-030-87589-3_28) -**arXiv 2021** -[github] 280 | * [Ds-transunet: Dual swin transformer u-net for medical image segmentation](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9785614) -**arXiv 2021** -[github] 281 | * [TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation](https://arxiv.org/pdf/2102.04306.pdf) -**arXiv 2021** -[github](https://github.com/Beckschen/TransUNet) 282 | * [TransVOS: Video Object Segmentation with Transformers](https://arxiv.org/pdf/2106.00588.pdf) -**arXiv 2021** -[github] 283 | ### 2020 284 | #### CVPR 2020 #### 285 | * [Polytransform: Deep polygon transformer for instance segmentation](https://openaccess.thecvf.com/content_CVPR_2020/papers/Liang_PolyTransform_Deep_Polygon_Transformer_for_Instance_Segmentation_CVPR_2020_paper.pdf) -**CVPR 2020** -[github] 286 | * [Sct: Set constrained temporal transformer for set supervised action segmentation](https://openaccess.thecvf.com/content_CVPR_2020/papers/Fayyaz_SCT_Set_Constrained_Temporal_Transformer_for_Set_Supervised_Action_Segmentation_CVPR_2020_paper.pdf) -**CVPR 2020** -[github](https://github.com/MohsenFayyaz89/SCT) 287 | #### ECCV 2020 #### 288 | * [Feature pyramid transformer](https://arxiv.org/pdf/2007.09451.pdf) -**ECCV 2020** -[github](https://github.com/dongzhang89/FPT) 289 | * [End-to-end object detection with transformers](https://arxiv.org/pdf/2005.12872.pdf,) -**ECCV 2020** -[github](https://github.com/facebookresearch/detr) 290 | #### MICCIA 2020 #### 291 | * [Multi-task Dynamic Transformer Network for Concurrent Bone Segmentation and Large-Scale Landmark Localization with Dental CBCT](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8687703/) -**MICCIA 2020** -[github] 292 | #### IEEE 2020 #### 293 | * [Attention-Based Transformers for Instance Segmentation of Cells in Microstructures](https://ieeexplore.ieee.org/document/9313305) -**IEEE 2020** -[github](https://github.com/ChristophReich1996/Cell-DETR) 294 | * [Detecting lane and road markings at a distance with perspective transformer layers](https://arxiv.org/pdf/2003.08550.pdf) -**IEEE 2020** -[github] 295 | * [Efficient aortic valve multilabel segmentation using a spatial transformer network](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9098378&casa_token=lSTpC0BJ9DcAAAAA:n3TVrNADU5egVyqJ78ZPwtGnhDMyrAYShc6dJzqQUg-M3sKAwwwbu7hTLgsnV_OmdCFZKZmdSw&tag=1) -**IEEE 2020** -[github] 296 | #### arXiv 2020 #### 297 | * [Visual transformers: Token-based image representation and processing for computer vision](https://arxiv.org/pdf/2006.03677.pdf) -**arXiv 2020** -[github](https://github.com/tahmid0007/VisualTransformers) 298 | * [Task-adaptive feature transformer for few-shot segmentation](https://arxiv.org/pdf/2010.11437.pdf) -**arXiv 2020** -[github](https://github.com/istarjun/TAFT-SE) 299 | ### 2019 300 | #### IEEE 2019 #### 301 | * [TETRIS: Template transformer networks for image segmentation with shape priors](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8672808) -**IEEE 2019** -[github] 302 | #### arXiv 2019 #### 303 | * [Iterative transformer network for 3d point cloud](https://arxiv.org/pdf/1811.11209.pdf) -**arXiv 2019** -[github](https://github.com/wentaoyuan/it-net) 304 | * [Segmentation transformer: Object-contextual representations for semantic segmentation](https://arxiv.org/pdf/1909.11065.pdf) -**arXiv 2019** -[github] 305 | ### Others 306 | * [TrSeg: Transformer for semantic segmentation](https://pdf.sciencedirectassets.com/271524/1-s2.0-S0167865521X00074/1-s2.0-S016786552100163X/main.pdf?X-Amz-Security-Token=IQoJb3JpZ2luX2VjEO3%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaCXVzLWVhc3QtMSJHMEUCIFlNu5rMdOhBm6xZ9JqePuOI4dXGYbZEATfhaQlbBLrlAiEAj4bE9oSC5k6xSVA9lLSk9kE5EPBLQ0Gp7YJa8oduODQq0gQIVRAFGgwwNTkwMDM1NDY4NjUiDAmrnEwr%2BNntVOsrWiqvBN16AKDRXC%2Byc6u4JUuIbuLxzM%2F4lYkoM6UwiA6zDq92OmgFeHuI2wQLDBZGZ6s3Qh9ZubxhcSEnIrRY8LRSxfSb9xXCpy7LG4bMkdl1qQoGUN8rGB2lk%2BBfRT8ar7dXSGpoanKgoLVHvORbp1nrPXPhv08S6udDxxDDyF3n7hpll2E%2BiOGHf0iIrJNV11KpowVxbhMJLepoBayXYyXEALRfbFBLtDRUTv3xH%2BdNaAAe%2BiUL3v7DLKuDOam7EFHnSL70zCo6UsftEnG%2Byzzo7KxVlUJNY7Dp2vMEw0PDmhgcV17On%2BflhTdrIWT5ouoEavYbt4Si093fFWYiyRBGDDFInT010X%2BPXg3VM3WVXPez5cQl6Zvym4GUiTwEAm6yGLepoJ4gbEurq8tSLcZ5mHwrviBsbI4a3RgFpQfbKo16TvNr1VPXMIkgecIWOsD5M%2FMCa5aHMypvyK69jkIOgVoeoP1MUK2S5aX8FpVo6Fv5xyQngHPMuDS2LTANbQG7H1ZANOpQ%2FY%2B5OviwIAj7QXwF0ulIPJFqzbo%2BUeoX7eWC4qjzX6ZCfZuneeyxeDpyhLBCMw8UkrfsB%2FBGb3JF%2BGKZ0VEjIMdy9innc8ovqmhVQDoRDMKCpiPikc6wfyILV46MFsoDw5ZEOi9TFMmT9iWldq67HRzcBZhRbjVi9cM5Mb6GrDgGPpUJiCme5IqwWMwe6uVAn0SoTaBTAPy200ZD7cT7%2Bc0hXt53V5ZzPDswloTSlwY6qQGxaQm55B6%2FeL6424W%2BCVNou5VFr2cenMpiBej21Qb1Ldjnz8ycP8v0kG7mBKWqvXzFdCbRgQFk3G1%2FGhFZyy%2FynTlXbCwnipxIhI6qFV0WpqTV7DxmM2XCeyju8N5%2Btgf2gYxXvzi8nxU018Vf%2FjfePePahuDjmkMHQtHgi8J7BLReqYP0kEXgpmPq%2BOXcHbD71LX%2BJ1LJChcb8v2BkdxZkiTeMk9AZqwB&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20220811T051212Z&X-Amz-SignedHeaders=host&X-Amz-Expires=300&X-Amz-Credential=ASIAQ3PHCVTYWCIX64SK%2F20220811%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=a9e1160c1edb9fac03b1c9610adcdefbfce6f4aed9d295011ba253ceb82eaf69&hash=08500206c3b0e4ca9db6b86bfc8c0e7650f8c8b461b8046fc174940308c8d0e0&host=68042c943591013ac2b2430a89b270f6af2c76d8dfd086a07176afe7c76c2c61&pii=S016786552100163X&tid=spdf-15e8c3c6-591f-4da0-b719-533e86352601&sid=e74741ce47d48945d12bc644efe2a56d70c7gxrqb&type=client&ua=51535f06555c56510200&rr=738e7992bb77ca8c) -**Pattern Recognition Letters 2021** -[github](https://github.com/youngsjjn/TrSeg) 307 | * [Video Semantic Segmentation via Sparse Temporal Transformer](https://dl.acm.org/doi/abs/10.1145/3474085.3475409?casa_token=I1PVo2St5EMAAAAA:WWlinaGz9yYZrXkMDTqBySg7x7uyfYqTHxeIxLy_zQ8pHwE_4WKx5kClZdjfLIoNCv3uig0ZEEDmBA) -**ACM 2021** -[github] 308 | 309 | ### Acknowledgements 310 | We appreciate the excellent work of the authors mentioned above. 311 | ### Citation 312 | --------------------------------------------------------------------------------