├── imgs ├── MobileUtr.png └── performance.png ├── LICENSE └── README.md /imgs/MobileUtr.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FengheTan9/MobileUtr/HEAD/imgs/MobileUtr.png -------------------------------------------------------------------------------- /imgs/performance.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FengheTan9/MobileUtr/HEAD/imgs/performance.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Fenghe Tang 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # MobileUtr: Revisiting the relationship between light-weight CNN and Transformer for efficient medical image segmentation 2 | 3 | Official pytorch code for "MobileUtr: Revisiting the relationship between light-weight CNN and Transformer for efficient medical image segmentation" 4 | 5 | - [x] Code release, please visit [this new repo](https://github.com/FengheTan9/Mobile-U-ViT) 🤓 ! 6 | - [x] Paper release 7 | 8 | ## Introduction 9 | Due to the scarcity and specific imaging characteristics in medical images, light-weighting Vision Transformers (ViTs) for efficient medical image segmentation is a significant challenge, and current studies have not yet paid attention to this issue. This work revisits the relationship between CNNs and Transformers in lightweight universal networks for medical image segmentation, aiming to integrate the advantages of both worlds at the infrastructure design level. In order to leverage the inductive bias inherent in CNNs, we abstract a Transformer-like lightweight CNNs block (ConvUtr) as the patch embeddings of ViTs, feeding Transformer with denoised, non-redundant and highly condensed semantic information. Moreover, an adaptive Local-Global-Local (LGL) block is introduced to facilitate efficient local-to-global information flow exchange, maximizing Transformer's global context information extraction capabilities. Finally, we build an efficient medical image segmentation model (MobileUtr) based on CNN and Transformers. Extensive experiments on five public medical image datasets with three different modalities demonstrate the superiority of MobileUtr over the state-of-the-art methods, while boasting lighter weights and lower computational cost. 10 | 11 | ### MobileUtr: 12 | 13 | ![framework](imgs/MobileUtr.png) 14 | 15 | ## Performance Comparison 16 | 17 | 18 | 19 | ## Datasets 20 | 21 | Please put the [BUSI](https://www.kaggle.com/aryashah2k/breast-ultrasound-images-dataset) dataset or your own dataset as the following architecture. 22 | ``` 23 | └── MobileUtr 24 | ├── data 25 | ├── busi 26 | ├── images 27 | | ├── benign (10).png 28 | │ ├── malignant (17).png 29 | │ ├── ... 30 | | 31 | └── masks 32 | ├── 0 33 | | ├── benign (10).png 34 | | ├── malignant (17).png 35 | | ├── ... 36 | ├── your dataset 37 | ├── images 38 | | ├── 0a7e06.png 39 | │ ├── ... 40 | | 41 | └── masks 42 | ├── 0 43 | | ├── 0a7e06.png 44 | | ├── ... 45 | ├── dataloader 46 | ├── network 47 | ├── utils 48 | ├── main.py 49 | └── split.py 50 | ``` 51 | ## Environment 52 | 53 | - GPU: NVIDIA GeForce RTX4090 GPU 54 | - Pytorch: 1.13.0 cuda 11.7 55 | - cudatoolkit: 11.7.1 56 | - scikit-learn: 1.0.2 57 | 58 | ## Training and Validation 59 | 60 | You can first split your dataset: 61 | 62 | ```python 63 | python split.py --dataset_name busi --dataset_root ./data 64 | ``` 65 | 66 | Then, train and validate: 67 | 68 | ```python 69 | python main.py --model ["MobileUtr", "MobileUtr-L"] --base_dir ./data/busi --train_file_dir busi_train.txt --val_file_dir busi_val.txt 70 | ``` 71 | 72 | ## Citation 73 | 74 | If you use our code, please cite our paper: 75 | 76 | ``` 77 | @misc{tang2023mobileutr, 78 | title={MobileUtr: Revisiting the relationship between light-weight CNN and Transformer for efficient medical image segmentation}, 79 | author={Fenghe Tang and Bingkun Nian and Jianrui Ding and Quan Quan and Jie Yang and Wei Liu and S. Kevin Zhou}, 80 | year={2023}, 81 | eprint={2312.01740}, 82 | archivePrefix={arXiv}, 83 | primaryClass={eess.IV} 84 | } 85 | ``` 86 | 87 | --------------------------------------------------------------------------------