├── imgs
    ├── MobileUtr.png
    └── performance.png
├── LICENSE
└── README.md


/imgs/MobileUtr.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FengheTan9/MobileUtr/HEAD/imgs/MobileUtr.png


--------------------------------------------------------------------------------
/imgs/performance.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FengheTan9/MobileUtr/HEAD/imgs/performance.png


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2023 Fenghe Tang
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # MobileUtr: Revisiting the relationship between light-weight CNN and Transformer for efficient medical image segmentation
 2 | 
 3 | Official pytorch code for "MobileUtr: Revisiting the relationship between light-weight CNN and Transformer for efficient medical image segmentation"
 4 | 
 5 | - [x] Code release, please visit [this new repo](https://github.com/FengheTan9/Mobile-U-ViT) 🤓 !
 6 | - [x] Paper release
 7 | 
 8 | ## Introduction
 9 | Due to the scarcity and specific imaging characteristics in medical images, light-weighting Vision Transformers (ViTs) for efficient medical image segmentation is a significant challenge, and current studies have not yet paid attention to this issue. This work revisits the relationship between CNNs and Transformers in lightweight universal networks for medical image segmentation, aiming to integrate the advantages of both worlds at the infrastructure design level. In order to leverage the inductive bias inherent in CNNs, we abstract a Transformer-like lightweight CNNs block (ConvUtr) as the patch embeddings of ViTs, feeding Transformer with denoised, non-redundant and highly condensed semantic information. Moreover, an adaptive Local-Global-Local (LGL) block is introduced to facilitate efficient local-to-global information flow exchange, maximizing Transformer's global context information extraction capabilities. Finally, we build an efficient medical image segmentation model (MobileUtr) based on CNN and Transformers. Extensive experiments on five public medical image datasets with three different modalities demonstrate the superiority of MobileUtr over the state-of-the-art methods, while boasting lighter weights and lower computational cost.
10 | 
11 | ### MobileUtr:
12 | 
13 | ![framework](imgs/MobileUtr.png)
14 | 
15 | ## Performance Comparison
16 | 
17 | <img src="imgs/performance.png" title="preformance" style="zoom:8%;" align="left"/>
18 | 
19 | ## Datasets
20 | 
21 | Please put the [BUSI](https://www.kaggle.com/aryashah2k/breast-ultrasound-images-dataset) dataset or your own dataset as the following architecture. 
22 | ```
23 | └── MobileUtr
24 |     ├── data
25 |         ├── busi
26 |             ├── images
27 |             |   ├── benign (10).png
28 |             │   ├── malignant (17).png
29 |             │   ├── ...
30 |             |
31 |             └── masks
32 |                 ├── 0
33 |                 |   ├── benign (10).png
34 |                 |   ├── malignant (17).png
35 |                 |   ├── ...
36 |         ├── your dataset
37 |             ├── images
38 |             |   ├── 0a7e06.png
39 |             │   ├── ...
40 |             |
41 |             └── masks
42 |                 ├── 0
43 |                 |   ├── 0a7e06.png
44 |                 |   ├── ...
45 |     ├── dataloader
46 |     ├── network
47 |     ├── utils
48 |     ├── main.py
49 |     └── split.py
50 | ```
51 | ## Environment
52 | 
53 | - GPU: NVIDIA GeForce RTX4090 GPU
54 | - Pytorch: 1.13.0 cuda 11.7
55 | - cudatoolkit: 11.7.1
56 | - scikit-learn: 1.0.2
57 | 
58 | ## Training and Validation
59 | 
60 | You can first split your dataset:
61 | 
62 | ```python
63 | python split.py --dataset_name busi --dataset_root ./data
64 | ```
65 | 
66 | Then, train and validate:
67 | 
68 | ```python
69 | python main.py --model ["MobileUtr", "MobileUtr-L"] --base_dir ./data/busi --train_file_dir busi_train.txt --val_file_dir busi_val.txt
70 | ```
71 | 
72 | ## Citation
73 | 
74 | If you use our code, please cite our paper:
75 | 
76 | ```
77 | @misc{tang2023mobileutr,
78 |       title={MobileUtr: Revisiting the relationship between light-weight CNN and Transformer for efficient medical image segmentation}, 
79 |       author={Fenghe Tang and Bingkun Nian and Jianrui Ding and Quan Quan and Jie Yang and Wei Liu and S. Kevin Zhou},
80 |       year={2023},
81 |       eprint={2312.01740},
82 |       archivePrefix={arXiv},
83 |       primaryClass={eess.IV}
84 | }
85 | ```
86 | 
87 | 


--------------------------------------------------------------------------------