├── .gitignore
├── LICENSE.txt
├── README.md
├── app.py
├── davis2017
├── __init__.py
├── davis.py
├── evaluation.py
├── metrics.py
├── results.py
└── utils.py
├── eval_miou.py
├── eval_video.py
├── examples
├── cat_00.jpg
├── cat_00.png
├── cat_01.jpg
├── cat_01.png
├── cat_02.jpg
├── cat_02.png
├── colorful_sneaker_00.jpg
├── colorful_sneaker_00.png
├── colorful_sneaker_01.jpg
├── colorful_sneaker_01.png
├── colorful_sneaker_02.jpg
├── colorful_sneaker_02.png
├── duck_toy_00.jpg
├── duck_toy_00.png
├── duck_toy_01.jpg
├── duck_toy_01.png
├── duck_toy_02.jpg
└── duck_toy_02.png
├── figs
├── fig_db.png
└── fig_persam.png
├── per_segment_anything
├── __init__.py
├── automatic_mask_generator.py
├── build_sam.py
├── modeling
│ ├── __init__.py
│ ├── common.py
│ ├── image_encoder.py
│ ├── mask_decoder.py
│ ├── prompt_encoder.py
│ ├── sam.py
│ ├── tiny_vit_sam.py
│ └── transformer.py
├── predictor.py
└── utils
│ ├── __init__.py
│ ├── amg.py
│ ├── onnx.py
│ └── transforms.py
├── persam.py
├── persam_f.py
├── persam_f_multi_obj.py
├── persam_video.py
├── persam_video_f.py
├── prepare_coco.py
├── requirements.txt
├── show.py
└── weights
└── mobile_sam.pt
/.gitignore:
--------------------------------------------------------------------------------
1 | lation and distribution
2 | __pycache__
3 | _ext
4 | *.pyc
5 | *.pyd
6 | *.so
7 | *.dll
8 | *.egg-info/
9 | build/
10 | dist/
11 | wheels/
12 |
13 | # pytorch/python/numpy formats
14 | *.pth
15 | *.pkl
16 | *.npy
17 | *.ts
18 | model_ts*.txt
19 |
20 | # onnx models
21 | *.onnx
22 |
23 | # ipython/jupyter notebooks
24 | **/.ipynb_checkpoints/
25 |
26 | # Editor temporaries
27 | *.swn
28 | *.swo
29 | *.swp
30 | *~
31 |
32 | # editor settings
33 | .idea
34 | .vscode
35 | _darcs
36 |
37 | # output
38 | data
39 | work_dirs
40 |
41 |
--------------------------------------------------------------------------------
/LICENSE.txt:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2022 Renrui Zhang
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Personalize Segment Anything with 1 Shot in 10 Seconds
2 |
3 | [](https://paperswithcode.com/sota/personalized-segmentation-on-perseg?p=personalize-segment-anything-model-with-one)
4 |
5 | Official implementation of ['Personalize Segment Anything Model with One Shot'](https://arxiv.org/pdf/2305.03048.pdf).
6 |
7 | 💥 Try out the [web demo](https://huggingface.co/spaces/justin-zk/Personalize-SAM) 🤗 of PerSAM and PerSAM-F: [](https://huggingface.co/spaces/justin-zk/Personalize-SAM)
8 |
9 |
10 | 🎉 Try out the [tutorial notebooks](https://github.com/NielsRogge/Transformers-Tutorials/tree/master/PerSAM) in colab for your own dataset. Great thanks to [@NielsRogge](https://github.com/NielsRogge)!
11 |
12 | 🎆 Try out the online web demo of PerSAM in OpenXLab :
13 | [](https://openxlab.org.cn/apps/detail/RenRuiZhang/Personalize-SAM)
14 |
15 |
16 | ## News
17 | * Support [MobileSAM](https://github.com/ChaoningZhang/MobileSAM) 🔥 with significant efficiency improvement. Thanks for their wonderful work!
18 | * **TODO**: Release the PerSAM-assisted [Dreambooth](https://arxiv.org/pdf/2208.12242.pdf) for better fine-tuning [Stable Diffusion](https://github.com/CompVis/stable-diffusion) 📌.
19 | * We release the code of PerSAM and PerSAM-F 🔥. Check our [video](https://www.youtube.com/watch?v=QlunvXpYQXM) here!
20 | * We release a new dataset for personalized segmentation, [PerSeg](https://drive.google.com/file/d/18TbrwhZtAPY5dlaoEqkPa5h08G9Rjcio/view?usp=sharing) 🔥.
21 |
22 | ## Introduction
23 | *How to customize SAM to automatically segment your pet dog in a photo album?*
24 |
25 | In this project, we propose a training-free **Per**sonalization approach for [Segment Anything Model (SAM)](https://ai.facebook.com/research/publications/segment-anything/), termed as **PerSAM**. Given only a single image with a reference mask, PerSAM can segment specific visual concepts, e.g., your pet dog, within other images or videos without any training.
26 | For better performance, we further present an efficient one-shot fine-tuning variant, **PerSAM-F**. We freeze the entire SAM and introduce two learnable mask weights, which only trains **2 parameters** within **10 seconds**.
27 |
28 |
29 |
30 |
31 |
32 | Besides, our approach can be utilized to assist [DreamBooth](https://arxiv.org/pdf/2208.12242.pdf) in fine-tuning better [Stable Diffusion](https://github.com/CompVis/stable-diffusion) for personalized image synthesis. We adopt PerSAM to segment the target object in the user-provided few-shot images, which eliminates the **background disturbance** and benefits the target representation learning.
33 |
34 |
35 |
36 |
37 |
38 | ## Requirements
39 | ### Installation
40 | Clone the repo and create a conda environment:
41 | ```bash
42 | git clone https://github.com/ZrrSkywalker/Personalize-SAM.git
43 | cd Personalize-SAM
44 |
45 | conda create -n persam python=3.8
46 | conda activate persam
47 |
48 | pip install -r requirements.txt
49 | ```
50 |
51 | Similar to Segment Anything, our code requires `pytorch>=1.7` and `torchvision>=0.8`. Please follow the instructions [here](https://pytorch.org/get-started/locally/) to install both PyTorch and TorchVision dependencies.
52 |
53 |
54 |
55 | ### Preparation
56 | Please download our constructed dataset **PerSeg** for personalized segmentation from [Google Drive](https://drive.google.com/file/d/18TbrwhZtAPY5dlaoEqkPa5h08G9Rjcio/view?usp=sharing) or [Baidu Yun](https://pan.baidu.com/s/1X-czD-FYW0ELlk2x90eTLg) (code `222k`), and the pre-trained weights of SAM from [here](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth). Then, unzip the dataset file and organize them as
57 | ```
58 | data/
59 | |–– Annotations/
60 | |–– Images/
61 | sam_vit_h_4b8939.pth
62 | ```
63 | Please download 480p [TrainVal](https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-trainval-480p.zip) split of DAVIS 2017. Then decompress the file to `DAVIS/2017` and organize them as
64 | ```
65 | DAVIS/
66 | |––2017/
67 | |–– Annotations/
68 | |–– ImageSets/
69 | |–– JPEGImages/
70 | ```
71 |
72 | ## Getting Started
73 |
74 | ### Personalized Segmentation
75 |
76 | For the training-free 🧊 **PerSAM**, just run:
77 | ```bash
78 | python persam.py --outdir