├── .gitignore
├── LICENSE
├── README.md
├── SETUP.md
├── data
└── mvtec_ad
│ └── .gitkeep
├── docker
├── README.md
├── cpu
│ ├── Dockerfile
│ ├── entrypoint.sh
│ └── installer.sh
└── gpu
│ ├── Dockerfile
│ ├── entrypoint.sh
│ └── installer.sh
├── docs
├── MVTecAD_scores.md
└── figures
│ ├── .gitkeep
│ └── roc_curve.svg
├── experiments
├── README.md
├── figures
│ ├── MVTecAD_Breakdown_of_inference_time_on_CPU.svg
│ ├── MVTecAD_ResNet50_with_different_pretrainings.svg
│ ├── MVTecAD_averaged_image-level_roc_auc_score.svg
│ ├── MVTecAD_averaged_pixel-level_roc_auc_score.svg
│ ├── MVTecAD_image-level_roc_auc_score.svg
│ ├── MVTecAD_image-level_roc_auc_score_backbones.svg
│ ├── MVTecAD_pixel-level_roc_auc_score.svg
│ ├── MVTecAD_pixel-level_roc_auc_score_backbones.svg
│ ├── anomalous_area_visualization.jpg
│ ├── patchcore_sketch.jpg
│ └── samples_mvtec_ad.jpg
├── summary_comparison_pretraining.md
├── summary_comparison_with_backbones.md
└── summary_comparison_with_the_paper.md
├── main.py
├── main_mvtecad.py
├── patchcore
├── dataset.py
├── extractor.py
├── knnsearch.py
├── patchcore.py
└── utils.py
├── requirements.txt
└── run.sh
/.gitignore:
--------------------------------------------------------------------------------
1 | # Ignore dataset.
2 | data/mvtec_ad/*
3 | !data/mvtec_ad/.gitkeep
4 |
5 | # Ignore output data.
6 | *.jpg
7 | *.npy
8 | *.faiss
9 | log.txt
10 |
11 | # Ignore Python cache.
12 | __pycache__
13 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2022 Tetsuya Ishikawa
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | PatchCore Anomaly Detection
2 | ================================================================================
3 |
4 | This repository provides an unofficial PyTorch implementation
5 | of the PatchCore anomaly detection model [1] and several additional experiments.
6 |
7 | PatchCore is an anomaly detection algorithm that has the following features:
8 |
9 | * uses a memory-bank of nominal features extracted from a pre-trained
10 | backbone network (like SPADE and PaDiM), where the memory back is
11 | coreset-subsampled to ensure low inference cost at higher performance,
12 | * uses an approximated nearest neighbor search for evaluating pixel-wise
13 | anomaly score on inference,
14 | * shows state-of-the-art performance on the MVTec AD dataset (as of Jun 2021).
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 | Usage
23 | --------------------------------------------------------------------------------
24 |
25 | ### Installation
26 |
27 | The author recommends using Docker for keeping your environment clean.
28 | For example, you can create a new Docker container and enter into it
29 | by the following command in the root directory of this repository:
30 |
31 | ```console
32 | docker run --rm -it -v `pwd`:/workspace -w /workspace --name patchcore tiskw/patchcore:cpu-2022-03-01
33 | ```
34 |
35 | If you need GPU support, please use the Docker image with CUDA libraries:
36 |
37 | ```console
38 | docker run --rm -it -v `pwd`:/workspace -w /workspace --name patchcore tiskw/patchcore:gpu-2022-03-01
39 | ```
40 |
41 | See [this document](SETUP.md) for more details.
42 |
43 | ### Dataset
44 |
45 | You need to get the MVTec AD dataset [2] if you will reproduce [our experiments](experiments/README.md).
46 | If you don't have a plan to use the dataset, you can skip this subsection.
47 | You can download the MVTec AD dataset from
48 | [the official website](https://www.mvtec.com/company/research/datasets/mvtec-ad)
49 | (or [direct link to the data file](https://www.mydrive.ch/shares/38536/3830184030e49fe74747669442f0f282/download/420938113-1629952094/mvtec_anomaly_detection.tar.xz))
50 | and put it under `data/mvtec_ad/` directory.
51 |
52 | At first, please move to the `data/mvtec_ad/` directory.
53 |
54 | ```console
55 | cd data/mvtec_ad/
56 | ```
57 |
58 | Then, run the following command to download the MVTec AD dataset:
59 |
60 | ```console
61 | wget "https://www.mydrive.ch/shares/38536/3830184030e49fe74747669442f0f282/download/420938113-1629952094/mvtec_anomaly_detection.tar.xz"
62 | ```
63 |
64 | Finally, extract the downloaded data:
65 |
66 | ```console
67 | tar fJx mvtec_anomaly_detection.tar.xz
68 | ```
69 |
70 | See [this document](SETUP.md) for more details.
71 |
72 |
73 | ### Train and predict on your dataset
74 |
75 | You can train and run predictions on your dataset using `main.py`.
76 | In the following, we assume:
77 |
78 | - your training images (good images) are stored under `data_train/` directory,
79 | - your test images (good or failure images) are stored under `data_test/` directory,
80 | - the training result will be stored at `./index.faiss`,
81 | - the test results will be dumped under `output_test/` directory.
82 |
83 | You can train your model by the following command:
84 |
85 | ```console
86 | python3 main.py train -i data_train -o ./index.faiss
87 | ```
88 |
89 | Then, you can predict anomaly score for test images by the following command:
90 |
91 | ```console
92 | python3 main.py predict -i data_test -o output_test
93 | ```
94 |
95 | On the output directory `output_test/`, two types of files will be dumped:
96 | - `.jpg`: anomaly heatmap overlayed on the input image,
97 | - `.npy`: matrix of anomaly heatmap with shape `(height, width)`.
98 |
99 | ### Replicate the experiments
100 |
101 | If you want to replicate [the experiments](experiments/README.md),
102 | run the following command at the root directory of this repository:
103 |
104 | ```console
105 | python3 main_mvtecad.py runall
106 | python3 main_mvtecad.py summarize
107 | ```
108 |
109 | The `python3 main_mvtecad.py runall` command will take a quite long time,
110 | therefore it is a good idea to wrap the above command by `nohup`.
111 |
112 | ### Anomalous area visualization
113 |
114 | If you want the visualizatino of the anomalous area for each sample like
115 | the following figure, you can try `--contour` option in the prediction
116 | step. The following is the details of steps to generate the anomalous
117 | area visualization.
118 |
119 |
120 |
121 |
122 |
123 | At first, the training data is located under `data_train/` directory,
124 | and we assume that you completed the training of the model:
125 |
126 | ```console
127 | # Train the PatchCore model.
128 | python3 main.py train -i data_train -o ./index.faiss
129 | ```
130 |
131 | Next, we need to compute the threshold for determining the anomalous area
132 | for each samples. You can compute the threshold by the following command:
133 |
134 | ```console
135 | # Compute threshold.
136 | python3 main.py thresh -i data_train
137 | ```
138 |
139 | Finally, you can get the anomalouse area visualization by the following command
140 | where we assume that the test data is located under `data_test/` directory
141 | and `THRESH` is the threshold value computed in the previous step:
142 |
143 | ```console
144 | # Visualize contour map using the threshold value obtained by the above.
145 | python3 main.py predict --contour THRESH -i data_test -o output_test
146 | ```
147 |
148 |
149 | Experiments
150 | --------------------------------------------------------------------------------
151 |
152 | ### Experiment 1: Comparison with the original paper on MVTec AD dataset
153 |
154 | The following figures are summaries of the comparison of the anomaly detection
155 | scores on MVTec AD dataset [2] with the original paper of the PatchCore [1].
156 | The performance of our implementation is quite close to the paper's score,
157 | therefore our implementation may have no serious issue.
158 |
159 |
160 |
161 |
162 |
163 |
164 | See [this document](experiments/summary_comparison_with_the_paper.md)
165 | for more details.
166 |
167 | ### Experiment 2: Comparison of backbone networks
168 |
169 | We compared the image/pixel-level scores on the MVTec AD dataset with
170 | different backbone networks. Some networks show a better speed/performance
171 | tradeoff than Wide ResNet50 x2 which is used as a default backbone network
172 | in the original paper.
173 |
174 |
175 |
176 |
177 |
178 |
179 | See [this document](experiments/summary_comparison_with_backbones.md)
180 | for more details.
181 |
182 | ### Experiment 3: Comparison of pre-trainings
183 |
184 | We compared several different pre-trained ResNet50 as a backbone of PatchCore.
185 | We hypothesize that a well-trained neural network achieves higher performance.
186 | We tried the normal ImageNet pre-training ResNet50, DeepLabV3 resNet50 pre-trained
187 | with COCO, and ResNet50-SSL/SWSL model that are pre-trained on ImageNet [3]
188 | with semi-supervise or un-supervised manner. The result is quite interesting,
189 | however, basically, we can say that the normal ImageNet pre-trained model is enough
190 | good for PatchCore purposes.
191 |
192 |
193 |
194 |
195 |
196 | See [this document](experiments/summary_comparison_pretraining.md)
197 | for more details.
198 |
199 |
209 |
210 | Notes
211 | --------------------------------------------------------------------------------
212 |
213 | * This implementation refers to another PatchCore implementation [6] which
214 | is released under Apache 2.0 license. The author has learned a lot
215 | from the implementation.
216 |
217 |
218 | References
219 | --------------------------------------------------------------------------------
220 |
221 | [1] K. Roth, L. Pemula, J. Zepeda, B. Scholkopf, T. Brox, and P. Gehler,
222 | "Towards Total Recall in Industrial Anomaly Detection",
223 | arXiv, 2021.
224 | [PDF](https://arxiv.org/pdf/2106.08265.pdf)
225 |
226 | [2] P. Bergmann M. Fauser D. Sattlegger, and C. Steger,
227 | "MVTec AD - A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection",
228 | CVPR, 2019.
229 | [PDF](https://openaccess.thecvf.com/content_CVPR_2019/papers/Bergmann_MVTec_AD_--_A_Comprehensive_Real-World_Dataset_for_Unsupervised_Anomaly_CVPR_2019_paper.pdf)
230 |
231 | [3] I. Yalniz, H. Jegou, K. Chen, M. Paluri and D. Mahajan,
232 | "Billion-scale semi-supervised learning for image classification",
233 | arXiv, 2019.
234 | [PDF](https://arxiv.org/pdf/1905.00546.pdf)
235 |
236 | [4] E. Fix and J. Hodges,
237 | ”Discriminatory Analysis. Nonparametric Discrimination: Consistency Properties”,
238 | USAF School of Aviation Medicine, Randolph Field, Texas, 1951.
239 |
240 | [5] B. Scholkopf, R. Williamson, A. Smola, J. Shawe-Taylor and J. Platt,
241 | "Support Vector Method for Novelty Detection",
242 | NIPS, 1999.
243 | [PDF](https://proceedings.neurips.cc/paper/1999/file/8725fb777f25776ffa9076e44fcfd776-Paper.pdf)
244 |
245 | [6] [hcw-00/PatchCore_anomaly_detection](https://github.com/hcw-00/PatchCore_anomaly_detection), GitHub.
246 |
--------------------------------------------------------------------------------
/SETUP.md:
--------------------------------------------------------------------------------
1 | Setting up
2 | ================================================================================
3 |
4 | Installation
5 | --------------------------------------------------------------------------------
6 |
7 | ### Using Docker (recommended)
8 |
9 | If you don't like to pollute your development environment, it is a good idea
10 | to run everything inside a Docker container. Our code is executable on this
11 | Docker image. Please download the Docker image to your computer by
12 | the following command at first:
13 |
14 | ```
15 | docker pull tiskw/patchcore:cpu-2022-01-29
16 | ```
17 |
18 | You can create your Docker container by the following command:
19 |
20 | ```console
21 | cd ROOT_DIRECTORY_OF_THIS_REPO
22 | docker run --rm -it -v `pwd`:/work -w /work -u `id -u`:`id -g` --name patchcore tiskw/patchcore:cpu-2022-01-29
23 | ```
24 |
25 | If you need GPU support, use `tiskw/patchcore:gpu-2022-01-29` image instead,
26 | and add `--gpus all` option to the above `docker run` command.
27 |
28 | ### Installing on your environment (easier, but pollute your development environment)
29 |
30 | If you don't mind polluting your environment
31 | (or you are already inside a docker container),
32 | just run the following command for installing required packages:
33 |
34 | ```console
35 | cd ROOT_DIRECTORY_OF_THIS_REPO
36 | pip3 install -r requirements.txt
37 | ```
38 |
39 | If you need GPU support, open the `requirements.txt`
40 | and replace `faiss-cpu` to `faiss-gpu`.
41 |
42 |
43 | Dataset
44 | --------------------------------------------------------------------------------
45 |
46 | Download the MVTec AD dataset from
47 | [the official website](https://www.mvtec.com/company/research/datasets/mvtec-ad)
48 | (or [direct link to the data file](https://www.mydrive.ch/shares/38536/3830184030e49fe74747669442f0f282/download/420938113-1629952094/mvtec_anomaly_detection.tar.xz))
49 | and put the downloaded file (`mvtec_anomaly_detection.tar.xz`) under `data/mvtec_ad`
50 | directory. The following is an example to download the dataset from your terminal:
51 |
52 | ```console
53 | cd ROOT_DIRECTORY_OF_THIS_REPO
54 | cd data/mvtec_ad
55 | wget "https://www.mydrive.ch/shares/38536/3830184030e49fe74747669442f0f282/download/420938113-1629952094/mvtec_anomaly_detection.tar.xz"
56 | ```
57 |
58 | Then, extract the downloaded data in the `data/mvtec_ad` directory:
59 |
60 | ```console
61 | cd ROOT_DIRECTORY_OF_THIS_REPO
62 | cd data/mvtec_ad
63 | tar xfJ mvtec_anomaly_detection.tar.xz
64 | ```
65 |
66 | You've succeed to extract the dataset if the directory structure of your
67 | `data/mvtec_ad/` is like the following:
68 |
69 | ```console
70 | data/mvtec_ad/
71 | |-- bottle
72 | | |-- ground_truth
73 | | | |-- broken_large
74 | | | |-- broken_small
75 | | | `-- contamination
76 | | |-- test
77 | | | |-- broken_large
78 | | | |-- broken_small
79 | | | |-- contamination
80 | | | `-- good
81 | | `-- train
82 | | `-- good
83 | |-- cable
84 | | |-- ground_truth
85 | | | |-- bent_wire
86 | | | |-- cable_swap
87 | | | |-- combined
88 | | | |-- cut_inner_insulation
89 | | | |-- cut_outer_insulation
90 | ...
91 | ```
92 |
93 | the above is the output of `tree -d --charset unicode data/mvtec_ad`
94 | command on the authors environment.
95 |
--------------------------------------------------------------------------------
/data/mvtec_ad/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tiskw/patchcore-ad/de22d18a1bfb444558abcc922326b6ca17460a89/data/mvtec_ad/.gitkeep
--------------------------------------------------------------------------------
/docker/README.md:
--------------------------------------------------------------------------------
1 | Docker images
2 | ================================================================================
3 |
4 | This directory contains files for building Docker images used in
5 | this repository. All images are available from
6 | [Dockerhub](https://hub.docker.com/r/tiskw/patchcore).
7 |
8 | Build docker images
9 | --------------------------------------------------------------------------------
10 |
11 | ### Docker image for CPU
12 |
13 | ```console
14 | cd ROOT_DIRECTORY_OF_THIS_REPO/docker/cpu
15 | docker build -t `date +"tiskw/patchcore:cpu-%Y-%m-%d"` .
16 | ```
17 |
18 | ### Docker image for GPU
19 |
20 | ```console
21 | cd ROOT_DIRECTORY_OF_THIS_REPO/docker/gpu
22 | docker build -t `date +"tiskw/patchcore:gpu-%Y-%m-%d"` .
23 | ```
24 |
--------------------------------------------------------------------------------
/docker/cpu/Dockerfile:
--------------------------------------------------------------------------------
1 | FROM ubuntu:focal
2 |
3 | MAINTAINER Tetsuya Ishikawa
4 |
5 | # Set environment variables.
6 | ENV DEBIAN_FRONTEND=noninteractive
7 | ENV USERNAME=developer
8 |
9 | # Copy and run the installer.
10 | COPY installer.sh /installer.sh
11 | RUN sh installer.sh
12 |
13 | # Copy a shell script for dynamic user creation.
14 | COPY entrypoint.sh /entrypoint.sh
15 |
16 | # Unlock permissions for the above "entrypoint.sh".
17 | RUN chmod u+s /usr/sbin/useradd /usr/sbin/groupadd
18 |
19 | # Set locales.
20 | ENV LANG en_US.UTF-8
21 | ENV LANGUAGE en_US:en
22 | ENV LC_ALL en_US.UTF-8
23 |
24 | ENTRYPOINT ["sh", "/entrypoint.sh"]
25 | CMD ["/bin/bash"]
26 |
--------------------------------------------------------------------------------
/docker/cpu/entrypoint.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash -e
2 |
3 | USERNAME="developer"
4 |
5 | USRID=$(id -u)
6 | GRPID=$(id -g)
7 |
8 | # Create group.
9 | if [ x"$GRPID" != x"0" ]; then
10 | groupadd -g ${GRPID} ${USERNAME}
11 | fi
12 |
13 | # Create user.
14 | if [ x"$USRID" != x"0" ]; then
15 | useradd -d /home/${USERNAME} -m -s /bin/bash -u ${USRID} -g ${GRPID} ${USERNAME}
16 | fi
17 |
18 | # Restore permissions.
19 | sudo chmod u-s /usr/sbin/useradd /usr/sbin/groupadd
20 |
21 | export HOME="/home/${USERNAME}"
22 |
23 | exec $@
24 |
--------------------------------------------------------------------------------
/docker/cpu/installer.sh:
--------------------------------------------------------------------------------
1 | #!/bin/sh
2 |
3 | # Update and upgrade installed packages.
4 | apt-get update
5 | apt-get upgrade -y
6 | apt-get install -y apt-utils
7 |
8 | # Install necessary packages.
9 | apt-get install -y sudo locales
10 |
11 | # Install Python3.
12 | apt-get install -y python3 python3-dev python3-distutils python3-pip
13 |
14 | # Install OpenCV.
15 | apt-get install -y libopencv-dev
16 |
17 | # Install Python packages.
18 | pip3 install opencv-python==4.5.5.62 scikit-learn==1.0.2 rich==11.1.0 faiss-cpu==1.7.2 \
19 | torch==1.10.2+cu113 torchvision==0.11.3+cu113 thop==0.0.31-2005241907 \
20 | -f https://download.pytorch.org/whl/cu113/torch_stable.html
21 |
22 | # Set locale to UTF8.
23 | locale-gen en_US.UTF-8
24 |
25 | # Clear package cache.
26 | apt-get clean
27 | rm -rf /var/lib/apt/lists/*
28 |
29 | # Enable sudo without password.
30 | mkdir -p /etc/sudoers.d
31 | echo "${USERNAME} ALL=NOPASSWD: ALL" >> /etc/sudoers.d/${USERNAME}
32 |
33 | # Unlock permissions.
34 | chmod u+s /usr/sbin/useradd && chmod u+s /usr/sbin/groupadd
35 |
--------------------------------------------------------------------------------
/docker/gpu/Dockerfile:
--------------------------------------------------------------------------------
1 | FROM nvidia/cuda:11.3.1-devel-ubuntu20.04
2 |
3 | MAINTAINER Tetsuya Ishikawa
4 |
5 | # Set environment variables.
6 | ENV DEBIAN_FRONTEND=noninteractive
7 | ENV USERNAME=developer
8 |
9 | # Copy and run the installer.
10 | COPY installer.sh /installer.sh
11 | RUN sh installer.sh
12 |
13 | # Copy a shell script for dynamic user creation.
14 | COPY entrypoint.sh /entrypoint.sh
15 |
16 | # Unlock permissions for the above "entrypoint.sh".
17 | RUN chmod u+s /usr/sbin/useradd /usr/sbin/groupadd
18 |
19 | # Set locales.
20 | ENV LANG en_US.UTF-8
21 | ENV LANGUAGE en_US:en
22 | ENV LC_ALL en_US.UTF-8
23 |
24 | ENTRYPOINT ["sh", "/entrypoint.sh"]
25 | CMD ["/bin/bash"]
26 |
--------------------------------------------------------------------------------
/docker/gpu/entrypoint.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash -e
2 |
3 | USERNAME="developer"
4 |
5 | USRID=$(id -u)
6 | GRPID=$(id -g)
7 |
8 | # Create group.
9 | if [ x"$GRPID" != x"0" ]; then
10 | groupadd -g ${GRPID} ${USERNAME}
11 | fi
12 |
13 | # Create user.
14 | if [ x"$USRID" != x"0" ]; then
15 | useradd -d /home/${USERNAME} -m -s /bin/bash -u ${USRID} -g ${GRPID} ${USERNAME}
16 | fi
17 |
18 | # Restore permissions.
19 | sudo chmod u-s /usr/sbin/useradd /usr/sbin/groupadd
20 |
21 | export HOME="/home/${USERNAME}"
22 |
23 | exec $@
24 |
--------------------------------------------------------------------------------
/docker/gpu/installer.sh:
--------------------------------------------------------------------------------
1 | #!/bin/sh
2 |
3 | # Update and upgrade installed packages.
4 | apt-get update
5 | apt-get upgrade -y
6 | apt-get install -y apt-utils
7 |
8 | # Install necessary packages.
9 | apt-get install -y sudo locales
10 |
11 | # Install Python3.
12 | apt-get install -y python3 python3-dev python3-distutils python3-pip
13 |
14 | # Install OpenCV.
15 | apt-get install -y libopencv-dev
16 |
17 | # Install Python packages.
18 | pip3 install opencv-python==4.5.5.62 scikit-learn==1.0.2 rich==11.1.0 faiss-gpu==1.7.2 \
19 | torch==1.10.2+cu113 torchvision==0.11.3+cu113 thop==0.0.31-2005241907 \
20 | -f https://download.pytorch.org/whl/cu113/torch_stable.html
21 |
22 | # Set locale to UTF8.
23 | locale-gen en_US.UTF-8
24 |
25 | # Clear package cache.
26 | apt-get clean
27 | rm -rf /var/lib/apt/lists/*
28 |
29 | # Enable sudo without password.
30 | mkdir -p /etc/sudoers.d
31 | echo "${USERNAME} ALL=NOPASSWD: ALL" >> /etc/sudoers.d/${USERNAME}
32 |
33 | # Unlock permissions.
34 | chmod u+s /usr/sbin/useradd && chmod u+s /usr/sbin/groupadd
35 |
--------------------------------------------------------------------------------
/docs/MVTecAD_scores.md:
--------------------------------------------------------------------------------
1 | Scores used in the MVTec AD dataset
2 | ================================================================================
3 |
4 | Image-level ROC AUC score
5 | --------------------------------------------------------------------------------
6 |
7 | The Iimage-level ROC AUC score is a metric for image-wise anomaly detection
8 | performance. This score is not mentioned in the MVTec AD dataset paper,
9 | however, from an application point of view, it is quite important to measure
10 | the image-wise performance of anomaly detection algorithms, therefore,
11 | this score is commonly used in the image anomlay detection paper,
12 | like SPADE and PaDiM.
13 |
14 | Definition of the image-level ROC AUC score is quite simple.
15 | Let's assume that the target anomaly detection algorithm can compute anomaly
16 | values for each image. The image-level ROC AUC acore is defined as
17 | AUC (Area Under the Curve) of ROC (Receiver Operatorating Characteristic)
18 | curve of the anomaly values computed per image by the target algorithm.
19 |
20 | ### What's ROC curve?
21 |
22 | Assume that the target model outputs anomaly score (real value) for each
23 | samples that have ground truth label (for example, 0: good, 1: anomaly).
24 | ROC curve is a plotting of TPR (True Positive Rate) versus
25 | FPR (False Positive Rate) with all threshold for anomaly score.
26 |
27 |
28 |
29 |
30 |
31 | ### What's AUC?
32 |
33 | AUC is a normalied area under the ROC curve.
34 | The minimum and maximum AUC score is 0 and 1 respectively.
35 |
36 | ### Pros and Cons
37 |
38 | * Quite intuitive for many applications
39 | * Insensitive for anomaly location.
40 |
41 |
42 | Pixel-level ROC AUC score
43 | --------------------------------------------------------------------------------
44 |
45 | Score computation is almost the same as the image-level ROC AUC score,
46 | but the base table is not image-level table, but pixel level table.
47 |
48 | | Sample | Anomaly score |
49 | |:---------------------------:|:-------------:|
50 | | pixel (0, 0) of the image 0 | a(0, 0, 0) |
51 | | pixel (1, 0) of the image 0 | a(1, 0, 0) |
52 | | ... | ... |
53 |
54 | ### Pros and Cons
55 |
56 | * Sensitive for anomaly location
57 | * Not intuitive for some applications
58 |
--------------------------------------------------------------------------------
/docs/figures/.gitkeep:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tiskw/patchcore-ad/de22d18a1bfb444558abcc922326b6ca17460a89/docs/figures/.gitkeep
--------------------------------------------------------------------------------
/experiments/README.md:
--------------------------------------------------------------------------------
1 | Experiments
2 | ================================================================================
3 |
4 | This directory contains experiment results and their summaries.
5 |
6 | Experiment 1: Comparison with the original paper
7 | --------------------------------------------------------------------------------
8 |
9 | - **Purpose**: Our purpose is to check that our implementation correctly
10 | reproduced the PatchCore algorithm.
11 |
12 | - **What we've done**: Evaluate our implementation on the MVTec AD dataset
13 | and compare it with the results written in the original paper.
14 |
15 | - **Conclusion**: The score of our implementation is quite close to
16 | the paper's score. Therefore, our implementation may not have a serious issue.
17 |
18 | See [this document](summary_comparison_with_the_paper.md) for details.
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 | Experiment 2: Comparison of backborn networks
27 | --------------------------------------------------------------------------------
28 |
29 | - **Purpose**: It's quite easy to swap the backbone network in the PatchCore
30 | algorithm (default: Wide ResNet50 x2). It's meaningful to find a good
31 | backbone network that shows a good performance-speed tradeoff from
32 | an application viewpoint.
33 |
34 | - **What we've done**: Try several backbone networks and evaluate their
35 | average image/pixel-level scores in the MVTecAD dataset.
36 |
37 | - **Conclusion**: The smaller ResNet (ResNet18, ResNet34) shows enough good
38 | scores even for their small computational cost. On the other hand, very
39 | deep ResNet (ResNet101, ResNet152) shows lower performance than ResNet50.
40 | A current tentative hypothesis is that the features used in the PatchCore
41 | algorithm are too deep (too far from input) and don't have enough
42 | high resolution (raw) features in them. In other words, we should
43 | add shallower features in the case of very deep neural networks
44 | like ResNet101/ResNet152 for exceeding ResNet50's score.
45 |
46 |
47 |
48 |
49 |
50 |
51 | See [this document](summary_comparison_with_backbones.md) for details.
52 |
53 |
54 | Experiment 3: Comparison of pre-trainings
55 | --------------------------------------------------------------------------------
56 |
57 | - **Purpose**: We hyposesize that well-trained neural network achieves
58 | higher performance on PatchCore anomaly detection.
59 |
60 | - **What we've done**: We used several networks that are pre-training in
61 | a different dataset or different method as a backbone of PatchCore,
62 | and evaluate their performance on the MVTec AD dataset.
63 |
64 | - **Conclusion**: We found quite interesting observations, however,
65 | we cannot conclude it at this moment. We would say that the normal ImageNet
66 | pre-trained model seems to be enough good for PatchCore purpose.
67 |
68 |
69 |
70 |
71 |
--------------------------------------------------------------------------------
/experiments/figures/MVTecAD_averaged_pixel-level_roc_auc_score.svg:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/experiments/figures/anomalous_area_visualization.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tiskw/patchcore-ad/de22d18a1bfb444558abcc922326b6ca17460a89/experiments/figures/anomalous_area_visualization.jpg
--------------------------------------------------------------------------------
/experiments/figures/patchcore_sketch.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tiskw/patchcore-ad/de22d18a1bfb444558abcc922326b6ca17460a89/experiments/figures/patchcore_sketch.jpg
--------------------------------------------------------------------------------
/experiments/figures/samples_mvtec_ad.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tiskw/patchcore-ad/de22d18a1bfb444558abcc922326b6ca17460a89/experiments/figures/samples_mvtec_ad.jpg
--------------------------------------------------------------------------------
/experiments/summary_comparison_pretraining.md:
--------------------------------------------------------------------------------
1 | Comparison of pre-trainings
2 | ================================================================================
3 |
4 | This directory contains experiment results of ResNet50 that are pre-trained
5 | with difference dataset or different method.
6 |
7 |
8 | Purpose
9 | --------------------------------------------------------------------------------
10 |
11 | We hyposesize that well-trained neural network achieves higher performance.
12 |
13 |
14 | What we've done
15 | --------------------------------------------------------------------------------
16 |
17 | We tried ResNet50 with ImageNet pre-training, DeepLab V3 pretrained with COCO,
18 | and SSL/SWSL model pretrained on ImageNet [3].
19 | We used these networks as a backbone of PatchCore and evaluate their
20 | performance on the MVTec AD dataset.
21 |
22 |
23 | Conclution
24 | --------------------------------------------------------------------------------
25 |
26 | We found quite interesting observations:
27 |
28 | - Simple ImageNet pre-trained ResNet50 (ResNet50 in the following figure)
29 | shows lower performance in the image-level ROC AUC score, on the other hand,
30 | it shows higher performance in the pixel-level ROC AUC score.
31 | - One possible reasoning is that the simple ImageNet pre-trained model
32 | keeps high-resolution (= raw) information in the features than the other
33 | models which trained on more complex dataset or trained by more complex
34 | method. High-resolution information may bring higer pixel-lebel score
35 | (and lower image-level score), therefore it can be an explanation
36 | of this phenomenon.
37 |
38 | We cannot conclude it at this moment and we need more experiments, however,
39 | I would say that the normal ImageNet pre-trained model seems to be enough
40 | good for PatchCore purpose.
41 |
42 |
43 |
44 |
45 |
--------------------------------------------------------------------------------
/experiments/summary_comparison_with_backbones.md:
--------------------------------------------------------------------------------
1 | Comparison of backbone networks
2 | ================================================================================
3 |
4 | This directory contains experiment results of different backbones
5 | on MVTec AD dataset.
6 |
7 |
8 | Purpose
9 | --------------------------------------------------------------------------------
10 |
11 | It's quite easy to swap the backborn network in the PatchCore algorithm
12 | (default: Wide ResNet50 x2). It's meaningful to find a good backbone network
13 | which shows a good performance-speed tradeoff from an application view point.
14 |
15 |
16 | What we've done
17 | --------------------------------------------------------------------------------
18 |
19 | Try several backbone networks and evaluate their average
20 | image/pixel-level scores in the MVTecAD dataset.
21 |
22 | ### Test Environment
23 |
24 | - CPU: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz (56 cores)
25 | - RAM: 128 GB
26 |
27 |
28 | Conclution
29 | --------------------------------------------------------------------------------
30 |
31 | The smaller ResNet (ResNet18, ResNet34) shows enough good
32 | scores even for their small computational cost. On the otherhand, very deep
33 | ResNet (ResNet101, ResNet152) shows a lower performance than ResNet50.
34 | Current tentative hypothesis is that the features used in the PatchCore
35 | algorithm are too deep (too far from input) and don't have enough
36 | high-resolution (raw) input information in them. In other words, we should
37 | add more shallower fatures in the case of very deep neural networks
38 | like ResNet101 and ResNet152 for exceeding ResNet50's score.
39 |
40 |
41 |
42 |
43 |
44 |
45 |
--------------------------------------------------------------------------------
/experiments/summary_comparison_with_the_paper.md:
--------------------------------------------------------------------------------
1 | Comparison with the original paper
2 | ================================================================================
3 |
4 | We reproduced the same experiments with the original paper,
5 | image-level and pixel-level ROC AUC score on the MVTec AD dataset
6 | (see the table S1 and S2 on page 15 of the paper),
7 | for checking that our code correctly implemented the PatchCore algorithm.
8 |
9 | Conclusion
10 | --------------------------------------------------------------------------------
11 |
12 | The average image-level and pixel-level ROC AUC scores of our code are quite
13 | close to the paper's score if the sampling ratio is 1%, therefore the author
14 | thinks our code may not have a serious issue.
15 |
16 | | | Sampling ratio | Average image-level roc auc score | Average pixel-level roc auc score |
17 | |:---------:|:--------------:|:---------------------------------:|:---------------------------------:|
18 | | The paper | 1.0 % | 99.0 % | 98.0 % |
19 | | This repo | 1.0 % | 98.8 % | 97.9 % |
20 |
21 | Results
22 | --------------------------------------------------------------------------------
23 |
24 |
25 |
26 |
27 |
28 |
--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
1 | """
2 | PatchCore anomaly detection: main script for user custom dataset
3 | """
4 |
5 | # Import standard libraries.
6 | import argparse
7 |
8 | # Import third-party packages.
9 | import numpy as np
10 | import rich
11 | import rich.progress
12 | import torch
13 |
14 | # Import custom modules.
15 | from patchcore.dataset import MVTecADImageOnly
16 | from patchcore.patchcore import PatchCore
17 | from patchcore.utils import auto_threshold
18 |
19 |
20 | def parse_args():
21 | """
22 | Parse command line arguments.
23 | """
24 | fmtcls = lambda prog: argparse.HelpFormatter(prog,max_help_position=50)
25 | parser = argparse.ArgumentParser(description=__doc__, add_help=False, formatter_class=fmtcls)
26 |
27 | # Required arguments.
28 | parser.add_argument("mode", choices=["train", "predict", "thresh"], help="running mode")
29 |
30 | # Optional arguments for dataset configuration.
31 | group1 = parser.add_argument_group("dataset options")
32 | group1.add_argument("-i", "--input", metavar="PATH", default="data/mvtec_ad/bottle/train",
33 | help="input file/directory path")
34 | group1.add_argument("-o", "--output", metavar="PATH", default="output",
35 | help="output file/directory path")
36 | group1.add_argument("-t", "--trained", metavar="PATH", default="index.faiss",
37 | help="training results")
38 | group1.add_argument("-b", "--batch_size", metavar="INT", default=16,
39 | help="training batch size")
40 | group1.add_argument("-l", "--load_size", metavar="INT", default=224,
41 | help="size of loaded images")
42 | group1.add_argument("-n", "--input_size", metavar="INT", default=224,
43 | help="size of images passed to NN model")
44 |
45 | # Optional arguments for neural network configuration.
46 | group2 = parser.add_argument_group("network options")
47 | group2.add_argument("-m", "--model", metavar="STR", default="wide_resnet50_2",
48 | help="name of a neural network model")
49 | group2.add_argument("-r", "--repo", metavar="STR", default="pytorch/vision:v0.11.3",
50 | help="repository of the neural network model")
51 |
52 | # Optional arguments for anomaly detection algorithm.
53 | group3 = parser.add_argument_group("algorithm options")
54 | group3.add_argument("-k", "--n_neighbors", metavar="INT", type=int, default=3,
55 | help="number of neighbors to be searched")
56 | group3.add_argument("-s", "--sampling_ratio", metavar="FLT", default=0.01,
57 | help="ratio of coreset sub-sampling")
58 |
59 | # Optional arguments for thresholding.
60 | group4 = parser.add_argument_group("thresholding options")
61 | group4.add_argument("-e", "--coef_sigma", metavar="FLT", type=float, default=5.0,
62 | help="coefficient of sigma when computing threshold (= mean + coef * sigma)")
63 |
64 | # Optional arguments for visualization.
65 | group5 = parser.add_argument_group("visualization options")
66 | group5.add_argument("-c", "--contour", metavar="FLT", type=float, default=None,
67 | help="visualize contour map instead of heatmap using the given threshold")
68 |
69 | # Other optional arguments.
70 | group6 = parser.add_argument_group("other options")
71 | group6.add_argument("-d", "--device", metavar="STR", default="auto",
72 | help="device name (e.g. 'cuda')")
73 | group6.add_argument("-w", "--num_workers", metavar="INT", type=int, default=1,
74 | help="number of available CPUs")
75 | group6.add_argument("-h", "--help", action="help",
76 | help="show this help message and exit")
77 |
78 | return parser.parse_args()
79 |
80 |
81 | def main(args):
82 | """
83 | Main function for running training/test procedure.
84 |
85 | Args:
86 | args (argparse.Namespace): Parsed command line arguments.
87 | """
88 | rich.print(r"[yellow][Command line arguments][/yellow]")
89 | rich.print(vars(args))
90 |
91 | if args.device == "auto":
92 | args.device = "cuda" if torch.cuda.is_available() else "cpu"
93 |
94 | # Create PatchCore model instance.
95 | model = PatchCore(args.model, args.repo, args.device, args.sampling_ratio)
96 |
97 | # Arguments required for dataset creation.
98 | # These arguments are mainly used for the transformations applied to
99 | # the input images and ground truth images. Details of the transformations
100 | # are written in the MVTecAD dataset class (see patchcore/dataset.py).
101 | dataset_args = {
102 | "load_shape" : (args.load_size, args.load_size),
103 | "input_shape": (args.input_size, args.input_size),
104 | "im_mean" : (0.485, 0.456, 0.406),
105 | "im_std" : (0.229, 0.224, 0.225),
106 | # The above mean and standard deviation is a values of the ImageNet dataset.
107 | # These values are required because the NN models pre-trained with ImageNet
108 | # assume that the input image is normalized in terms of ImageNet dataset.
109 | }
110 |
111 | if args.mode == "train":
112 |
113 | # Prepare dataset.
114 | dataset = MVTecADImageOnly(args.input, **dataset_args)
115 |
116 | # Train model.
117 | model.fit(dataset, args.batch_size, args.num_workers)
118 |
119 | # Save training result.
120 | model.save(args.trained)
121 |
122 | elif args.mode == "predict":
123 |
124 | # Load trained model.
125 | model.load(args.trained)
126 |
127 | # Prepare dataset.
128 | dataset = MVTecADImageOnly(args.input, **dataset_args)
129 | dloader = torch.utils.data.DataLoader(dataset, batch_size=1, num_workers=args.num_workers, pin_memory=True)
130 |
131 | for x, gt, label, filepath, x_type in rich.progress.track(dloader, description="Processing..."):
132 |
133 | # Run prediction and get anomaly heatmap.
134 | anomaly_map_rw = model.predict(x, args.n_neighbors)
135 |
136 | # Save anomaly heatmap (JPG image and NPY file).
137 | model.save_anomaly_map(args.output, anomaly_map_rw, filepath[0], x_type[0], contour=args.contour)
138 |
139 | elif args.mode == "thresh":
140 |
141 | # Load trained model.
142 | model.load(args.trained)
143 |
144 | # Prepare dataset.
145 | dataset = MVTecADImageOnly(args.input, **dataset_args)
146 | dloader = torch.utils.data.DataLoader(dataset, batch_size=1, num_workers=args.num_workers, pin_memory=True)
147 |
148 | # Initialize the anomaly scores.
149 | scores = list()
150 |
151 | # Compute max value of the anomaly heatmaps.
152 | for x, gt, label, filepath, x_type in rich.progress.track(dloader, description="Processing..."):
153 |
154 | # Run prediction and get anomaly heatmap.
155 | anomaly_map_rw = model.predict(x, args.n_neighbors)
156 |
157 | # Append the anomaly score.
158 | scores.append(np.max(anomaly_map_rw))
159 |
160 | # Compute threshold.
161 | thresh, score_mean, score_std = auto_threshold(scores, args.coef_sigma)
162 |
163 | print("Anomaly threshold = %f" % thresh)
164 | print(" - score_mean = %f" % score_mean)
165 | print(" - score_std = %f" % score_std)
166 |
167 |
168 | if __name__ == "__main__":
169 | main(parse_args())
170 |
--------------------------------------------------------------------------------
/main_mvtecad.py:
--------------------------------------------------------------------------------
1 | """
2 | PatchCore anomaly detection: main script for MVTec AD dataset
3 | """
4 |
5 | # Import standard libraries.
6 | import argparse
7 | import os
8 | import pathlib
9 | import subprocess
10 |
11 | # Import third-party packages.
12 | import rich
13 | import rich.progress
14 | import torch
15 |
16 | # Import custom modules.
17 | from patchcore.dataset import MVTecAD
18 | from patchcore.patchcore import PatchCore
19 |
20 |
21 | def parse_args():
22 | """
23 | Parse command line arguments.
24 | """
25 | fmtcls = lambda prog: argparse.HelpFormatter(prog,max_help_position=50)
26 | parser = argparse.ArgumentParser(description=__doc__, add_help=False, formatter_class=fmtcls)
27 |
28 | # Required arguments.
29 | parser.add_argument("mode", choices=["train", "test", "runall", "summ"],
30 | help="running mode")
31 |
32 | # Common optional arguments.
33 | group0 = parser.add_argument_group("common options")
34 | group0.add_argument("--datadir", default="data/mvtec_ad",
35 | help="path to MVTec AD dataset")
36 |
37 | # Optional arguments for training and test configuration.
38 | group1 = parser.add_argument_group("train/test options")
39 | group1.add_argument("--category", default="hazelnut",
40 | help="data category (e.g. 'hazelnut')")
41 | group1.add_argument("--device", metavar="STR", default="auto",
42 | help="device name (e.g. 'cuda')")
43 | group1.add_argument("--model", metavar="STR", default="wide_resnet50_2",
44 | help="name of a neural network model")
45 | group1.add_argument("--repo", metavar="STR", default="pytorch/vision:v0.11.3",
46 | help="repository of the neural network model")
47 | group1.add_argument("--n_neighbors", metavar="INT", type=int, default=3,
48 | help="number of neighbors to be searched")
49 | group1.add_argument("--sampling_ratio", metavar="FLT", default=0.01,
50 | help="ratio of coreset sub-sampling")
51 | group1.add_argument("--outdir", metavar="PATH", default="output",
52 | help="output file/directory path")
53 | group1.add_argument("--batch_size", metavar="INT", default=16,
54 | help="training batch size")
55 | group1.add_argument("--load_size", metavar="INT", default=256,
56 | help="size of loaded images")
57 | group1.add_argument("--input_size", metavar="INT", default=224,
58 | help="size of images passed to NN model")
59 | group1.add_argument("--num_workers", metavar="INT", type=int, default=1,
60 | help="number of available CPUs")
61 |
62 | # Optional arguments for running experiments configuration.
63 | group2 = parser.add_argument_group("runall options")
64 | group2.add_argument("--dryrun", action="store_true",
65 | help="only dump the commands and do nothing")
66 | group2.add_argument("--test_only", action="store_true",
67 | help="run only test procedure")
68 | group2.add_argument("--no_redirect", action="store_true",
69 | help="do not redirect dump messages")
70 |
71 | # Other optional arguments.
72 | group3 = parser.add_argument_group("other options")
73 | group3.add_argument("-h", "--help", action="help",
74 | help="show this help message and exit")
75 |
76 | return parser.parse_args()
77 |
78 |
79 | def main_traintest(args):
80 | """
81 | Main function for running training/test procedure.
82 |
83 | Args:
84 | args (argparse.Namespace): Parsed command line arguments.
85 | """
86 | rich.print("[yellow]Mode[/yellow]: [green]" + args.mode + "[/green]")
87 | rich.print(r"[yellow][Command line arguments][/yellow]")
88 | rich.print(vars(args))
89 |
90 | if args.device == "auto":
91 | args.device = "cuda" if torch.cuda.is_available() else "cpu"
92 |
93 | # Create a path to input dataset.
94 | dirpath_dataset = os.path.join(args.datadir, args.category)
95 |
96 | # Create PatchCore model instance.
97 | model = PatchCore(args.model, args.repo, args.device, args.sampling_ratio)
98 |
99 | # Arguments required for dataset creation.
100 | # These arguments are mainly used for the transformations applied to
101 | # the input images and ground truth images. Details of the transformations
102 | # are written in the MVTecAD dataset class (see patchcore/dataset.py).
103 | dataset_args = {
104 | "load_shape" : (args.load_size, args.load_size),
105 | "input_shape": (args.input_size, args.input_size),
106 | "im_mean" : (0.485, 0.456, 0.406),
107 | "im_std" : (0.229, 0.224, 0.225),
108 | # The above mean and standard deviation is a values of the ImageNet dataset.
109 | # These values are required because the NN models pre-trained with ImageNet
110 | # assume that the input image is normalized in terms of ImageNet dataset.
111 | }
112 |
113 | # In training mode, run both training and test.
114 | if args.mode == "train":
115 | dataset = MVTecAD(dirpath_dataset, "train", **dataset_args)
116 | model.fit(dataset, args.batch_size, args.num_workers)
117 | model.save(os.path.join(args.outdir, "index.faiss"))
118 |
119 | dataset = MVTecAD(dirpath_dataset, "test", **dataset_args)
120 | model.score(dataset, args.n_neighbors, args.outdir, args.num_workers)
121 |
122 | # In test mode, run test only.
123 | elif args.mode == "test":
124 | dataset = MVTecAD(dirpath_dataset, "test", **dataset_args)
125 | model.load(os.path.join(args.outdir, "index.faiss"))
126 | model.score(dataset, args.n_neighbors, args.outdir, args.num_workers)
127 |
128 |
129 | def main_runall(args):
130 | """
131 | Run all experiments.
132 |
133 | Args:
134 | args (argparse.Namespace): Parsed command line arguments.
135 | """
136 | rich.print("[yellow]Mode[/yellow]: [green]" + args.mode + "[/green]")
137 | rich.print("[yellow]Command line arguments:[/yellow]")
138 | rich.print(vars(args))
139 |
140 | dirpaths = [dirpath for dirpath in pathlib.Path(args.datadir).glob("*") if dirpath.is_dir()]
141 |
142 | for dirpath in sorted(dirpaths):
143 |
144 | program = "python3 main_mvtecad.py " + ("test" if args.test_only else "train")
145 | category = dirpath.name
146 | model = args.model
147 | repo = args.repo
148 | outdir = f"experiments/data_{model}/{category}"
149 | outfile = outdir + "/log.txt"
150 | redirect = "" if args.no_redirect else f" > {outfile}"
151 | command = f"{program} --category {category} --repo {repo} --model {model} --outdir {outdir} {redirect}"
152 |
153 | rich.print("[yellow]Running[/yellow]: " + command)
154 |
155 | # Run command.
156 | if not args.dryrun:
157 | os.makedirs(outdir, exist_ok=True)
158 | subprocess.run(command, shell=True)
159 |
160 |
161 | def main_summarize(args):
162 | """
163 | Summarize experiment results.
164 |
165 | Args:
166 | args (argparse.Namespace): Parsed command line arguments.
167 | """
168 | def glob_dir(path, pattern):
169 | """
170 | Returns only directory.
171 | """
172 | for target in path.glob(pattern):
173 | if target.is_dir():
174 | yield target
175 |
176 | def get_value(line):
177 | """
178 | Get score value from a line string.
179 | """
180 | return float(line.strip().split(":")[-1].split()[0])
181 |
182 | def get_scores(filepath):
183 | """
184 | Get all scores from the given file and returns it as a dict.
185 | """
186 | scores = dict()
187 | for line in filepath.open():
188 | if line.startswith("Total pixel-level") : scores["pixel-level"] = get_value(line)
189 | elif line.startswith("Total image-level") : scores["image-level"] = get_value(line)
190 | elif line.startswith("Feature extraction"): scores["time-featex"] = get_value(line)
191 | elif line.startswith("Anomaly map search"): scores["time-anmaps"] = get_value(line)
192 | elif line.startswith("Total infer time") : scores["time-itotal"] = get_value(line)
193 | return scores
194 |
195 | def get_results(root):
196 | """
197 | Create a dictionaly which contains experiments results
198 | where the key order is `results[network][category][score]`.
199 | """
200 | results = dict()
201 | for dirpath in glob_dir(pathlib.Path(root), "data_*"):
202 | results[dirpath.name] = dict()
203 | for dirpath_cat in sorted(glob_dir(dirpath, "*")):
204 | results[dirpath.name][dirpath_cat.name] = get_scores(dirpath_cat / "log.txt")
205 | return results
206 |
207 | def print_results(results):
208 | """
209 | Print summary table to STDOUT.
210 | """
211 | networks = list(results.keys())
212 | categories = list(results[networks[0]].keys())
213 | scores = list(results[networks[0]][categories[0]].keys())
214 |
215 | # Print table (scores for each netwotks).
216 | for score in scores:
217 |
218 | header = [score] + categories
219 | print(",".join(header))
220 |
221 | # Print row (scores) for each networks.
222 | for network in networks:
223 | row = [network] + [results[network][c][score] for c in categories]
224 | print(",".join(map(str, row)))
225 |
226 | # Get results and print it.
227 | print_results(get_results("experiments"))
228 |
229 |
230 | def main(args):
231 | """
232 | Entry point of this script.
233 |
234 | Args:
235 | args (argparse.Namespace): Parsed command line arguments.
236 | """
237 | if args.mode in ["train", "test"]: main_traintest(args)
238 | elif args.mode in ["runall"] : main_runall(args)
239 | elif args.mode in ["summ"] : main_summarize(args)
240 | else : raise ValueError("unknown mode: " + args.mode)
241 |
242 |
243 | if __name__ == "__main__":
244 | main(parse_args())
245 |
--------------------------------------------------------------------------------
/patchcore/dataset.py:
--------------------------------------------------------------------------------
1 | """
2 | This module provides a PyTorch implementation of the MVTec AD dataset.
3 | """
4 |
5 | # Import standard libraries.
6 | import pathlib
7 |
8 | # Import third-party packages.
9 | import numpy as np
10 | import PIL.Image
11 | import torch
12 | import torchvision
13 |
14 |
15 | class MVTecAD(torch.utils.data.Dataset):
16 | """
17 | PyTorch implementation of the MVTec AD dataset [1].
18 |
19 | Reference:
20 | [1]
21 | """
22 | def __init__(self, root="data", split="train", transform_im=None, transform_gt=None,
23 | load_shape=(256, 256), input_shape=(224, 224),
24 | im_mean=(0.485, 0.456, 0.406), im_std=(0.229, 0.224, 0.225)):
25 | """
26 | Constructor of MVTec AD dataset.
27 |
28 | Args:
29 | root (str) : Dataset directory.
30 | split (str) : The dataset split, supports `train`, or `test`.
31 | transform_im (func) : Transform for the input image.
32 | transform_gt (func) : Transform for the ground truth image.
33 | load_shape (tuple): Shape of the loaded image.
34 | input_shape (tuple): Shape of the input image.
35 | im_mean (tuple): Mean of image (3 channels) for image normalization.
36 | im_std (tuple): Standard deviation of image (3 channels) for image normalization.
37 |
38 | Notes:
39 | The arguments `load_shape`, `input_shape`, `im_mean`, and `im_std` are used
40 | only if the `transform_im` or `transform_gt` is None.
41 | """
42 | self.root = pathlib.Path(root)
43 |
44 | # Set directory path of input and ground truth images.
45 | if split == "train":
46 | self.dir_im = self.root / "train"
47 | self.dir_gt = None
48 | elif split == "test":
49 | self.dir_im = self.root / "test"
50 | self.dir_gt = self.root / "ground_truth"
51 |
52 | # The value of `split` should be either of "train" or "test".
53 | else: raise ValueError("Error: argument `split` should be 'train' or 'test'.")
54 |
55 | # Use default transform if no transform specified.
56 | args = (load_shape, input_shape, im_mean, im_std)
57 | self.transform_im = self.default_transform_im(*args) if transform_im is None else transform_im
58 | self.transform_gt = self.default_transform_gt(*args) if transform_gt is None else transform_gt
59 |
60 | self.paths_im, self.paths_gt, self.labels, self.anames = self.load_dataset(self.dir_im, self.dir_gt)
61 |
62 | def __getitem__(self, idx):
63 | """
64 | Returns idx-th data.
65 |
66 | Args:
67 | idx (int): Index of image to be returned.
68 | """
69 | path_im = self.paths_im[idx] # Image file path.
70 | path_gt = self.paths_gt[idx] # Ground truth file path.
71 | flag_an = self.labels[idx] # Anomaly flag (good : 0, anomaly : 1).
72 | name_an = self.anames[idx] # Anomaly name.
73 |
74 | # Load input image.
75 | img = PIL.Image.open(str(path_im)).convert("RGB")
76 |
77 | # If good data, use zeros as a ground truth image.
78 | if flag_an == 0:
79 | igt = PIL.Image.fromarray(np.zeros(img.size[::-1], dtype=np.uint8))
80 |
81 | # Otherwise, load ground truth data.
82 | elif flag_an == 1:
83 | igt = PIL.Image.open(str(path_gt)).convert("L")
84 |
85 | # Anomaly flag should be either of 0 or 1.
86 | else: raise ValueError("Error: value of `flag_an` should be 0 or 1.")
87 |
88 | # Size of the input and ground truth image should be the same.
89 | assert img.size == igt.size, "image.size != igt.size !!!"
90 |
91 | # Apply transforms.
92 | img = self.transform_im(img)
93 | igt = self.transform_gt(igt)
94 |
95 | return (img, igt, flag_an, str(path_im), name_an)
96 |
97 | def __len__(self):
98 | """
99 | Returns number of data.
100 | """
101 | return len(self.paths_im)
102 |
103 | def load_dataset(self, dir_im, dir_gt):
104 | """
105 | Load dataset.
106 |
107 | Args:
108 | dir_im (pathlib.Path): Path to the input image directory.
109 | dir_gt (pathlib.Path): Path to the ground truth image directory.
110 | """
111 | paths_im = list() # List of image file paths.
112 | paths_gt = list() # List of ground truth file paths.
113 | flags_an = list() # List of anomaly flags (good : 0, anomaly : 1).
114 | names_an = list() # List of anomaly names.
115 |
116 | for subdir in sorted(dir_im.iterdir()):
117 |
118 | # Name of the sub directory is the same as the anomaly name.
119 | defect_name = subdir.name
120 |
121 | # Case 1: good data which have only input image.
122 | if defect_name == "good":
123 |
124 | # Get input image paths (good data doesn't have ground truth image).
125 | paths = sorted((dir_im / defect_name).glob("*.png"))
126 |
127 | # Update attributes.
128 | paths_im += paths
129 | paths_gt += len(paths) * [None]
130 | flags_an += len(paths) * [0]
131 | names_an += len(paths) * [defect_name]
132 |
133 | # Case 2: not good data which have both input and ground truth images.
134 | else:
135 |
136 | # Get input and ground truth image paths.
137 | paths1 = sorted((dir_im / defect_name).glob("*.png"))
138 | paths2 = sorted((dir_gt / defect_name).glob("*.png"))
139 |
140 | # Update attributes.
141 | paths_im += paths1
142 | paths_gt += paths2
143 | flags_an += len(paths1) * [1]
144 | names_an += len(paths2) * [defect_name]
145 |
146 | # Number of input image and ground truth image sould be the same.
147 | assert len(paths_im) == len(paths_gt), "Something wrong with test and ground truth pair!"
148 |
149 | return (paths_im, paths_gt, flags_an, names_an)
150 |
151 | def default_transform_im(self, load_shape, input_shape, im_mean, im_std):
152 | """
153 | Returns default transform for the input image of MVTec AD dataset.
154 | """
155 | return torchvision.transforms.Compose([
156 | torchvision.transforms.Resize(load_shape),
157 | torchvision.transforms.ToTensor(),
158 | torchvision.transforms.CenterCrop(input_shape),
159 | torchvision.transforms.Normalize(mean=im_mean, std=im_std),
160 | ])
161 |
162 | def default_transform_gt(self, load_shape, input_shape, im_mean, im_std):
163 | """
164 | Returns default transform for the ground truth image of MVTec AD dataset.
165 | """
166 | return torchvision.transforms.Compose([
167 | torchvision.transforms.Resize(load_shape),
168 | torchvision.transforms.ToTensor(),
169 | torchvision.transforms.CenterCrop(input_shape),
170 | ])
171 |
172 |
173 | class MVTecADImageOnly(torch.utils.data.Dataset):
174 | """
175 | Dataset class that is quite close to MVTecAD, but works
176 | if no ground truth image is available. This class will be
177 | used for user defined datasets.
178 | """
179 | def __init__(self, root="data", transform=None,
180 | load_shape=(256, 256), input_shape=(224, 224),
181 | im_mean=(0.485, 0.456, 0.406), im_std=(0.229, 0.224, 0.225)):
182 | """
183 | Constructor of MVTec AD dataset.
184 |
185 | Args:
186 | root (str) : Image directory.
187 | transform (func) : Transform for the input image.
188 | load_shape (tuple): Shape of the loaded image.
189 | input_shape (tuple): Shape of the input image.
190 | im_mean (tuple): Mean of image (3 channels) for image normalization.
191 | im_std (tuple): Standard deviation of image (3 channels) for image normalization.
192 |
193 | Notes:
194 | The arguments `load_shape`, `input_shape`, `im_mean`, and `im_std` are used
195 | only if the `transform_im` or `transform_gt` is None.
196 | """
197 | self.root = pathlib.Path(root)
198 |
199 | # Use default transform if no transform specified.
200 | args = (load_shape, input_shape, im_mean, im_std)
201 | self.transform = self.default_transform(*args) if transform is None else transform
202 |
203 | self.paths = [path for path in self.root.glob("**/*") if path.suffix in [".jpg", ".png"]]
204 |
205 | def __getitem__(self, idx):
206 | """
207 | Returns idx-th data.
208 |
209 | Args:
210 | idx (int): Index of image to be returned.
211 | """
212 |
213 | # Load input image.
214 | path = self.paths[idx]
215 | img = PIL.Image.open(str(path)).convert("RGB")
216 |
217 | # Apply transforms.
218 | img = self.transform(img)
219 |
220 | # Returns only image while keeping the same interface as the MVTecAD class.
221 | return (img, 0, 0, str(path), 0)
222 |
223 | def __len__(self):
224 | """
225 | Returns number of data.
226 | """
227 | return len(self.paths)
228 |
229 | def default_transform(self, load_shape, input_shape, im_mean, im_std):
230 | """
231 | Returns default transform for the input image of MVTec AD dataset.
232 | """
233 | return torchvision.transforms.Compose([
234 | torchvision.transforms.Resize(load_shape),
235 | torchvision.transforms.ToTensor(),
236 | torchvision.transforms.CenterCrop(input_shape),
237 | torchvision.transforms.Normalize(mean=im_mean, std=im_std),
238 | ])
239 |
240 |
241 | if __name__ == "__main__":
242 |
243 | # Test the training data.
244 | dataset = MVTecAD("data/mvtec_anomaly_detection/hazelnut", "train")
245 | print(dataset[0])
246 |
247 | # Test the test data.
248 | dataset = MVTecAD("data/mvtec_anomaly_detection/hazelnut", "test")
249 | print(dataset[0])
250 |
--------------------------------------------------------------------------------
/patchcore/extractor.py:
--------------------------------------------------------------------------------
1 | """
2 | This module provides a feature extractor which extract intermediate features
3 | from a neural network model and reshape it to a convenient format for PatchCore.
4 | """
5 |
6 | # Import standard libraries.
7 | import warnings
8 |
9 | # Import third-party packages.
10 | import numpy as np
11 | import rich
12 | import rich.progress
13 | import torch
14 | import thop
15 |
16 |
17 | # Ignore "UserWarning" because many UserWarnings will be raised
18 | # when calling `thop.profile` function (because of custom layers).
19 | warnings.simplefilter("ignore", UserWarning)
20 |
21 |
22 | class FeatureExtractor:
23 | """
24 | A class to extract intermediate features from a NN model
25 | and reshape it to a convenient format for PatchCore.
26 |
27 | Example:
28 | >>> model = "pytorch/vision:v0.10.0" # Or your custom model
29 | >>> dataset = MVTecAD()
30 | >>> extractor = FeatureExtractor(model)
31 | >>>
32 | >>> extractor.transform(dataset)
33 | """
34 | def __init__(self, model, repo, device):
35 | """
36 | Constructor of the featureExtractor class.
37 |
38 | Args:
39 | model (str/torch.nn.Module): A base NN model.
40 | repo (str) : Repository name which provides the model.
41 | device (str) : Device name (e.g. "cuda").
42 | """
43 | def load_model(repo, model):
44 | try : return torch.hub.load(repo, model, verbose=False, pretrained=True)
45 | except: return torch.hub.load(repo, model, verbose=False)
46 |
47 | if isinstance(model, str):
48 | self.model = load_model(repo, model)
49 | self.name = model
50 | macs, pars = thop.profile(self.model, inputs=(torch.randn(1, 3, 224, 224),), verbose=False)
51 | rich.print("Model summary (assume 3x224x224 input):")
52 | rich.print(" - [green]repo[/green]: [magenta]{:s}[/magenta]".format(repo))
53 | rich.print(" - [green]name[/green]: [magenta]{:s}[/magenta]".format(model))
54 | rich.print(" - [green]pars[/green]: [magenta]{:,}[/magenta]".format(int(pars)))
55 | rich.print(" - [green]macs[/green]: [magenta]{:,}[/magenta]".format(int(macs)))
56 | else:
57 | rich.print("Custom model specified")
58 | self.model = model
59 | self.name = model.__class__.__name__
60 |
61 | # Send model to the device.
62 | self.device = device
63 | self.model.to(device)
64 |
65 | # Freeze the model.
66 | self.model.eval()
67 | for param in self.model.parameters():
68 | param.requires_grad = False
69 |
70 | # Embed hook functions to the mode for extracting intermediate features.
71 | self.model = self.embed_hooks(self.model)
72 |
73 | def forward(self, x):
74 | """
75 | Extract intermediate feature from a single batch data.
76 | Note that the output tensor is still a tensor of 4-rank (not reshaped).
77 |
78 | Args:
79 | x (torch.Tensor): Input tensor of shape (N, C_in, H_in, W_in).
80 |
81 | Returns:
82 | embeddings (torch.Tensor): Embedding representation of shape (N, C_out, H_out, W_out).
83 | """
84 | def concat_embeddings(*xs):
85 | """
86 | Concatenate the given intermediate features with resizing.
87 |
88 | Args:
89 | x[i] (torch.Tensor): Input tensor of i-th argument with shape (N, C_i, H_i, W_i).
90 |
91 | Returns:
92 | z (torch.Tensor): Concatenated tensor with shape (N, sum(C_i), max(H_i), max(W_i)).
93 | """
94 | # Compute maximum shape.
95 | H_max = max([x.shape[2] for x in xs])
96 | W_max = max([x.shape[3] for x in xs])
97 |
98 | # Create resize function instance.
99 | resizer = torch.nn.Upsample(size=(H_max, W_max), mode="nearest")
100 |
101 | # Apply resize function for all input tensors.
102 | zs = [resizer(x) for x in xs]
103 |
104 | # Concatenate in the channel dimention and return it.
105 | return torch.cat(zs, dim=1)
106 |
107 | # Extract features using hook mechanism.
108 | self.features = []
109 | _ = self.model(x.to(self.device))
110 |
111 | # Apply smoothing (3x3 average pooling) to the features.
112 | smoothing = torch.nn.AvgPool2d(kernel_size=3, stride=1, padding=1)
113 | features = [smoothing(feature).cpu() for feature in self.features]
114 |
115 | # Concatenate intermediate features.
116 | embedding = concat_embeddings(*features)
117 |
118 | return embedding
119 |
120 | def transform(self, data, description="Extracting...", **kwargs):
121 | """
122 | Extract features from the given data.
123 | This function can handle 2 types of input:
124 | (1) dataset (torch.utils.data.Dataset),
125 | (2) tensor (torch.Tensor),
126 | where the shape of the tensor is (N, C, H, W).
127 |
128 | Args:
129 | data (Dataset/Tensor): Input data.
130 | description (str) : Message shown on the progress bar.
131 | kwargs (dict) : Keyword arguments for DataLoader class constructor.
132 |
133 | Returns:
134 | embeddings (torch.Tensor): Embedding representations of shape (N*H*W, C).
135 | """
136 | def flatten_NHW(tensor):
137 | """
138 | Flatten the given tensor of rank-4 with shape (N, C, H, W)
139 | in terms of N, H and W, and returns a matrix with shape (N*H*W, C).
140 |
141 | Args:
142 | tensor (torch.Tensor): Tensor of shape (N, C, H, W).
143 |
144 | Returns:
145 | matrix (torch.Tensor): Tensor of shape (N*H*W, C).
146 | """
147 | return tensor.permute((0, 2, 3, 1)).flatten(start_dim=0, end_dim=2)
148 |
149 | # Case 1: input data is a dataset.
150 | if isinstance(data, torch.utils.data.Dataset):
151 |
152 | # Create data loader.
153 | dataloader = torch.utils.data.DataLoader(data, **kwargs, pin_memory=True)
154 |
155 | # Extract features for each batch.
156 | embeddings_list = list()
157 | for x, _, _, _, _ in rich.progress.track(dataloader, description=description):
158 | embedding = self.forward(x)
159 | embeddings_list.append(flatten_NHW(embedding).cpu().numpy())
160 |
161 | # Concat results for each batch and
162 | return np.concatenate(embeddings_list, axis=0)
163 |
164 | # Case 2: input data is a single tensor (i.e. single batch).
165 | elif isinstance(data, torch.Tensor):
166 | embedding = self.forward(data)
167 | return flatten_NHW(embedding).cpu().numpy()
168 |
169 | def embed_hooks(self, model):
170 | """
171 | Embed hook functions to a NN model for extracting intermediate features.
172 | """
173 | # Hook function for capturing intermediate features.
174 | def hook(module, input, output):
175 | self.features.append(output)
176 |
177 | RESNET_FAMILIES = ["resnet18", "resnet34", "resnet50", "resnet101", "resnet152",
178 | "resnext101_32x8d", "resnext50_32x4d",
179 | "wide_resnet50_2", "wide_resnet101_2"]
180 |
181 | DEEPLAB_RESNET = ["deeplabv3_resnet50", "deeplabv3_resnet101"]
182 |
183 | DENSENET_FAMILIES = ["densenet121", "densenet161", "densenet169", "densenet201"]
184 |
185 | VGG_FAMILIES = ["vgg11", "vgg11_bn", "vgg13", "vgg13_bn",
186 | "vgg16", "vgg16_bn", "vgg19", "vgg19_bn"]
187 |
188 | if self.name in RESNET_FAMILIES:
189 | model.layer2[-1].register_forward_hook(hook)
190 | model.layer3[-1].register_forward_hook(hook)
191 |
192 | elif self.name in DEEPLAB_RESNET:
193 | model.backbone.layer2[-1].register_forward_hook(hook)
194 | model.backbone.layer3[-1].register_forward_hook(hook)
195 |
196 | elif self.name in DENSENET_FAMILIES:
197 | model.features.denseblock2.register_forward_hook(hook)
198 | model.features.denseblock3.register_forward_hook(hook)
199 |
200 | # In the case of VGG, register the 2nd and 3rd MaxPool counted from the bottom.
201 | elif self.name in VGG_FAMILIES:
202 | num_maxpool = 0
203 | for idx, module in reversed(list(enumerate(model.features))):
204 | if module.__class__.__name__ == "MaxPool2d":
205 | num_maxpool += 1
206 | if num_maxpool in [2, 3]:
207 | model.features[idx].register_forward_hook(hook)
208 |
209 | # Network proposed by the following paper:
210 | # I. Zeki Yalniz, H. Jegou, K. Chen, M. Paluri, and D. Mahajan,
211 | # "Billion-scale semi-supervised learning for image classification", CVPR, 2019.
212 | #
213 | #
214 | # PyTorch Hub:
215 | # GitHub repo:
216 | elif self.name in ["resnet18_swsl", "resnet50_swsl", "resnet18_ssl", "resnet50_ssl"]:
217 | model.layer2.register_forward_hook(hook)
218 | model.layer3.register_forward_hook(hook)
219 |
220 | # Raise an error if the given network is unknown.
221 | else: RuntimeError("unknown neural network: no hooks registered")
222 |
223 | return model
224 |
--------------------------------------------------------------------------------
/patchcore/knnsearch.py:
--------------------------------------------------------------------------------
1 | """
2 | This module provides a k-NN searcher class using faiss backend,
3 | and sampling algorithms which returns a set of points that minimizes
4 | the maximum distance of any point to a center.
5 | """
6 |
7 | # Import third-party packages.
8 | import faiss
9 | import numpy as np
10 | import rich
11 | import sklearn.metrics
12 | import sklearn.random_projection
13 |
14 |
15 | class KCenterGreedy:
16 | """
17 | Python implementation of the k-Center-Greedy method in [1].
18 |
19 | Distance metric defaults to L2 distance. Features used to calculate distance
20 | are either raw features or if a model has transform method then uses the output
21 | of model.transform(X).
22 |
23 | This algorithm can be extended to a robust k centers algorithm that ignores
24 | a certain number of outlier datapoints. Resulting centers are solution to
25 | multiple integer programing problem.
26 |
27 | Reference:
28 | [1] O. Sener and S. Savarese, "A Geometric Approach to Active Learning for
29 | Convolutional Neural Networks", arXiv, 2017.
30 |
31 |
32 | Notes:
33 | This code originally comesfrom the following code written by Google
34 | which is released under the Apache License 2.0 (as of Jan 25, 2022):
35 |
36 |
37 |
38 |
39 | The following is the description of the license applied to these code.
40 | ---
41 | Copyright 2017 Google Inc.
42 |
43 | Licensed under the Apache License, Version 2.0 (the "License");
44 | you may not use this file except in compliance with the License.
45 | You may obtain a copy of the License at
46 |
47 | http://www.apache.org/licenses/LICENSE-2.0
48 |
49 | Unless required by applicable law or agreed to in writing, software
50 | distributed under the License is distributed on an "AS IS" BASIS,
51 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
52 | See the License for the specific language governing permissions and
53 | limitations under the License.
54 | """
55 | def __init__(self, X, y, seed, metric="euclidean"):
56 | self.X = X
57 | self.y = y
58 | self.name = "kcenter"
59 | self.metric = metric
60 | self.min_distances = None
61 | self.n_obs = self.X.shape[0]
62 | self.already_selected = []
63 |
64 | def update_distances(self, cluster_centers, only_new=True, reset_dist=False):
65 | """
66 | Update min distances given cluster centers.
67 |
68 | Args:
69 | cluster_centers (list): indices of cluster centers
70 | only_new (bool): only calculate distance for newly selected points
71 | and update min_distances.
72 | rest_dist (bool): whether to reset min_distances variable.
73 | """
74 | if reset_dist:
75 | self.min_distances = None
76 |
77 | if only_new:
78 | cluster_centers = [d for d in cluster_centers if d not in self.already_selected]
79 |
80 | if cluster_centers:
81 |
82 | # Update min_distances for all examples given new cluster center.
83 | x = self.features[cluster_centers]
84 | dist = sklearn.metrics.pairwise_distances(self.features, x, metric=self.metric)
85 |
86 | if self.min_distances is None:
87 | self.min_distances = np.min(dist, axis=1).reshape(-1,1)
88 | else:
89 | self.min_distances = np.minimum(self.min_distances, dist)
90 |
91 | def select_batch(self, model, already_selected, N, **kwargs):
92 | """
93 | Diversity promoting active learning method that greedily forms a batch
94 | to minimize the maximum distance to a cluster center among all unlabeled
95 | datapoints.
96 |
97 | Args:
98 | model: model with scikit-like API with decision_function implemented
99 | already_selected: index of datapoints already selected
100 | N: batch size
101 |
102 | Returns:
103 | indices of points selected to minimize distance to cluster centers
104 | """
105 | # Assumes that the transform function takes in original data and not flattened data.
106 | if model is not None: self.features = model.transform(self.X)
107 | else : self.features = self.X.reshape((self.X.shape[0], -1))
108 |
109 | # Compute distances.
110 | self.update_distances(already_selected, only_new=False, reset_dist=True)
111 |
112 | # Initialize sampling results.
113 | new_batch = []
114 |
115 | for _ in rich.progress.track(range(N), description="Sampling..."):
116 |
117 | # Initialize centers with a randomly selected datapoint
118 | if self.already_selected is None:
119 | ind = np.random.choice(np.arange(self.n_obs))
120 |
121 | # Otherwise, use the index of minimum distance.
122 | else:
123 | ind = np.argmax(self.min_distances)
124 |
125 | # New examples should not be in already selected since those points
126 | # should have min_distance of zero to a cluster center.
127 | assert ind not in already_selected
128 |
129 | self.update_distances([ind], only_new=True, reset_dist=False)
130 |
131 | new_batch.append(ind)
132 |
133 | # Memorize the already selected indices.
134 | self.already_selected = already_selected
135 |
136 | # Print summaries.
137 | rich.print("Maximum distance from cluster centers: [magenta]%0.2f[/magenta]" % max(self.min_distances))
138 | rich.print("Initial number of features: [magenta]%d[/magenta]" % self.X.shape[0])
139 | rich.print("Sampled number of features: [magenta]%d[/magenta]" % len(new_batch))
140 |
141 | return new_batch
142 |
143 |
144 | class KNNSearcher:
145 | """
146 | A class for k-NN search with dimention resuction (random projection)
147 | and subsampling (k-center greedy method) features.
148 | """
149 | def __init__(self, projection=True, subsampling=True, sampling_ratio=0.01):
150 | """
151 | Constructor of the KNNSearcher class.
152 |
153 | Args:
154 | projection (bool) : Enable random projection if true.
155 | subsampling (bool) : Enable subsampling if true.
156 | sampling_ratio (float): Ratio of subsampling.
157 | """
158 | self.projection = projection
159 | self.subsampling = subsampling
160 | self.sampling_ratio = sampling_ratio
161 |
162 | def fit(self, x):
163 | """
164 | Train k-NN search model.
165 |
166 | Args:
167 | x (np.ndarray): Training data of shape (n_samples, n_features).
168 | """
169 | # Apply random projection if specified. Random projection is used for reducing
170 | # dimention while keeping topology. It makes the k-center greedy algorithm faster.
171 | if self.projection:
172 |
173 | rich.print("Sparse random projection")
174 |
175 | # If number of features is much smaller than the number of samples, random
176 | # projection will fail due to the lack of number of features. In that case,
177 | # please increase the parameter `eps`, or just skip the random projection.
178 | projector = sklearn.random_projection.SparseRandomProjection(n_components="auto", eps=0.90)
179 | projector.fit(x)
180 |
181 | # Print the shape of random matrix: (n_features_after, n_features_before).
182 | shape = projector.components_.shape
183 | rich.print(" - [green]random matrix shape[/green]: [cyan]%s[/cyan]" % str(shape))
184 |
185 | # Set None if random projection is no specified.
186 | else: projector = None
187 |
188 | # Execute coreset subsampling.
189 | if self.subsampling:
190 | rich.print("Coreset subsampling")
191 | n_select = int(x.shape[0] * self.sampling_ratio)
192 | selector = KCenterGreedy(x, 0, 0)
193 | indices = selector.select_batch(projector, [], n_select)
194 | x = x[indices, :]
195 |
196 | # Setup nearest neighbour finder using Faiss library.
197 | self.index = faiss.IndexFlatL2(x.shape[1])
198 | self.index.add(x)
199 |
200 | def predict(self, x, k=3):
201 | """
202 | Run k-NN search prediction.
203 |
204 | Args:
205 | x (np.ndarray): Query data of shape (n_samples, n_features)
206 | k (int) : Number of neighbors to be searched.
207 |
208 | Returns:
209 | dists (np.ndarray): Distance between the query and searched data,
210 | where the shape is (n_samples, n_neighbors).
211 | indices (np.ndarray): List of indices to be searched of shape
212 | (n_samples, n_neighbors).
213 | """
214 | # Faiss searcher requires C-contiguous array as input,
215 | # therefore forcibly convert the input data to contiguous array.
216 | x = np.ascontiguousarray(x)
217 |
218 | # Run k-NN search.
219 | dists, indices = self.index.search(x , k=k)
220 |
221 | return (dists, indices)
222 |
223 | def load(self, filepath):
224 | self.index = faiss.read_index(filepath)
225 | # if torch.cuda.is_available():
226 | # res = faiss.StandardGpuResources()
227 | # self.index = faiss.index_cpu_to_gpu(res, 0, self.index)
228 |
229 | def save(self, filepath):
230 | if hasattr(self, "index"): faiss.write_index(self.index, filepath)
231 | else : raise RuntimeError("this model not traint yet")
232 |
--------------------------------------------------------------------------------
/patchcore/patchcore.py:
--------------------------------------------------------------------------------
1 | """
2 | This module provides a class for PatchCore algorithm.
3 | """
4 |
5 | # Import standard libraries.
6 | import os
7 | import pathlib
8 |
9 | # Import third-party packages.
10 | import cv2 as cv
11 | import numpy as np
12 | import rich
13 | import rich.progress
14 | import sklearn.metrics
15 | import scipy.ndimage
16 | import torch
17 |
18 | # Import custom modules.
19 | from patchcore.extractor import FeatureExtractor
20 | from patchcore.knnsearch import KNNSearcher
21 | from patchcore.utils import Timer
22 |
23 |
24 | class PatchCore:
25 | """
26 | PyTorch implementation of the PatchCore anomaly detection [1].
27 |
28 | PatchCore algorithm can be divided to 2 steps, (1) feature extraction from NN model,
29 | and (2) k-NN search including coreset subsampling. The procedure of the step (1) is
30 | written in `extractor.FeatureExtractor`, and the step (2) is written in ``.
31 |
32 | Reference:
33 | [1] K. Roth, L. Pemula, J. Zepeda, B. Scholkopf, T. Brox, and P. Gehler,
34 | "Towards Total Recall in Industrial Anomaly Detection", arXiv, 2021.
35 |
36 | """
37 | def __init__(self, model, repo, device, sampling_ratio=0.001):
38 | """
39 | Constructor of the PatchCore class.
40 |
41 | Args:
42 | model (str/torch.nn.Module): A base NN model.
43 | repo (str) : Repository name which provides the model.
44 | device (str) : Device type used for NN inference.
45 | outdir (str) : Path to output directory.
46 |
47 | Notes:
48 | The arguments `model` and `repo` are passed to `torch.hub.load`
49 | function if the `model` is not a `torch.Module` instance.
50 | """
51 | self.device = device
52 |
53 | # Create feature extractor instance.
54 | self.extractor = FeatureExtractor(model, repo, device)
55 |
56 | # Create k-NN searcher instance.
57 | self.searcher = KNNSearcher(sampling_ratio=sampling_ratio)
58 |
59 | def fit(self, dataset, batch_size, num_workers=0):
60 | """
61 | Args:
62 | dataset (torchvision.utils.data.Dataset): Dataset.
63 | """
64 | # Step (1): feature extraction from the NN model.
65 | rich.print("\n[yellow][Training 1/2: feature extraction][/yellow]")
66 | embeddings = self.extractor.transform(dataset, batch_size=batch_size, num_workers=num_workers)
67 | rich.print("Embeddings dimentions: [magenta]%s[/magenta]" % str(embeddings.shape))
68 |
69 | # Step (2): preparation for k-NN search.
70 | rich.print("\n[yellow][Training 2/2: preparation for k-NN search][/yellow]")
71 | self.searcher.fit(embeddings)
72 |
73 | def score(self, dataset, n_neighbors, dirpath_out, num_workers=0):
74 | """
75 | Args:
76 | dataset (torchvision.utils.data.Dataset): Dataset.
77 | n_neighbors (int): Number of neighbors to be computed in k-NN search.
78 | dirpath_out (str): Directory path to dump detection results.
79 | num_workers (int): Number of available CPUs.
80 | """
81 | dataloader = torch.utils.data.DataLoader(dataset, batch_size=1, num_workers=num_workers, pin_memory=True)
82 |
83 | # Create timer instances.
84 | timer_feature_ext = Timer()
85 | timer_k_nn_search = Timer()
86 |
87 | # Initialize results.
88 | list_true_px_lvl, list_true_im_lvl = (list(), list())
89 | list_pred_px_lvl, list_pred_im_lvl = (list(), list())
90 |
91 | # Create output firectory if not exists.
92 | dirpath_images = pathlib.Path(dirpath_out) / "samples"
93 | dirpath_images.mkdir(parents=True, exist_ok=True)
94 |
95 | rich.print("\n[yellow][Test: anomaly detection inference][/yellow]")
96 | for x, gt, label, filepath, x_type in rich.progress.track(dataloader, description="Processing..."):
97 |
98 | # Extract embeddings.
99 | with timer_feature_ext:
100 | embedding = self.extractor.transform(x)
101 |
102 | # Compute nearest neighbor point and it's score (L2 distance).
103 | with timer_k_nn_search:
104 | score_patches, _ = self.searcher.predict(embedding, k=n_neighbors)
105 |
106 | anomaly_map_rw, score = self.compute_anomaly_scores(score_patches, x.shape)
107 |
108 | # Add pixel level scores.
109 | list_true_px_lvl.extend(gt.cpu().numpy().astype(int).ravel())
110 | list_pred_px_lvl.extend(anomaly_map_rw.ravel())
111 |
112 | # Add image level scores.
113 | list_true_im_lvl.append(label.cpu().numpy()[0])
114 | list_pred_im_lvl.append(score)
115 |
116 | # Save anomaly maps as images.
117 | self.save_anomaly_map(dirpath_images, anomaly_map_rw, filepath[0], x_type[0])
118 |
119 | rich.print("\n[yellow][Test: score calculation][/yellow]")
120 | image_auc = sklearn.metrics.roc_auc_score(list_true_im_lvl, list_pred_im_lvl)
121 | pixel_auc = sklearn.metrics.roc_auc_score(list_true_px_lvl, list_pred_px_lvl)
122 | rich.print("Total image-level auc-roc score: [magenta]%.6f[/magenta]" % image_auc)
123 | rich.print("Total pixel-level auc-roc score: [magenta]%.6f[/magenta]" % pixel_auc)
124 |
125 | rich.print("\n[yellow][Test: inference time][/yellow]")
126 | t1 = timer_feature_ext.mean()
127 | t2 = timer_k_nn_search.mean()
128 | rich.print("Feature extraction: [magenta]%.4f sec/image[/magenta]" % t1)
129 | rich.print("Anomaly map search: [magenta]%.4f sec/image[/magenta]" % t2)
130 | rich.print("Total infer time : [magenta]%.4f sec/image[/magenta]" % (t1 + t2))
131 |
132 | def predict(self, x, n_neighbors):
133 | """
134 | Returns prediction results.
135 |
136 | Args:
137 | x (np.ndarray): Input matrix with shape (n_samples, n_fesatures).
138 | n_neighbors (int) : Number of neighbors to be returned.
139 | """
140 | # Extract embeddings.
141 | embedding = self.extractor.transform(x)
142 |
143 | # Compute nearest neighbor point and it's score (L2 distance).
144 | score_patches, _ = self.searcher.predict(embedding, k=n_neighbors)
145 |
146 | # Compute anomaly map and it's re-weighting.
147 | anomaly_map_rw, _ = self.compute_anomaly_scores(score_patches, x.shape)
148 |
149 | return anomaly_map_rw
150 |
151 | def load(self, filepath):
152 | """
153 | Load trained model.
154 |
155 | Args:
156 | filepath (str): path to the trained file.
157 | """
158 | self.searcher.load(filepath)
159 |
160 | def save(self, filepath):
161 | """
162 | Save trained model.
163 |
164 | Args:
165 | filepath (str): path to the trained file.
166 | """
167 | self.searcher.save(filepath)
168 |
169 | def compute_anomaly_scores(seld, score_patches, x_shape):
170 | """
171 | Returns anomaly index from the results of k-NN search.
172 |
173 | Args:
174 | score_patches (np.ndarary): Results of k-NN search with shape (1, h, w, neighbors).
175 | x_shape (tuple) : Shape of input image with shape (1, c, H, W).
176 | """
177 | # Anomaly map is defined as a map of L2 distance from the nearest neibours.
178 | # NOTE: The magic number (28, 28) should be removed!
179 | anomaly_score = score_patches.reshape((28, 28, -1))
180 |
181 | # Refine anomaly map.
182 | anomaly_maps = [anomaly_score[:, :, n] for n in range(anomaly_score.shape[2])]
183 | anomaly_maps = [cv.resize(amap, (x_shape[3], x_shape[2])) for amap in anomaly_maps]
184 | anomaly_maps = [scipy.ndimage.gaussian_filter(amap, sigma=4) for amap in anomaly_maps]
185 | anomaly_maps = np.array(anomaly_maps, dtype=np.float64)
186 | anomaly_map = anomaly_maps[0, :, :]
187 |
188 | # Anomaly map re-wighting.
189 | # We applied log-softmax like processing to the computation of scale factor
190 | # for avoiding the overflow of floating numbers.
191 | normalized_exp = np.exp(anomaly_maps - np.max(anomaly_maps, axis=0))
192 | anomaly_map_rw = anomaly_map * (1.0 - normalized_exp[0, :, :] / np.sum(normalized_exp, axis=0))
193 |
194 | # Compute image level score.
195 | i, j = np.unravel_index(np.argmax(anomaly_map), anomaly_map.shape)
196 | score = anomaly_map_rw[i, j]
197 |
198 | return (anomaly_map_rw, score)
199 |
200 | def save_anomaly_map(self, dirpath, anomaly_map, filepath, x_type, contour=None):
201 | """
202 | Args:
203 | dirpath (str) : Output directory path.
204 | anomaly_map (np.ndarray): Anomaly map with the same size as the input image.
205 | filepath (str) : Path of the input image.
206 | x_type (str) : Anomaly type (e.g. "good", "crack", etc).
207 | contour (float) : Threshold of contour, or None.
208 | """
209 | def min_max_norm(image):
210 | a_min, a_max = image.min(), image.max()
211 | return (image - a_min) / (a_max - a_min)
212 |
213 | def cvt2heatmap(gray):
214 | return cv.applyColorMap(np.uint8(gray), cv.COLORMAP_JET)
215 |
216 | # Get output directory.
217 | dirpath = pathlib.Path(dirpath)
218 | dirpath.mkdir(parents=True, exist_ok=True)
219 |
220 | # Get the image file name.
221 | filename = os.path.basename(filepath)
222 |
223 | # Load the image file and resize.
224 | original_image = cv.imread(filepath)
225 | original_image = cv.resize(original_image, anomaly_map.shape[:2])
226 |
227 | # Visualize heat map.
228 | if contour is None:
229 |
230 | # Normalize anomaly map for easier visualization.
231 | anomaly_map_norm = cvt2heatmap(255 * min_max_norm(anomaly_map))
232 |
233 | # Overlay the anomaly map to the origimal image.
234 | output_image = (anomaly_map_norm / 2 + original_image / 2).astype(np.uint8)
235 |
236 | # Visualize contour map.
237 | else:
238 |
239 | # Additional smoothing for better contour visualization.
240 | anomaly_map = cv.GaussianBlur(anomaly_map, (5, 5), 3)
241 |
242 | # Compute binary map to compute contour.
243 | binary_map = np.where(anomaly_map >= contour, 255, 0).astype(np.uint8)
244 |
245 | # Compute contour.
246 | contour_coord, hierarchy = cv.findContours(binary_map, cv.RETR_LIST, cv.CHAIN_APPROX_SIMPLE)
247 |
248 | # Ignore small anomalies.
249 | w, h = original_image.shape[0:2]
250 | areas = [cv.contourArea(contour_pts) / w / h for contour_pts in contour_coord]
251 | contour_coord = [contour_pts for contour_pts, area in zip(contour_coord, areas) if area > 0.005]
252 |
253 | # Initialize contour image.
254 | contour_map = original_image.copy()
255 |
256 | # Fill inside of all contours.
257 | for idx in range(len(contour_coord)):
258 | contour_map = cv.fillPoly(contour_map, [contour_coord[idx][:,0,:]], (0,165,255))
259 |
260 | # Overray the filled image to the original image.
261 | contour_map = (0.6 * original_image + 0.4 * contour_map).astype(np.uint8)
262 |
263 | # Draw contour edges.
264 | output_image = cv.drawContours(contour_map, contour_coord, -1, (0,165,255), 2)
265 |
266 | # Save the normalized anomaly map as image.
267 | cv.imwrite(str(dirpath / f"{x_type}_{filename}.jpg"), output_image)
268 |
269 | # Save raw anomaly score as npy file.
270 | np.save(str(dirpath / f"{x_type}_{filename}.npy"), anomaly_map)
271 |
--------------------------------------------------------------------------------
/patchcore/utils.py:
--------------------------------------------------------------------------------
1 | """
2 | Utility functions for the PatchCore
3 | """
4 |
5 | # Import standard libraries.
6 | import time
7 |
8 | # Import third-party packages.
9 | import numpy as np
10 |
11 |
12 | class Timer:
13 | """
14 | A class for measuring elapsed time by "with" sentence.
15 |
16 | Example:
17 | >>> # Create Timer instance.
18 | >>> timer = Timer()
19 | >>>
20 | >>> # Repeat some procedure 100 times.
21 | >>> for _ in range(100):
22 | >>> with timer:
23 | >>> some_procedure()
24 | >>>
25 | >>> # Print mean elapsed time.
26 | >>> print(timer.mean())
27 | """
28 | def __init__(self):
29 | self.times = list()
30 |
31 | def __enter__(self):
32 | self.time_start = time.time()
33 |
34 | def __exit__(self, exc_type, exc_value, traceback):
35 | self.time_end = time.time()
36 | self.times.append(self.time_end - self.time_start)
37 |
38 | def mean(self):
39 | return sum(self.times) / len(self.times)
40 |
41 |
42 | def auto_threshold(scores_good, coef_sigma):
43 | """
44 | Compute threshold value from the given good scores.
45 |
46 | Args:
47 | score_good (list) : List of anomaly score for good samples.
48 | coef_sigma (float): Hyperparameter of the thresholding.
49 | """
50 | # Compute mean/std of the anomaly scores.
51 | score_mean = np.mean(scores_good)
52 | score_std = np.std(scores_good)
53 |
54 | # Compute threshold.
55 | thresh = score_mean + coef_sigma * score_std
56 |
57 | return (thresh, score_mean, score_std)
58 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | faiss-cpu==1.7.1
2 | opencv-python==4.5.2.52
3 | rich==11.0.0
4 | scikit-learn==0.24.2
5 | torch==1.8.1
6 | torchvision==0.9.1
7 |
--------------------------------------------------------------------------------
/run.sh:
--------------------------------------------------------------------------------
1 | #!/bin/sh
2 |
3 | python3 main_mvtecad.py runall --model resnet18
4 | python3 main_mvtecad.py runall --model resnet34
5 | python3 main_mvtecad.py runall --model resnet50
6 | python3 main_mvtecad.py runall --model resnet101
7 | python3 main_mvtecad.py runall --model resnet152
8 | python3 main_mvtecad.py runall --model wide_resnet50_2
9 |
10 | python3 main_mvtecad.py runall --model data_resnext50_32x4d
11 | python3 main_mvtecad.py runall --model data_resnext101_32x8d
12 |
13 | python3 main_mvtecad.py runall --model densenet121
14 | python3 main_mvtecad.py runall --model densenet161
15 | python3 main_mvtecad.py runall --model densenet169
16 | python3 main_mvtecad.py runall --model densenet201
17 |
18 | python3 main_mvtecad.py runall --model vgg11_bn
19 | python3 main_mvtecad.py runall --model vgg13_bn
20 | python3 main_mvtecad.py runall --model vgg16_bn
21 | python3 main_mvtecad.py runall --model vgg19_bn
22 |
23 | python3 main_mvtecad.py runall --model deeplabv3_resnet50
24 | python3 main_mvtecad.py runall --test_only --repo facebookresearch/semi-supervised-ImageNet1K-models --model resnet50_swsl
25 | python3 main_mvtecad.py runall --test_only --repo facebookresearch/semi-supervised-ImageNet1K-models --model resnet50_ssl
26 |
--------------------------------------------------------------------------------