├── .gitignore
├── README.md
├── Skin Cancer - EDA.ipynb
├── Skin Cancer Prediction Baseline.ipynb
├── Skin Cancer Prediction DenseNet121.ipynb
├── Skin Cancer Prediction EfficientNetB4.ipynb
├── Skin Cancer Prediction GradCam Visualization.ipynb
├── Skin Cancer Prediction InceptionResNetV2.ipynb
├── Skin Cancer Prediction ResNet50.ipynb
├── Skin Cancer Prediction VGG16.ipynb
└── docs
├── model_architecture.png
└── results.png
/.gitignore:
--------------------------------------------------------------------------------
1 | .ipynb_checkpoints
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 |
Skin Cancer Classification using Transfer Learning
4 |
5 |
6 |
7 | [](https://doi.org/10.1007/978-3-031-22405-8_27)
11 | 
12 |
17 |
18 |
19 |
20 | ### 📌 Introduction
21 |
22 | According to the Skin Cancer Foundation statistics, skin cancer is known to be the most common cancer in the United States and worldwide. By the age of seventy years, about twenty percent of Americans will have developed skin cancer due to exposure to radiation. Of all the types of skin cancers, melanoma is particularly deadly and responsible for most skin cancer deaths. Therefore, early detection is the key to survival. An automatic skin lesion diagnosis system can assist dermatologists since its challenging to differentiate between the different classes of skin lesions. In this paper, we propose a transfer learning based deep learning system using deep convolutional neural networks that leverage residual connections to perform the mentioned task with high accuracy. The HAM10000 dataset was utilized for training and testing the model and comparing its performance with other pre-trained models. This kind of automated classification system can be integrated into a computeraided diagnosis (CAD) system pipeline to assist in the early detection of skin cancer.
23 |
24 |
25 |
26 |
27 | ### 🗃️ Dataset
28 |
29 | A public dataset from the International Skin Imaging Collaboration (ISCI) archive was obtained for this work known as HAM10000. It is quite a large dataset that consists of 10015 samples of multi-source dermatoscopic images of common pigmented skin lesions.
30 |
31 | Get the data [here](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DBW86T)
32 |
33 |
34 |
35 | ## Modified Inception-ResNetV2 Model Architecture
36 |
37 |
38 | We modified the Inception-ResNetV2 architecture, which belongs to the inception family but improved with the help of residual connections instead of the conventional filter concatenation stage. It combines the two architectures of Inception and Residual networks to obtain more solid performance but at the same time keeping the computational costs relatively low. It consists of a stem block, three sets of residual inception block modules with [5,10,5] blocks of Inception-ResNetA, Inception-RetNetB, Inception-RetNetC modules, respectively, and subsequently pooling layer after each set of Inception-ResNet modules, all of which are connected sequentially. With a total of 164 layers, this deep convolutional network is capable of learning rich feature representation for broad categories of image data
39 |
40 |
41 |
42 |
43 | ## Conclusion
44 |
45 |
46 | In this paper, we have taken up the task of multi-class classification of skin lesions from dermatoscopic images in the HAM10000 dataset using deep convolutional neural networks, alleviating the need for complex feature engineering. We leveraged transfer learning by using pre-trained models and modified the Inception-ResNetV2 architecture to the required problem. The model achieved an accuracy of 90.08% with an F1-score of 89%, outperforming the rest.
47 |
48 |
49 |
50 |
51 | ### 📝 Citation
52 | If you found this code helpful please consider citing,
53 | ```
54 | Kollipara, V.N.H., Kollipara, V.N.D.P. (2022). Residual Learning Based Approach for Multi-class Classification of Skin Lesion Using Deep Convolutional Neural Network. In: Guru, D.S., Y. H., S.K., K., B., Agrawal, R.K., Ichino, M. (eds) Cognition and Recognition. ICCR 2021. Communications in Computer and Information Science, vol 1697. Springer, Cham. https://doi.org/10.1007/978-3-031-22405-8_27
55 | ```
56 |
57 | ### ⚖️ License
58 | Copyright © 2023 Hemanth Kollipara, Pavithra Kollipara
59 |
--------------------------------------------------------------------------------
/Skin Cancer Prediction Baseline.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "6fa5a720",
6 | "metadata": {},
7 | "source": [
8 | "## Skin Cancer Classification baseline model"
9 | ]
10 | },
11 | {
12 | "cell_type": "code",
13 | "execution_count": 1,
14 | "id": "fdf93ec8",
15 | "metadata": {},
16 | "outputs": [],
17 | "source": [
18 | "import os\n",
19 | "import shutil\n",
20 | "import numpy as np\n",
21 | "import pandas as pd\n",
22 | "import matplotlib.pyplot as plt\n",
23 | "import seaborn as sns\n",
24 | "import cv2\n",
25 | "import sklearn"
26 | ]
27 | },
28 | {
29 | "cell_type": "markdown",
30 | "id": "79ab8ac8",
31 | "metadata": {},
32 | "source": [
33 | "### Loading Dataset"
34 | ]
35 | },
36 | {
37 | "cell_type": "code",
38 | "execution_count": 2,
39 | "id": "34c3ceb8",
40 | "metadata": {},
41 | "outputs": [
42 | {
43 | "name": "stdout",
44 | "output_type": "stream",
45 | "text": [
46 | "['BKL', 'NV', 'AKIEC', 'MEL', 'DF', 'BCC', 'VASC']\n"
47 | ]
48 | }
49 | ],
50 | "source": [
51 | "os.mkdir(\"HAM_Dataset\")\n",
52 | "base = \"HAM_Dataset\"\n",
53 | "\n",
54 | "os.mkdir(os.path.join(base, \"MEL\"))\n",
55 | "os.mkdir(os.path.join(base, \"NV\"))\n",
56 | "os.mkdir(os.path.join(base, \"BCC\"))\n",
57 | "os.mkdir(os.path.join(base, \"AKIEC\"))\n",
58 | "os.mkdir(os.path.join(base, \"BKL\"))\n",
59 | "os.mkdir(os.path.join(base, \"DF\"))\n",
60 | "os.mkdir(os.path.join(base, \"VASC\"))\n",
61 | "\n",
62 | "print(os.listdir(base))"
63 | ]
64 | },
65 | {
66 | "cell_type": "code",
67 | "execution_count": null,
68 | "id": "e754affa",
69 | "metadata": {},
70 | "outputs": [],
71 | "source": [
72 | "for image in os.listdir('ISIC2018_Task3_Training_Input'):\n",
73 | " if \"jpg\" not in image:\n",
74 | " os.remove('ISIC2018_Task3_Training_Input/'+image)\n",
75 | "\n",
76 | "for image in os.listdir('ISIC2018_Task3_Training_Input'):\n",
77 | " if \"jpg\" not in image:\n",
78 | " print(image)"
79 | ]
80 | },
81 | {
82 | "cell_type": "code",
83 | "execution_count": null,
84 | "id": "b71a7814",
85 | "metadata": {},
86 | "outputs": [],
87 | "source": [
88 | "mapping = {0:\"MEL\", 1:\"NV\", 2:\"BCC\", 3:\"AKIEC\", 4:\"BKL\", 5:\"DF\", 6:\"VASC\"}\n",
89 | "\n",
90 | "df_labels = pd.read_csv(\"../input/isictruth/ISIC2018GroundTruth.csv\")\n",
91 | "for i in range(len(df_labels)):\n",
92 | " labels = df_labels.iloc[i,1:]\n",
93 | " df_labels.loc[i,\"label\"] = mapping[list(labels).index(1)]\n",
94 | "\n",
95 | "#df_labels[\"label\"]=df_labels[\"label\"].astype(int)\n",
96 | "df_labels.set_index('image', inplace=True)"
97 | ]
98 | },
99 | {
100 | "cell_type": "code",
101 | "execution_count": 5,
102 | "id": "083c0229",
103 | "metadata": {},
104 | "outputs": [
105 | {
106 | "name": "stdout",
107 | "output_type": "stream",
108 | "text": [
109 | "NV 6705\n",
110 | "MEL 1113\n",
111 | "BKL 1099\n",
112 | "BCC 514\n",
113 | "AKIEC 327\n",
114 | "VASC 142\n",
115 | "DF 115\n",
116 | "Name: label, dtype: int64\n"
117 | ]
118 | }
119 | ],
120 | "source": [
121 | "df_labels['label'].value_counts()"
122 | ]
123 | },
124 | {
125 | "cell_type": "markdown",
126 | "id": "fe973dfb",
127 | "metadata": {},
128 | "source": [
129 | "### Computing class weights"
130 | ]
131 | },
132 | {
133 | "cell_type": "code",
134 | "execution_count": 10,
135 | "id": "da687f41",
136 | "metadata": {},
137 | "outputs": [
138 | {
139 | "name": "stdout",
140 | "output_type": "stream",
141 | "text": [
142 | "{0: 4.375273044997815,\n",
143 | " 1: 2.78349082823791,\n",
144 | " 2: 1.301832835044846,\n",
145 | " 3: 12.440993788819876,\n",
146 | " 4: 1.2854575792581184,\n",
147 | " 5: 0.21338020666879728,\n",
148 | " 6: 10.075452716297788}\n"
149 | ]
150 | }
151 | ],
152 | "source": [
153 | "from sklearn.utils import class_weight\n",
154 | "\n",
155 | "class_weights = class_weight.compute_class_weight('balanced',\n",
156 | " classes=['AKIEC', 'BCC', 'BKL', 'DF', 'MEL', 'NV', 'VASC'],\n",
157 | " y=df_labels[\"label\"])\n",
158 | "class_wt_dict=dict(enumerate(class_weights))\n",
159 | "class_wt_dict"
160 | ]
161 | },
162 | {
163 | "cell_type": "code",
164 | "execution_count": 18,
165 | "id": "e6dc59ec",
166 | "metadata": {},
167 | "outputs": [
168 | {
169 | "name": "stdout",
170 | "output_type": "stream",
171 | "text": [
172 | "\n",
173 | "100%|██████████| 10015/10015 [00:06<00:00, 1625.40it/s]\n",
174 | "\n"
175 | ]
176 | }
177 | ],
178 | "source": [
179 | "from tqdm import tqdm\n",
180 | "\n",
181 | "images = os.listdir('ISIC2018_Task3_Training_Input')\n",
182 | "\n",
183 | "for image in tqdm(images):\n",
184 | " fname=image[:-4]\n",
185 | " label=df_labels.loc[fname, \"label\"]\n",
186 | " src = os.path.join('ISIC2018_Task3_Training_Input', image)\n",
187 | " dst = os.path.join('HAM_Dataset', label, image)\n",
188 | " shutil.copyfile(src, dst)"
189 | ]
190 | },
191 | {
192 | "cell_type": "markdown",
193 | "id": "7db46759",
194 | "metadata": {},
195 | "source": [
196 | "### Data Augmentation"
197 | ]
198 | },
199 | {
200 | "cell_type": "code",
201 | "execution_count": 11,
202 | "id": "fc89f32c",
203 | "metadata": {},
204 | "outputs": [
205 | {
206 | "name": "stdout",
207 | "output_type": "stream",
208 | "text": [
209 | "Found 8516 images belonging to 7 classes.\n",
210 | " Found 1499 images belonging to 7 classes.\n",
211 | " {'AKIEC': 0, 'BCC': 1, 'BKL': 2, 'DF': 3, 'MEL': 4, 'NV': 5, 'VASC': 6}\n"
212 | ]
213 | }
214 | ],
215 | "source": [
216 | "from tensorflow.keras.preprocessing.image import ImageDataGenerator\n",
217 | "from tensorflow.keras.applications.resnet_v2 import preprocess_input as base_preprocess\n",
218 | "\n",
219 | "image_gen = ImageDataGenerator(rotation_range=30,\n",
220 | " width_shift_range=0.1,\n",
221 | " height_shift_range=0.1,\n",
222 | " shear_range=0.1,\n",
223 | " zoom_range=0.2,\n",
224 | " horizontal_flip=True,\n",
225 | " fill_mode='nearest',\n",
226 | " rescale=1/255,\n",
227 | " validation_split=0.15)\n",
228 | "\n",
229 | "data_dir = 'HAM_Dataset'\n",
230 | "batch_size = 128\n",
231 | "target_size = (224,224)\n",
232 | "train_image_gen = image_gen.flow_from_directory(data_dir, \n",
233 | " target_size=target_size,\n",
234 | " color_mode='rgb',\n",
235 | " batch_size=batch_size,\n",
236 | " class_mode='categorical',\n",
237 | " subset=\"training\")\n",
238 | "\n",
239 | "test_image_gen = image_gen.flow_from_directory(data_dir, \n",
240 | " target_size=target_size, \n",
241 | " color_mode='rgb',\n",
242 | " batch_size=batch_size,\n",
243 | " class_mode='categorical',\n",
244 | " shuffle=False,\n",
245 | " subset=\"validation\")\n",
246 | "\n",
247 | "print(test_image_gen.class_indices)"
248 | ]
249 | },
250 | {
251 | "cell_type": "markdown",
252 | "id": "b9d19125",
253 | "metadata": {},
254 | "source": [
255 | "### Baseline CNN Model"
256 | ]
257 | },
258 | {
259 | "cell_type": "code",
260 | "execution_count": 14,
261 | "id": "eb4948e5",
262 | "metadata": {},
263 | "outputs": [
264 | {
265 | "name": "stdout",
266 | "output_type": "stream",
267 | "text": [
268 | "\n",
269 | "Model: \"sequential_3\"\n",
270 | "_________________________________________________________________\n",
271 | "Layer (type) Output Shape Param # \n",
272 | "=================================================================\n",
273 | "conv2d_12 (Conv2D) (None, 222, 222, 64) 1792 \n",
274 | "_________________________________________________________________\n",
275 | "max_pooling2d_12 (MaxPooling (None, 111, 111, 64) 0 \n",
276 | "_________________________________________________________________\n",
277 | "conv2d_13 (Conv2D) (None, 109, 109, 64) 36928 \n",
278 | "_________________________________________________________________\n",
279 | "max_pooling2d_13 (MaxPooling (None, 54, 54, 64) 0 \n",
280 | "_________________________________________________________________\n",
281 | "conv2d_14 (Conv2D) (None, 52, 52, 128) 73856 \n",
282 | "_________________________________________________________________\n",
283 | "max_pooling2d_14 (MaxPooling (None, 26, 26, 128) 0 \n",
284 | "_________________________________________________________________\n",
285 | "conv2d_15 (Conv2D) (None, 24, 24, 256) 295168 \n",
286 | "_________________________________________________________________\n",
287 | "max_pooling2d_15 (MaxPooling (None, 12, 12, 256) 0 \n",
288 | "_________________________________________________________________\n",
289 | "flatten_3 (Flatten) (None, 36864) 0 \n",
290 | "_________________________________________________________________\n",
291 | "dense_6 (Dense) (None, 128) 4718720 \n",
292 | "_________________________________________________________________\n",
293 | "dropout_3 (Dropout) (None, 128) 0 \n",
294 | "_________________________________________________________________\n",
295 | "dense_7 (Dense) (None, 7) 903 \n",
296 | "=================================================================\n",
297 | "Total params: 5,127,367\n",
298 | "Trainable params: 5,127,367\n",
299 | "Non-trainable params: 0\n",
300 | "_________________________________________________________________\n",
301 | "add Codeadd Markdown\n",
302 | "\n"
303 | ]
304 | }
305 | ],
306 | "source": [
307 | "from tensorflow.keras.models import Sequential\n",
308 | "from tensorflow.keras.layers import Dense, Conv2D, MaxPool2D, Flatten, Dropout\n",
309 | "\n",
310 | "model = Sequential()\n",
311 | "\n",
312 | "model.add( Conv2D(filters=64, kernel_size=(3,3), input_shape=(224,224,3), activation=\"relu\") )\n",
313 | "model.add( MaxPool2D(pool_size=(2,2)) )\n",
314 | "\n",
315 | "model.add( Conv2D(filters=64, kernel_size=(3,3), activation=\"relu\") )\n",
316 | "model.add( MaxPool2D(pool_size=(2,2)) )\n",
317 | "\n",
318 | "model.add( Conv2D(filters=128, kernel_size=(3,3), activation=\"relu\") )\n",
319 | "model.add( MaxPool2D(pool_size=(2,2)) )\n",
320 | "\n",
321 | "model.add( Conv2D(filters=256, kernel_size=(3,3), activation=\"relu\") )\n",
322 | "model.add( MaxPool2D(pool_size=(2,2)) )\n",
323 | "\n",
324 | "model.add(Flatten())\n",
325 | "model.add(Dense(128, activation=\"relu\"))\n",
326 | "model.add(Dropout(0.5))\n",
327 | "\n",
328 | "model.add(Dense(7, activation=\"softmax\"))\n",
329 | "\n",
330 | "model.compile(loss=\"categorical_crossentropy\", optimizer=\"adam\", metrics=[\"accuracy\"])\n",
331 | "model.summary()"
332 | ]
333 | },
334 | {
335 | "cell_type": "markdown",
336 | "id": "c62414f2",
337 | "metadata": {},
338 | "source": [
339 | "### Training and Validation"
340 | ]
341 | },
342 | {
343 | "cell_type": "code",
344 | "execution_count": null,
345 | "id": "6b387e8b",
346 | "metadata": {},
347 | "outputs": [],
348 | "source": [
349 | "from tensorflow.keras.callbacks import ReduceLROnPlateau\n",
350 | "from tensorflow.keras.callbacks import EarlyStopping\n",
351 | "from tensorflow.keras.callbacks import ModelCheckpoint\n",
352 | "\n",
353 | "lr_reduce = ReduceLROnPlateau(monitor='val_accuracy', factor=0.5, patience=1,mode='max', min_lr=0.00001,verbose=1)\n",
354 | "early_stop = EarlyStopping(monitor=\"val_loss\", patience=2, verbose=1)\n",
355 | "model_chkpt = ModelCheckpoint('best_model_dn121.hdf5',save_best_only=True, monitor='val_accuracy',verbose=1)\n",
356 | "\n",
357 | "callback_list = [model_chkpt,lr_reduce]"
358 | ]
359 | },
360 | {
361 | "cell_type": "code",
362 | "execution_count": 19,
363 | "id": "39ff5ca2",
364 | "metadata": {},
365 | "outputs": [
366 | {
367 | "name": "stdout",
368 | "output_type": "stream",
369 | "text": [
370 | "\n",
371 | "Epoch 42/45\n",
372 | "67/67 [==============================] - 126s 2s/step - loss: 0.3013 - accuracy: 0.7550 - val_loss: 1.3435 - val_accuracy: 0.6758\n",
373 | "\n",
374 | "Epoch 00001: val_accuracy improved from 0.61107 to 0.67578, saving model to best_model_dn121.hdf5\n",
375 | "Epoch 43/45\n",
376 | "67/67 [==============================] - 124s 2s/step - loss: 0.2686 - accuracy: 0.7810 - val_loss: 1.5529 - val_accuracy: 0.6851\n",
377 | "\n",
378 | "Epoch 00002: val_accuracy improved from 0.67578 to 0.68512, saving model to best_model_dn121.hdf5\n",
379 | "Epoch 44/45\n",
380 | "67/67 [==============================] - 124s 2s/step - loss: 0.2371 - accuracy: 0.8033 - val_loss: 1.4009 - val_accuracy: 0.7111\n",
381 | "\n",
382 | "Epoch 00003: val_accuracy improved from 0.68512 to 0.71114, saving model to best_model_dn121.hdf5\n",
383 | "Epoch 45/45\n",
384 | "60/67 [=========================>....] - ETA: 11s - loss: 0.2297 - accuracy: 0.8042\n",
385 | "\n",
386 | "\n"
387 | ]
388 | }
389 | ],
390 | "source": [
391 | "history = model.fit(train_image_gen,\n",
392 | " epochs=15, \n",
393 | " validation_data = test_image_gen,\n",
394 | " class_weight=class_wt_dict,\n",
395 | " callbacks=callback_list)"
396 | ]
397 | },
398 | {
399 | "cell_type": "markdown",
400 | "id": "78865b82",
401 | "metadata": {},
402 | "source": [
403 | "### Model Evaluation"
404 | ]
405 | },
406 | {
407 | "cell_type": "code",
408 | "execution_count": 16,
409 | "id": "336bcbf0",
410 | "metadata": {},
411 | "outputs": [
412 | {
413 | "name": "stdout",
414 | "output_type": "stream",
415 | "text": [
416 | "\n",
417 | "array([[ 12, 16, 7, 1, 5, 8, 0],\n",
418 | " [ 2, 46, 6, 4, 9, 10, 0],\n",
419 | " [ 6, 23, 67, 2, 30, 34, 2],\n",
420 | " [ 3, 4, 2, 6, 1, 1, 0],\n",
421 | " [ 3, 4, 29, 0, 78, 51, 1],\n",
422 | " [ 8, 27, 44, 6, 78, 836, 6],\n",
423 | " [ 0, 3, 1, 0, 0, 2, 15]])\n",
424 | "\n"
425 | ]
426 | }
427 | ],
428 | "source": [
429 | "import sklearn\n",
430 | "from sklearn.metrics import classification_report, confusion_matrix\n",
431 | "\n",
432 | "cm = confusion_matrix(test_labels, predictions)\n",
433 | "cm"
434 | ]
435 | },
436 | {
437 | "cell_type": "code",
438 | "execution_count": 17,
439 | "id": "5c15da6a",
440 | "metadata": {},
441 | "outputs": [
442 | {
443 | "name": "stdout",
444 | "output_type": "stream",
445 | "text": [
446 | "\n",
447 | "precision recall f1-score support\n",
448 | "\n",
449 | " 0 0.35 0.24 0.29 49\n",
450 | " 1 0.37 0.60 0.46 77\n",
451 | " 2 0.43 0.41 0.42 164\n",
452 | " 3 0.32 0.35 0.33 17\n",
453 | " 4 0.39 0.47 0.43 166\n",
454 | " 5 0.89 0.83 0.86 1005\n",
455 | " 6 0.62 0.71 0.67 21\n",
456 | "\n",
457 | " accuracy 0.71 1499\n",
458 | " macro avg 0.48 0.52 0.49 1499\n",
459 | "weighted avg 0.73 0.71 0.71 1499\n",
460 | "\n"
461 | ]
462 | }
463 | ],
464 | "source": [
465 | "print(classification_report(test_labels, predictions))"
466 | ]
467 | },
468 | {
469 | "cell_type": "code",
470 | "execution_count": null,
471 | "id": "ceef50c8",
472 | "metadata": {},
473 | "outputs": [],
474 | "source": []
475 | }
476 | ],
477 | "metadata": {
478 | "kernelspec": {
479 | "display_name": "Python 3",
480 | "language": "python",
481 | "name": "python3"
482 | },
483 | "language_info": {
484 | "codemirror_mode": {
485 | "name": "ipython",
486 | "version": 3
487 | },
488 | "file_extension": ".py",
489 | "mimetype": "text/x-python",
490 | "name": "python",
491 | "nbconvert_exporter": "python",
492 | "pygments_lexer": "ipython3",
493 | "version": "3.8.8"
494 | }
495 | },
496 | "nbformat": 4,
497 | "nbformat_minor": 5
498 | }
499 |
--------------------------------------------------------------------------------
/Skin Cancer Prediction DenseNet121.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Skin Cancer Classification Transfer Learning DenseNet121"
8 | ]
9 | },
10 | {
11 | "cell_type": "code",
12 | "execution_count": 2,
13 | "metadata": {
14 | "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19",
15 | "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5",
16 | "execution": {
17 | "iopub.execute_input": "2021-09-20T09:53:25.793096Z",
18 | "iopub.status.busy": "2021-09-20T09:53:25.792824Z",
19 | "iopub.status.idle": "2021-09-20T09:53:25.801106Z",
20 | "shell.execute_reply": "2021-09-20T09:53:25.800410Z",
21 | "shell.execute_reply.started": "2021-09-20T09:53:25.793067Z"
22 | }
23 | },
24 | "outputs": [],
25 | "source": [
26 | "import os\n",
27 | "import shutil\n",
28 | "from tqdm import tqdm\n",
29 | "\n",
30 | "import cv2\n",
31 | "import sklearn\n",
32 | "import numpy as np\n",
33 | "import pandas as pd\n",
34 | "import matplotlib.pyplot as plt\n",
35 | "import seaborn as sns\n",
36 | "plt.style.use(\"classic\")"
37 | ]
38 | },
39 | {
40 | "cell_type": "markdown",
41 | "metadata": {},
42 | "source": [
43 | "### Creating Dataset"
44 | ]
45 | },
46 | {
47 | "cell_type": "code",
48 | "execution_count": 2,
49 | "metadata": {
50 | "execution": {
51 | "iopub.execute_input": "2021-09-20T05:53:07.683746Z",
52 | "iopub.status.busy": "2021-09-20T05:53:07.682980Z",
53 | "iopub.status.idle": "2021-09-20T05:53:07.709720Z",
54 | "shell.execute_reply": "2021-09-20T05:53:07.708962Z",
55 | "shell.execute_reply.started": "2021-09-20T05:53:07.683709Z"
56 | }
57 | },
58 | "outputs": [
59 | {
60 | "name": "stdout",
61 | "output_type": "stream",
62 | "text": [
63 | "['MEL', 'VASC', 'DF', 'NV', 'BKL', 'AKIEC', 'BCC']\n"
64 | ]
65 | }
66 | ],
67 | "source": [
68 | "print(os.listdir('../input/ham-dataset/HAM_Dataset'))"
69 | ]
70 | },
71 | {
72 | "cell_type": "markdown",
73 | "metadata": {},
74 | "source": [
75 | "### Train Test Split"
76 | ]
77 | },
78 | {
79 | "cell_type": "code",
80 | "execution_count": 77,
81 | "metadata": {
82 | "execution": {
83 | "iopub.execute_input": "2021-09-20T07:40:53.011177Z",
84 | "iopub.status.busy": "2021-09-20T07:40:53.010316Z",
85 | "iopub.status.idle": "2021-09-20T07:40:53.019292Z",
86 | "shell.execute_reply": "2021-09-20T07:40:53.018318Z",
87 | "shell.execute_reply.started": "2021-09-20T07:40:53.011131Z"
88 | }
89 | },
90 | "outputs": [],
91 | "source": [
92 | "from tensorflow.keras.preprocessing.image import ImageDataGenerator\n",
93 | "from tensorflow.keras.applications.densenet import preprocess_input as base_preprocess\n",
94 | "\n",
95 | "\n",
96 | "image_gen = ImageDataGenerator(preprocessing_function=base_preprocess,\n",
97 | " rotation_range=20,\n",
98 | " width_shift_range=0.1,\n",
99 | " height_shift_range=0.1,\n",
100 | " shear_range=0.1,\n",
101 | " zoom_range=0.1,\n",
102 | " horizontal_flip=True,\n",
103 | " fill_mode='nearest',\n",
104 | " #rescale=1/255,\n",
105 | " validation_split=0.20)"
106 | ]
107 | },
108 | {
109 | "cell_type": "code",
110 | "execution_count": 78,
111 | "metadata": {
112 | "execution": {
113 | "iopub.execute_input": "2021-09-20T07:40:53.907860Z",
114 | "iopub.status.busy": "2021-09-20T07:40:53.907598Z",
115 | "iopub.status.idle": "2021-09-20T07:40:54.802207Z",
116 | "shell.execute_reply": "2021-09-20T07:40:54.801390Z",
117 | "shell.execute_reply.started": "2021-09-20T07:40:53.907829Z"
118 | }
119 | },
120 | "outputs": [
121 | {
122 | "name": "stdout",
123 | "output_type": "stream",
124 | "text": [
125 | "Found 12085 images belonging to 7 classes.\n",
126 | "Found 3019 images belonging to 7 classes.\n",
127 | "{'AKIEC': 0, 'BCC': 1, 'BKL': 2, 'DF': 3, 'MEL': 4, 'NV': 5, 'VASC': 6}\n"
128 | ]
129 | }
130 | ],
131 | "source": [
132 | "data_dir = '../input/ham-dataset/HAM_Dataset'\n",
133 | "batch_size = 64\n",
134 | "train_image_gen = image_gen.flow_from_directory(data_dir, \n",
135 | " target_size=(224,224), \n",
136 | " color_mode='rgb',\n",
137 | " batch_size=batch_size,\n",
138 | " class_mode='categorical',\n",
139 | " subset=\"training\")\n",
140 | "\n",
141 | "test_image_gen = image_gen.flow_from_directory(data_dir, \n",
142 | " target_size=(224,224), \n",
143 | " color_mode='rgb',\n",
144 | " batch_size=batch_size,\n",
145 | " class_mode='categorical',\n",
146 | " shuffle=False,\n",
147 | " subset=\"validation\")\n",
148 | "\n",
149 | "print(test_image_gen.class_indices)"
150 | ]
151 | },
152 | {
153 | "cell_type": "code",
154 | "execution_count": 80,
155 | "metadata": {
156 | "execution": {
157 | "iopub.execute_input": "2021-09-20T07:40:55.225838Z",
158 | "iopub.status.busy": "2021-09-20T07:40:55.225619Z",
159 | "iopub.status.idle": "2021-09-20T07:40:56.697774Z",
160 | "shell.execute_reply": "2021-09-20T07:40:56.697080Z",
161 | "shell.execute_reply.started": "2021-09-20T07:40:55.225813Z"
162 | }
163 | },
164 | "outputs": [
165 | {
166 | "data": {
167 | "text/plain": [
168 | ""
169 | ]
170 | },
171 | "execution_count": 80,
172 | "metadata": {},
173 | "output_type": "execute_result"
174 | },
175 | {
176 | "data": {
177 | "image/png": "\n",
178 | "text/plain": [
179 | ""
180 | ]
181 | },
182 | "metadata": {},
183 | "output_type": "display_data"
184 | }
185 | ],
186 | "source": [
187 | "plt.figure(figsize=(3,3))\n",
188 | "plt.imshow(train_image_gen[0][0][0])"
189 | ]
190 | },
191 | {
192 | "cell_type": "markdown",
193 | "metadata": {},
194 | "source": [
195 | "## Transfer Learning using DenseNet121 Model"
196 | ]
197 | },
198 | {
199 | "cell_type": "code",
200 | "execution_count": 1,
201 | "metadata": {},
202 | "outputs": [
203 | {
204 | "name": "stdout",
205 | "output_type": "stream",
206 | "text": [
207 | "\n",
208 | "Model: \"model_12\"\n",
209 | "__________________________________________________________________________________________________\n",
210 | "Layer (type) Output Shape Param # Connected to \n",
211 | "==================================================================================================\n",
212 | "input_13 (InputLayer) [(None, 224, 224, 3) 0 \n",
213 | "__________________________________________________________________________________________________\n",
214 | "zero_padding2d_8 (ZeroPadding2D (None, 230, 230, 3) 0 input_13[0][0] \n",
215 | "__________________________________________________________________________________________________\n",
216 | "conv1/conv (Conv2D) (None, 112, 112, 64) 9408 zero_padding2d_8[0][0] \n",
217 | "__________________________________________________________________________________________________\n",
218 | "conv1/bn (BatchNormalization) (None, 112, 112, 64) 256 conv1/conv[0][0] \n",
219 | "__________________________________________________________________________________________________\n",
220 | "conv1/relu (Activation) (None, 112, 112, 64) 0 conv1/bn[0][0] \n",
221 | "__________________________________________________________________________________________________\n",
222 | "zero_padding2d_9 (ZeroPadding2D (None, 114, 114, 64) 0 conv1/relu[0][0] \n",
223 | "__________________________________________________________________________________________________\n",
224 | "pool1 (MaxPooling2D) (None, 56, 56, 64) 0 zero_padding2d_9[0][0] \n",
225 | "__________________________________________________________________________________________________\n",
226 | "conv2_block1_0_bn (BatchNormali (None, 56, 56, 64) 256 pool1[0][0] \n",
227 | "__________________________________________________________________________________________________\n",
228 | "conv2_block1_0_relu (Activation (None, 56, 56, 64) 0 conv2_block1_0_bn[0][0] \n",
229 | "__________________________________________________________________________________________________\n",
230 | "conv2_block1_1_conv (Conv2D) (None, 56, 56, 128) 8192 conv2_block1_0_relu[0][0] \n",
231 | "__________________________________________________________________________________________________\n",
232 | "conv2_block1_1_bn (BatchNormali (None, 56, 56, 128) 512 conv2_block1_1_conv[0][0] \n",
233 | "__________________________________________________________________________________________________\n",
234 | "conv2_block1_1_relu (Activation (None, 56, 56, 128) 0 conv2_block1_1_bn[0][0] \n",
235 | "__________________________________________________________________________________________________\n",
236 | "conv2_block1_2_conv (Conv2D) (None, 56, 56, 32) 36864 conv2_block1_1_relu[0][0] \n",
237 | "__________________________________________________________________________________________________\n",
238 | "conv2_block1_concat (Concatenat (None, 56, 56, 96) 0 pool1[0][0] \n",
239 | " conv2_block1_2_conv[0][0] \n",
240 | ".\n",
241 | ".\n",
242 | ".\n",
243 | ".\n",
244 | "\n",
245 | "__________________________________________________________________________________________________\n",
246 | "bn (BatchNormalization) (None, 7, 7, 1024) 4096 conv5_block16_concat[0][0] \n",
247 | "__________________________________________________________________________________________________\n",
248 | "relu (Activation) (None, 7, 7, 1024) 0 bn[0][0] \n",
249 | "__________________________________________________________________________________________________\n",
250 | "flatten_12 (Flatten) (None, 50176) 0 relu[0][0] \n",
251 | "__________________________________________________________________________________________________\n",
252 | "dense_24 (Dense) (None, 128) 6422656 flatten_12[0][0] \n",
253 | "__________________________________________________________________________________________________\n",
254 | "dropout_12 (Dropout) (None, 128) 0 dense_24[0][0] \n",
255 | "__________________________________________________________________________________________________\n",
256 | "dense_25 (Dense) (None, 7) 903 dropout_12[0][0] \n",
257 | "==================================================================================================\n",
258 | "Total params: 13,461,063\n",
259 | "Trainable params: 13,377,415\n",
260 | "Non-trainable params: 83,648\n",
261 | "__________________________________________________________________________________________________\n",
262 | "\n"
263 | ]
264 | }
265 | ],
266 | "source": [
267 | "from tensorflow.keras.applications import *\n",
268 | "from tensorflow.keras.layers import Flatten, Dense, Input, Dropout\n",
269 | "from tensorflow.keras.models import Model\n",
270 | "from tensorflow.keras.optimizers import Adam\n",
271 | "\n",
272 | "base_model = DenseNet121(weights='imagenet', include_top=False, input_shape=(224, 224, 3))\n",
273 | "\n",
274 | "for layer in base_model.layers:\n",
275 | " layer.trainable = True\n",
276 | "\n",
277 | "x = base_model.output\n",
278 | "x = Flatten()(x)\n",
279 | "# x = Dense(4096, activation='relu')(x)\n",
280 | "# x = Dropout(0.5)(x)\n",
281 | "x = Dense(128, activation='relu')(x)\n",
282 | "x = Dropout(0.5)(x)\n",
283 | "x = Dense(7, activation='softmax')(x)\n",
284 | "tl_model = Model(inputs=base_model.input, outputs=x)\n",
285 | "\n",
286 | "\n",
287 | "tl_model.summary()\n",
288 | "optimizer = Adam(0.0001)\n",
289 | "tl_model.compile(loss=\"categorical_crossentropy\", optimizer=optimizer, metrics=[\"accuracy\"])"
290 | ]
291 | },
292 | {
293 | "cell_type": "code",
294 | "execution_count": 84,
295 | "metadata": {
296 | "execution": {
297 | "iopub.execute_input": "2021-09-20T07:41:32.803559Z",
298 | "iopub.status.busy": "2021-09-20T07:41:32.802865Z",
299 | "iopub.status.idle": "2021-09-20T07:41:32.809583Z",
300 | "shell.execute_reply": "2021-09-20T07:41:32.808787Z",
301 | "shell.execute_reply.started": "2021-09-20T07:41:32.803522Z"
302 | }
303 | },
304 | "outputs": [],
305 | "source": [
306 | "from tensorflow.keras.callbacks import ReduceLROnPlateau\n",
307 | "from tensorflow.keras.callbacks import EarlyStopping\n",
308 | "from tensorflow.keras.callbacks import ModelCheckpoint\n",
309 | "\n",
310 | "lr_reduce = ReduceLROnPlateau(monitor='val_accuracy', factor=0.5, patience=2,mode='max', min_lr=0.00001,verbose=1)\n",
311 | "early_stop = EarlyStopping(monitor=\"val_loss\", patience=2, verbose=1)\n",
312 | "model_chkpt = ModelCheckpoint('best_model_aug.hdf5',save_best_only=True, monitor='val_accuracy',verbose=1)\n",
313 | "\n",
314 | "callback_list = [model_chkpt,lr_reduce]"
315 | ]
316 | },
317 | {
318 | "cell_type": "markdown",
319 | "metadata": {},
320 | "source": [
321 | "### Model Training"
322 | ]
323 | },
324 | {
325 | "cell_type": "code",
326 | "execution_count": 85,
327 | "metadata": {
328 | "execution": {
329 | "iopub.execute_input": "2021-09-20T07:41:33.998735Z",
330 | "iopub.status.busy": "2021-09-20T07:41:33.998425Z",
331 | "iopub.status.idle": "2021-09-20T09:37:55.639900Z",
332 | "shell.execute_reply": "2021-09-20T09:37:55.639133Z",
333 | "shell.execute_reply.started": "2021-09-20T07:41:33.998702Z"
334 | }
335 | },
336 | "outputs": [
337 | {
338 | "name": "stdout",
339 | "output_type": "stream",
340 | "text": [
341 | "Epoch 1/20\n",
342 | "189/189 [==============================] - 352s 2s/step - loss: 1.7499 - accuracy: 0.4074 - val_loss: 0.9937 - val_accuracy: 0.6615\n",
343 | "\n",
344 | "Epoch 00001: val_accuracy improved from -inf to 0.66148, saving model to best_model_aug.hdf5\n",
345 | "Epoch 2/20\n",
346 | "189/189 [==============================] - 340s 2s/step - loss: 0.7136 - accuracy: 0.7340 - val_loss: 0.8230 - val_accuracy: 0.7284\n",
347 | "\n",
348 | "Epoch 00002: val_accuracy improved from 0.66148 to 0.72839, saving model to best_model_aug.hdf5\n",
349 | "Epoch 3/20\n",
350 | "189/189 [==============================] - 339s 2s/step - loss: 0.5030 - accuracy: 0.8223 - val_loss: 0.8826 - val_accuracy: 0.7479\n",
351 | "\n",
352 | "Epoch 00003: val_accuracy improved from 0.72839 to 0.74793, saving model to best_model_aug.hdf5\n",
353 | "Epoch 4/20\n",
354 | "189/189 [==============================] - 339s 2s/step - loss: 0.3668 - accuracy: 0.8721 - val_loss: 1.0351 - val_accuracy: 0.7440\n",
355 | "\n",
356 | "Epoch 00004: val_accuracy did not improve from 0.74793\n",
357 | "Epoch 5/20\n",
358 | "189/189 [==============================] - 340s 2s/step - loss: 0.3209 - accuracy: 0.8906 - val_loss: 0.7385 - val_accuracy: 0.8019\n",
359 | "\n",
360 | "Epoch 00005: val_accuracy improved from 0.74793 to 0.80192, saving model to best_model_aug.hdf5\n",
361 | "Epoch 6/20\n",
362 | "189/189 [==============================] - 340s 2s/step - loss: 0.2600 - accuracy: 0.9136 - val_loss: 0.8868 - val_accuracy: 0.7943\n",
363 | "\n",
364 | "Epoch 00006: val_accuracy did not improve from 0.80192\n",
365 | "Epoch 7/20\n",
366 | "189/189 [==============================] - 338s 2s/step - loss: 0.2152 - accuracy: 0.9310 - val_loss: 0.8217 - val_accuracy: 0.7764\n",
367 | "\n",
368 | "Epoch 00007: val_accuracy did not improve from 0.80192\n",
369 | "\n",
370 | "Epoch 00007: ReduceLROnPlateau reducing learning rate to 4.999999873689376e-05.\n",
371 | "Epoch 8/20\n",
372 | "189/189 [==============================] - 338s 2s/step - loss: 0.1439 - accuracy: 0.9501 - val_loss: 0.9309 - val_accuracy: 0.8066\n",
373 | "\n",
374 | "Epoch 00008: val_accuracy improved from 0.80192 to 0.80656, saving model to best_model_aug.hdf5\n",
375 | "Epoch 9/20\n",
376 | "189/189 [==============================] - 338s 2s/step - loss: 0.1155 - accuracy: 0.9583 - val_loss: 1.0752 - val_accuracy: 0.7870\n",
377 | "\n",
378 | "Epoch 00009: val_accuracy did not improve from 0.80656\n",
379 | "Epoch 10/20\n",
380 | "189/189 [==============================] - 343s 2s/step - loss: 0.0943 - accuracy: 0.9686 - val_loss: 1.0865 - val_accuracy: 0.7900\n",
381 | "\n",
382 | "Epoch 00010: val_accuracy did not improve from 0.80656\n",
383 | "\n",
384 | "Epoch 00010: ReduceLROnPlateau reducing learning rate to 2.499999936844688e-05.\n",
385 | "Epoch 11/20\n",
386 | "189/189 [==============================] - 349s 2s/step - loss: 0.0774 - accuracy: 0.9745 - val_loss: 1.0638 - val_accuracy: 0.8089\n",
387 | "\n",
388 | "Epoch 00011: val_accuracy improved from 0.80656 to 0.80888, saving model to best_model_aug.hdf5\n",
389 | "Epoch 12/20\n",
390 | "189/189 [==============================] - 371s 2s/step - loss: 0.0580 - accuracy: 0.9834 - val_loss: 1.1004 - val_accuracy: 0.8115\n",
391 | "\n",
392 | "Epoch 00012: val_accuracy improved from 0.80888 to 0.81153, saving model to best_model_aug.hdf5\n",
393 | "Epoch 13/20\n",
394 | "189/189 [==============================] - 356s 2s/step - loss: 0.0491 - accuracy: 0.9837 - val_loss: 1.3549 - val_accuracy: 0.7917\n",
395 | "\n",
396 | "Epoch 00013: val_accuracy did not improve from 0.81153\n",
397 | "Epoch 14/20\n",
398 | "189/189 [==============================] - 355s 2s/step - loss: 0.0460 - accuracy: 0.9837 - val_loss: 1.2851 - val_accuracy: 0.7976\n",
399 | "\n",
400 | "Epoch 00014: val_accuracy did not improve from 0.81153\n",
401 | "\n",
402 | "Epoch 00014: ReduceLROnPlateau reducing learning rate to 1.249999968422344e-05.\n",
403 | "Epoch 15/20\n",
404 | "189/189 [==============================] - 357s 2s/step - loss: 0.0422 - accuracy: 0.9839 - val_loss: 1.2144 - val_accuracy: 0.8119\n",
405 | "\n",
406 | "Epoch 00015: val_accuracy improved from 0.81153 to 0.81186, saving model to best_model_aug.hdf5\n",
407 | "Epoch 16/20\n",
408 | "189/189 [==============================] - 353s 2s/step - loss: 0.0345 - accuracy: 0.9874 - val_loss: 1.3847 - val_accuracy: 0.7950\n",
409 | "\n",
410 | "Epoch 00016: val_accuracy did not improve from 0.81186\n",
411 | "Epoch 17/20\n",
412 | "189/189 [==============================] - 353s 2s/step - loss: 0.0345 - accuracy: 0.9870 - val_loss: 1.2980 - val_accuracy: 0.7996\n",
413 | "\n",
414 | "Epoch 00017: val_accuracy did not improve from 0.81186\n",
415 | "\n",
416 | "Epoch 00017: ReduceLROnPlateau reducing learning rate to 1e-05.\n",
417 | "Epoch 18/20\n",
418 | "189/189 [==============================] - 361s 2s/step - loss: 0.0302 - accuracy: 0.9890 - val_loss: 1.3043 - val_accuracy: 0.8062\n",
419 | "\n",
420 | "Epoch 00018: val_accuracy did not improve from 0.81186\n",
421 | "Epoch 19/20\n",
422 | "189/189 [==============================] - 351s 2s/step - loss: 0.0285 - accuracy: 0.9912 - val_loss: 1.3658 - val_accuracy: 0.8062\n",
423 | "\n",
424 | "Epoch 00019: val_accuracy did not improve from 0.81186\n",
425 | "Epoch 20/20\n",
426 | "189/189 [==============================] - 354s 2s/step - loss: 0.0262 - accuracy: 0.9906 - val_loss: 1.3751 - val_accuracy: 0.8162\n",
427 | "\n",
428 | "Epoch 00020: val_accuracy improved from 0.81186 to 0.81616, saving model to best_model_aug.hdf5\n"
429 | ]
430 | }
431 | ],
432 | "source": [
433 | "history = tl_model.fit(train_image_gen,\n",
434 | " epochs=20, \n",
435 | " validation_data = test_image_gen,\n",
436 | " callbacks=callback_list)"
437 | ]
438 | },
439 | {
440 | "cell_type": "markdown",
441 | "metadata": {},
442 | "source": [
443 | "### Model Evaluation"
444 | ]
445 | },
446 | {
447 | "cell_type": "code",
448 | "execution_count": 87,
449 | "metadata": {
450 | "execution": {
451 | "iopub.execute_input": "2021-09-20T09:40:42.140577Z",
452 | "iopub.status.busy": "2021-09-20T09:40:42.139902Z",
453 | "iopub.status.idle": "2021-09-20T09:40:42.381795Z",
454 | "shell.execute_reply": "2021-09-20T09:40:42.381131Z",
455 | "shell.execute_reply.started": "2021-09-20T09:40:42.140540Z"
456 | }
457 | },
458 | "outputs": [
459 | {
460 | "data": {
461 | "text/plain": [
462 | ""
463 | ]
464 | },
465 | "execution_count": 87,
466 | "metadata": {},
467 | "output_type": "execute_result"
468 | },
469 | {
470 | "data": {
471 | "image/png": "\n",
472 | "text/plain": [
473 | ""
474 | ]
475 | },
476 | "metadata": {},
477 | "output_type": "display_data"
478 | }
479 | ],
480 | "source": [
481 | "metrics = pd.DataFrame(tl_model.history.history)\n",
482 | "metrics[[\"loss\",\"val_loss\"]].plot()"
483 | ]
484 | },
485 | {
486 | "cell_type": "code",
487 | "execution_count": 88,
488 | "metadata": {
489 | "execution": {
490 | "iopub.execute_input": "2021-09-20T09:40:53.282394Z",
491 | "iopub.status.busy": "2021-09-20T09:40:53.282130Z",
492 | "iopub.status.idle": "2021-09-20T09:40:53.514401Z",
493 | "shell.execute_reply": "2021-09-20T09:40:53.513688Z",
494 | "shell.execute_reply.started": "2021-09-20T09:40:53.282368Z"
495 | }
496 | },
497 | "outputs": [
498 | {
499 | "data": {
500 | "text/plain": [
501 | ""
502 | ]
503 | },
504 | "execution_count": 88,
505 | "metadata": {},
506 | "output_type": "execute_result"
507 | },
508 | {
509 | "data": {
510 | "image/png": "\n",
511 | "text/plain": [
512 | ""
513 | ]
514 | },
515 | "metadata": {},
516 | "output_type": "display_data"
517 | }
518 | ],
519 | "source": [
520 | "metrics[[\"accuracy\",\"val_accuracy\"]].plot()"
521 | ]
522 | },
523 | {
524 | "cell_type": "code",
525 | "execution_count": 90,
526 | "metadata": {
527 | "execution": {
528 | "iopub.execute_input": "2021-09-20T09:44:16.187572Z",
529 | "iopub.status.busy": "2021-09-20T09:44:16.186664Z",
530 | "iopub.status.idle": "2021-09-20T09:45:25.831164Z",
531 | "shell.execute_reply": "2021-09-20T09:45:25.830423Z",
532 | "shell.execute_reply.started": "2021-09-20T09:44:16.187530Z"
533 | }
534 | },
535 | "outputs": [
536 | {
537 | "name": "stdout",
538 | "output_type": "stream",
539 | "text": [
540 | "48/48 [==============================] - 68s 1s/step\n"
541 | ]
542 | }
543 | ],
544 | "source": [
545 | "predictions = tl_model.predict(test_image_gen, verbose=1)\n",
546 | "predictions = predictions.argmax(axis=1)"
547 | ]
548 | },
549 | {
550 | "cell_type": "code",
551 | "execution_count": 91,
552 | "metadata": {
553 | "execution": {
554 | "iopub.execute_input": "2021-09-20T09:45:42.454735Z",
555 | "iopub.status.busy": "2021-09-20T09:45:42.454469Z",
556 | "iopub.status.idle": "2021-09-20T09:45:42.459930Z",
557 | "shell.execute_reply": "2021-09-20T09:45:42.457985Z",
558 | "shell.execute_reply.started": "2021-09-20T09:45:42.454706Z"
559 | }
560 | },
561 | "outputs": [],
562 | "source": [
563 | "test_labels = test_image_gen.classes"
564 | ]
565 | },
566 | {
567 | "cell_type": "code",
568 | "execution_count": 1,
569 | "metadata": {
570 | "execution": {
571 | "iopub.execute_input": "2021-09-20T09:46:20.935434Z",
572 | "iopub.status.busy": "2021-09-20T09:46:20.935181Z",
573 | "iopub.status.idle": "2021-09-20T09:46:21.000189Z",
574 | "shell.execute_reply": "2021-09-20T09:46:20.999547Z",
575 | "shell.execute_reply.started": "2021-09-20T09:46:20.935406Z"
576 | }
577 | },
578 | "outputs": [],
579 | "source": [
580 | "import sklearn\n",
581 | "from sklearn.metrics import classification_report, confusion_matrix\n",
582 | "\n",
583 | "cm = confusion_matrix(test_labels, predictions)"
584 | ]
585 | },
586 | {
587 | "cell_type": "code",
588 | "execution_count": 7,
589 | "metadata": {},
590 | "outputs": [
591 | {
592 | "name": "stdout",
593 | "output_type": "stream",
594 | "text": [
595 | "Normalized confusion matrix\n",
596 | "[[0.67346939 0.09183673 0.1505102 0.00510204 0.06632653 0.0127551\n",
597 | " 0. ]\n",
598 | " [0.02919708 0.86861314 0.04136253 0. 0.01459854 0.03892944\n",
599 | " 0.00729927]\n",
600 | " [0.02277904 0.00683371 0.79498861 0.00455581 0.07061503 0.10022779\n",
601 | " 0. ]\n",
602 | " [0.05949657 0.02517162 0.03432494 0.77116705 0.01830664 0.09153318\n",
603 | " 0. ]\n",
604 | " [0.00898876 0.01573034 0.10786517 0. 0.63820225 0.21348315\n",
605 | " 0.01573034]\n",
606 | " [0. 0.00226757 0.01587302 0. 0.02040816 0.96145125\n",
607 | " 0. ]\n",
608 | " [0. 0. 0.0154185 0. 0. 0.01101322\n",
609 | " 0.97356828]]\n"
610 | ]
611 | },
612 | {
613 | "data": {
614 | "image/png": "\n",
615 | "text/plain": [
616 | ""
617 | ]
618 | },
619 | "metadata": {
620 | "needs_background": "light"
621 | },
622 | "output_type": "display_data"
623 | }
624 | ],
625 | "source": [
626 | "import itertools\n",
627 | "\n",
628 | "def plot_confusion_matrix(cm, classes,\n",
629 | " normalize=True,\n",
630 | " title='Confusion matrix',\n",
631 | " cmap=plt.cm.Blues):\n",
632 | " if normalize:\n",
633 | " cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]\n",
634 | " print(\"Normalized confusion matrix\")\n",
635 | " else:\n",
636 | " print('Confusion matrix, without normalization')\n",
637 | "\n",
638 | " print(cm)\n",
639 | "\n",
640 | " plt.figure(figsize=(7,7))\n",
641 | " plt.imshow(cm, interpolation='nearest', cmap=cmap)\n",
642 | " plt.title(title)\n",
643 | " plt.colorbar()\n",
644 | " tick_marks = np.arange(len(classes))\n",
645 | " plt.xticks(tick_marks, classes, rotation=45)\n",
646 | " plt.yticks(tick_marks, classes)\n",
647 | "\n",
648 | " fmt = '.2f' if normalize else 'd'\n",
649 | " thresh = cm.max() / 2.\n",
650 | " for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):\n",
651 | " plt.text(j, i, format(cm[i, j], fmt),\n",
652 | " horizontalalignment=\"center\",\n",
653 | " color=\"white\" if cm[i, j] > thresh else \"black\")\n",
654 | " plt.ylabel('True label')\n",
655 | " plt.xlabel('Predicted label')\n",
656 | " plt.tight_layout()\n",
657 | "\n",
658 | " \n",
659 | "cm_plot_labels =['AKIEC','BCC','BKL','DF','MEL','NV','VASC']\n",
660 | "\n",
661 | "plot_confusion_matrix(cm, cm_plot_labels, title='Confusion Matrix', normalize=True)"
662 | ]
663 | },
664 | {
665 | "cell_type": "code",
666 | "execution_count": 5,
667 | "metadata": {},
668 | "outputs": [
669 | {
670 | "data": {
671 | "text/html": [
672 | "\n",
673 | "\n",
686 | "
\n",
687 | " \n",
688 | " \n",
689 | " \n",
690 | " AKIEC \n",
691 | " BCC \n",
692 | " BKL \n",
693 | " DF \n",
694 | " MEL \n",
695 | " NV \n",
696 | " VASC \n",
697 | " micro-average \n",
698 | " \n",
699 | " \n",
700 | " \n",
701 | " \n",
702 | " accuracy \n",
703 | " 0.940378 \n",
704 | " 0.962902 \n",
705 | " 0.919510 \n",
706 | " 0.965552 \n",
707 | " 0.920172 \n",
708 | " 0.926466 \n",
709 | " 0.992713 \n",
710 | " 0.946813 \n",
711 | " \n",
712 | " \n",
713 | " f1 \n",
714 | " 0.745763 \n",
715 | " 0.864407 \n",
716 | " 0.741764 \n",
717 | " 0.866324 \n",
718 | " 0.702101 \n",
719 | " 0.792523 \n",
720 | " 0.975717 \n",
721 | " 0.813846 \n",
722 | " \n",
723 | " \n",
724 | " false_discovery_rate \n",
725 | " 0.164557 \n",
726 | " 0.139759 \n",
727 | " 0.304781 \n",
728 | " 0.011730 \n",
729 | " 0.219780 \n",
730 | " 0.325914 \n",
731 | " 0.022124 \n",
732 | " 0.186154 \n",
733 | " \n",
734 | " \n",
735 | " false_negative_rate \n",
736 | " 0.326531 \n",
737 | " 0.131387 \n",
738 | " 0.205011 \n",
739 | " 0.228833 \n",
740 | " 0.361798 \n",
741 | " 0.038549 \n",
742 | " 0.026432 \n",
743 | " 0.186154 \n",
744 | " \n",
745 | " \n",
746 | " false_positive_rate \n",
747 | " 0.019794 \n",
748 | " 0.022239 \n",
749 | " 0.059302 \n",
750 | " 0.001549 \n",
751 | " 0.031080 \n",
752 | " 0.079519 \n",
753 | " 0.003899 \n",
754 | " 0.031026 \n",
755 | " \n",
756 | " \n",
757 | " negative_predictive_value \n",
758 | " 0.952645 \n",
759 | " 0.979263 \n",
760 | " 0.964243 \n",
761 | " 0.962659 \n",
762 | " 0.939360 \n",
763 | " 0.992887 \n",
764 | " 0.995325 \n",
765 | " 0.968974 \n",
766 | " \n",
767 | " \n",
768 | " positive_predictive_value \n",
769 | " 0.835443 \n",
770 | " 0.860241 \n",
771 | " 0.695219 \n",
772 | " 0.988270 \n",
773 | " 0.780220 \n",
774 | " 0.674086 \n",
775 | " 0.977876 \n",
776 | " 0.813846 \n",
777 | " \n",
778 | " \n",
779 | " precision \n",
780 | " 0.835443 \n",
781 | " 0.860241 \n",
782 | " 0.695219 \n",
783 | " 0.988270 \n",
784 | " 0.780220 \n",
785 | " 0.674086 \n",
786 | " 0.977876 \n",
787 | " 0.813846 \n",
788 | " \n",
789 | " \n",
790 | " recall \n",
791 | " 0.673469 \n",
792 | " 0.868613 \n",
793 | " 0.794989 \n",
794 | " 0.771167 \n",
795 | " 0.638202 \n",
796 | " 0.961451 \n",
797 | " 0.973568 \n",
798 | " 0.813846 \n",
799 | " \n",
800 | " \n",
801 | " sensitivity \n",
802 | " 0.673469 \n",
803 | " 0.868613 \n",
804 | " 0.794989 \n",
805 | " 0.771167 \n",
806 | " 0.638202 \n",
807 | " 0.961451 \n",
808 | " 0.973568 \n",
809 | " 0.813846 \n",
810 | " \n",
811 | " \n",
812 | " specificity \n",
813 | " 0.980206 \n",
814 | " 0.977761 \n",
815 | " 0.940698 \n",
816 | " 0.998451 \n",
817 | " 0.968920 \n",
818 | " 0.920481 \n",
819 | " 0.996101 \n",
820 | " 0.968974 \n",
821 | " \n",
822 | " \n",
823 | " true_negative_rate \n",
824 | " 0.980206 \n",
825 | " 0.977761 \n",
826 | " 0.940698 \n",
827 | " 0.998451 \n",
828 | " 0.968920 \n",
829 | " 0.920481 \n",
830 | " 0.996101 \n",
831 | " 0.968974 \n",
832 | " \n",
833 | " \n",
834 | " true_positive_rate \n",
835 | " 0.673469 \n",
836 | " 0.868613 \n",
837 | " 0.794989 \n",
838 | " 0.771167 \n",
839 | " 0.638202 \n",
840 | " 0.961451 \n",
841 | " 0.973568 \n",
842 | " 0.813846 \n",
843 | " \n",
844 | " \n",
845 | "
\n",
846 | "
"
847 | ],
848 | "text/plain": [
849 | " AKIEC BCC BKL DF MEL \\\n",
850 | "accuracy 0.940378 0.962902 0.919510 0.965552 0.920172 \n",
851 | "f1 0.745763 0.864407 0.741764 0.866324 0.702101 \n",
852 | "false_discovery_rate 0.164557 0.139759 0.304781 0.011730 0.219780 \n",
853 | "false_negative_rate 0.326531 0.131387 0.205011 0.228833 0.361798 \n",
854 | "false_positive_rate 0.019794 0.022239 0.059302 0.001549 0.031080 \n",
855 | "negative_predictive_value 0.952645 0.979263 0.964243 0.962659 0.939360 \n",
856 | "positive_predictive_value 0.835443 0.860241 0.695219 0.988270 0.780220 \n",
857 | "precision 0.835443 0.860241 0.695219 0.988270 0.780220 \n",
858 | "recall 0.673469 0.868613 0.794989 0.771167 0.638202 \n",
859 | "sensitivity 0.673469 0.868613 0.794989 0.771167 0.638202 \n",
860 | "specificity 0.980206 0.977761 0.940698 0.998451 0.968920 \n",
861 | "true_negative_rate 0.980206 0.977761 0.940698 0.998451 0.968920 \n",
862 | "true_positive_rate 0.673469 0.868613 0.794989 0.771167 0.638202 \n",
863 | "\n",
864 | " NV VASC micro-average \n",
865 | "accuracy 0.926466 0.992713 0.946813 \n",
866 | "f1 0.792523 0.975717 0.813846 \n",
867 | "false_discovery_rate 0.325914 0.022124 0.186154 \n",
868 | "false_negative_rate 0.038549 0.026432 0.186154 \n",
869 | "false_positive_rate 0.079519 0.003899 0.031026 \n",
870 | "negative_predictive_value 0.992887 0.995325 0.968974 \n",
871 | "positive_predictive_value 0.674086 0.977876 0.813846 \n",
872 | "precision 0.674086 0.977876 0.813846 \n",
873 | "recall 0.961451 0.973568 0.813846 \n",
874 | "sensitivity 0.961451 0.973568 0.813846 \n",
875 | "specificity 0.920481 0.996101 0.968974 \n",
876 | "true_negative_rate 0.920481 0.996101 0.968974 \n",
877 | "true_positive_rate 0.961451 0.973568 0.813846 "
878 | ]
879 | },
880 | "execution_count": 5,
881 | "metadata": {},
882 | "output_type": "execute_result"
883 | }
884 | ],
885 | "source": [
886 | "import disarray\n",
887 | "\n",
888 | "# Instantiate the confusion matrix DataFrame with index and columns\n",
889 | "#cm = confusion_matrix(a,b)\n",
890 | "df = pd.DataFrame(cm, index= ['AKIEC','BCC','BKL','DF','MEL','NV','VASC'], columns=['AKIEC','BCC','BKL','DF','MEL','NV','VASC'])\n",
891 | "df.da.export_metrics()"
892 | ]
893 | },
894 | {
895 | "cell_type": "code",
896 | "execution_count": null,
897 | "metadata": {},
898 | "outputs": [],
899 | "source": []
900 | }
901 | ],
902 | "metadata": {
903 | "kernelspec": {
904 | "display_name": "Python 3 (ipykernel)",
905 | "language": "python",
906 | "name": "python3"
907 | },
908 | "language_info": {
909 | "codemirror_mode": {
910 | "name": "ipython",
911 | "version": 3
912 | },
913 | "file_extension": ".py",
914 | "mimetype": "text/x-python",
915 | "name": "python",
916 | "nbconvert_exporter": "python",
917 | "pygments_lexer": "ipython3",
918 | "version": "3.9.7"
919 | }
920 | },
921 | "nbformat": 4,
922 | "nbformat_minor": 4
923 | }
924 |
--------------------------------------------------------------------------------
/docs/model_architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Defcon27/Skin-Cancer-Classification-using-Transfer-Learning/0d216c18f9528cdf1e07562296f3e1b571970edc/docs/model_architecture.png
--------------------------------------------------------------------------------
/docs/results.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Defcon27/Skin-Cancer-Classification-using-Transfer-Learning/0d216c18f9528cdf1e07562296f3e1b571970edc/docs/results.png
--------------------------------------------------------------------------------