├── CONTRIBUTING.md ├── LICENSE ├── README.md └── sessions ├── ct-body-part ├── inference.ipynb └── train.ipynb ├── data-curation ├── Data_Processing_&_Curation_for_Deep_Learning.ipynb ├── README.md └── Sample_DICOM.zip ├── dicom-seg ├── README.md └── RSNA_2021_DICOM_IN_DICOM_OUT_Segmentation.ipynb ├── dicom-wrangling ├── DataWrangling2021RSNA16.ipynb └── README.md ├── gans ├── README.md └── RSNA2021_DL_Lab_GAN.ipynb ├── mednist-monai ├── MedNIST_Classification_MONAI.ipynb ├── README.md └── RSNA21_DLL_mednist-monai.pdf ├── multi-modal-pe ├── Copy_of_Multimodal_Fusion_for_PE_Detection.ipynb ├── Multimodal Fusion for PE Detection (Clean).ipynb ├── Multimodal Fusion for PE Detection.ipynb ├── README.md └── figs │ ├── .DS_Store │ ├── UserAgreement.png │ ├── fusion_strategies.png │ ├── late_fusion_mean_agg.png │ ├── other_fusion_strategies.png │ └── workflow.png ├── nlp-basics ├── DLL52_Basics_NLP_Radiology.ipynb └── README.md ├── nlp-text-classification ├── README.md ├── RSNA21_DLL_NLP_RNNs.ipynb ├── RSNA21_DLL_NLP_Transformers.ipynb └── RSNA21_DLL_RNNs_with_Tensorboard.ipynb ├── object-detection-seg ├── README.md └── segmentation.ipynb ├── pneumonia-detection └── README.md ├── tcga-gbm ├── README.md └── RSNA_2021_TCGA_GBM_radiogenomics.ipynb ├── tcia-idc ├── README.md └── RSNA_2021_IDC_and_TCIA.ipynb └── yolo ├── README.md ├── Train_YOLOv5_Complete_Notebook.ipynb ├── Train_YOLOv5_Practice_Notebook.ipynb └── YOLO_RSNA2021.pdf /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines for Presenters 2 | 3 | Please add any files required for your session to the appropriate subdirectory of `sessions`. E.g. if you're uploading files for the **Pneumonia Detection Model Building** session, you should add them to the folder `sessions/pneumonia-detection`. 4 | 5 | If you need storage for large files (>100 MB per file) such as model weights, please email Walter Wiggins at [walter.wiggins@duke.edu](mailto:walter.wiggins@duke.edu). 6 | 7 | > Please ensure you have a preliminary version of your Colab notebook submitted to this repository by **November 15, 2021**. -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2021 RSNA 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # RSNA AI Deep Learning Lab 2021 3 | 4 | ## Intro 5 | 6 | Welcome Deep Learners! 7 | 8 | This document provides all the information you need to participate in the RSNA AI Deep Learning Lab. This set of classes provides a hands-on opportunity to engage with deep learning tools, write basic algorithms, learn how to organize data to implement deep learning and improve your understanding of AI technology. 9 | 10 | The classes will be held in the RSNA AI Deep Learning Lab classroom, which is located in the Lakeside Learning Center, Level 3. Here's the schedule of [classes](#class-schedule). CME credit is available for each session. 11 | 12 | 13 | ## Requirements 14 | 15 | All lessons are designed to run in Google Colab, which is a free web-based version of Jupyter hosted by Google. You will need a Google account (eg, gmail) to use Colab. If you don't already have a Google account, please create one in advance at the [account sign-up page](https://accounts.google.com/signup/v2/webcreateaccount?flowName=GlifWebSignIn&flowEntry=SignUp). You can [delete the account](https://support.google.com/accounts/answer/32046?hl=en) when you complete the lessons if you wish. 16 | 17 | We recommend that you use a computer with a recent vintage processor running the [Chrome browser](https://www.google.com/chrome/). 18 | 19 | ## Lessons 20 | 21 | Lesson : [Pneumonia Detection Model Building (Beginner friendly)](https://colab.research.google.com/gist/georgezero/8f7a8f3463fa7db8f89a7c7bb4c1b6cc/rsna-2021-deep-learning-lab-pneumonia-detection-model-building.ipynb) 22 | 23 | Lesson : [MedNIST Exam Classification with MONAI (Beginner friendly)](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/mednist-monai/MedNIST_Classification_MONAI.ipynb) 24 | 25 | Lesson : [DICOM Data Wrangling with Python (Beginner friendly)](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/dicom-wrangling/DataWrangling2021RSNA16.ipynb) 26 | 27 | Lesson : CT Body Part Classification (Beginner friendly): [Notebook #1](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/ct-body-part/train.ipynb), [Notebook #2](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/ct-body-part/inference.ipynb) 28 | 29 | Lesson : YOLO: Bounding Box Segmentation & Classification: [Practice Notebook](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/yolo/Train_YOLOv5_Practice_Notebook.ipynb), [Complete Notebook](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/yolo/Train_YOLOv5_Complete_Notebook.ipynb) 30 | 31 | Lesson : [Integrating Genomic and Imaging Data with TCGA-GBM](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/tcga-gbm/RSNA_2021_TCGA_GBM_radiogenomics.ipynb) 32 | 33 | Lesson : [Generative Adversarial Networks](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/gans/RSNA2021_DL_Lab_GAN.ipynb) 34 | 35 | Lesson : [Object Detection & Segmentation (Beginner friendly)](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/object-detection-seg/segmentation.ipynb) 36 | 37 | Lesson : [Working with Public Datasets: 38 | TCIA & IDC (Beginner friendly)](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/tcia-idc/RSNA_2021_IDC_and_TCIA.ipynb) 39 | 40 | Lesson : NLP: Text Classification with RNNs & Transformers: [Notebook #1](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/nlp-text-classification/RSNA21_DLL_NLP_RNNs.ipynb), [Notebook #2](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/nlp-text-classification/RSNA21_DLL_NLP_Transformers.ipynb) 41 | 42 | Lesson : [Multimodal Fusion for Pulmonary Embolism Detection Using CTs and Patient EMR](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/multi-modal-pe/Multimodal%20Fusion%20for%20PE%20Detection%20(Clean).ipynb) 43 | 44 | Lesson : [Data Processing & Curation for Deep Learning (Beginner friendly)](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/data-curation/Data_Processing_%26_Curation_for_Deep_Learning.ipynb) 45 | 46 | Lesson : [Basics of NLP in Radiology (Beginner friendly)](https://colab.research.google.com/github/RSNA/AI-Deep-Learning-Lab-2021/blob/main/sessions/nlp-basics/DLL52_Basics_NLP_Radiology.ipynb) 47 | 48 | 49 | ## Class Schedule 50 | 51 | | Date / Time | Class | 52 | | --- | --- | 53 | | Sun 10:30-11:30 am | MedNIST Exam Classification with MONAI - Beginner friendly | 54 | | Sun 1:00-2:00 pm | DICOM Data Wrangling with Python - Beginner friendly | 55 | | Sun 2:30-3:30 pm | CT Body Part Classification - Beginner friendly | 56 | | Mon 9:30-10:30 am | YOLO: Bounding Box Segmentation & Classification | 57 | | Mon 11:00 am-12:00 pm | Integrating Genomic and Imaging Data with TCGA-GBM | 58 | | Mon 1:30-2:30 pm | Generative Adversarial Networks | 59 | | Mon 3:00-4:00 pm | Object Detection & Segmentation | 60 | | Mon 4:30-5:30 pm | Pneumonia Detection Model Building - Beginner friendly | 61 | | Tue 11:00 am-12:00 pm| Working with Public Datasets: TCIA & IDC - Beginner friendly | 62 | | Tue 3:00-4:00 pm| NLP: Text Classification with RNNs & Transformers | 63 | | Wed 9:30-10:30 am | Pneumonia Detection Model Building - Beginner friendly; Repeat | 64 | | Wed 11:00 am-12:00 pm | Working with Public Datasets: TCIA & IDC - Beginner friendly; Repeat | 65 | | Wed 1:30-2:30 pm | Multimodal Fusion for Pulmonary Embolism Detection Using CTs and Patient EMR | 66 | | Wed 4:30-5:30 pm | Data Processing & Curation for Deep Learning - Beginner friendly | 67 | | Thu 11:00 am-12:00 pm| Basics of NLP in Radiology - Beginner friendly | 68 | 69 | 70 | -------------------------------------------------------------------------------- /sessions/ct-body-part/inference.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "accelerator": "TPU", 6 | "colab": { 7 | "name": "inference.ipynb", 8 | "provenance": [], 9 | "collapsed_sections": [] 10 | }, 11 | "kernelspec": { 12 | "display_name": "Python 3", 13 | "language": "python", 14 | "name": "python3" 15 | }, 16 | "language_info": { 17 | "codemirror_mode": { 18 | "name": "ipython", 19 | "version": 3 20 | }, 21 | "file_extension": ".py", 22 | "mimetype": "text/x-python", 23 | "name": "python", 24 | "nbconvert_exporter": "python", 25 | "pygments_lexer": "ipython3", 26 | "version": "3.7.5" 27 | } 28 | }, 29 | "cells": [ 30 | { 31 | "cell_type": "markdown", 32 | "metadata": { 33 | "id": "tKYuZWZ0gP9z" 34 | }, 35 | "source": [ 36 | "# Deep Learning for Automatic Labeling of CT Images\n", 37 | "## By: Ian Pan, MD.ai modified by Anouk Stein, MD.ai and Ross Filice MD, MedStar Georgetown University Hospital to predict chest, abdomen, or pelvic slices. Note lower chest/upper abdomen may have labels for both chest and abdomen." 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "metadata": { 43 | "id": "h7RP28IXPGvG" 44 | }, 45 | "source": [ 46 | "!git clone https://github.com/rwfilice/bodypart.git" 47 | ], 48 | "execution_count": null, 49 | "outputs": [] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "metadata": { 54 | "id": "tZPTc-wGP82S" 55 | }, 56 | "source": [ 57 | "!pip install pydicom" 58 | ], 59 | "execution_count": null, 60 | "outputs": [] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "metadata": { 65 | "id": "KHp444ni3J7i" 66 | }, 67 | "source": [ 68 | "from scipy.ndimage.interpolation import zoom\n", 69 | "\n", 70 | "import matplotlib.pyplot as plt\n", 71 | "import pydicom\n", 72 | "import pandas as pd \n", 73 | "import numpy as np \n", 74 | "import glob\n", 75 | "import os \n", 76 | "import re \n", 77 | "import json\n", 78 | "from pathlib import Path\n", 79 | "\n", 80 | "from keras.applications.imagenet_utils import preprocess_input\n", 81 | "from keras.applications.mobilenet_v2 import MobileNetV2\n", 82 | "from keras.callbacks import EarlyStopping, ReduceLROnPlateau\n", 83 | "from keras import Model\n", 84 | "from keras.layers import Dropout, Dense, GlobalAveragePooling2D\n", 85 | "from keras import optimizers\n", 86 | "from keras.models import model_from_json\n", 87 | "\n", 88 | "import tensorflow as tf \n", 89 | "\n", 90 | "# Set seed for reproducibility\n", 91 | "tf.random.set_seed(88) ; np.random.seed(88) \n", 92 | "\n", 93 | "# For data augmentation\n", 94 | "from albumentations import (\n", 95 | " Compose, OneOf, HorizontalFlip, Blur, RandomGamma, RandomContrast, RandomBrightness\n", 96 | ")" 97 | ], 98 | "execution_count": null, 99 | "outputs": [] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "metadata": { 104 | "id": "xSK1Br4Gn9Ma" 105 | }, 106 | "source": [ 107 | "testPath = Path('bodypart/testnpy')\n", 108 | "testList = list(sorted(testPath.glob('**/*.npy'), key=lambda fn: int(re.search('-([0-9]*)', str(fn)).group(1))))" 109 | ], 110 | "execution_count": null, 111 | "outputs": [] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "metadata": { 116 | "id": "3VUw6wgTOtrz" 117 | }, 118 | "source": [ 119 | "testList" 120 | ], 121 | "execution_count": null, 122 | "outputs": [] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "metadata": { 127 | "id": "yYc2ALDRUn3j" 128 | }, 129 | "source": [ 130 | "def get_dicom_and_uid(path_to_npy):\n", 131 | " '''\n", 132 | " Given a filepath, return the npy file and corresponding SOPInstanceUID. \n", 133 | " '''\n", 134 | " path_to_npy = str(path_to_npy)\n", 135 | " dicom_file = np.load(path_to_npy)\n", 136 | " uid = path_to_npy.split('/')[-1].replace('.npy', '')\n", 137 | " return dicom_file, uid" 138 | ], 139 | "execution_count": null, 140 | "outputs": [] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "metadata": { 145 | "id": "6HKsxs_3Otr1" 146 | }, 147 | "source": [ 148 | "def convert_dicom_to_8bit(npy_file, width, level, imsize=(224.,224.), clip=True): \n", 149 | " '''\n", 150 | " Given a DICOM file, window specifications, and image size, \n", 151 | " return the image as a Numpy array scaled to [0,255] of the specified size. \n", 152 | " '''\n", 153 | " array = npy_file.copy() \n", 154 | " #array = array + int(dicom_file.RescaleIntercept) #we did this on preprocess\n", 155 | " #array = array * int(dicom_file.RescaleSlope) #we did this on preprocess\n", 156 | " array = np.clip(array, level - width / 2, level + width / 2)\n", 157 | " # Rescale to [0, 255]\n", 158 | " array -= np.min(array) \n", 159 | " array /= np.max(array) \n", 160 | " array *= 255.\n", 161 | " array = array.astype('uint8')\n", 162 | " \n", 163 | " if clip:\n", 164 | " # Sometimes there is dead space around the images -- let's get rid of that\n", 165 | " nonzeros = np.nonzero(array) \n", 166 | " x1 = np.min(nonzeros[0]) ; x2 = np.max(nonzeros[0])\n", 167 | " y1 = np.min(nonzeros[1]) ; y2 = np.max(nonzeros[1])\n", 168 | " array = array[x1:x2,y1:y2]\n", 169 | "\n", 170 | " # Resize image if necessary\n", 171 | " resize_x = float(imsize[0]) / array.shape[0] \n", 172 | " resize_y = float(imsize[1]) / array.shape[1] \n", 173 | " if resize_x != 1. or resize_y != 1.:\n", 174 | " array = zoom(array, [resize_x, resize_y], order=1, prefilter=False)\n", 175 | " return np.expand_dims(array, axis=-1)" 176 | ], 177 | "execution_count": null, 178 | "outputs": [] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "metadata": { 183 | "id": "am3DCy-9Otr4" 184 | }, 185 | "source": [ 186 | "json_file = open('bodypart/model.json', 'r')\n", 187 | "loaded_model_json = json_file.read()\n", 188 | "json_file.close()\n", 189 | "model = model_from_json(loaded_model_json)" 190 | ], 191 | "execution_count": null, 192 | "outputs": [] 193 | }, 194 | { 195 | "cell_type": "markdown", 196 | "metadata": { 197 | "id": "AxN2UuhSH0to" 198 | }, 199 | "source": [ 200 | "## Predict on test set\n", 201 | "\n", 202 | "\n" 203 | ] 204 | }, 205 | { 206 | "cell_type": "markdown", 207 | "metadata": { 208 | "id": "A3j8okxMHule" 209 | }, 210 | "source": [ 211 | "Now let's evaluate predictions on the test data. Note that the CNN predicts labels for each image but labels exist at an exam-level. Thus we need to combine predictions for each image in an exam to produce an exam-level label. We will do so by simply averaging the prediction scores across all images in an exam." 212 | ] 213 | }, 214 | { 215 | "cell_type": "code", 216 | "metadata": { 217 | "id": "8fMOhlKtOtr5" 218 | }, 219 | "source": [ 220 | "model.load_weights('bodypart/tcga-mguh-multilabel.h5') #federated" 221 | ], 222 | "execution_count": null, 223 | "outputs": [] 224 | }, 225 | { 226 | "cell_type": "code", 227 | "metadata": { 228 | "id": "auoBQzUKOXxh" 229 | }, 230 | "source": [ 231 | "#Inference\n", 232 | "IMSIZE = 256\n", 233 | "WINDOW_LEVEL, WINDOW_WIDTH = 50, 500\n", 234 | "def predict(model, images, imsize):\n", 235 | " '''\n", 236 | " Small modifications to data generator to allow for prediction on test data.\n", 237 | " '''\n", 238 | " test_arrays = [] \n", 239 | " \n", 240 | " test_probas = [] \n", 241 | " test_uids = []\n", 242 | " for im in images: \n", 243 | " dicom_file, uid = get_dicom_and_uid(im) \n", 244 | " try:\n", 245 | " array = convert_dicom_to_8bit(dicom_file, WINDOW_WIDTH, WINDOW_LEVEL, \n", 246 | " imsize=(imsize,imsize))\n", 247 | " except: \n", 248 | " continue\n", 249 | " \n", 250 | " array = preprocess_input(array, mode='tf')\n", 251 | " test_arrays.append(array) \n", 252 | "\n", 253 | " test_probas.append(model.predict(np.expand_dims(array, axis=0)))\n", 254 | " test_uids.append(uid)\n", 255 | " return test_uids, test_arrays, test_probas\n", 256 | " \n", 257 | "uids, X, y_prob = predict(model, testList, IMSIZE)\n", 258 | "\n", 259 | "test_pred_df = pd.DataFrame({'uid': uids, 'X': X, 'y_prob': y_prob})" 260 | ], 261 | "execution_count": null, 262 | "outputs": [] 263 | }, 264 | { 265 | "cell_type": "code", 266 | "metadata": { 267 | "id": "5e7ihGMVOtr7" 268 | }, 269 | "source": [ 270 | "test_pred_df.apply(lambda row: row['y_prob'], axis=1)" 271 | ], 272 | "execution_count": null, 273 | "outputs": [] 274 | }, 275 | { 276 | "cell_type": "code", 277 | "metadata": { 278 | "id": "M5BsiokCOtr8" 279 | }, 280 | "source": [ 281 | "chest = np.stack(test_pred_df['y_prob'])[:,0][:,0]\n", 282 | "abd = np.stack(test_pred_df['y_prob'])[:,0][:,1]\n", 283 | "pelv = np.stack(test_pred_df['y_prob'])[:,0][:,2]" 284 | ], 285 | "execution_count": null, 286 | "outputs": [] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "metadata": { 291 | "scrolled": true, 292 | "id": "5ZH4V3BXOtr8" 293 | }, 294 | "source": [ 295 | "plt.plot(chest)\n", 296 | "plt.plot(abd)\n", 297 | "plt.plot(pelv)" 298 | ], 299 | "execution_count": null, 300 | "outputs": [] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "metadata": { 305 | "id": "xkn3ZTpHOtr_" 306 | }, 307 | "source": [ 308 | "numaveslices = 5\n", 309 | "avepreds = []\n", 310 | "allpreds = np.stack(test_pred_df['y_prob'])[:,0]\n", 311 | "for idx,arr in enumerate(allpreds):\n", 312 | " low = int(max(0,idx-(numaveslices-1)/2))\n", 313 | " high = int(min(len(allpreds),idx+(numaveslices+1)/2))\n", 314 | " avepreds.append(np.mean(allpreds[low:high],axis=0))\n", 315 | " \n", 316 | "chest = np.stack(avepreds)[:,0]\n", 317 | "abd = np.stack(avepreds)[:,1]\n", 318 | "pelv = np.stack(avepreds)[:,2]" 319 | ], 320 | "execution_count": null, 321 | "outputs": [] 322 | }, 323 | { 324 | "cell_type": "code", 325 | "metadata": { 326 | "scrolled": true, 327 | "id": "9lrJuRAQOtsA" 328 | }, 329 | "source": [ 330 | "#averaged over 5 slices\n", 331 | "plt.plot(chest)\n", 332 | "plt.plot(abd)\n", 333 | "plt.plot(pelv)" 334 | ], 335 | "execution_count": null, 336 | "outputs": [] 337 | }, 338 | { 339 | "cell_type": "code", 340 | "metadata": { 341 | "id": "F9NdpyYzOtsA" 342 | }, 343 | "source": [ 344 | "def displayImages(imgs,labels):\n", 345 | " numimgs = len(imgs)\n", 346 | " plt.figure(figsize=(20,10))\n", 347 | " for idx,img in enumerate(imgs):\n", 348 | " dicom_file, uid = get_dicom_and_uid(img)\n", 349 | " img = convert_dicom_to_8bit(dicom_file, WINDOW_WIDTH, WINDOW_LEVEL, clip=False)\n", 350 | " plt.subplot(\"1%i%i\" % (numimgs,idx+1))\n", 351 | " plt.imshow(img[...,0],cmap='gray')\n", 352 | " plt.title(labels[idx])\n", 353 | " plt.axis('off')" 354 | ], 355 | "execution_count": null, 356 | "outputs": [] 357 | }, 358 | { 359 | "cell_type": "code", 360 | "metadata": { 361 | "id": "UXEYTPSoOtsB" 362 | }, 363 | "source": [ 364 | "#averaged over 5 slices\n", 365 | "fig, ax1 = plt.subplots(figsize=(17,10))\n", 366 | "ax1.set_xlabel(\"Slice Number\", fontsize=20)\n", 367 | "ax1.set_ylabel(\"Confidence\", fontsize=20)\n", 368 | "plt.xticks([0,30,60,90,120,150,180,210],fontsize=12)\n", 369 | "plt.yticks(fontsize=12)\n", 370 | "ax1.axvline(30,color='gray',ymax=0.1)\n", 371 | "ax1.axvline(82,color='gray',ymax=0.1)\n", 372 | "ax1.axvline(120,color='gray',ymax=0.1)\n", 373 | "ax1.axvline(172,color='gray',ymax=0.1)\n", 374 | "ax1.axvline(195,color='gray',ymax=0.1)\n", 375 | "plt.plot(chest,linewidth=2,label=\"Chest\")\n", 376 | "plt.plot(abd,linewidth=2,label=\"Abdomen\")\n", 377 | "plt.plot(pelv,linewidth=2,label=\"Pelvis\")\n", 378 | "plt.legend(fontsize=16)" 379 | ], 380 | "execution_count": null, 381 | "outputs": [] 382 | }, 383 | { 384 | "cell_type": "code", 385 | "metadata": { 386 | "id": "sfJIHO1tOtsC" 387 | }, 388 | "source": [ 389 | "displayImages([testList[30],testList[82],testList[120],testList[172],testList[195]],[30,82,120,172,195])" 390 | ], 391 | "execution_count": null, 392 | "outputs": [] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "metadata": { 397 | "id": "cUzCh9OsOtsC" 398 | }, 399 | "source": [ 400 | "" 401 | ], 402 | "execution_count": null, 403 | "outputs": [] 404 | } 405 | ] 406 | } -------------------------------------------------------------------------------- /sessions/ct-body-part/train.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "train.ipynb", 7 | "provenance": [], 8 | "collapsed_sections": [] 9 | }, 10 | "kernelspec": { 11 | "display_name": "Python 3", 12 | "language": "python", 13 | "name": "python3" 14 | }, 15 | "language_info": { 16 | "codemirror_mode": { 17 | "name": "ipython", 18 | "version": 3 19 | }, 20 | "file_extension": ".py", 21 | "mimetype": "text/x-python", 22 | "name": "python", 23 | "nbconvert_exporter": "python", 24 | "pygments_lexer": "ipython3", 25 | "version": "3.7.5" 26 | }, 27 | "accelerator": "GPU" 28 | }, 29 | "cells": [ 30 | { 31 | "cell_type": "markdown", 32 | "metadata": { 33 | "id": "tKYuZWZ0gP9z" 34 | }, 35 | "source": [ 36 | "# Deep Learning for Automatic Labeling of CT Images\n", 37 | "## By: Ian Pan, MD.ai modified by Anouk Stein, MD.ai to predict chest, abdomen, or pelvic slices. Note lower chest/upper abdomen may have labels for both chest and abdomen.\n", 38 | "\n" 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "metadata": { 44 | "id": "cyd2W4nWTOCv" 45 | }, 46 | "source": [ 47 | "!git clone https://github.com/rwfilice/bodypart.git" 48 | ], 49 | "execution_count": null, 50 | "outputs": [] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "metadata": { 55 | "id": "KtSWfduT3CoO" 56 | }, 57 | "source": [ 58 | "## Import Python packages" 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "metadata": { 64 | "id": "dksRlUUOS-GM" 65 | }, 66 | "source": [ 67 | "!pip install pydicom" 68 | ], 69 | "execution_count": null, 70 | "outputs": [] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "metadata": { 75 | "id": "KHp444ni3J7i" 76 | }, 77 | "source": [ 78 | "from scipy.ndimage.interpolation import zoom\n", 79 | "\n", 80 | "import matplotlib.pyplot as plt\n", 81 | "import pydicom\n", 82 | "import pandas as pd \n", 83 | "import numpy as np \n", 84 | "import glob\n", 85 | "import os \n", 86 | "import re \n", 87 | "import json\n", 88 | "from pathlib import Path\n", 89 | "\n", 90 | "from keras.applications.imagenet_utils import preprocess_input\n", 91 | "from keras.applications.mobilenet_v2 import MobileNetV2\n", 92 | "from keras.callbacks import EarlyStopping, ReduceLROnPlateau\n", 93 | "from keras import Model\n", 94 | "from keras.layers import Dropout, Dense, GlobalAveragePooling2D\n", 95 | "from keras import optimizers\n", 96 | "\n", 97 | "import tensorflow as tf \n", 98 | "\n", 99 | "# Set seed for reproducibility\n", 100 | "tf.random.set_seed(88) ; np.random.seed(88) \n", 101 | "\n", 102 | "# For data augmentation\n", 103 | "from albumentations import (\n", 104 | " Compose, OneOf, HorizontalFlip, Blur, RandomGamma, RandomContrast, RandomBrightness\n", 105 | ")" 106 | ], 107 | "execution_count": null, 108 | "outputs": [] 109 | }, 110 | { 111 | "cell_type": "code", 112 | "metadata": { 113 | "id": "hEbNsZ4zSpLT" 114 | }, 115 | "source": [ 116 | "tf.compat.v1.enable_eager_execution()\n", 117 | "print(tf.matmul([[1., 2.],[3., 4.]], [[1., 2.],[3., 4.]]))" 118 | ], 119 | "execution_count": null, 120 | "outputs": [] 121 | }, 122 | { 123 | "cell_type": "code", 124 | "metadata": { 125 | "id": "xSK1Br4Gn9Ma" 126 | }, 127 | "source": [ 128 | "imagesPath = Path('bodypart/npy/')\n", 129 | "imageList = list(imagesPath.glob('**/*.npy'))\n", 130 | "testList = list(sorted(imagesPath.glob('**/5fd4ea78053ef3b10aace7cbf9d70b65*.npy'), key=lambda fn: int(re.search('-([0-9]*)', str(fn)).group(1))))" 131 | ], 132 | "execution_count": null, 133 | "outputs": [] 134 | }, 135 | { 136 | "cell_type": "code", 137 | "metadata": { 138 | "id": "m1_MvTfISpLW" 139 | }, 140 | "source": [ 141 | "testPath = Path('bodypart/testnpy')\n", 142 | "testList = list(sorted(testPath.glob('**/*.npy'), key=lambda fn: int(re.search('-([0-9]*)', str(fn)).group(1))))" 143 | ], 144 | "execution_count": null, 145 | "outputs": [] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "metadata": { 150 | "id": "cPh6ihMaSpLY" 151 | }, 152 | "source": [ 153 | "testList" 154 | ], 155 | "execution_count": null, 156 | "outputs": [] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "metadata": { 161 | "id": "MoaCWETfSpLb" 162 | }, 163 | "source": [ 164 | "df = pd.read_csv(\"bodypart/labels-overlap.csv\")\n", 165 | "df" 166 | ], 167 | "execution_count": null, 168 | "outputs": [] 169 | }, 170 | { 171 | "cell_type": "markdown", 172 | "metadata": { 173 | "id": "E8C_G60bn-nf" 174 | }, 175 | "source": [ 176 | "## Locate DICOM images and split data into: training, validation, test" 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "metadata": { 182 | "id": "dC5ho-bmhQ5p" 183 | }, 184 | "source": [ 185 | "Let's locate all of the images we will use during training. In the previous code block, we kept track of images that the annotator excluded. We remove those images here. The data is structured as: exam (study) > series > images. That is, an exam can have multiple series and a series can have multiple images. The labels are assigned at the exam-level, so we can assume that all images in a series in an exam share the same labels. We split the data based on exams to prevent images from the same patient being distributed across the training, validation, and test data. We split the data into 80% training, 10% validation, 10% test. " 186 | ] 187 | }, 188 | { 189 | "cell_type": "code", 190 | "metadata": { 191 | "id": "9z3OFYEgQHe5" 192 | }, 193 | "source": [ 194 | "# Define a function to construct training/validation/test splits \n", 195 | "# Split data based on exams to prevent data leak \n", 196 | "# i.e. images from the same patient exist across the splits\n", 197 | "\n", 198 | "def get_train_val_test_split(images, train_frac, val_frac, seed=88):\n", 199 | " '''\n", 200 | " Test fraction will equal 1 - train_frac - val_frac.\n", 201 | " This function splits data based on exams, extracts image file paths,\n", 202 | " and removes images that cannot be read by pydicom.\n", 203 | " '''\n", 204 | " np.random.seed(seed) \n", 205 | " \n", 206 | " train_images = np.random.choice(images, int(train_frac*len(images)), replace=False)\n", 207 | " not_train_images = list(set(images) - set(train_images)) \n", 208 | " valid_images = np.random.choice(not_train_images, int(val_frac*len(images)), replace=False)\n", 209 | " test_images = list(set(not_train_images) - set(valid_images)) \n", 210 | " # Remove images that can't be read by pydicom\n", 211 | " for im in train_images: \n", 212 | " try: \n", 213 | " _ = np.load(str(im)) \n", 214 | " except:\n", 215 | " train_images.remove(im) \n", 216 | " for im in valid_images: \n", 217 | " try: \n", 218 | " _ = np.load(str(im)) \n", 219 | " except:\n", 220 | " valid_images.remove(im) \n", 221 | " for im in test_images: \n", 222 | " try: \n", 223 | " _ = np.load(str(im)) \n", 224 | " except:\n", 225 | " test_images.remove(im) \n", 226 | " return train_images, valid_images, test_images \n", 227 | " \n", 228 | "# Let's do 3 random train/val/test splits 80%/10%/10%\n", 229 | "train0, val0, test0 = get_train_val_test_split(imageList, 0.8, 0.1, seed=0)\n", 230 | "train1, val1, test1 = get_train_val_test_split(imageList, 0.8, 0.1, seed=1)\n", 231 | "train2, val2, test2 = get_train_val_test_split(imageList, 0.8, 0.1, seed=2)" 232 | ], 233 | "execution_count": null, 234 | "outputs": [] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "metadata": { 239 | "id": "JrYva5bISpLh" 240 | }, 241 | "source": [ 242 | "len(train0),len(val0),len(test0)" 243 | ], 244 | "execution_count": null, 245 | "outputs": [] 246 | }, 247 | { 248 | "cell_type": "code", 249 | "metadata": { 250 | "id": "mgphczZ8UqYq" 251 | }, 252 | "source": [ 253 | "labels_dict = {'Chest': 0, \n", 254 | " 'Abdomen': 1,\n", 255 | " 'Pelvis': 2}\n", 256 | "N_CLASSES = len(labels_dict)" 257 | ], 258 | "execution_count": null, 259 | "outputs": [] 260 | }, 261 | { 262 | "cell_type": "markdown", 263 | "metadata": { 264 | "id": "APe2YcAX5rJp" 265 | }, 266 | "source": [ 267 | "## Set up data generation and augmentation" 268 | ] 269 | }, 270 | { 271 | "cell_type": "markdown", 272 | "metadata": { 273 | "id": "pbCXFROvigx6" 274 | }, 275 | "source": [ 276 | "Data generators are an efficient and effective way to load and augment data as it is being passed to the CNN. We convert the DICOM image array into an 8-bit image using a window width of 500 and level of 50. We had previously assigned an integer label to each label in our dataset (e.g., chest, abdomen, pelvis), but the CNN expects binary labels. Thus, for our 7 labels, we convert each integer into a length-7 vector, where each element in the vector is 1 if the image contains that label, 0 otherwise. \n", 277 | "\n", 278 | "We use simple data augmentation consisting of horizontal flips, random changes to brightness and contrast, and random levels of image blurring to help prevent the CNN from overfitting on the training data. The user should select data augmentations which represent the variability that could occur in a real setting. \n", 279 | "\n", 280 | "We also examine the class imbalance in our dataset by calculating the frequency of each label in the training data." 281 | ] 282 | }, 283 | { 284 | "cell_type": "code", 285 | "metadata": { 286 | "id": "yYc2ALDRUn3j" 287 | }, 288 | "source": [ 289 | "def get_dicom_and_uid(path_to_npy):\n", 290 | " '''\n", 291 | " Given a filepath, return the npy file and corresponding SOPInstanceUID. \n", 292 | " '''\n", 293 | " path_to_npy = str(path_to_npy)\n", 294 | " dicom_file = np.load(path_to_npy)\n", 295 | " uid = path_to_npy.split('/')[-1].replace('.npy', '')\n", 296 | " return dicom_file, uid\n", 297 | " \n", 298 | "def convert_dicom_to_8bit(npy_file, width, level, imsize=(224.,224.), clip=True): \n", 299 | " '''\n", 300 | " Given a DICOM file, window specifications, and image size, \n", 301 | " return the image as a Numpy array scaled to [0,255] of the specified size. \n", 302 | " '''\n", 303 | " array = npy_file.copy() \n", 304 | " #array = array + int(dicom_file.RescaleIntercept) #we did this on preprocess\n", 305 | " #array = array * int(dicom_file.RescaleSlope) #we did this on preprocess\n", 306 | " array = np.clip(array, level - width / 2, level + width / 2)\n", 307 | " # Rescale to [0, 255]\n", 308 | " array -= np.min(array) \n", 309 | " array /= np.max(array) \n", 310 | " array *= 255.\n", 311 | " array = array.astype('uint8')\n", 312 | " \n", 313 | " if clip:\n", 314 | " # Sometimes there is dead space around the images -- let's get rid of that\n", 315 | " nonzeros = np.nonzero(array) \n", 316 | " x1 = np.min(nonzeros[0]) ; x2 = np.max(nonzeros[0])\n", 317 | " y1 = np.min(nonzeros[1]) ; y2 = np.max(nonzeros[1])\n", 318 | " array = array[x1:x2,y1:y2]\n", 319 | "\n", 320 | " # Resize image if necessary\n", 321 | " resize_x = float(imsize[0]) / array.shape[0] \n", 322 | " resize_y = float(imsize[1]) / array.shape[1] \n", 323 | " if resize_x != 1. or resize_y != 1.:\n", 324 | " array = zoom(array, [resize_x, resize_y], order=1, prefilter=False)\n", 325 | " return np.expand_dims(array, axis=-1)\n", 326 | "\n", 327 | "def get_label_from_sop_id(df, uid):\n", 328 | " '''\n", 329 | " Given the annotations dataframe and a study ID, return a one-hot encoded\n", 330 | " vector with labels for that study ID. \n", 331 | " '''\n", 332 | " df = df[df.npyid == uid] \n", 333 | " labels = np.zeros((N_CLASSES,))\n", 334 | " for rownum, row in df.iterrows():\n", 335 | " lbls = row.labels.split(\"-\")\n", 336 | " for lbl in lbls:\n", 337 | " label_index = labels_dict[lbl] \n", 338 | " labels[label_index] += 1 \n", 339 | " return labels\n", 340 | "\n", 341 | "# Data augmentation involves perturbing the images in your training set \n", 342 | "# to prevent overfitting\n", 343 | "def augment(p=0.5):\n", 344 | " return Compose([\n", 345 | " HorizontalFlip(p=0.5),\n", 346 | " Blur(p=0.5),\n", 347 | " OneOf([\n", 348 | " RandomGamma(),\n", 349 | " RandomContrast(),\n", 350 | " RandomBrightness(),\n", 351 | " ], p=0.5)\n", 352 | " ], p=p)\n", 353 | "\n", 354 | "aug = augment(p=0.5)\n", 355 | "\n", 356 | "def ScoutDataGenerator(df, images, imsize, batchsize, augment=True):\n", 357 | " '''\n", 358 | " Data generator to use with Keras when training. \n", 359 | " '''\n", 360 | " while True:\n", 361 | " # Shuffle images\n", 362 | " images = np.random.permutation(images) \n", 363 | " for index in range(0, len(images), batchsize): \n", 364 | " # Get images \n", 365 | " image_batch = images[index:(index+batchsize)]\n", 366 | " dicom_and_uids = [get_dicom_and_uid(im) for im in image_batch]\n", 367 | " dicom_files = [_[0] for _ in dicom_and_uids]\n", 368 | " uids = [_[1] for _ in dicom_and_uids] \n", 369 | " array_list = [] ; uids_list = []\n", 370 | " for ind, dcm in enumerate(dicom_files): \n", 371 | " try:\n", 372 | " array_list.append(convert_dicom_to_8bit(dcm, WINDOW_WIDTH, \n", 373 | " WINDOW_LEVEL, \n", 374 | " imsize=(imsize,imsize)))\n", 375 | " uids_list.append(uids[ind])\n", 376 | " except: \n", 377 | " continue\n", 378 | " if augment: array_list = [aug(image=arr)['image'] for arr in array_list]\n", 379 | " arrays = np.asarray(array_list)\n", 380 | " # Data are labeled by studies\n", 381 | " # All images in a study share the same label \n", 382 | " arrays = preprocess_input(arrays, mode='tf')\n", 383 | " labels = np.asarray([get_label_from_sop_id(df, _) for _ in uids_list])\n", 384 | " yield arrays, labels\n", 385 | " \n", 386 | "\n", 387 | "# Let's look at the distribution of labels in the training data\n", 388 | "dicom_and_uids = [get_dicom_and_uid(im) for im in train0] \n", 389 | "uids = [_[1] for _ in dicom_and_uids] \n", 390 | "labels = np.asarray([get_label_from_sop_id(df, _) for _ in uids]) \n", 391 | "class_frequencies = np.mean(labels, axis=0) \n", 392 | "\n", 393 | "category_list = ['Chest', 'Abdomen', 'Pelvis']\n", 394 | "\n", 395 | "for cat_index, cat in enumerate(category_list):\n", 396 | " pct_frequency = round(class_frequencies[cat_index] * 100., 1)\n", 397 | " print('Frequency of {} : {}%'.format(cat, pct_frequency))" 398 | ], 399 | "execution_count": null, 400 | "outputs": [] 401 | }, 402 | { 403 | "cell_type": "markdown", 404 | "metadata": { 405 | "id": "aqeeJOqN5xcP" 406 | }, 407 | "source": [ 408 | "## Set up Keras model" 409 | ] 410 | }, 411 | { 412 | "cell_type": "markdown", 413 | "metadata": { 414 | "id": "z21oN8L5kLaE" 415 | }, 416 | "source": [ 417 | "Now we can set up a basic Keras CNN model. We will use the lightweight MobileNetV2 model since our classification problem is relatively simple. Important parameters that can affect model performance include: the initial learning rate (how aggressively the model makes changes to its weights), dropout probability (a method to prevent overfitting by randomly turning neurons in the CNN off), and batch size (the number of images at each iteration to adjust the CNN weights). We use the Adam optimizer, which is a popular optimizer that performs well in most cases. We use an image size of 256 x 256. Oftentimes we would see better performance with higher image sizes up to a point, after which increases in image size do not improve or even worsen performance. However, the tradeoff is that you require more GPU memory and training time. \n", 418 | "\n", 419 | "Implementing early stopping and a learning rate annealing schedule can help improve performance. A model can quickly overfit and perfectly predict the training data, especially if the model is large and the dataset is small. Early stopping uses validation performance to determine when to stop training: if a model is no longer making improvements on the validation dataset, then stop training. How long we wait and how much progress is considered improvement are both tunable parameters. Reducing the learning rate when validation performance stagnates can also improve model performance. While the initial learning rate should be relatively large to speed up convergence, training using smaller learning rates later on in the process can help the model make smaller adjustments to better fit the specific classification task. In this example, we choose to monitor the validation loss as track changes in the loss to determine when to reduce the learning rate and stop training.\n", 420 | "\n", 421 | "We initialize our model with ImageNet pretrained weights. Our dataset is relatively small so training from scratch (randomly initialized weights) will likely result in worse performance and unstable model training. Even with larger datasets, initializing with pretrained weights can speed up model convergence. Because ImageNet is composed of 3-channel RGB natural color images and our CT scout images are 1-channel grayscale images, we need to slightly modify the first layer of the ImageNet pretrained weights.\n", 422 | "\n", 423 | "Note that we use binary crossentropy as our loss function versus categorical crossentropy. We previously discussed how our problem is multi-label in that a single image can have multiple labels (e.g., chest AND abdomen). Categorical crossentropy is better suited for multi-class problems where an image has only one label. " 424 | ] 425 | }, 426 | { 427 | "cell_type": "code", 428 | "metadata": { 429 | "id": "UT-XqAcItusp" 430 | }, 431 | "source": [ 432 | "#######################\n", 433 | "# TRAINING PARAMETERS #\n", 434 | "#######################\n", 435 | "\n", 436 | "INITIAL_LR = 1e-4\n", 437 | "N_CLASSES = 3\n", 438 | "BATCH_SIZE = 4\n", 439 | "DROPOUT = 0.5\n", 440 | "IMSIZE = 256\n", 441 | "# Max number of epochs to train for\n", 442 | "EPOCHS = 50 \n", 443 | "# Pick a performance metric to determine whether the model is improving\n", 444 | "MONITOR = 'val_loss' \n", 445 | "# Define a minimum improvement threshold\n", 446 | "# The model must improve validation performance by at least this amount \n", 447 | "# to be considered improving\n", 448 | "MIN_DELTA = 0.001 \n", 449 | "# If the model is not improving, we should reduce the learning rate \n", 450 | "ANNEAL_BY = 0.5 \n", 451 | "# A model may not improve after 1 epoch but could improve after the next epoch\n", 452 | "# without making changes to the learning rate. How many epochs should we wait?\n", 453 | "PATIENCE = 2 \n", 454 | "# It can be a good idea to let the model \"settle in\" to a new learning rate \n", 455 | "# after decreasing it. How long should we wait?\n", 456 | "COOLDOWN = 1\n", 457 | "# Stopping model training when the model isn't improving validation performance\n", 458 | "# can help prevent overfitting. \n", 459 | "STOP_AFTER = 5\n", 460 | "\n", 461 | "# Load pretrained MobileNetV2 ImageNet weights\n", 462 | "base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224,224,3))\n", 463 | "imagenet_weights = base_model.get_weights()\n", 464 | "print(imagenet_weights[0].shape) \n", 465 | "# Sum over axis 2 to allow for grayscale (1-channel) input\n", 466 | "imagenet_weights[0] = np.expand_dims(np.sum(imagenet_weights[0], axis=2), axis=2)\n", 467 | "print(imagenet_weights[0].shape)\n", 468 | "base_model = MobileNetV2(weights=None, include_top=False, input_shape=(IMSIZE,IMSIZE,1)) \n", 469 | "base_model.set_weights(imagenet_weights)\n", 470 | "x = GlobalAveragePooling2D()(base_model.output) \n", 471 | "x = Dropout(DROPOUT)(x) \n", 472 | "prediction = Dense(N_CLASSES, activation='sigmoid')(x) \n", 473 | "\n", 474 | "model = Model(inputs=base_model.input, outputs=prediction)\n", 475 | "model.compile(optimizer=tf.optimizers.Adam(learning_rate=INITIAL_LR), \n", 476 | " loss='binary_crossentropy', \n", 477 | " metrics=['accuracy'])\n", 478 | "\n", 479 | "WINDOW_LEVEL, WINDOW_WIDTH = 50, 500\n", 480 | "train_scoutgen = ScoutDataGenerator(df, train1, IMSIZE, BATCH_SIZE, augment=True)\n", 481 | "valid_scoutgen = ScoutDataGenerator(df, val1, IMSIZE, BATCH_SIZE, augment=False)" 482 | ], 483 | "execution_count": null, 484 | "outputs": [] 485 | }, 486 | { 487 | "cell_type": "code", 488 | "metadata": { 489 | "id": "Blw6djmlQ04w" 490 | }, 491 | "source": [ 492 | "# serialize model to JSON\n", 493 | "model_json = model.to_json()\n", 494 | "with open(\"bodypart/model.json\", \"w\") as json_file:\n", 495 | " json_file.write(model_json)" 496 | ], 497 | "execution_count": null, 498 | "outputs": [] 499 | }, 500 | { 501 | "cell_type": "markdown", 502 | "metadata": { 503 | "id": "A7ysPBgxn1Hx" 504 | }, 505 | "source": [ 506 | "Let's take a look at an image produced by our data generator." 507 | ] 508 | }, 509 | { 510 | "cell_type": "code", 511 | "metadata": { 512 | "id": "HvQtWNQk_BEP" 513 | }, 514 | "source": [ 515 | "# Show example image\n", 516 | "test_image = next(train_scoutgen)[0][0]\n", 517 | "plt.imshow(test_image[..., 0], cmap='gray'); plt.show() " 518 | ], 519 | "execution_count": null, 520 | "outputs": [] 521 | }, 522 | { 523 | "cell_type": "markdown", 524 | "metadata": { 525 | "id": "PUqn1oJCn40y" 526 | }, 527 | "source": [ 528 | "## Training the CNN" 529 | ] 530 | }, 531 | { 532 | "cell_type": "markdown", 533 | "metadata": { 534 | "id": "PhLp4N9dn7yR" 535 | }, 536 | "source": [ 537 | "Once we have all of our training hyperparameters set up, training the model is simple in Keras. We validate on the validation set after every epoch. \n", 538 | "\n", 539 | "When we examined the distribution of class labels previously, we saw that there was an imbalance: abdomen and pelvis labels were both >30% whereas lower and upper extremity labels were <10%. Severe class imbalance can cause problems during training as the network will learn to simply predict the more prevalent class. Two common strategies are 1) over-/under-sampling the data so that the distributions of class labels are more similar and 2) using a weighted loss function that gives more weight to less common classes. Though our data is not severely imbalanced, adjusting for class imbalance can still result in small performance boosts. We will use an inverse-frequency weighted loss function during training." 540 | ] 541 | }, 542 | { 543 | "cell_type": "code", 544 | "metadata": { 545 | "id": "ep8iH4OSz7mi" 546 | }, 547 | "source": [ 548 | "device_name = tf.test.gpu_device_name()\n", 549 | "if device_name != '/device:GPU:0':\n", 550 | " raise SystemError('GPU device not found')\n", 551 | "print('Found GPU at: {}'.format(device_name))" 552 | ], 553 | "execution_count": null, 554 | "outputs": [] 555 | }, 556 | { 557 | "cell_type": "code", 558 | "metadata": { 559 | "id": "NvJ6cyB3l-cM" 560 | }, 561 | "source": [ 562 | "# this is on just MGUH images starting from scratch\n", 563 | "callbacks = [\n", 564 | " EarlyStopping(monitor=MONITOR, patience=STOP_AFTER, min_delta=MIN_DELTA,\n", 565 | " restore_best_weights=True),\n", 566 | " ReduceLROnPlateau(monitor=MONITOR, factor=ANNEAL_BY, patience=PATIENCE,\n", 567 | " min_delta=MIN_DELTA, mode='min', cooldown=COOLDOWN, \n", 568 | " verbose=1)\n", 569 | "]\n", 570 | "\n", 571 | "# Let's weight each class in the loss function by the inverse of its frequency\n", 572 | "weights = {} ; total_weight = 0.\n", 573 | "for freq_index, freq in enumerate(class_frequencies): \n", 574 | " weights[freq_index] = 1. / freq\n", 575 | " total_weight += weights[freq_index]\n", 576 | "\n", 577 | "# Scale so that sum of weights equals the number of classes\n", 578 | "for each_class in weights.keys(): \n", 579 | " weights[each_class] = weights[each_class] / total_weight * N_CLASSES\n", 580 | "\n", 581 | "model.fit_generator(train_scoutgen, epochs=EPOCHS, \n", 582 | " steps_per_epoch=len(train1) / BATCH_SIZE, \n", 583 | " validation_data=valid_scoutgen, \n", 584 | " validation_steps=len(val1) / BATCH_SIZE,\n", 585 | " callbacks=callbacks,\n", 586 | " class_weight=weights) " 587 | ], 588 | "execution_count": null, 589 | "outputs": [] 590 | }, 591 | { 592 | "cell_type": "code", 593 | "metadata": { 594 | "id": "6rdAVXFhZIlb" 595 | }, 596 | "source": [ 597 | "title = \"mguh-multilabel.h5\"\n", 598 | "model.save_weights(title)" 599 | ], 600 | "execution_count": null, 601 | "outputs": [] 602 | }, 603 | { 604 | "cell_type": "markdown", 605 | "metadata": { 606 | "id": "vffywqr-sCt1" 607 | }, 608 | "source": [ 609 | "\n", 610 | "\n", 611 | "That concludes the training notebook. We trained a basic CNN to label CT scout exams with their anatomical regions." 612 | ] 613 | }, 614 | { 615 | "cell_type": "code", 616 | "metadata": { 617 | "id": "m1id0Y9NGTxF" 618 | }, 619 | "source": [ 620 | "" 621 | ], 622 | "execution_count": null, 623 | "outputs": [] 624 | } 625 | ] 626 | } -------------------------------------------------------------------------------- /sessions/data-curation/README.md: -------------------------------------------------------------------------------- 1 | # Data Processing & Curation for Deep Learning 2 | 3 | ## Presenters 4 | - Kirti Magudia 5 | - Walter Wiggins 6 | 7 | ## Session Date & Time 8 | Wednesday, December 1, 2021 9 | 4:30 PM -------------------------------------------------------------------------------- /sessions/data-curation/Sample_DICOM.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RSNA/AI-Deep-Learning-Lab-2021/bf7119ef9426453951dcd4c452e109f466c2f1ec/sessions/data-curation/Sample_DICOM.zip -------------------------------------------------------------------------------- /sessions/dicom-seg/README.md: -------------------------------------------------------------------------------- 1 | # DICOM In, DICOM Out for Segmentation 2 | 3 | ## Presenters 4 | - Thomas Loehfelm 5 | 6 | ## Session Date & Time 7 | Unfortunately, the presenter couldn't make it to present live this year. But feel free to peruse the notebook anyway. 8 | -------------------------------------------------------------------------------- /sessions/dicom-wrangling/README.md: -------------------------------------------------------------------------------- 1 | # DICOM Data Wrangling with Python 2 | 3 | ## Presenters 4 | - Kathy Andriole 5 | 6 | ## Session Date & Time 7 | Sunday, November 28, 2021 8 | 1:00 PM -------------------------------------------------------------------------------- /sessions/gans/README.md: -------------------------------------------------------------------------------- 1 | # Generative Adversarial Networks 2 | 3 | ## Presenters 4 | - 5 | 6 | ## Session Date & Time 7 | Monday, November 29, 2021 8 | 1:30 PM -------------------------------------------------------------------------------- /sessions/mednist-monai/README.md: -------------------------------------------------------------------------------- 1 | # MedNIST Exam Classification with MONAI 2 | 3 | ## Presenters 4 | - Kuan (Kevin) Zhang 5 | - Bradley J. Erickson 6 | - Jayashree Kalpathy-Cramer 7 | - Jay Biren Patel 8 | 9 | ## Session Date & Time 10 | Sunday, November 28, 2021 11 | 10:30 AM 12 | -------------------------------------------------------------------------------- /sessions/mednist-monai/RSNA21_DLL_mednist-monai.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RSNA/AI-Deep-Learning-Lab-2021/bf7119ef9426453951dcd4c452e109f466c2f1ec/sessions/mednist-monai/RSNA21_DLL_mednist-monai.pdf -------------------------------------------------------------------------------- /sessions/multi-modal-pe/Multimodal Fusion for PE Detection (Clean).ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "colab_type": "text", 7 | "id": "view-in-github" 8 | }, 9 | "source": [ 10 | "\"Open" 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": { 16 | "id": "sSdKnAJl0Urr" 17 | }, 18 | "source": [ 19 | "# Multimodal Fusion for Pulmonary Embolism Classification\n", 20 | "\n", 21 | "> by **Mars (Shih-Cheng) Huang**\n", 22 | "\n", 23 | "> email: *mschuang@stanford.edu*\n", 24 | "\n", 25 | "In this demonstration, we will recreate the results from our manuscript *Multimodal fusion with deep neural networks for leveraging CT imaging and electronic health record: a case-study in pulmonary embolism detection*. \n", 26 | "\n", 27 | "Specifically, we will build a multimodal fusion model (late fusion) that combines information from both CT scans and Electronic Medical Record (EMR) to automatically diagnose the presence/absence of PE. \n", 28 | "\n", 29 | "![Workflow](https://github.com/marshuang80/AI-Deep-Learning-Lab-2021/blob/multimodal-pe/sessions/multi-modal-pe/figs/workflow.png?raw=1)\n", 30 | "\n", 31 | "### Motivation\n", 32 | "\n", 33 | "**Clinical Motivation** \n", 34 | "\n", 35 | "Pulmonary Embolism (PE) is a serious medical condition† that hospitalizes 300,000 people in the United States every year. The gold standard diagnostic modality for PE is Computed Tomography Pulmonary Angiography (CTPA) which is interpreted by radiologists. Studies have shown that prompt diagnosis and treatment can greatly reduce morbidity and mortality. Strategies to automate accurate interpretation and timely reporting of CTPA examinations may successfully triage urgent cases of PE to the immediate attention of physicians, improving time to diagnosis and treatment.\n", 36 | "\n", 37 | "**Technical Motivation** \n", 38 | "\n", 39 | "Recent advancements in deep learning have led to a resurgence of medical imaging and Electronic Medical Record (EMR) models for a variety of applications, including clinical decision support, automated workflow triage, clinical prediction and more. However, very few models have been developed to integrate both clinical and imaging data, despite that in routine practice clinicians rely on EMR to provide context in medical imaging interpretation.\n", 40 | "\n", 41 | "\n", 42 | "\n", 43 | "### Fusion Strategies\n", 44 | "![Fusion Strategies](https://github.com/marshuang80/AI-Deep-Learning-Lab-2021/blob/multimodal-pe/sessions/multi-modal-pe/figs/fusion_strategies.png?raw=1)\n", 45 | "\n", 46 | "### Data\n", 47 | "We will use a subset of RadFusion, a large-scale multimodal pulmonary embolism detection dataset consisting of 1837 CT imaging studies (comprising 600,000+ 2D CT slices) for 1794 patients and their corresponding EHR summary data. The full dataset with CT scans can be access via the following link: \n", 48 | "- https://stanfordaimi.azurewebsites.net/datasets/3a7548a4-8f65-4ab7-85fa-3d68c9efc1bd\n", 49 | "\n", 50 | "### References\n", 51 | "- Huang, Shih-Cheng, et al. \"PENet—a scalable deep-learning model for automated diagnosis of pulmonary embolism using volumetric CT imaging.\" NPJ digital medicine 3.1 (2020): 1-9.\n", 52 | "- Huang, Shih-Cheng, et al. \"Multimodal fusion with deep neural networks for leveraging CT imaging and electronic health record: a case-study in pulmonary embolism detection.\" Scientific reports 10.1 (2020): 1-9.\n", 53 | "- Zhou, Yuyin, et al. \"RadFusion: Benchmarking Performance and Fairness for Multimodal Pulmonary Embolism Detection from CT and EHR.\" arXiv preprint arXiv:2111.11665 (2021)." 54 | ] 55 | }, 56 | { 57 | "cell_type": "markdown", 58 | "metadata": { 59 | "id": "kYqQxeKF0Urv" 60 | }, 61 | "source": [ 62 | "## Research Use Agreement\n", 63 | "\n", 64 | "Before we can proceed to download the data, please agree to this **Research Use Agreement** by registering to download from our website:\n", 65 | "- https://stanfordaimi.azurewebsites.net/datasets/3a7548a4-8f65-4ab7-85fa-3d68c9efc1bd\n", 66 | "\n", 67 | "\n", 68 | "![User Agreement](https://github.com/marshuang80/AI-Deep-Learning-Lab-2021/blob/multimodal-pe/sessions/multi-modal-pe/figs/UserAgreement.png?raw=1)\n" 69 | ] 70 | }, 71 | { 72 | "cell_type": "markdown", 73 | "metadata": { 74 | "id": "JuxRbEun0Urw" 75 | }, 76 | "source": [ 77 | "## System Setup & Downloading the Data" 78 | ] 79 | }, 80 | { 81 | "cell_type": "code", 82 | "execution_count": null, 83 | "metadata": { 84 | "colab": { 85 | "base_uri": "https://localhost:8080/" 86 | }, 87 | "id": "n6KU6ap80Urx", 88 | "outputId": "8ff3429e-8f1a-4d40-c3e5-f8f028039c38" 89 | }, 90 | "outputs": [], 91 | "source": [ 92 | "!pip install numpy pandas scikit-learn matplotlib\n", 93 | "!gdown --id 1w0ocK3br8oqVwn6zK5qgtRaj9Ql37dtd # /content/Demographics.csv\n", 94 | "!gdown --id 1MEhVZ87J2IwFmkgxOi8WjdVKTdwOpDDY # /content/INP_MED.csv\n", 95 | "!gdown --id 1PRgFvQjqEUudeJ0FLR3DbtvqmI7t7sCT # /content/OUT_MED.csv\n", 96 | "!gdown --id 1EDZOYmWrvv6D3XaZrjVous95c9HdiBEx # /content/Vitals.csv\n", 97 | "!gdown --id 1Nlm1ZgibRv6kJBIJkQHkRh8oPqUpELnK # /content/ICD.csv\n", 98 | "!gdown --id 17Y9DJsolaRPyMkk_Xm3w-iCgSOxkQOyf # /content/LABS.csv\n", 99 | "!gdown --id 1JDb5f18uNo2hXXQqcHlRbcjswph1y98h # /content/Vision.csv" 100 | ] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": { 105 | "id": "cAQw3sno0Ury" 106 | }, 107 | "source": [ 108 | "## Data Exploration\n", 109 | "After downloading the data, you should be able to find the following files in your directory: \n", 110 | " \n", 111 | "- Demographics.csv \n", 112 | "- INP_MED.csv\n", 113 | "- OUT_MED.csv\n", 114 | "- Vitals.csv\n", 115 | "- ICD.csv\n", 116 | "- LABS.csv\n", 117 | "- Vision.csv\n", 118 | "\n", 119 | "Let's explore the contents in each file." 120 | ] 121 | }, 122 | { 123 | "cell_type": "code", 124 | "execution_count": null, 125 | "metadata": { 126 | "colab": { 127 | "base_uri": "https://localhost:8080/" 128 | }, 129 | "id": "aqjy8qL_6ikl", 130 | "outputId": "bb8c0760-2d0a-44db-a831-a4aee41a1aea" 131 | }, 132 | "outputs": [], 133 | "source": [ 134 | "! ls /content" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": null, 140 | "metadata": { 141 | "id": "dR9jz5H20Ury" 142 | }, 143 | "outputs": [], 144 | "source": [ 145 | "# import libraries\n", 146 | "import pandas as pd\n", 147 | "import numpy as np" 148 | ] 149 | }, 150 | { 151 | "cell_type": "markdown", 152 | "metadata": { 153 | "id": "dsKPP3Op0Urz" 154 | }, 155 | "source": [ 156 | "### Patient Demographics\n", 157 | "\n", 158 | "The demographic features consist of one-hot encoded gender, race and smoking habits and the age as a numeric variable." 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": null, 164 | "metadata": { 165 | "colab": { 166 | "base_uri": "https://localhost:8080/", 167 | "height": 221 168 | }, 169 | "id": "_KbDiSBT0Urz", 170 | "outputId": "e84fd2da-800a-4adb-f28c-f57851d5f10e" 171 | }, 172 | "outputs": [], 173 | "source": [ 174 | "demo_df = pd.read_csv('/content/Demographics.csv')\n", 175 | "print(demo_df.shape)\n", 176 | "demo_df.head(5)" 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "metadata": { 182 | "id": "Fy2xfJdb0Ur0" 183 | }, 184 | "source": [ 185 | "### Inpatient & Outpatient Medications\n", 186 | "\n", 187 | "641 unique classes of drugs were identified for inpatient & outpatient medication. Each medication was represented as both the frequency within the 12-month window and a binary label of whether the drug was prescribed to the patient." 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": null, 193 | "metadata": { 194 | "colab": { 195 | "base_uri": "https://localhost:8080/", 196 | "height": 355 197 | }, 198 | "id": "5NRR6u550Ur0", 199 | "outputId": "bc87e231-1c6c-43c7-847b-1b535538c846" 200 | }, 201 | "outputs": [], 202 | "source": [ 203 | "out_med_df = pd.read_csv('/content/OUT_MED.csv')\n", 204 | "print(out_med_df.shape)\n", 205 | "out_med_df.head(5)" 206 | ] 207 | }, 208 | { 209 | "cell_type": "code", 210 | "execution_count": null, 211 | "metadata": { 212 | "colab": { 213 | "base_uri": "https://localhost:8080/", 214 | "height": 355 215 | }, 216 | "id": "ihUT8aoV0Ur1", 217 | "outputId": "8e4ab7b7-5a79-418e-9e76-fa15c6e197b7" 218 | }, 219 | "outputs": [], 220 | "source": [ 221 | "in_med_df = pd.read_csv('/content/INP_MED.csv')\n", 222 | "print(in_med_df.shape)\n", 223 | "in_med_df.head(5)" 224 | ] 225 | }, 226 | { 227 | "cell_type": "markdown", 228 | "metadata": { 229 | "id": "U1AuWQPm0Ur1" 230 | }, 231 | "source": [ 232 | "### ICD Codes\n", 233 | "\n", 234 | "We excluded all ICD codes with less than 1% occurrences in the training dataset and collapsed into top diagnosis categories, which resulted in a total of 141 diagnosis groups. We used a binary presence/absence as well as a frequency to represent diagnosis code as features. All ICD codes recorded with the same encounter number as the patient’s CT exam, or within a 24 hour window prior to their CT examination, were dropped to avoid data leakage." 235 | ] 236 | }, 237 | { 238 | "cell_type": "code", 239 | "execution_count": null, 240 | "metadata": { 241 | "colab": { 242 | "base_uri": "https://localhost:8080/", 243 | "height": 389 244 | }, 245 | "id": "Q1KbnDsv0Ur1", 246 | "outputId": "7ac3d9d9-c151-41ce-ffae-076505353940" 247 | }, 248 | "outputs": [], 249 | "source": [ 250 | "icd_df = pd.read_csv('/content/ICD.csv')\n", 251 | "print(icd_df.shape)\n", 252 | "icd_df.head(5)" 253 | ] 254 | }, 255 | { 256 | "cell_type": "markdown", 257 | "metadata": { 258 | "id": "7hDUoeBi0Ur2" 259 | }, 260 | "source": [ 261 | "### Lab Tests\n", 262 | "\n", 263 | "We identified 22 lab tests and represented each test as binary presence/absence as well as the latest value of the test." 264 | ] 265 | }, 266 | { 267 | "cell_type": "code", 268 | "execution_count": null, 269 | "metadata": { 270 | "colab": { 271 | "base_uri": "https://localhost:8080/", 272 | "height": 258 273 | }, 274 | "id": "jptLjEkm0Ur2", 275 | "outputId": "60068c37-e98a-4e64-e47e-363def12e9eb" 276 | }, 277 | "outputs": [], 278 | "source": [ 279 | "lab_df = pd.read_csv('/content/LABS.csv')\n", 280 | "print(lab_df.shape)\n", 281 | "lab_df.head(5)" 282 | ] 283 | }, 284 | { 285 | "cell_type": "markdown", 286 | "metadata": { 287 | "id": "B7O3srjL0Ur2" 288 | }, 289 | "source": [ 290 | "### Vitals\n", 291 | "\n", 292 | "For vitals, we included systolic and diastolic blood pressure, height, weight, body mass index (BMI), temperature, respiration rate, pulse oximetry (spO2) and heart rate. The vitals were represented with respect to their sensitivity to change, which was computed by taking the derivative of the vital values along the temporal axis." 293 | ] 294 | }, 295 | { 296 | "cell_type": "code", 297 | "execution_count": null, 298 | "metadata": { 299 | "colab": { 300 | "base_uri": "https://localhost:8080/", 301 | "height": 204 302 | }, 303 | "id": "4lT3evdg0Ur3", 304 | "outputId": "d3400417-705f-4772-cc26-943dcb81442f" 305 | }, 306 | "outputs": [], 307 | "source": [ 308 | "vitals_df = pd.read_csv('/content/Vitals.csv')\n", 309 | "vitals_df.head(5)" 310 | ] 311 | }, 312 | { 313 | "cell_type": "markdown", 314 | "metadata": { 315 | "id": "uFk0oavw0Ur3" 316 | }, 317 | "source": [ 318 | "### CTs Scans\n", 319 | "\n", 320 | "The RadFusion dataset includes CTPA scans for each study. Due to time and computational constraint, we have ran inference on these CT scans using PENet, and stored the prediction probabilities in **Vision.csv**. Additional, this csv file incldues the labels (PE positive / PE negative), the type of PE (central, segmental and sub-segmental) and the train/val/test split used to develope PENet. For more information about PENet, please refer to: \n", 321 | "- Manuscript: [https://www.nature.com/articles/s41746-020-0266-y](https://www.nature.com/articles/s41746-020-0266-y)\n", 322 | "- GitHub: [https://github.com/marshuang80/penet](https://github.com/marshuang80/penet)" 323 | ] 324 | }, 325 | { 326 | "cell_type": "code", 327 | "execution_count": null, 328 | "metadata": { 329 | "colab": { 330 | "base_uri": "https://localhost:8080/", 331 | "height": 204 332 | }, 333 | "id": "_NdD0AH30Ur3", 334 | "outputId": "18c1217b-b2ad-4ab9-9365-09561e1f9d27" 335 | }, 336 | "outputs": [], 337 | "source": [ 338 | "# TODO, remove pe_type if label = 0\n", 339 | "vision_df = pd.read_csv('/content/Vision.csv')\n", 340 | "vision_df.head(5)" 341 | ] 342 | }, 343 | { 344 | "cell_type": "markdown", 345 | "metadata": { 346 | "id": "W2fjy6NY0Ur3" 347 | }, 348 | "source": [ 349 | "## Process Data\n", 350 | "\n", 351 | "We are going to pre-process the EMR data by: \n", 352 | "- Remove any features with zero variance \n", 353 | "- Normalize all features to be within the same range\n", 354 | "\n", 355 | "Next, we are going to combine all the EMR features into one dataframe" 356 | ] 357 | }, 358 | { 359 | "cell_type": "code", 360 | "execution_count": null, 361 | "metadata": { 362 | "colab": { 363 | "base_uri": "https://localhost:8080/", 364 | "height": 335 365 | }, 366 | "id": "O8ahgXou0Ur4", 367 | "outputId": "9c3891ec-f2eb-441d-a50c-1e12d2d80738" 368 | }, 369 | "outputs": [], 370 | "source": [ 371 | "processed_emr_dfs = []\n", 372 | "for df in [demo_df, out_med_df, in_med_df, icd_df, lab_df, vitals_df]:\n", 373 | " # remove zero variance featurs\n", 374 | " df = df.loc[:,df.apply(pd.Series.nunique) != 1]\n", 375 | " \n", 376 | " # set index \n", 377 | " df = df.set_index('idx')\n", 378 | "\n", 379 | " # normalize features\n", 380 | " df = df.apply(lambda x: (x - x.mean())/(x.std()))\n", 381 | " \n", 382 | " processed_emr_dfs.append(df)\n", 383 | "\n", 384 | "emr_df = pd.concat(processed_emr_dfs, axis=1)\n", 385 | "emr_df.head(5)" 386 | ] 387 | }, 388 | { 389 | "cell_type": "markdown", 390 | "metadata": { 391 | "id": "XKXCAv4b1npr" 392 | }, 393 | "source": [ 394 | "## Create Data Splits\n", 395 | "\n", 396 | "Next, we are going to create training, validation and test splits to develop our models. We want to make sure that we use the same data splits as the vision model (PENet), so we will join our EMR dataframe with the vision dataframe." 397 | ] 398 | }, 399 | { 400 | "cell_type": "code", 401 | "execution_count": null, 402 | "metadata": { 403 | "id": "zKmtXn2B0Ur4" 404 | }, 405 | "outputs": [], 406 | "source": [ 407 | "# Define columns\n", 408 | "EMR_FEATURE_COLS = emr_df.columns.tolist()\n", 409 | "PE_TYPE_COL = 'pe_type'\n", 410 | "SPLIT_COL = 'split'\n", 411 | "VISION_PRED_COL = 'pred'\n", 412 | "EMR_PRED_COL = 'emr_pred'\n", 413 | "FUSION_PRED_COL = 'late_fusion_pred'\n", 414 | "LABEL_COL = 'label'\n", 415 | "\n", 416 | "# Join vision information with emr dataframe\n", 417 | "vision_df = vision_df.set_index('idx')\n", 418 | "df = pd.concat([vision_df, emr_df], axis=1)\n", 419 | "\n", 420 | "# Create data splits\n", 421 | "df_dev = df[(df[SPLIT_COL] == 'train') | (df[SPLIT_COL] == 'val')] # for gridsearch CV\n", 422 | "df_train = df[df[SPLIT_COL] == 'train']\n", 423 | "df_val = df[df[SPLIT_COL] == 'val']\n", 424 | "df_test = df[df[SPLIT_COL] == 'test']" 425 | ] 426 | }, 427 | { 428 | "cell_type": "markdown", 429 | "metadata": { 430 | "id": "gGksQMbU0Ur5" 431 | }, 432 | "source": [ 433 | "## Train EMR Model\n", 434 | "For our EMR data, we are going to train a simple logistic regression model.\n", 435 | "\n", 436 | "In particular we are going to train the logistic regression model with the elasticnet penalty, which linearly combines the L₁ and L₂ penalties of the lasso and ridge methods.\n", 437 | "\n", 438 | "We will use the **LogisticRegression** class from sklearn for this task." 439 | ] 440 | }, 441 | { 442 | "cell_type": "code", 443 | "execution_count": null, 444 | "metadata": { 445 | "colab": { 446 | "base_uri": "https://localhost:8080/" 447 | }, 448 | "id": "aJUCkviY0Ur5", 449 | "outputId": "c4759d46-384d-4bcc-85a1-1b6cd05872e8" 450 | }, 451 | "outputs": [], 452 | "source": [ 453 | "from sklearn.linear_model import LogisticRegression\n", 454 | "from sklearn.model_selection import GridSearchCV\n", 455 | "\n", 456 | "# Uncomment and run grid search if time permits\n", 457 | "\"\"\"\n", 458 | "# define model\n", 459 | "clf = LogisticRegression(\n", 460 | " penalty='elasticnet', solver='saga', random_state=0\n", 461 | ")\n", 462 | "\n", 463 | "# define grid search\n", 464 | "param_grid = {\n", 465 | " \"C\": [0.01, 0.1, 1.0, 100], \n", 466 | " \"max_iter\": [10, 100, 1000],\n", 467 | " \"l1_ratio\": [0.01, 0.25, 0.5, 0.75, 0.99]\n", 468 | "}\n", 469 | "gsc = GridSearchCV(\n", 470 | " estimator=clf,\n", 471 | " param_grid=param_grid,\n", 472 | " scoring='roc_auc',\n", 473 | " n_jobs=-1,\n", 474 | " verbose=10\n", 475 | ")\n", 476 | "\n", 477 | "# run grid search\n", 478 | "gsc.fit(df_dev[EMR_FEATURE_COLS], df_dev[LABEL_COL])\n", 479 | "print(f\"Best parameters: {gsc.best_params_}\")\n", 480 | "clf = gsc.best_estimator_\n", 481 | "\"\"\"\n", 482 | "\n", 483 | "clf = LogisticRegression(\n", 484 | " penalty='elasticnet', \n", 485 | " solver='saga', \n", 486 | " random_state=0,\n", 487 | " C= 0.1, \n", 488 | " class_weight='balanced', \n", 489 | " l1_ratio= 0.99, \n", 490 | " max_iter= 1000\n", 491 | ")\n", 492 | "clf.fit(df_train[EMR_FEATURE_COLS], df_train[LABEL_COL])" 493 | ] 494 | }, 495 | { 496 | "cell_type": "markdown", 497 | "metadata": { 498 | "id": "ZTtxKD1d0Ur5" 499 | }, 500 | "source": [ 501 | "## Test EMR Model\n", 502 | "\n", 503 | "Using the trained EMR model, we will run inference on our held-out test set and extract the prediction probabilities." 504 | ] 505 | }, 506 | { 507 | "cell_type": "code", 508 | "execution_count": null, 509 | "metadata": { 510 | "id": "fVMQeDfC0Ur5" 511 | }, 512 | "outputs": [], 513 | "source": [ 514 | "# test with best model\n", 515 | "emr_prob = clf.predict_proba(df_test[EMR_FEATURE_COLS])\n", 516 | "\n", 517 | "# take probability of positive class \n", 518 | "emr_prob = [p[1] for p in emr_prob]\n", 519 | "\n", 520 | "df_test = df_test.assign(emr_pred = emr_prob)" 521 | ] 522 | }, 523 | { 524 | "cell_type": "markdown", 525 | "metadata": { 526 | "id": "SrDhhMHK0Ur5" 527 | }, 528 | "source": [ 529 | "## Late Fusion (Mean Aggregation)\n", 530 | "\n", 531 | "\n", 532 | "\n", 533 | "Now that we are have prediction probabilities from both the EMR and Vision model, we will apply a simple late fusion strategy with mean aggregation. " 534 | ] 535 | }, 536 | { 537 | "cell_type": "code", 538 | "execution_count": null, 539 | "metadata": { 540 | "id": "tOsWfC4P0Ur5" 541 | }, 542 | "outputs": [], 543 | "source": [ 544 | "# Late fusion by taking the average prediction probability from vision model and emr model\n", 545 | "late_fusion_pred = np.mean(\n", 546 | " [df_test[EMR_PRED_COL], df_test[VISION_PRED_COL]], \n", 547 | " axis=0\n", 548 | ")\n", 549 | "df_test = df_test.assign(late_fusion_pred = late_fusion_pred)" 550 | ] 551 | }, 552 | { 553 | "cell_type": "markdown", 554 | "metadata": { 555 | "id": "A-ag0_5Z0Ur6" 556 | }, 557 | "source": [ 558 | "## Evaluate Performance" 559 | ] 560 | }, 561 | { 562 | "cell_type": "code", 563 | "execution_count": null, 564 | "metadata": { 565 | "colab": { 566 | "base_uri": "https://localhost:8080/", 567 | "height": 34 568 | }, 569 | "id": "XSiBXsbN0Ur6", 570 | "outputId": "c475b3c5-c842-4a3b-da92-9e50e90b6566" 571 | }, 572 | "outputs": [], 573 | "source": [ 574 | "from sklearn import metrics\n", 575 | "import matplotlib.pyplot as plt\n", 576 | "\n", 577 | "plt.style.use('ggplot')\n", 578 | "plt.figure(figsize=(7, 7))\n", 579 | "lw = 2\n", 580 | "\n", 581 | "def plot_auc(df, label):\n", 582 | " # PENet performance\n", 583 | " fpr_v, tpr_v, _ = metrics.roc_curve(\n", 584 | " df[LABEL_COL], \n", 585 | " df[VISION_PRED_COL])\n", 586 | " roc_auc_v = metrics.auc(fpr_v, tpr_v)\n", 587 | " plt.plot(\n", 588 | " fpr_v, \n", 589 | " tpr_v, \n", 590 | " color='darkorange',\n", 591 | " lw=lw, \n", 592 | " label='PENet ROC curve (area = %0.2f)' % roc_auc_v)\n", 593 | "\n", 594 | " # EMR model performance\n", 595 | " fpr_emr, tpr_emr, _ = metrics.roc_curve(\n", 596 | " df[LABEL_COL], \n", 597 | " df[EMR_PRED_COL])\n", 598 | " roc_auc_emr = metrics.auc(fpr_emr, tpr_emr)\n", 599 | " plt.plot(\n", 600 | " fpr_emr, \n", 601 | " tpr_emr,\n", 602 | " lw=lw, \n", 603 | " label='EMR Model ROC curve (area = %0.2f)' % roc_auc_emr)\n", 604 | "\n", 605 | " # Fusion model performance\n", 606 | " fpr_fusion, tpr_fusion, _ = metrics.roc_curve(\n", 607 | " df[LABEL_COL], \n", 608 | " df[FUSION_PRED_COL])\n", 609 | " roc_auc_fusion = metrics.auc(fpr_fusion, tpr_fusion)\n", 610 | " plt.plot(\n", 611 | " fpr_fusion, \n", 612 | " tpr_fusion,\n", 613 | " lw=lw, \n", 614 | " label='Fusion Model ROC curve (area = %0.2f)' % roc_auc_fusion)\n", 615 | "\n", 616 | " plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')\n", 617 | " plt.xlim([0.0, 0.95])\n", 618 | " plt.ylim([0.0, 1.05])\n", 619 | " plt.axes().set_aspect('equal', 'datalim')\n", 620 | "\n", 621 | " plt.xlabel('False Positive Rate')\n", 622 | " plt.ylabel('True Positive Rate')\n", 623 | " plt.title(f'Receiver operating characteristic ({label})')\n", 624 | " plt.legend(loc=\"lower right\")\n", 625 | "\n", 626 | " plt.show()" 627 | ] 628 | }, 629 | { 630 | "cell_type": "code", 631 | "execution_count": null, 632 | "metadata": { 633 | "colab": { 634 | "base_uri": "https://localhost:8080/", 635 | "height": 336 636 | }, 637 | "id": "fEMRmB-b0Ur6", 638 | "outputId": "e88009f5-e6a2-4533-9036-3ad19b273d9f" 639 | }, 640 | "outputs": [], 641 | "source": [ 642 | "# Performance for all cases\n", 643 | "plot_auc(df_test, 'All Cases')" 644 | ] 645 | }, 646 | { 647 | "cell_type": "code", 648 | "execution_count": null, 649 | "metadata": { 650 | "colab": { 651 | "base_uri": "https://localhost:8080/", 652 | "height": 336 653 | }, 654 | "id": "DpFHeBhC0Ur6", 655 | "outputId": "281f25bc-f7f9-440b-a37c-82bb46e36e9c" 656 | }, 657 | "outputs": [], 658 | "source": [ 659 | "# Performance for non-subsegmental cases\n", 660 | "df_test_no_subseg = df_test[\n", 661 | " df_test[PE_TYPE_COL] != 'subsegmental']\n", 662 | "plot_auc(df_test_no_subseg, 'No Subsegmental')" 663 | ] 664 | }, 665 | { 666 | "cell_type": "code", 667 | "execution_count": null, 668 | "metadata": { 669 | "colab": { 670 | "base_uri": "https://localhost:8080/", 671 | "height": 603 672 | }, 673 | "id": "VAG9blrx0Ur6", 674 | "outputId": "c045e41d-a243-4b97-f6f5-61ae02bf35bf" 675 | }, 676 | "outputs": [], 677 | "source": [ 678 | "# Visualize histogram of Predicted Probs\n", 679 | "%matplotlib inline\n", 680 | "import matplotlib\n", 681 | "import matplotlib.pyplot as plt\n", 682 | "from matplotlib.pyplot import figure\n", 683 | "\n", 684 | "# style\n", 685 | "plt.clf()\n", 686 | "plt.style.use('ggplot')\n", 687 | "matplotlib.rc('xtick', labelsize=5) \n", 688 | "matplotlib.rc('ytick', labelsize=5) \n", 689 | "f, (ax1, ax2, ax3) = plt.subplots(1, 3, sharey=True, figsize=(7,3), dpi=150)\n", 690 | "bins = np.linspace(0, 1, 30)\n", 691 | "\n", 692 | "# seperate cases into positive and negative\n", 693 | "positive_cases = df_test_no_subseg[\n", 694 | " df_test_no_subseg[LABEL_COL] == 1]\n", 695 | "negative_cases = df_test_no_subseg[\n", 696 | " df_test_no_subseg[LABEL_COL] == 0]\n", 697 | "\n", 698 | "# PENet\n", 699 | "ax1.hist(\n", 700 | " [positive_cases[VISION_PRED_COL], negative_cases[VISION_PRED_COL]], \n", 701 | " bins, \n", 702 | " label=['positive','negative'], \n", 703 | " width=0.01)\n", 704 | "\n", 705 | "# EMR\n", 706 | "ax2.hist(\n", 707 | " [positive_cases[EMR_PRED_COL], negative_cases[EMR_PRED_COL]], \n", 708 | " bins, \n", 709 | " label=['positive', 'negative'], \n", 710 | " width=0.01)\n", 711 | "\n", 712 | "# Fusion\n", 713 | "ax3.hist(\n", 714 | " [positive_cases[FUSION_PRED_COL], negative_cases[FUSION_PRED_COL]], \n", 715 | " bins, \n", 716 | " label=['positive','negative'], \n", 717 | " width=0.01)\n", 718 | "\n", 719 | "f.tight_layout(pad=0.5)\n", 720 | "plt.legend(loc='upper right')\n", 721 | "ax2.set_xlabel(\"Predicted Probabilities\", fontsize = 10)\n", 722 | "ax1.set_ylabel(\"Count\", fontsize = 10)\n", 723 | "ax1.set_title('Vision Only', fontsize = 10)\n", 724 | "ax2.set_title('EMR Only', fontsize = 10)\n", 725 | "ax3.set_title('Fusion', fontsize = 10)\n", 726 | "plt.show()" 727 | ] 728 | }, 729 | { 730 | "cell_type": "markdown", 731 | "metadata": { 732 | "id": "MRHKH9cC0Ur6" 733 | }, 734 | "source": [ 735 | "# Bonus: Other Fusion Strategies\n", 736 | "\n", 737 | "![OtherFusionStrategies](https://github.com/marshuang80/AI-Deep-Learning-Lab-2021/blob/multimodal-pe/sessions/multi-modal-pe/figs/other_fusion_strategies.png?raw=1)" 738 | ] 739 | }, 740 | { 741 | "cell_type": "code", 742 | "execution_count": null, 743 | "metadata": { 744 | "id": "D5qcPYsQ0Ur7" 745 | }, 746 | "outputs": [], 747 | "source": [ 748 | "# Try out other fusion strategies here\n", 749 | "\n", 750 | "## Option 1: Use another classifier for our EMR model (SVC, Decision Tree, Neural Networks...) \n", 751 | "## Option 2: Use another aggregator for late fusion (Max, Meta-classifier)\n", 752 | "## Option 3: Train separate classfiers for each type of EMR data before fusion (ICD, Vitals, Demographics ...)" 753 | ] 754 | } 755 | ], 756 | "metadata": { 757 | "colab": { 758 | "collapsed_sections": [], 759 | "include_colab_link": true, 760 | "name": "Multimodal Fusion for PE Detection.ipynb", 761 | "provenance": [] 762 | }, 763 | "kernelspec": { 764 | "display_name": "Python 3.7.6 64-bit", 765 | "language": "python", 766 | "name": "python37664bitec9bac52ca3c411ebc0b7adf9e9ef198" 767 | }, 768 | "language_info": { 769 | "codemirror_mode": { 770 | "name": "ipython", 771 | "version": 3 772 | }, 773 | "file_extension": ".py", 774 | "mimetype": "text/x-python", 775 | "name": "python", 776 | "nbconvert_exporter": "python", 777 | "pygments_lexer": "ipython3", 778 | "version": "3.7.6" 779 | } 780 | }, 781 | "nbformat": 4, 782 | "nbformat_minor": 1 783 | } 784 | -------------------------------------------------------------------------------- /sessions/multi-modal-pe/README.md: -------------------------------------------------------------------------------- 1 | # Multi-Modal Fusion for PE Detection Using CTs & EMR Data 2 | 3 | ## Presenters 4 | - Mars Huang 5 | - Matt Lungren 6 | 7 | ## Session Date & Time 8 | Wednesday, December 1, 2021 9 | 1:30 PM -------------------------------------------------------------------------------- /sessions/multi-modal-pe/figs/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RSNA/AI-Deep-Learning-Lab-2021/bf7119ef9426453951dcd4c452e109f466c2f1ec/sessions/multi-modal-pe/figs/.DS_Store -------------------------------------------------------------------------------- /sessions/multi-modal-pe/figs/UserAgreement.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RSNA/AI-Deep-Learning-Lab-2021/bf7119ef9426453951dcd4c452e109f466c2f1ec/sessions/multi-modal-pe/figs/UserAgreement.png -------------------------------------------------------------------------------- /sessions/multi-modal-pe/figs/fusion_strategies.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RSNA/AI-Deep-Learning-Lab-2021/bf7119ef9426453951dcd4c452e109f466c2f1ec/sessions/multi-modal-pe/figs/fusion_strategies.png -------------------------------------------------------------------------------- /sessions/multi-modal-pe/figs/late_fusion_mean_agg.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RSNA/AI-Deep-Learning-Lab-2021/bf7119ef9426453951dcd4c452e109f466c2f1ec/sessions/multi-modal-pe/figs/late_fusion_mean_agg.png -------------------------------------------------------------------------------- /sessions/multi-modal-pe/figs/other_fusion_strategies.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RSNA/AI-Deep-Learning-Lab-2021/bf7119ef9426453951dcd4c452e109f466c2f1ec/sessions/multi-modal-pe/figs/other_fusion_strategies.png -------------------------------------------------------------------------------- /sessions/multi-modal-pe/figs/workflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RSNA/AI-Deep-Learning-Lab-2021/bf7119ef9426453951dcd4c452e109f466c2f1ec/sessions/multi-modal-pe/figs/workflow.png -------------------------------------------------------------------------------- /sessions/nlp-basics/README.md: -------------------------------------------------------------------------------- 1 | # Basics of NLP in Radiology 2 | 3 | ## Presenters 4 | - Jae Sohn 5 | - Timothy Chen 6 | 7 | ## Session Date & Time 8 | Thursday, December 2, 2021 9 | 11:00 AM -------------------------------------------------------------------------------- /sessions/nlp-text-classification/README.md: -------------------------------------------------------------------------------- 1 | # NLP: Text Classification with RNNs & Transformers 2 | 3 | ## Presenters 4 | - Walter Wiggins 5 | - Kirti Magudia 6 | 7 | ## Session Date & Time 8 | Tuesday, November 30, 2021 9 | 3:00 PM -------------------------------------------------------------------------------- /sessions/object-detection-seg/README.md: -------------------------------------------------------------------------------- 1 | # Object Detection & Segmentation 2 | 3 | ## Presenters 4 | - Peter Chang 5 | - Simukayi Mutasa 6 | 7 | ## Session Date & Time 8 | Monday, November 29, 2021 9 | 3:00 PM 10 | -------------------------------------------------------------------------------- /sessions/object-detection-seg/segmentation.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "kernelspec": { 6 | "display_name": "Python 3", 7 | "language": "python", 8 | "name": "python3" 9 | }, 10 | "language_info": { 11 | "codemirror_mode": { 12 | "name": "ipython", 13 | "version": 3 14 | }, 15 | "file_extension": ".py", 16 | "mimetype": "text/x-python", 17 | "name": "python", 18 | "nbconvert_exporter": "python", 19 | "pygments_lexer": "ipython3", 20 | "version": "3.6.12" 21 | }, 22 | "colab": { 23 | "name": "segmentation.ipynb", 24 | "provenance": [], 25 | "include_colab_link": true 26 | } 27 | }, 28 | "cells": [ 29 | { 30 | "cell_type": "markdown", 31 | "metadata": { 32 | "id": "view-in-github", 33 | "colab_type": "text" 34 | }, 35 | "source": [ 36 | "\"Open" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": { 42 | "id": "Kp_ayf96vpUx" 43 | }, 44 | "source": [ 45 | "# Overview\n", 46 | "\n", 47 | "In this tutorial we will explore how to create a contract-expanding fully convolutional neural network (CNN) for segmentation of pneumonia (lung infection) from chest radiographs, the most common imaging modality used to screen for pulmonary disease. For any patient with suspected lung infection, including viral penumonia such as as COVID-19, the initial imaging exam of choice is a chest radiograph.\n", 48 | "\n", 49 | "## Workshop Links\n", 50 | "\n", 51 | "Use the following link to access materials from this workshop: https://github.com/peterchang77/dl_tutor/tree/master/workshops\n", 52 | "\n", 53 | "*Tutorials*\n", 54 | "\n", 55 | "* Introduction to Tensorflow 2.0 and Keras: https://bit.ly/2VSYaop\n", 56 | "* CNN for pneumonia classification: https://bit.ly/2D9ZBrX\n", 57 | "* CNN for pneumonia segmentation: https://bit.ly/2VQMWk9 (**current tutorial**)" 58 | ] 59 | }, 60 | { 61 | "cell_type": "markdown", 62 | "metadata": { 63 | "id": "56d3oMiMw8Wm" 64 | }, 65 | "source": [ 66 | "# Environment\n", 67 | "\n", 68 | "The following lines of code will configure your Google Colab environment for this tutorial." 69 | ] 70 | }, 71 | { 72 | "cell_type": "markdown", 73 | "metadata": { 74 | "id": "uDy5-hb4vpVF" 75 | }, 76 | "source": [ 77 | "### Enable GPU runtime\n", 78 | "\n", 79 | "Use the following instructions to switch the default Colab instance into a GPU-enabled runtime:\n", 80 | "\n", 81 | "```\n", 82 | "Runtime > Change runtime type > Hardware accelerator > GPU\n", 83 | "```" 84 | ] 85 | }, 86 | { 87 | "cell_type": "markdown", 88 | "metadata": { 89 | "id": "kcSbeRlEvpVI" 90 | }, 91 | "source": [ 92 | "### Jarvis library\n", 93 | "\n", 94 | "In this notebook we will Jarvis, a custom Python package to facilitate data science and deep learning for healthcare. Among other things, this library will be used for low-level data management, stratification and visualization of high-dimensional medical data." 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "metadata": { 100 | "id": "wfFNUF82vpVK" 101 | }, 102 | "source": [ 103 | "# --- Install Jarvis library\n", 104 | "% pip install jarvis-md" 105 | ], 106 | "execution_count": null, 107 | "outputs": [] 108 | }, 109 | { 110 | "cell_type": "markdown", 111 | "metadata": { 112 | "id": "ZnL47mGKvpVM" 113 | }, 114 | "source": [ 115 | "### Imports\n", 116 | "\n", 117 | "Use the following lines to import any needed libraries:" 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "metadata": { 123 | "id": "FEocrGo-vpVR" 124 | }, 125 | "source": [ 126 | "import numpy as np, pandas as pd\n", 127 | "from tensorflow import losses, optimizers\n", 128 | "from tensorflow.keras import Input, Model, models, layers, metrics\n", 129 | "from jarvis.train import datasets, custom\n", 130 | "from jarvis.utils.display import imshow" 131 | ], 132 | "execution_count": null, 133 | "outputs": [] 134 | }, 135 | { 136 | "cell_type": "markdown", 137 | "metadata": { 138 | "id": "eUkAD42PvpVZ" 139 | }, 140 | "source": [ 141 | "# Data\n", 142 | "\n", 143 | "The data used in this tutorial will consist of (frontal projection) chest radiographs from a subset of the RSNA / Kaggle pneumonia challenge (https://www.kaggle.com/c/rsna-pneumonia-detection-challenge). From the complete cohort, a random subset of 1,000 exams will be used for training and evaluation.\n", 144 | "\n", 145 | "### Download\n", 146 | "\n", 147 | "The custom `datasets.download(...)` method can be used to download a local copy of the dataset. By default the dataset will be archived at `/data/raw/xr_pna`; as needed an alternate location may be specified using `datasets.download(name=..., path=...)`. " 148 | ] 149 | }, 150 | { 151 | "cell_type": "code", 152 | "metadata": { 153 | "id": "ArdFEwDFvpVg" 154 | }, 155 | "source": [ 156 | "# --- Download dataset\n", 157 | "datasets.download(name='xr/pna-512')" 158 | ], 159 | "execution_count": null, 160 | "outputs": [] 161 | }, 162 | { 163 | "cell_type": "markdown", 164 | "metadata": { 165 | "id": "ChHRifmmvpVn" 166 | }, 167 | "source": [ 168 | "### Python generators\n", 169 | "\n", 170 | "Once the dataset is downloaded locally, Python generators to iterate through the dataset can be easily prepared using the `datasets.prepare(...)` method:" 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "metadata": { 176 | "id": "J4PpO5v0vpVq" 177 | }, 178 | "source": [ 179 | "# --- Prepare generators\n", 180 | "gen_train, gen_valid, client = datasets.prepare(name='xr/pna-512', keyword='seg-512')" 181 | ], 182 | "execution_count": null, 183 | "outputs": [] 184 | }, 185 | { 186 | "cell_type": "markdown", 187 | "metadata": { 188 | "id": "KDyHoKF_vpVr" 189 | }, 190 | "source": [ 191 | "The created generators, `gen_train` and `gen_valid`, are designed to yield two variables per iteration: `xs` and `ys`. Both `xs` and `ys` each represent a dictionary of NumPy arrays containing model input(s) and output(s) for a single *batch* of training. The use of Python generators provides a generic interface for data input for a number of machine learning libraries including Tensorflow 2.0 / Keras.\n", 192 | "\n", 193 | "Note that any valid Python iterable method can be used to loop through the generators indefinitely. For example the Python built-in `next(...)` method will yield the next batch of data:" 194 | ] 195 | }, 196 | { 197 | "cell_type": "code", 198 | "metadata": { 199 | "id": "MiBeft0lvpVy" 200 | }, 201 | "source": [ 202 | "# --- Yield one example\n", 203 | "xs, ys = next(gen_train)" 204 | ], 205 | "execution_count": null, 206 | "outputs": [] 207 | }, 208 | { 209 | "cell_type": "markdown", 210 | "metadata": { 211 | "id": "2-DlmcS5vpV3" 212 | }, 213 | "source": [ 214 | "### Data exploration\n", 215 | "\n", 216 | "To help facilitate algorithm design, each original chest radiograph has been resampled to a uniform `(512, 512)` matrix. Overall, the dataset comprises a total of `1,000` 2D images: a total of `500` negaative exams and `500` positive exams." 217 | ] 218 | }, 219 | { 220 | "cell_type": "markdown", 221 | "metadata": { 222 | "id": "RtFyWg-FvpV7" 223 | }, 224 | "source": [ 225 | "### `xs` dictionary\n", 226 | "\n", 227 | "The `xs` dictionary contains a single batch of model inputs:\n", 228 | "\n", 229 | "1. `dat`: input chest radiograph resampled to `(1, 512, 512, 1)` matrix shape" 230 | ] 231 | }, 232 | { 233 | "cell_type": "code", 234 | "metadata": { 235 | "id": "68qlizpWvpV7" 236 | }, 237 | "source": [ 238 | "# --- Print keys \n", 239 | "for key, arr in xs.items():\n", 240 | " print('xs key: {} | shape = {}'.format(key.ljust(8), arr.shape))" 241 | ], 242 | "execution_count": null, 243 | "outputs": [] 244 | }, 245 | { 246 | "cell_type": "markdown", 247 | "metadata": { 248 | "id": "cv1NZxOTvpV9" 249 | }, 250 | "source": [ 251 | "### `ys` dictionary\n", 252 | "\n", 253 | "The `ys` dictionary contains a single batch of model outputs:\n", 254 | "\n", 255 | "1. `pna`: output segmentation mask for pneumonia equal in size to the input `(1, 512, 512, 1)` matrix shape\n", 256 | "\n", 257 | "* 0 = pixels negative for pneumonia\n", 258 | "* 1 = pixels positive for pneumonia" 259 | ] 260 | }, 261 | { 262 | "cell_type": "code", 263 | "metadata": { 264 | "id": "VLroXDvnvpV-" 265 | }, 266 | "source": [ 267 | "# --- Print keys \n", 268 | "for key, arr in ys.items():\n", 269 | " print('ys key: {} | shape = {}'.format(key.ljust(8), arr.shape))" 270 | ], 271 | "execution_count": null, 272 | "outputs": [] 273 | }, 274 | { 275 | "cell_type": "markdown", 276 | "metadata": { 277 | "id": "haoxRUcWvpWA" 278 | }, 279 | "source": [ 280 | "### Visualization\n", 281 | "\n", 282 | "Use the following lines of code to visualize a single input image and mask using the `imshow(...)` method:" 283 | ] 284 | }, 285 | { 286 | "cell_type": "code", 287 | "metadata": { 288 | "id": "NrxVPeLYvpWA" 289 | }, 290 | "source": [ 291 | "# --- Show labels\n", 292 | "xs, ys = next(gen_train)\n", 293 | "imshow(xs['dat'][0], ys['pna'][0], radius=3)" 294 | ], 295 | "execution_count": null, 296 | "outputs": [] 297 | }, 298 | { 299 | "cell_type": "markdown", 300 | "metadata": { 301 | "id": "PNC4oWVRvpWC" 302 | }, 303 | "source": [ 304 | "Use the following lines of code to visualize an N x N mosaic of all images and masks in the current batch using the `imshow(...)` method:" 305 | ] 306 | }, 307 | { 308 | "cell_type": "code", 309 | "metadata": { 310 | "id": "pWY-P9pcvpWE" 311 | }, 312 | "source": [ 313 | "# --- Show \"montage\" of all images\n", 314 | "xs, ys = next(gen_train)\n", 315 | "imshow(xs['dat'], ys['pna'], figsize=(12, 12), radius=3)" 316 | ], 317 | "execution_count": null, 318 | "outputs": [] 319 | }, 320 | { 321 | "cell_type": "markdown", 322 | "metadata": { 323 | "id": "KID_bkiTvpWF" 324 | }, 325 | "source": [ 326 | "### Model inputs\n", 327 | "\n", 328 | "For every input in `xs`, a corresponding `Input(...)` variable can be created and returned in a `inputs` dictionary for ease of model development:" 329 | ] 330 | }, 331 | { 332 | "cell_type": "code", 333 | "metadata": { 334 | "id": "lTxi45yYvpWI" 335 | }, 336 | "source": [ 337 | "# --- Create model inputs\n", 338 | "inputs = client.get_inputs(Input)" 339 | ], 340 | "execution_count": null, 341 | "outputs": [] 342 | }, 343 | { 344 | "cell_type": "markdown", 345 | "metadata": { 346 | "id": "-oKaJZMavpWI" 347 | }, 348 | "source": [ 349 | "In this example, the equivalent Python code to generate `inputs` would be:\n", 350 | "\n", 351 | "```python\n", 352 | "inputs = {}\n", 353 | "inputs['dat'] = Input(shape=(1, 512, 512, 1))\n", 354 | "```" 355 | ] 356 | }, 357 | { 358 | "cell_type": "markdown", 359 | "metadata": { 360 | "id": "k3ShPnrZvpWI" 361 | }, 362 | "source": [ 363 | "# U-Net Architecture\n", 364 | "\n", 365 | "The **U-Net** architecture is a common fully-convolutional neural network used to perform instance segmentation. The network topology comprises of symmetric contracting and expanding arms to map an original input image to an output segmentation mask that appoximates the size of the original image:\n", 366 | "\n", 367 | "![U-Net Architecture](https://raw.githubusercontent.com/peterchang77/dl_tutor/master/cs190/spring_2020/notebooks/organ_segmentation/pngs/u-net-architecture.png)" 368 | ] 369 | }, 370 | { 371 | "cell_type": "markdown", 372 | "metadata": { 373 | "id": "JltDJb1uvpWJ" 374 | }, 375 | "source": [ 376 | "# Contracting Layers\n", 377 | "\n", 378 | "The contracting layers of a U-Net architecture are essentially identical to a standard feed-forward CNN. Compared to the original architecture above, several key modifications will be made for ease of implementation and to optimize for medical imaging tasks including:\n", 379 | "\n", 380 | "* same padding (vs. valid padding)\n", 381 | "* strided convoltions (vs. max-pooling)\n", 382 | "* smaller filters (channel depths)" 383 | ] 384 | }, 385 | { 386 | "cell_type": "markdown", 387 | "metadata": { 388 | "id": "4KcJ5p3CvpWJ" 389 | }, 390 | "source": [ 391 | "Let us start by defining the contracting layer architecture below:" 392 | ] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "metadata": { 397 | "id": "9wf2zGMJvpWK" 398 | }, 399 | "source": [ 400 | "# --- Define kwargs dictionary\n", 401 | "kwargs = {\n", 402 | " 'kernel_size': (1, 3, 3),\n", 403 | " 'padding': 'same'}\n", 404 | "\n", 405 | "# --- Define lambda functions\n", 406 | "conv = lambda x, filters, strides : layers.Conv3D(filters=filters, strides=strides, **kwargs)(x)\n", 407 | "norm = lambda x : layers.BatchNormalization()(x)\n", 408 | "relu = lambda x : layers.ReLU()(x)\n", 409 | "\n", 410 | "# --- Define stride-1, stride-2 blocks\n", 411 | "conv1 = lambda filters, x : relu(norm(conv(x, filters, strides=1)))\n", 412 | "conv2 = lambda filters, x : relu(norm(conv(x, filters, strides=(1, 2, 2))))" 413 | ], 414 | "execution_count": null, 415 | "outputs": [] 416 | }, 417 | { 418 | "cell_type": "markdown", 419 | "metadata": { 420 | "id": "GZXE-An_vpWL" 421 | }, 422 | "source": [ 423 | "Using these lambda functions, let us define a simple 9-layer contracting network topology with a total a four subsample (stride-2 convolution) operations:" 424 | ] 425 | }, 426 | { 427 | "cell_type": "code", 428 | "metadata": { 429 | "id": "qEGx-AhDvpWL" 430 | }, 431 | "source": [ 432 | "# --- Define contracting layers\n", 433 | "l1 = conv1(16, inputs['dat'])\n", 434 | "l2 = conv1(32, conv2(32, l1))\n", 435 | "l3 = conv1(48, conv2(48, l2))\n", 436 | "l4 = conv1(64, conv2(64, l3))\n", 437 | "l5 = conv1(80, conv2(80, l4))" 438 | ], 439 | "execution_count": null, 440 | "outputs": [] 441 | }, 442 | { 443 | "cell_type": "markdown", 444 | "metadata": { 445 | "id": "1Cx4sO0wvpWL" 446 | }, 447 | "source": [ 448 | "**Checkpoint**: What is the shape of the `l5` feature map?" 449 | ] 450 | }, 451 | { 452 | "cell_type": "code", 453 | "metadata": { 454 | "id": "BotXignovpWL" 455 | }, 456 | "source": [ 457 | "" 458 | ], 459 | "execution_count": null, 460 | "outputs": [] 461 | }, 462 | { 463 | "cell_type": "markdown", 464 | "metadata": { 465 | "id": "dYaszKWvvpWL" 466 | }, 467 | "source": [ 468 | "# Expanding Layers" 469 | ] 470 | }, 471 | { 472 | "cell_type": "markdown", 473 | "metadata": { 474 | "id": "yH9i_CRuvpWL" 475 | }, 476 | "source": [ 477 | "The expanding layers are simply implemented by reversing the operations found in the contract layers above. Specifically, each subsample operation is now replaced by a **convolutional transpose**. Due to the use of **same** padding, defining a transpose operation with the exact same parameters as a strided convolution will ensure that layers in the expanding pathway will exactly match the shape of the corresponding contracting layer.\n", 478 | "\n", 479 | "### Convolutional transpose\n", 480 | "\n", 481 | "Let us start by defining an additional lambda function for the convolutional transpose:" 482 | ] 483 | }, 484 | { 485 | "cell_type": "code", 486 | "metadata": { 487 | "id": "ig5vKp-0vpWM" 488 | }, 489 | "source": [ 490 | "# --- Define single transpose\n", 491 | "tran = lambda x, filters, strides : layers.Conv3DTranspose(filters=filters, strides=strides, **kwargs)(x)\n", 492 | "\n", 493 | "# --- Define transpose block\n", 494 | "tran2 = lambda filters, x : relu(norm(tran(x, filters, strides=(1, 2, 2))))" 495 | ], 496 | "execution_count": null, 497 | "outputs": [] 498 | }, 499 | { 500 | "cell_type": "markdown", 501 | "metadata": { 502 | "id": "eMlH1CejvpWN" 503 | }, 504 | "source": [ 505 | "Carefully compare these functions to the single `conv` operations as well as the `conv1` and `conv2` blocks above. Notice that they share the exact same configurations.\n", 506 | "\n", 507 | "Let us now apply the first convolutional transpose block to the `l5` feature map:" 508 | ] 509 | }, 510 | { 511 | "cell_type": "code", 512 | "metadata": { 513 | "id": "YgpluDu3vpWO" 514 | }, 515 | "source": [ 516 | "# --- Define expanding layers\n", 517 | "l6 = tran2(64, l5)" 518 | ], 519 | "execution_count": null, 520 | "outputs": [] 521 | }, 522 | { 523 | "cell_type": "markdown", 524 | "metadata": { 525 | "id": "4noJUVEEvpWO" 526 | }, 527 | "source": [ 528 | "**Checkpoint**: What is the shape of the `l6` feature map?" 529 | ] 530 | }, 531 | { 532 | "cell_type": "markdown", 533 | "metadata": { 534 | "id": "PHNd25PzvpWP" 535 | }, 536 | "source": [ 537 | "### Concatenation\n", 538 | "\n", 539 | "The first connection in this specific U-Net derived architecture is a link between the `l4` and the `l6` layers:\n", 540 | "\n", 541 | "```\n", 542 | "l1 -------------------> l9\n", 543 | " \\ /\n", 544 | " l2 -------------> l8\n", 545 | " \\ / \n", 546 | " l3 -------> l7\n", 547 | " \\ /\n", 548 | " l4 -> l6\n", 549 | " \\ /\n", 550 | " l5\n", 551 | "```\n", 552 | "\n", 553 | "To mediate the first connection between contracting and expanding layers, we must ensure that `l4` and `l6` match in feature map size (the number of filters / channel depth *do not* necessarily). Using the `same` padding as above should ensure that this is the case and thus simplifies the connection operation:" 554 | ] 555 | }, 556 | { 557 | "cell_type": "code", 558 | "metadata": { 559 | "id": "y-iF7nUmvpWP" 560 | }, 561 | "source": [ 562 | "# --- Ensure shapes match\n", 563 | "print(l4.shape)\n", 564 | "print(l6.shape)\n", 565 | "\n", 566 | "# --- Concatenate\n", 567 | "concat = lambda a, b : layers.Concatenate()([a, b])\n", 568 | "concat(l4, l6)" 569 | ], 570 | "execution_count": null, 571 | "outputs": [] 572 | }, 573 | { 574 | "cell_type": "markdown", 575 | "metadata": { 576 | "id": "X5eEaYFvvpWP" 577 | }, 578 | "source": [ 579 | "Note that since `l4` and `l6` are **exactly the same shape** (including matching channel depth), what additional operation could be used here instead of a concatenation?" 580 | ] 581 | }, 582 | { 583 | "cell_type": "markdown", 584 | "metadata": { 585 | "id": "_1H0823_vpWP" 586 | }, 587 | "source": [ 588 | "### Full expansion\n", 589 | "\n", 590 | "Alternate the use of `conv1` and `tran2` blocks to build the remainder of the expanding pathway:" 591 | ] 592 | }, 593 | { 594 | "cell_type": "code", 595 | "metadata": { 596 | "id": "hgwH5dgWvpWQ" 597 | }, 598 | "source": [ 599 | "# --- Define expanding layers\n", 600 | "l7 = tran2(48, conv1(64, concat(l4, l6)))\n", 601 | "l8 = tran2(32, conv1(48, concat(l3, l7)))\n", 602 | "l9 = tran2(16, conv1(32, concat(l2, l8)))\n", 603 | "l10 = conv1(16, l9)" 604 | ], 605 | "execution_count": null, 606 | "outputs": [] 607 | }, 608 | { 609 | "cell_type": "markdown", 610 | "metadata": { 611 | "id": "xAcFVDiPvpWS" 612 | }, 613 | "source": [ 614 | "# Logits\n", 615 | "\n", 616 | "The last convolution projects the `l10` feature map into a total of just `n` feature maps, one for each possible class prediction. In this 2-class prediction task, a total of `2` feature maps will be needed. Recall that these feature maps essentially act as a set of **logit scores** for each voxel location throughout the image. As with a standard CNN architecture, **do not** use an activation here in the final convolution:" 617 | ] 618 | }, 619 | { 620 | "cell_type": "code", 621 | "metadata": { 622 | "id": "mJ4E8NYSvpWT" 623 | }, 624 | "source": [ 625 | "# --- Create logits\n", 626 | "logits = {}\n", 627 | "logits['pna'] = layers.Conv3D(filters=2, name='pna', **kwargs)(l10)" 628 | ], 629 | "execution_count": null, 630 | "outputs": [] 631 | }, 632 | { 633 | "cell_type": "markdown", 634 | "metadata": { 635 | "id": "L7knAqtAvpWT" 636 | }, 637 | "source": [ 638 | "# Model\n", 639 | "\n", 640 | "Let us first create our model:" 641 | ] 642 | }, 643 | { 644 | "cell_type": "code", 645 | "metadata": { 646 | "id": "J6DX6YTNvpWT" 647 | }, 648 | "source": [ 649 | "# --- Create model\n", 650 | "model = Model(inputs=inputs, outputs=logits)" 651 | ], 652 | "execution_count": null, 653 | "outputs": [] 654 | }, 655 | { 656 | "cell_type": "markdown", 657 | "metadata": { 658 | "id": "fvJM5yk4vpWT" 659 | }, 660 | "source": [ 661 | "### Custom Dice score metric\n", 662 | "\n", 663 | "The metric of choice for tracking performance of a medical image segmentation algorithm is the **Dice score**. The Dice score is not a default metric built in the Tensorflow library, however a custom metric is available for your convenience as part of the `jarvis-md` package. It is invoked using the `custom.dsc(cls=...)` call, where the argument `cls` refers to the number of *non-zero* classes to track (e.g. the background Dice score is typically not tracked). In this exercise, it will be important to track the performance of segmentation for **pneumonia** (class = 1) only, thus set the `cls` argument to `1`." 664 | ] 665 | }, 666 | { 667 | "cell_type": "code", 668 | "metadata": { 669 | "id": "Niit_A0zvpWT" 670 | }, 671 | "source": [ 672 | "# --- Compile model\n", 673 | "model.compile(\n", 674 | " optimizer=optimizers.Adam(learning_rate=2e-4),\n", 675 | " loss={'pna': losses.SparseCategoricalCrossentropy(from_logits=True)},\n", 676 | " metrics={'pna': custom.dsc(cls=1)},\n", 677 | " experimental_run_tf_function=False)" 678 | ], 679 | "execution_count": null, 680 | "outputs": [] 681 | }, 682 | { 683 | "cell_type": "markdown", 684 | "metadata": { 685 | "id": "x2BS9O6cvpWU" 686 | }, 687 | "source": [ 688 | "# Model Training\n", 689 | "\n", 690 | "### In-Memory Data\n", 691 | "\n", 692 | "The following line of code will load all training data into RAM memory. This strategy can be effective for increasing speed of training for small to medium-sized datasets." 693 | ] 694 | }, 695 | { 696 | "cell_type": "code", 697 | "metadata": { 698 | "id": "7YlLT6ojvpWU" 699 | }, 700 | "source": [ 701 | "# --- Load data into memory\n", 702 | "client.load_data_in_memory()" 703 | ], 704 | "execution_count": null, 705 | "outputs": [] 706 | }, 707 | { 708 | "cell_type": "markdown", 709 | "metadata": { 710 | "id": "rDSy6DnDvpWU" 711 | }, 712 | "source": [ 713 | "### Training\n", 714 | "\n", 715 | "Once the model has been compiled and the data prepared (via a generator), training can be invoked using the `model.fit(...)` method. Ensure that both the training and validation data generators are used. In this particular example, we are defining arbitrary epochs of 100 steps each. Training will proceed for 8 epochs in total. Validation statistics will be assess every fourth epoch. As needed, tune these arugments as need." 716 | ] 717 | }, 718 | { 719 | "cell_type": "code", 720 | "metadata": { 721 | "id": "n9G-Xu2avpWU" 722 | }, 723 | "source": [ 724 | "model.fit(\n", 725 | " x=gen_train, \n", 726 | " steps_per_epoch=100, \n", 727 | " epochs=8,\n", 728 | " validation_data=gen_valid,\n", 729 | " validation_steps=100,\n", 730 | " validation_freq=4)" 731 | ], 732 | "execution_count": null, 733 | "outputs": [] 734 | }, 735 | { 736 | "cell_type": "markdown", 737 | "metadata": { 738 | "id": "xRiSQPyjvpWU" 739 | }, 740 | "source": [ 741 | "# Evaluation\n", 742 | "\n", 743 | "To test the trained model, the following steps are required:\n", 744 | "\n", 745 | "* load data\n", 746 | "* use `model.predict(...)` to obtain logit scores\n", 747 | "* use `np.argmax(...)` to obtain prediction\n", 748 | "* compare prediction with ground-truth\n", 749 | "\n", 750 | "Recall that the generator used to train the model simply iterates through the dataset randomly. For model evaluation, the cohort must instead be loaded manually in an orderly way. For this tutorial, we will create new **test mode** data generators, which will simply load each example individually once for testing. " 751 | ] 752 | }, 753 | { 754 | "cell_type": "code", 755 | "metadata": { 756 | "id": "ysz9AF2kvpWU" 757 | }, 758 | "source": [ 759 | "# --- Create validation generator\n", 760 | "test_train, test_valid = client.create_generators(test=True)" 761 | ], 762 | "execution_count": null, 763 | "outputs": [] 764 | }, 765 | { 766 | "cell_type": "markdown", 767 | "metadata": { 768 | "id": "UjSEOO_SvpWV" 769 | }, 770 | "source": [ 771 | "### Dice score\n", 772 | "\n", 773 | "While the Dice score metric for Tensorflow has been provided already, an implementation must still be used to manually calculate the performance during validation. Use the following code cell block to implement:" 774 | ] 775 | }, 776 | { 777 | "cell_type": "code", 778 | "metadata": { 779 | "id": "HEuqS7V-vpWV" 780 | }, 781 | "source": [ 782 | "def dice(y_true, y_pred, c=1, epsilon=1):\n", 783 | " \"\"\"\n", 784 | " Method to calculate the Dice score coefficient for given class\n", 785 | " \n", 786 | " :params\n", 787 | " \n", 788 | " (np.ndarray) y_true : ground-truth label\n", 789 | " (np.ndarray) y_pred : predicted logits scores\n", 790 | " (int) c : class to calculate DSC on\n", 791 | " \n", 792 | " \"\"\"\n", 793 | " assert y_true.ndim == y_pred.ndim\n", 794 | " \n", 795 | " true = y_true[..., 0] == c\n", 796 | " pred = np.argmax(y_pred, axis=-1) == c \n", 797 | "\n", 798 | " A = np.count_nonzero(true & pred) * 2\n", 799 | " B = np.count_nonzero(true) + np.count_nonzero(pred) + epsilon\n", 800 | " \n", 801 | " return A / B" 802 | ], 803 | "execution_count": null, 804 | "outputs": [] 805 | }, 806 | { 807 | "cell_type": "markdown", 808 | "metadata": { 809 | "id": "7K-xl6uMvpWZ" 810 | }, 811 | "source": [ 812 | "Use the following lines of code to loop through the test set generator and run model prediction on each example:" 813 | ] 814 | }, 815 | { 816 | "cell_type": "code", 817 | "metadata": { 818 | "id": "9eRKIwmFvpWZ" 819 | }, 820 | "source": [ 821 | "# --- Test model\n", 822 | "dsc = []\n", 823 | "\n", 824 | "for x, y in test_valid:\n", 825 | " \n", 826 | " if y['pna'].any():\n", 827 | " \n", 828 | " # --- Predict\n", 829 | " logits = model.predict(x['dat'])\n", 830 | "\n", 831 | " if type(logits) is dict:\n", 832 | " logits = logits['pna']\n", 833 | "\n", 834 | " # --- Argmax\n", 835 | " dsc.append(dice(y['pna'][0], logits[0], c=1))\n", 836 | "\n", 837 | "dsc = np.array(dsc)" 838 | ], 839 | "execution_count": null, 840 | "outputs": [] 841 | }, 842 | { 843 | "cell_type": "markdown", 844 | "metadata": { 845 | "id": "2heHGcZ0vpWa" 846 | }, 847 | "source": [ 848 | "Use the following lines of code to calculate validataion cohort performance:" 849 | ] 850 | }, 851 | { 852 | "cell_type": "code", 853 | "metadata": { 854 | "id": "SJoRM1YkvpWa" 855 | }, 856 | "source": [ 857 | "# --- Calculate accuracy\n", 858 | "print('{}: {:0.5f}'.format('Mean Dice'.ljust(20), np.mean(dsc)))\n", 859 | "print('{}: {:0.5f}'.format('Median Dice'.ljust(20), np.median(dsc)))\n", 860 | "print('{}: {:0.5f}'.format('25th-centile Dice'.ljust(20), np.percentile(dsc, 25)))\n", 861 | "print('{}: {:0.5f}'.format('74th-centile Dice'.ljust(20), np.percentile(dsc, 75)))" 862 | ], 863 | "execution_count": null, 864 | "outputs": [] 865 | }, 866 | { 867 | "cell_type": "markdown", 868 | "metadata": { 869 | "id": "MO6vVUjgvpWc" 870 | }, 871 | "source": [ 872 | "## Saving and Loading a Model\n", 873 | "\n", 874 | "After a model has been successfully trained, it can be saved and/or loaded by simply using the `model.save()` and `models.load_model()` methods. " 875 | ] 876 | }, 877 | { 878 | "cell_type": "code", 879 | "metadata": { 880 | "id": "6lI21ceyvpWc" 881 | }, 882 | "source": [ 883 | "# --- Serialize a model\n", 884 | "model.save('./cnn.hdf5')" 885 | ], 886 | "execution_count": null, 887 | "outputs": [] 888 | }, 889 | { 890 | "cell_type": "code", 891 | "metadata": { 892 | "id": "dMPVYS22vpWc" 893 | }, 894 | "source": [ 895 | "# --- Load a serialized model\n", 896 | "del model\n", 897 | "model = models.load_model('./cnn.hdf5', compile=False)" 898 | ], 899 | "execution_count": null, 900 | "outputs": [] 901 | } 902 | ] 903 | } -------------------------------------------------------------------------------- /sessions/pneumonia-detection/README.md: -------------------------------------------------------------------------------- 1 | # Pneumonia Detection Model Building 2 | 3 | ## Presenters 4 | - Felipe Kitamura 5 | - Ian Pan 6 | 7 | ## Session Date & Time 8 | 1. Monday, November 29, 2021 - 4:30 PM 9 | 2. Wednesday, December 1, 2021 - 9:30 AM 10 | -------------------------------------------------------------------------------- /sessions/tcga-gbm/README.md: -------------------------------------------------------------------------------- 1 | # Integrating Genomic & Imaging Data with TCGA-GBM 2 | 3 | ## Presenters 4 | - Gian Marco Conte 5 | - Pouria Rouzrokh 6 | 7 | ## Session Date & Time 8 | Monday, November 29, 2021 9 | 11:00 AM -------------------------------------------------------------------------------- /sessions/tcia-idc/README.md: -------------------------------------------------------------------------------- 1 | # Working with Public Datasets: TCIA & IDC 2 | 3 | ## Presenters 4 | - Andrey Fedorov 5 | - Justin Kirby 6 | 7 | ## Session Date & Time 8 | 1. Tuesday, November 30, 2021 - 11:00 AM 9 | 2. Wednesday, December 1, 2021 - 11:00 AM -------------------------------------------------------------------------------- /sessions/yolo/README.md: -------------------------------------------------------------------------------- 1 | # YOLO: Bounding Box Segmentation & Classification 2 | 3 | ## Presenters 4 | - Pouria Rouzrokh 5 | - Brad Erickson 6 | 7 | ## Session Date & Time 8 | Monday, November 29, 2021 9 | 9:30 AM -------------------------------------------------------------------------------- /sessions/yolo/Train_YOLOv5_Practice_Notebook.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 5, 4 | "metadata": { 5 | "kernelspec": { 6 | "display_name": "Python 3", 7 | "language": "python", 8 | "name": "python3" 9 | }, 10 | "language_info": { 11 | "codemirror_mode": { 12 | "name": "ipython", 13 | "version": 3 14 | }, 15 | "file_extension": ".py", 16 | "mimetype": "text/x-python", 17 | "name": "python", 18 | "nbconvert_exporter": "python", 19 | "pygments_lexer": "ipython3", 20 | "version": "3.7.10" 21 | }, 22 | "colab": { 23 | "name": "Train_YOLOv5_Practice_Notebook.ipynb", 24 | "provenance": [], 25 | "collapsed_sections": [], 26 | "include_colab_link": true 27 | }, 28 | "accelerator": "GPU" 29 | }, 30 | "cells": [ 31 | { 32 | "cell_type": "markdown", 33 | "metadata": { 34 | "id": "view-in-github", 35 | "colab_type": "text" 36 | }, 37 | "source": [ 38 | "\"Open" 39 | ] 40 | }, 41 | { 42 | "cell_type": "markdown", 43 | "metadata": { 44 | "id": "07e01ff0-c83d-402c-95dc-2cad366e1d03" 45 | }, 46 | "source": [ 47 | "## Brain Hemorrhage Detection Model\n", 48 | "\n", 49 | "Welcome to the RSNA2021 Object detection workshop!\n", 50 | "\n", 51 | "In this notebook, we train a YOLOv5 deep learning model to detect brain hemorrhage on Head CT scans. \n", 52 | "\n", 53 | "* For our model, we use the [Ultralytics](https://github.com/ultralytics/yolov5) implementation of YOLOv5.\n", 54 | "* For our data, we use the publicly available [CQ500 Head CT-scan dataset](http://headctstudy.qure.ai/dataset) of patients with brain hemorrhage. \n", 55 | "* For our labels, we use the bounding box annotation on CQ500 dataset by [Physionet](https://physionet.org/content/bhx-brain-bounding-box/1.1/).\n", 56 | "\n", 57 | "Hopefully, by reviewing this notebook you will be able to train a model that can detect hemorrhage lesions on brain CT scans, as plotted in the image below:\n", 58 | "
\n", 59 | "\n", 60 | "In this notebook, you are supposed to complete a few coding blocks in different cells. This will help you to think better about the processes we need to take for training and applying our model. Don't worry, in case you did not find out the right code to put in the cells, we will provide it to you. We are all learning after all!\n" 61 | ], 62 | "id": "07e01ff0-c83d-402c-95dc-2cad366e1d03" 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "metadata": { 67 | "id": "oAKCHZoSUmAN" 68 | }, 69 | "source": [ 70 | "### Part 0: Setting up the working directory\n", 71 | "We start our work by installing some required libraries and cloning our DICOMs and ground truths from a GitHub repository that we have made before. In this notebook, we will train our model on 3500 DICOMs from the CQ500 dataset. Feel free to download the entire dataset and train stronger models on it after this workshop!" 72 | ], 73 | "id": "oAKCHZoSUmAN" 74 | }, 75 | { 76 | "cell_type": "code", 77 | "metadata": { 78 | "id": "HU565R0cbDh3" 79 | }, 80 | "source": [ 81 | "# Installing required packages\n", 82 | "\n", 83 | "!pip install python-gdcm pydicom -q\n", 84 | "!pip uninstall PyYAML -y -q\n", 85 | "!pip install PyYAML==5.3.1 -q\n", 86 | "!pip install --upgrade scikit-learn -q\n", 87 | "!pip install --upgrade pillow -q" 88 | ], 89 | "id": "HU565R0cbDh3", 90 | "execution_count": null, 91 | "outputs": [] 92 | }, 93 | { 94 | "cell_type": "markdown", 95 | "metadata": { 96 | "id": "zyG9ovYYtDkO" 97 | }, 98 | "source": [ 99 | "**MAKE SURE TO RESTART YOUR RUNTIME BEFORE PROCEEDING. YOU CAN DO THIS BY CLICKING ON THE RUNTIME MENU FROM THE TOP OF THE PAGE -> RESTART RUNTIME**" 100 | ], 101 | "id": "zyG9ovYYtDkO" 102 | }, 103 | { 104 | "cell_type": "code", 105 | "metadata": { 106 | "id": "aA6KK1za_xEp" 107 | }, 108 | "source": [ 109 | "# Removing the \"sample_data\" folder that google colab creates by default\n", 110 | "\n", 111 | "import shutil\n", 112 | "shutil.rmtree('sample_data', ignore_errors=True)" 113 | ], 114 | "id": "aA6KK1za_xEp", 115 | "execution_count": null, 116 | "outputs": [] 117 | }, 118 | { 119 | "cell_type": "code", 120 | "metadata": { 121 | "id": "3CuAOpWK5IYa" 122 | }, 123 | "source": [ 124 | "# Importing DICOM and label files to our working directory\n", 125 | "%%time\n", 126 | "\n", 127 | "import os\n", 128 | "if not os.path.exists('RSNA2021_YOLOv5_Workshop'): # To make the cell work prperly if run multiple times. \n", 129 | " !git clone https://github.com/Mayo-Radiology-Informatics-Lab/RSNA2021_YOLOv5_Workshop.git" 130 | ], 131 | "id": "3CuAOpWK5IYa", 132 | "execution_count": null, 133 | "outputs": [] 134 | }, 135 | { 136 | "cell_type": "markdown", 137 | "metadata": { 138 | "id": "6HywYobnwBPr" 139 | }, 140 | "source": [ 141 | "OK, now feel free to check your working directory on the right side of the screen. You should click on the folder icon and will then see a folder named \"RSNA2021_YOLOv5_Workshop\". If you look inside that folder, you will see a folder containing our DICOM files and a CSV file containing our labels." 142 | ], 143 | "id": "6HywYobnwBPr" 144 | }, 145 | { 146 | "cell_type": "markdown", 147 | "metadata": { 148 | "id": "fda2b69f-71e8-41e7-a610-eeb2221e84e1" 149 | }, 150 | "source": [ 151 | "### Part 1: Extracting CT images out of DICOMs\n", 152 | "YOLO works on 2-dimensional images, therefore we need to convert our DICOM files to images before we can train a model on them. For this, we first create a list of paths to all our DICOMs, then read each DICOM with Pydicom library, and finally convert the DICOM arrays to PNG images, while bringing them to the brain window." 153 | ], 154 | "id": "fda2b69f-71e8-41e7-a610-eeb2221e84e1" 155 | }, 156 | { 157 | "cell_type": "code", 158 | "metadata": { 159 | "id": "75318a77-72ee-41f6-b0ed-fc7c666eae4c" 160 | }, 161 | "source": [ 162 | "# Collecting the paths to DICOM files\n", 163 | "\n", 164 | "import os\n", 165 | "\n", 166 | "DICOMs_dir = 'RSNA2021_YOLOv5_Workshop/DICOMs'\n", 167 | "dcmpaths = list()\n", 168 | "for root, dirs, files in os.walk(DICOMs_dir):\n", 169 | " for file in files:\n", 170 | " if file.endswith('.dcm'):\n", 171 | " dcmpaths.append(os.path.join(root, file))\n", 172 | "\n", 173 | "print(f'{len(dcmpaths)} DICOMs were found!')" 174 | ], 175 | "id": "75318a77-72ee-41f6-b0ed-fc7c666eae4c", 176 | "execution_count": null, 177 | "outputs": [] 178 | }, 179 | { 180 | "cell_type": "code", 181 | "metadata": { 182 | "id": "i_2VgHm6ZXOS" 183 | }, 184 | "source": [ 185 | "import pydicom\n", 186 | "import numpy as np\n", 187 | "\n", 188 | "def Read_DICOM(dcmpath):\n", 189 | " \"\"\"\n", 190 | " Recive a DICOM file, extract its image array, and convert the pixel values\n", 191 | " to the Hounsfield Unit (HU). \n", 192 | " inputs:\n", 193 | " - dcmpath: Path to a DICOM file saved on disk.\n", 194 | " outputs:\n", 195 | " - img: Extracted image array from the DICOM file that has been \n", 196 | " conveted to HU.\n", 197 | " \"\"\"\n", 198 | "\n", 199 | " # What you should do:\n", 200 | "\n", 201 | " # 1- Load the DICOM file using pydicom (tip: use pydicom.dcmread command).\n", 202 | "\n", 203 | " # 2- Extract the image array from the DICOM (tip: the image array inside a \n", 204 | " # DICOM file could be accessed using dcm.pixel_array command). \n", 205 | "\n", 206 | " # 3- Get the DICOM slope (tip: use dcm.RescaleSlope). \n", 207 | "\n", 208 | " # 4- Get the DICOM intercept (tip: use dcm.RescaleIntercept).\n", 209 | "\n", 210 | " # 5- Multiply each pixel value of the image array in the DICOM slope and \n", 211 | " # then add the output to the DICOM intercept.\n", 212 | "\n", 213 | " ##### START YOUR CODE HERE (~3 - 5 lines): \n", 214 | "\n", 215 | "\n", 216 | " ##### END YOUR CODE HERE.\n", 217 | "\n", 218 | " return img" 219 | ], 220 | "id": "i_2VgHm6ZXOS", 221 | "execution_count": null, 222 | "outputs": [] 223 | }, 224 | { 225 | "cell_type": "code", 226 | "metadata": { 227 | "cellView": "form", 228 | "id": "wNmkC4LiZ5qY" 229 | }, 230 | "source": [ 231 | "#@title Code to complete the previous cell!\n", 232 | "\n", 233 | "# dcm = pydicom.dcmread(dcmpath)\n", 234 | "# img = dcm.pixel_array\n", 235 | "# intercept = float(dcm.RescaleIntercept)\n", 236 | "# slope = float(dcm.RescaleSlope)\n", 237 | "# img = slope * img + intercept" 238 | ], 239 | "id": "wNmkC4LiZ5qY", 240 | "execution_count": null, 241 | "outputs": [] 242 | }, 243 | { 244 | "cell_type": "code", 245 | "metadata": { 246 | "id": "X16ZNuK6ZM-F" 247 | }, 248 | "source": [ 249 | "from PIL import Image\n", 250 | "import matplotlib.pyplot as plt\n", 251 | "\n", 252 | "images_dir = 'images'\n", 253 | "os.makedirs(images_dir, exist_ok=True)\n", 254 | "\n", 255 | "\n", 256 | "def Apply_window(img, ww=80, wl=40):\n", 257 | " \"\"\"\n", 258 | " A function to apply windowing on the CT-scan. Default values represent the brain window.\n", 259 | " \"\"\"\n", 260 | " U = 255\n", 261 | " W = U / ww\n", 262 | " b = (-U/ww) * (wl-ww/2)\n", 263 | " img = W*img + b\n", 264 | " img = np.where(img > U, U, img)\n", 265 | " img = np.where(img < 0, 0, img)\n", 266 | " return img" 267 | ], 268 | "id": "X16ZNuK6ZM-F", 269 | "execution_count": null, 270 | "outputs": [] 271 | }, 272 | { 273 | "cell_type": "code", 274 | "metadata": { 275 | "id": "68606375-c9f4-4cf8-986a-5cce74556a8a" 276 | }, 277 | "source": [ 278 | "def Extract_image(dcmpath, test_run=False):\n", 279 | " \"\"\"\n", 280 | " Extract imaging array out of a given DICOM file.\n", 281 | " \"\"\"\n", 282 | " \n", 283 | " # Reading the imaging data from the DICOM file and converting \n", 284 | " # its pixel values to the Hounsfield Unit (HU).\n", 285 | " img = Read_DICOM(dcmpath)\n", 286 | " \n", 287 | " # Widnowing to the brain window.\n", 288 | " img = Apply_window(img)\n", 289 | " \n", 290 | " # Normalizing and changing the img to 8-bit.\n", 291 | " img -= img.mean()\n", 292 | " img /= (img.std() + 1e-10)\n", 293 | " img -= img.min()\n", 294 | " img = (255 * img/np.max(img)).astype('uint8')\n", 295 | " \n", 296 | " # Saving the img as a PNG file to images_dir\n", 297 | " # A DICOM's name is : {SOPInstanceUID}.dcm\n", 298 | " # We name the image files similiarly: {SOPInstanceUID}.png\n", 299 | " \n", 300 | " pil_img = Image.fromarray(img)\n", 301 | " pil_img.save(f'{images_dir}/{dcmpath.split(\"/\")[-1][:-4]}.png')\n", 302 | " \n", 303 | " # Testing the function's performance if needed\n", 304 | " if test_run:\n", 305 | " plt.imshow(pil_img, cmap='gray')\n", 306 | " \n", 307 | "# Testing the Extract_image function\n", 308 | "Extract_image(dcmpaths[400], test_run=True)" 309 | ], 310 | "id": "68606375-c9f4-4cf8-986a-5cce74556a8a", 311 | "execution_count": null, 312 | "outputs": [] 313 | }, 314 | { 315 | "cell_type": "code", 316 | "metadata": { 317 | "id": "9482ad50-1c28-453f-b7f5-d8b038c99b05" 318 | }, 319 | "source": [ 320 | "# Extracting CT images out of all DICOMs \n", 321 | "\n", 322 | "from tqdm import tqdm\n", 323 | "\n", 324 | "for dcmpath in tqdm(dcmpaths):\n", 325 | " Extract_image(dcmpath)" 326 | ], 327 | "id": "9482ad50-1c28-453f-b7f5-d8b038c99b05", 328 | "execution_count": null, 329 | "outputs": [] 330 | }, 331 | { 332 | "cell_type": "code", 333 | "metadata": { 334 | "id": "d4feff68-134c-4f5c-9dad-e2d012020e80" 335 | }, 336 | "source": [ 337 | "# Collecting the paths to all saved images\n", 338 | "\n", 339 | "imgpaths = [os.path.join('images', file) for file in os.listdir('images')]\n", 340 | "print(f'{len(imgpaths)} images were found!')" 341 | ], 342 | "id": "d4feff68-134c-4f5c-9dad-e2d012020e80", 343 | "execution_count": null, 344 | "outputs": [] 345 | }, 346 | { 347 | "cell_type": "markdown", 348 | "metadata": { 349 | "id": "Z8a7x2Ciy17t" 350 | }, 351 | "source": [ 352 | "OK, now check your working directory again. You should see a folder named \"images\" in which you should find all images from our dataset." 353 | ], 354 | "id": "Z8a7x2Ciy17t" 355 | }, 356 | { 357 | "cell_type": "markdown", 358 | "metadata": { 359 | "id": "9f2dc5d0-c99c-4830-b04b-3ac13c34c5d1" 360 | }, 361 | "source": [ 362 | "### Part 2: Collecting and visualizing the annotated bounding box labels\n", 363 | "Now that we took care of the images, let's take a look at our ground truth labels. For this, we use Pandas to load our annotation CSV file." 364 | ], 365 | "id": "9f2dc5d0-c99c-4830-b04b-3ac13c34c5d1" 366 | }, 367 | { 368 | "cell_type": "code", 369 | "metadata": { 370 | "id": "866e2ad2-2b0e-4c8c-86aa-499f35b00c9b" 371 | }, 372 | "source": [ 373 | "# Importing our annotation dataset as a Pandas dataframe\n", 374 | "\n", 375 | "import pandas as pd\n", 376 | "\n", 377 | "labels_csv_path = 'RSNA2021_YOLOv5_Workshop/labels.csv'\n", 378 | "labels_df = pd.read_csv(labels_csv_path)\n", 379 | "\n", 380 | "num_bboxes = len(labels_df)\n", 381 | "num_unique_images = len(pd.unique(labels_df.SOPInstanceUID))\n", 382 | "num_unique_scans = len(pd.unique(labels_df.SeriesInstanceUID)) \n", 383 | "print (f'The dataframe includes {num_bboxes} bounding boxes from {num_unique_images} images and {num_unique_scans} scans!\\n')\n", 384 | "\n", 385 | "labels_df.head()" 386 | ], 387 | "id": "866e2ad2-2b0e-4c8c-86aa-499f35b00c9b", 388 | "execution_count": null, 389 | "outputs": [] 390 | }, 391 | { 392 | "cell_type": "markdown", 393 | "metadata": { 394 | "id": "cT60SiYtzRcu" 395 | }, 396 | "source": [ 397 | "Looking at the above dataframe, please note two things:\n", 398 | "\n", 399 | "1. Each Row in our dataframe belongs to one single **bounding box** for a hemorrhage lesion (and not a CT slice or nor a patient). So each slice of a CT scan may be on more than one row if it contains more than one hemorrhage lesion!\n", 400 | "2. DICOMs are also de-identified and do not have a PatientID tag, yet they do have *StudyInstanceUID* tags. We will soon talk about this more!" 401 | ], 402 | "id": "cT60SiYtzRcu" 403 | }, 404 | { 405 | "cell_type": "code", 406 | "metadata": { 407 | "id": "e70ce180-60e4-4ead-907b-9fe5ac40f7e3" 408 | }, 409 | "source": [ 410 | "# Finding out the possible labels and assigning a color to each label\n", 411 | "\n", 412 | "all_possible_labels = set(labels_df.labelName.tolist())\n", 413 | "colors = ['red', 'blue', 'green', 'orange', 'pink', 'purple']\n", 414 | "label_color_dict = {label:color for label, color in zip(all_possible_labels, colors)}\n", 415 | "\n", 416 | "for label in label_color_dict.keys():\n", 417 | " print(f'{label}: {label_color_dict[label]}')" 418 | ], 419 | "id": "e70ce180-60e4-4ead-907b-9fe5ac40f7e3", 420 | "execution_count": null, 421 | "outputs": [] 422 | }, 423 | { 424 | "cell_type": "markdown", 425 | "metadata": { 426 | "id": "zNzMKdhx1FwB" 427 | }, 428 | "source": [ 429 | "It's a good idea to plot some images and their manually plotted bounding boxes before we start the training. This will help us have a better sense of what data we are dealing with. Feel free to run the cell below a few times and see multiple different images with their annotations." 430 | ], 431 | "id": "zNzMKdhx1FwB" 432 | }, 433 | { 434 | "cell_type": "code", 435 | "metadata": { 436 | "id": "fa6a4ed4-a1a2-4e5e-911b-6e815cb52b0e" 437 | }, 438 | "source": [ 439 | "# Plotting random sample images along with their annotated bounding boxes\n", 440 | "\n", 441 | "import matplotlib.pyplot as plt\n", 442 | "import matplotlib.patches as patches\n", 443 | "from skimage.io import imread\n", 444 | "import ast\n", 445 | "\n", 446 | "def Collect_GT_bboxes(imgpath):\n", 447 | " \"\"\"\n", 448 | " Collect ground truth (manually annotated) bounding box coordinates for a given imgpath.\n", 449 | " \"\"\"\n", 450 | " sop_uid = imgpath.split('/')[-1][:-4]\n", 451 | " img_df = labels_df[labels_df.SOPInstanceUID==sop_uid]\n", 452 | " bboxes = list()\n", 453 | " for i, row in img_df.iterrows():\n", 454 | " bbox = ast.literal_eval(row['data'])\n", 455 | " bbox['labelName'] = row['labelName']\n", 456 | " bboxes.append(bbox)\n", 457 | " return bboxes \n", 458 | "\n", 459 | "# Plot bboxes on 9 random imgpaths\n", 460 | "from random import choices\n", 461 | "sample_imgpaths = choices(imgpaths, k=9)\n", 462 | "fig, axes = plt.subplots(3, 3, figsize=(12, 12))\n", 463 | "for i in range(3):\n", 464 | " for j in range(3):\n", 465 | " imgpath = sample_imgpaths[i*3 + j]\n", 466 | " img = imread(imgpath)\n", 467 | " bboxes = Collect_GT_bboxes(imgpath)\n", 468 | " axes[i, j].imshow(img, cmap='gray')\n", 469 | " labels = list()\n", 470 | " for bbox in bboxes:\n", 471 | " xmin = bbox['x'] \n", 472 | " ymin = bbox['y']\n", 473 | " label = bbox['labelName']\n", 474 | " labels.append(label)\n", 475 | " rect = patches.Rectangle((xmin, ymin), bbox['width'], bbox['height'], \n", 476 | " linewidth=1, \n", 477 | " edgecolor=label_color_dict[label], \n", 478 | " facecolor='none')\n", 479 | " axes[i, j].add_patch(rect)\n", 480 | " axes[i, j].axis('off')\n", 481 | " axes[i, j].set_title('\\n'.join(labels))\n", 482 | "for label in label_color_dict.keys():\n", 483 | " print(f'{label}: {label_color_dict[label]}')" 484 | ], 485 | "id": "fa6a4ed4-a1a2-4e5e-911b-6e815cb52b0e", 486 | "execution_count": null, 487 | "outputs": [] 488 | }, 489 | { 490 | "cell_type": "markdown", 491 | "metadata": { 492 | "id": "88410834-3a35-493b-a3dd-28bf0297fd97" 493 | }, 494 | "source": [ 495 | "### Part 3: Data splitting and setting up the training and validation files" 496 | ], 497 | "id": "88410834-3a35-493b-a3dd-28bf0297fd97" 498 | }, 499 | { 500 | "cell_type": "markdown", 501 | "metadata": { 502 | "id": "EwdZgNXp1mCN" 503 | }, 504 | "source": [ 505 | "**Ideally, we would like to split our data into different sets (training, validation, test) based on PatientIDs. Here, we need to do the split based on StudyInstanceUID, which we assume are unique for each patient (A patient may have multiple studies and a study may contain more than one scan, yet we assume that each patient in our pool has only one study)**. DICOMs also have *SeriesInstanceUID* tags that are the same for all DICOMs from a single CT scan, but different between scans, and *SOPInstanceUID* tags, which are unique for each DICOM regardless of the patient or the scan it belongs to. \n", 506 | "\n", 507 | "Saying above, let's start splitting our data to a training and a validation set. Ideally, we would like to also have a test set or endorse a k-fold cross-validation strategy, yet we keep things simple for this workshop!" 508 | ], 509 | "id": "EwdZgNXp1mCN" 510 | }, 511 | { 512 | "cell_type": "code", 513 | "metadata": { 514 | "id": "09752bc6-7a0b-4b48-a914-25b32e780804" 515 | }, 516 | "source": [ 517 | "# Building the folders needed for data splitting\n", 518 | "# These folders will be added to our working directory.\n", 519 | "\n", 520 | "train_imgs_dir = 'model_data/images/train'\n", 521 | "train_labels_dir = 'model_data/labels/train'\n", 522 | "valid_imgs_dir = 'model_data/images/valid'\n", 523 | "valid_labels_dir = 'model_data/labels/valid'\n", 524 | "\n", 525 | "os.makedirs(train_imgs_dir, exist_ok=True)\n", 526 | "os.makedirs(train_labels_dir, exist_ok=True)\n", 527 | "os.makedirs(valid_imgs_dir, exist_ok=True)\n", 528 | "os.makedirs(valid_labels_dir, exist_ok=True)" 529 | ], 530 | "id": "09752bc6-7a0b-4b48-a914-25b32e780804", 531 | "execution_count": null, 532 | "outputs": [] 533 | }, 534 | { 535 | "cell_type": "markdown", 536 | "metadata": { 537 | "id": "UcubTtGi2WpA" 538 | }, 539 | "source": [ 540 | "Here we create three lists based on our dataframe: \n", 541 | "\n", 542 | "* X: A list of all SOPInstanceUIDs we got from all rows (don't forget that each DICOM has a unique SOPInstanceUID, so this list includes all our DICOM names).\n", 543 | "* Y: A list of all labels (or the types of hemorrhage lesions) for all rows.\n", 544 | "* groups: A list of all StudyInstanceUIDs.\n", 545 | "\n", 546 | "What we need to do, is to split our X into 2 different folds (80% training and 20% validation), while controlling for the group variable (so that all rows for one specific StudyInstaceUID tag go to either training or validation sets) and also stratifying our data based on the labels (so that the distribution of different labels between training and validation sets be as close as possible to our 80-20 split)." 547 | ], 548 | "id": "UcubTtGi2WpA" 549 | }, 550 | { 551 | "cell_type": "code", 552 | "metadata": { 553 | "id": "UgOE_tr1Mt1q" 554 | }, 555 | "source": [ 556 | "# To make the cell work prperly if run multiple times. \n", 557 | "try:\n", 558 | " train_df \n", 559 | " valid_df\n", 560 | "except NameError:\n", 561 | " train_df = None; valid_df=None" 562 | ], 563 | "id": "UgOE_tr1Mt1q", 564 | "execution_count": null, 565 | "outputs": [] 566 | }, 567 | { 568 | "cell_type": "code", 569 | "metadata": { 570 | "id": "hga3Z677aHmR" 571 | }, 572 | "source": [ 573 | "# Splitting the data into training and validation sets.\n", 574 | "\n", 575 | "X = labels_df.SOPInstanceUID.tolist()\n", 576 | "Y = labels_df.labelName.tolist()\n", 577 | "groups = labels_df.StudyInstanceUID.tolist()\n", 578 | "\n", 579 | "# To make the cell work prperly if run multiple times. \n", 580 | "if train_df is None and valid_df is None: \n", 581 | " from sklearn.model_selection import StratifiedGroupKFold\n", 582 | " cv = StratifiedGroupKFold(n_splits=5)\n", 583 | " train_idxs, valid_idxs = next(iter(cv.split(X, Y, groups)))\n", 584 | " train_df = labels_df.loc[train_idxs]\n", 585 | " valid_df = labels_df.loc[valid_idxs]\n", 586 | " train_uids = set(train_df.StudyInstanceUID.tolist())\n", 587 | " valid_uids = set(valid_df.StudyInstanceUID.tolist())\n", 588 | "\n", 589 | "print(f'Number of hemorrhage instances (bboxes) in the training set: {len(train_df)} - Number of patients: {len(train_uids)})')\n", 590 | "print(f'Class distribution in the training set: ')\n", 591 | "print(train_df['labelName'].value_counts())\n", 592 | "print('\\n***********\\n')\n", 593 | "print(f'Number of hemorrhage instances (bboxes) in the validation set: {len(valid_df)} - Number of patients: {len(valid_uids)})')\n", 594 | "print(f'Class distribution in the validation set: ')\n", 595 | "print(valid_df['labelName'].value_counts())\n", 596 | "\n", 597 | "# Making sure training and validation data have no patients in common\n", 598 | "assert len(set(train_df['StudyInstanceUID'].tolist()).intersection(set(valid_df['StudyInstanceUID'].tolist()))) == 0" 599 | ], 600 | "id": "hga3Z677aHmR", 601 | "execution_count": null, 602 | "outputs": [] 603 | }, 604 | { 605 | "cell_type": "markdown", 606 | "metadata": { 607 | "id": "61pWUtxr5HaF" 608 | }, 609 | "source": [ 610 | "You would probably have enjoyed how easy you could do the tedious task of splitting using the StratifiedGroupKFold command of Scikit-learn! Check the numbers above and see how stratified our split is based on labels. \n", 611 | "\n", 612 | " Please note that this command is currently available in the Beta version of Scikit-learn, so we needed to uninstall the default Scikit-learn library of Google Colab and install the Beta version by our own (we did it in the first cell of this notebook). \n", 613 | "\n", 614 | "Now that we built the folds, let's proceed by actually copying our images to training and validation directories." 615 | ], 616 | "id": "61pWUtxr5HaF" 617 | }, 618 | { 619 | "cell_type": "code", 620 | "metadata": { 621 | "id": "25e3b704-57a3-4ffb-9cb1-da87aef05491" 622 | }, 623 | "source": [ 624 | "# Moving the images to their folders in the \"model_data\" directory\n", 625 | "\n", 626 | "for i, row in train_df.iterrows():\n", 627 | " imgpath = f'images/{row.SOPInstanceUID}.png'\n", 628 | " shutil.copy(imgpath, os.path.join(train_imgs_dir, f'{row.SOPInstanceUID}.png'))\n", 629 | " \n", 630 | "for i, row in valid_df.iterrows():\n", 631 | " imgpath = f'images/{row.SOPInstanceUID}.png'\n", 632 | " shutil.copy(imgpath, os.path.join(valid_imgs_dir, f'{row.SOPInstanceUID}.png'))" 633 | ], 634 | "id": "25e3b704-57a3-4ffb-9cb1-da87aef05491", 635 | "execution_count": null, 636 | "outputs": [] 637 | }, 638 | { 639 | "cell_type": "markdown", 640 | "metadata": { 641 | "id": "N8krypmQ54re" 642 | }, 643 | "source": [ 644 | "YOLO needs us two TXT files including the paths to images in our training and validation directories as well, so let's create those files:" 645 | ], 646 | "id": "N8krypmQ54re" 647 | }, 648 | { 649 | "cell_type": "code", 650 | "metadata": { 651 | "id": "d2efb5ef-c73c-43db-8bb8-b3b5c320ec85" 652 | }, 653 | "source": [ 654 | "# To make the cell work prperly if run multiple times. \n", 655 | "shutil.rmtree('model_data/train.txt', ignore_errors=True)\n", 656 | "shutil.rmtree('model_data/val.txt', ignore_errors=True)\n", 657 | "\n", 658 | "# Building the txt files including paths to all images in training and validation sets\n", 659 | "\n", 660 | "with open('model_data/train.txt', 'w') as f:\n", 661 | " for file in os.listdir(train_imgs_dir):\n", 662 | " f.write(os.path.join(train_imgs_dir, file)+'\\n')\n", 663 | " \n", 664 | "with open('model_data/val.txt', 'w') as f:\n", 665 | " for file in os.listdir(valid_imgs_dir):\n", 666 | " f.write(os.path.join(valid_imgs_dir, file)+'\\n')" 667 | ], 668 | "id": "d2efb5ef-c73c-43db-8bb8-b3b5c320ec85", 669 | "execution_count": null, 670 | "outputs": [] 671 | }, 672 | { 673 | "cell_type": "markdown", 674 | "metadata": { 675 | "id": "NPFPio0i6Hk5" 676 | }, 677 | "source": [ 678 | "OK, now we should create the labels for training our YOLO model.\n", 679 | "YOLO needs a TXT file for each image that contains the locations of each bounding box in that image, plus the index for the class of that box in separate lines. \n", 680 | "\n", 681 | "We will create these files below. For doing this, we first give a unique index to each class of brain hemorrhage in our dataset and then write their location in files. Here is where we will need your coding again:\n", 682 | "\n", 683 | "YOLO needs four coordinates for each box:\n", 684 | "\n", 685 | "* x_min: The x-coordinate for the upper-left corner of the box in pixels.\n", 686 | "* y_min: The y-coordinate for the upper-left corner of the box in pixels.\n", 687 | "* width: The wideness of the box in pixels.\n", 688 | "* height: The height of the box in pixels.\n", 689 | "\n", 690 | "We should note that the way YOLO receives ground truth labels from us (and further predicts labels) is special. YOLO needs all the above numbers proportional to the size of the image, (and not in their absolute values). For example, if the image size is 512 * 512 (as in our case), and the x-min for an imaginary box is 100, YOLO expects a float number (100/512) as the x-min in our labels. This rule also applies to all other bounding box numbers YOLO needs." 691 | ], 692 | "id": "NPFPio0i6Hk5" 693 | }, 694 | { 695 | "cell_type": "code", 696 | "metadata": { 697 | "id": "fEwS8JiXizOB" 698 | }, 699 | "source": [ 700 | "def Convert_bbox_toYOLO(bbox, image_size):\n", 701 | " \"\"\"\n", 702 | " Receive the coordinates for a given bbox and convert its coordinates to the \n", 703 | " YOLO format.\n", 704 | " \n", 705 | " inputs:\n", 706 | " - bbox (dict): a dictionary with the following keys: \n", 707 | " (all have absolute values.)\n", 708 | " -- 'x': the x coordinate for the top left point of the bounding box.\n", 709 | " -- 'y': the y coordinate for the top left point of the bounding box.\n", 710 | " -- 'width': the width of the bounding box.\n", 711 | " -- 'height': the height of the bounding box.\n", 712 | " \n", 713 | " - image_size (int): the shape of the image is (image_size, image_size)\n", 714 | " \n", 715 | " outputs:\n", 716 | " - yolo_bbox (dict): a dictionary with the following keys:\n", 717 | " (all have values scaled between 0 - 1 based on the image_size.)\n", 718 | " -- 'x_center': the x coordinate for the center of the bounding box.\n", 719 | " -- 'y_center': the y coordinate for the center of the bounding box.\n", 720 | " -- 'width': the width of the bounding box.\n", 721 | " -- 'height': the height of the bounding box.\n", 722 | " \"\"\"\n", 723 | "\n", 724 | " yolo_bbox = dict()\n", 725 | "\n", 726 | " ##### START YOUR CODE HERE (4 lines of code):\n", 727 | "\n", 728 | " \n", 729 | " ##### END YOUR CODE HERE.\n", 730 | "\n", 731 | " return yolo_bbox" 732 | ], 733 | "id": "fEwS8JiXizOB", 734 | "execution_count": null, 735 | "outputs": [] 736 | }, 737 | { 738 | "cell_type": "code", 739 | "metadata": { 740 | "cellView": "form", 741 | "id": "K-tpWKk6moko" 742 | }, 743 | "source": [ 744 | "#@title Code to complete the previous cell!\n", 745 | "\n", 746 | "# yolo_bbox['x_center'] = (bbox['x'] + bbox['width'] / 2) / image_size\n", 747 | "# yolo_bbox['y_center'] = (bbox['y'] + bbox['height'] / 2) / image_size\n", 748 | "# yolo_bbox['width'] = bbox['width'] / image_size\n", 749 | "# yolo_bbox['height'] = bbox['height'] / image_size" 750 | ], 751 | "id": "K-tpWKk6moko", 752 | "execution_count": null, 753 | "outputs": [] 754 | }, 755 | { 756 | "cell_type": "code", 757 | "metadata": { 758 | "id": "52560851-198f-49a7-b118-5ff97f27c88f" 759 | }, 760 | "source": [ 761 | "# To make the cell work prperly if run multiple times. \n", 762 | "shutil.rmtree(train_labels_dir, ignore_errors=True) \n", 763 | "os.makedirs(train_labels_dir, exist_ok=True)\n", 764 | "shutil.rmtree(valid_labels_dir, ignore_errors=True)\n", 765 | "os.makedirs(valid_labels_dir, exist_ok=True)\n", 766 | "\n", 767 | "# Creating the labels\n", 768 | "\n", 769 | "label_to_index_dict = {\n", 770 | " 'Chronic': 0,\n", 771 | " 'Intraventricular': 1,\n", 772 | " 'Subdural': 2,\n", 773 | " 'Intraparenchymal': 3,\n", 774 | " 'Subarachnoid': 4,\n", 775 | " 'Epidural': 5\n", 776 | "}\n", 777 | "\n", 778 | "# Creating the TXT file for each image\n", 779 | "# Each image will have a single TXT file including all its labels (each file \n", 780 | "# may have multiple lines, each for one bounding box on that image).\n", 781 | "\n", 782 | "for img_file in os.listdir(train_imgs_dir):\n", 783 | " bboxes = Collect_GT_bboxes(f'images/{img_file}')\n", 784 | " label_file = img_file[:-4] + '.txt'\n", 785 | " with open(f'{train_labels_dir}/{label_file}', 'w') as f:\n", 786 | " for bbox in bboxes:\n", 787 | " label = str(label_to_index_dict[bbox['labelName']])\n", 788 | " yolo_bbox = Convert_bbox_toYOLO(bbox, image_size=512)\n", 789 | " x_center = yolo_bbox['x_center']\n", 790 | " y_center = yolo_bbox['y_center']\n", 791 | " width = yolo_bbox['width']\n", 792 | " height = yolo_bbox['height']\n", 793 | " line_to_write = ' '.join([label, str(x_center), \n", 794 | " str(y_center), str(width), str(height)])\n", 795 | " f.write(line_to_write)\n", 796 | " f.write('\\n')\n", 797 | " \n", 798 | "for img_file in os.listdir(valid_imgs_dir):\n", 799 | " bboxes = Collect_GT_bboxes(f'images/{img_file}')\n", 800 | " label_file = img_file[:-4] + '.txt'\n", 801 | " with open(f'{valid_labels_dir}/{label_file}', 'w') as f:\n", 802 | " for bbox in bboxes:\n", 803 | " # Simply copy your code from the block above.\n", 804 | " label = str(label_to_index_dict[bbox['labelName']])\n", 805 | " yolo_bbox = Convert_bbox_toYOLO(bbox, image_size=512)\n", 806 | " x_center = yolo_bbox['x_center']\n", 807 | " y_center = yolo_bbox['y_center']\n", 808 | " width = yolo_bbox['width']\n", 809 | " height = yolo_bbox['height']\n", 810 | " line_to_write = ' '.join([label, str(x_center), \n", 811 | " str(y_center), str(width), str(height)])\n", 812 | " f.write(line_to_write)\n", 813 | " f.write('\\n')" 814 | ], 815 | "id": "52560851-198f-49a7-b118-5ff97f27c88f", 816 | "execution_count": null, 817 | "outputs": [] 818 | }, 819 | { 820 | "cell_type": "markdown", 821 | "metadata": { 822 | "id": "5cd38bfc-52d5-4fc7-9a99-d90561412cfb" 823 | }, 824 | "source": [ 825 | "### Part 4: Downloading the YOLO model and configuring it\n", 826 | "Alright, now that we have all the images and labels set up, we can clone the Ultralytics YOLOv5 repository and configure that for our training." 827 | ], 828 | "id": "5cd38bfc-52d5-4fc7-9a99-d90561412cfb" 829 | }, 830 | { 831 | "cell_type": "code", 832 | "metadata": { 833 | "id": "e67bd781-6079-4c99-888a-7626a0a6bc41" 834 | }, 835 | "source": [ 836 | "# Clonning the YOLOv5 directory from Ultralytics GitHub page\n", 837 | "\n", 838 | "model_dir = 'yolov5'\n", 839 | "\n", 840 | "# To make the cell work prperly if run multiple times. \n", 841 | "shutil.rmtree(model_dir, ignore_errors=True)\n", 842 | "\n", 843 | "!git clone https://github.com/ultralytics/yolov5 {model_dir} # clone repo" 844 | ], 845 | "id": "e67bd781-6079-4c99-888a-7626a0a6bc41", 846 | "execution_count": null, 847 | "outputs": [] 848 | }, 849 | { 850 | "cell_type": "markdown", 851 | "metadata": { 852 | "id": "pn3kuuF58791" 853 | }, 854 | "source": [ 855 | "As the first step in our configuration, we need to change the YAML file in the model directory. We should give it the path to our training and validation directories of images, the number of classes, and the name of classes." 856 | ], 857 | "id": "pn3kuuF58791" 858 | }, 859 | { 860 | "cell_type": "code", 861 | "metadata": { 862 | "id": "c25a0b07-4ec7-4dd2-813b-934b3ad4cf3a" 863 | }, 864 | "source": [ 865 | "# Configuring the data.yaml file\n", 866 | "\n", 867 | "import yaml\n", 868 | "\n", 869 | "data = dict(\n", 870 | " train = train_imgs_dir,\n", 871 | " val = valid_imgs_dir, \n", 872 | " nc = 6, # number of classes\n", 873 | " names = list(label_to_index_dict.keys()) # classes\n", 874 | " )\n", 875 | "\n", 876 | "with open(f'{model_dir}/data/data.yaml', 'w') as file:\n", 877 | " yaml.dump(data, file, default_flow_style=False)\n", 878 | "\n", 879 | "with open(f'{model_dir}/data/data.yaml', 'r') as file:\n", 880 | " for line in file.readlines():\n", 881 | " print(line.strip())" 882 | ], 883 | "id": "c25a0b07-4ec7-4dd2-813b-934b3ad4cf3a", 884 | "execution_count": null, 885 | "outputs": [] 886 | }, 887 | { 888 | "cell_type": "markdown", 889 | "metadata": { 890 | "id": "59750870-ba3c-48f4-90a2-1b7a361cdb97" 891 | }, 892 | "source": [ 893 | "### Part 5: View and modify the hyperparameters (optional)\n", 894 | "The Ultralytics implementation of YOLO gives us the ability to change many of the hyperparameters YOLO works with. These are all accessible in a YAML file in the model's directory. We will not touch these settings for now, but let's visualize them before we proceed:" 895 | ], 896 | "id": "59750870-ba3c-48f4-90a2-1b7a361cdb97" 897 | }, 898 | { 899 | "cell_type": "code", 900 | "metadata": { 901 | "id": "162c5d86-9ba8-408e-8383-a3047d4e1dad" 902 | }, 903 | "source": [ 904 | "with open(f'{model_dir}/data/hyps/hyp.scratch.yaml', 'r') as file:\n", 905 | " for line in file.readlines():\n", 906 | " print(line.strip())" 907 | ], 908 | "id": "162c5d86-9ba8-408e-8383-a3047d4e1dad", 909 | "execution_count": null, 910 | "outputs": [] 911 | }, 912 | { 913 | "cell_type": "markdown", 914 | "metadata": { 915 | "id": "eb6a7344-20b6-48cf-96c2-83838bf32c64" 916 | }, 917 | "source": [ 918 | "### Part 6: Training\n", 919 | "Perfect, now everything is set for us to start the training! Fortunately, the training itself could be run using one line of code! We just need to determine the image size, batch size, number of epochs, the directory to our model's directory, a name for our project, and a name for our current run of experiment. \n", 920 | "\n", 921 | "Please note that the training command should be run from the command line, that is why we have put an \"!\" mark before the line we do the training.\n", 922 | "\n", 923 | "**Note: In Google Colab, we cannot run the training with a batch size greater than 8, otherwise we will hit the memory limits. Training with such a small batch size on the other hand may take a lot of time, so we only train our model for one epoch. Please run this notebook locally or give it more time later for a full training.**" 924 | ], 925 | "id": "eb6a7344-20b6-48cf-96c2-83838bf32c64" 926 | }, 927 | { 928 | "cell_type": "code", 929 | "metadata": { 930 | "jupyter": { 931 | "outputs_hidden": true 932 | }, 933 | "tags": [], 934 | "id": "15eaff63-d161-411b-97de-053533d81b8e" 935 | }, 936 | "source": [ 937 | "# Training Hyperparameters\n", 938 | "%%time\n", 939 | "\n", 940 | "IMAGE_SIZE = 512\n", 941 | "BATCH_SIZE = 8\n", 942 | "EPOCHS = 1\n", 943 | "WEIGHTS_PATH = f'{model_dir}/yolov5x.pt'\n", 944 | "PROJECT_dir = f'{model_dir}/RSNA_YOLO_Project'\n", 945 | "RUN = 'Exp1'\n", 946 | "\n", 947 | "# To make the cell work prperly if run multiple times. \n", 948 | "shutil.rmtree(os.path.join(PROJECT_dir, RUN), ignore_errors=True)\n", 949 | "\n", 950 | "# Training\n", 951 | "!python {model_dir}/train.py --img {IMAGE_SIZE} \\\n", 952 | " --batch {BATCH_SIZE} \\\n", 953 | " --epochs {EPOCHS} \\\n", 954 | " --data {model_dir}/data/data.yaml \\\n", 955 | " --weights {WEIGHTS_PATH} \\\n", 956 | " --project {PROJECT_dir}\\\n", 957 | " --name {RUN} \\" 958 | ], 959 | "id": "15eaff63-d161-411b-97de-053533d81b8e", 960 | "execution_count": null, 961 | "outputs": [] 962 | }, 963 | { 964 | "cell_type": "markdown", 965 | "metadata": { 966 | "id": "UyWmWE4P54wT" 967 | }, 968 | "source": [ 969 | "As the training will probably take longer than the time we have available in this workshop, we will work with a pre-trained model from now on. We have already trained this model using almost the same configuration we set earlier.You can download that model and replace it in a usual place in our working directory by running the following cell:" 970 | ], 971 | "id": "UyWmWE4P54wT" 972 | }, 973 | { 974 | "cell_type": "code", 975 | "metadata": { 976 | "id": "4rX_G6ysmRs9" 977 | }, 978 | "source": [ 979 | "import gdown\n", 980 | "url = 'https://drive.google.com/uc?export=download&id=1rpMzYyna1N3bcZ-_ZmEVr9IhA8FDpjTH'\n", 981 | "output = 'Pretrained_YOLO.zip'\n", 982 | "if not os.path.exists(output): # To make the cell work prperly if run multiple times. \n", 983 | " gdown.download(url, output, quiet=False)" 984 | ], 985 | "id": "4rX_G6ysmRs9", 986 | "execution_count": null, 987 | "outputs": [] 988 | }, 989 | { 990 | "cell_type": "code", 991 | "metadata": { 992 | "id": "6JhTkgkunDu4" 993 | }, 994 | "source": [ 995 | "# Let's replace the downloaded model with the initial model we were training.\n", 996 | "shutil.rmtree(f'{PROJECT_dir}/{RUN}', ignore_errors=True)\n", 997 | "!unzip Pretrained_YOLO.zip -d {PROJECT_dir}" 998 | ], 999 | "id": "6JhTkgkunDu4", 1000 | "execution_count": null, 1001 | "outputs": [] 1002 | }, 1003 | { 1004 | "cell_type": "markdown", 1005 | "metadata": { 1006 | "id": "EB-767wg_GD-" 1007 | }, 1008 | "source": [ 1009 | "All right, now we assume that we have done a full training. The Ultralytics implementation of YOLOv5 plots a lot of useful curves and logs much information during the training. You could easily visualize that inforation by looking at your model_dir/project/run directory or even using loggers like TensorBoard or WandB." 1010 | ], 1011 | "id": "EB-767wg_GD-" 1012 | }, 1013 | { 1014 | "cell_type": "code", 1015 | "metadata": { 1016 | "id": "b16ba308-25d1-4552-a98a-3a06a2a06c11" 1017 | }, 1018 | "source": [ 1019 | "# Visualizing the training performance\n", 1020 | "\n", 1021 | "def Show_performance(run_name:str, project_dir:str=PROJECT_dir):\n", 1022 | " \n", 1023 | " run_path = os.path.join(project_dir, run_name)\n", 1024 | " \n", 1025 | " # Model's performance\n", 1026 | " results_df = pd.read_csv(os.path.join(run_path, 'results.csv'))\n", 1027 | " metrics = {'precision':[], 'recall':[], 'mAP.05':[]}\n", 1028 | " for i, row in results_df.iterrows():\n", 1029 | " for metric, key in zip(['precision', 'recall', 'mAP.05'], [' metrics/precision', ' metrics/recall', ' metrics/mAP_0.5']):\n", 1030 | " metrics[metric].append(row[key]) \n", 1031 | " max_precision = max(metrics['precision'])\n", 1032 | " max_recall = max(metrics['recall'])\n", 1033 | " max_mAP = max(metrics['mAP.05'])\n", 1034 | " print(f\"Best precision: Epoch {metrics['precision'].index(max_precision)} -> {max_precision}\")\n", 1035 | " print(f\"Best recall: Epoch {metrics['recall'].index(max_recall)} -> {max_recall}\")\n", 1036 | " print(f\"Best mAP.05: Epoch {metrics['mAP.05'].index(max_mAP)} -> {max_mAP}\")\n", 1037 | " \n", 1038 | " # Training curves\n", 1039 | " print(\"\\nDisplaying the training curves:\")\n", 1040 | " plt.figure(figsize = (12,12))\n", 1041 | " plt.axis('off')\n", 1042 | " plt.imshow(plt.imread(os.path.join(run_path, 'results.png')));\n", 1043 | " plt.show()\n", 1044 | " \n", 1045 | " #GTs vs predictions\n", 1046 | " print(\"\\nDisplaying the ground truths vs predictions for three example batches from the validation set:\")\n", 1047 | " fig, axes = plt.subplots(3, 2, figsize=(15, 20))\n", 1048 | " for i in range(3):\n", 1049 | " axes[i, 0].imshow(plt.imread(os.path.join(run_path, f'val_batch{i}_labels.jpg')))\n", 1050 | " axes[i, 1].imshow(plt.imread(os.path.join(run_path, f'val_batch{i}_pred.jpg')))\n", 1051 | " axes[i, 0].axis('off')\n", 1052 | " axes[i, 1].axis('off')\n", 1053 | " axes[i, 0].set_title('Ground Truths')\n", 1054 | " axes[i, 1].set_title('Predictions')" 1055 | ], 1056 | "id": "b16ba308-25d1-4552-a98a-3a06a2a06c11", 1057 | "execution_count": null, 1058 | "outputs": [] 1059 | }, 1060 | { 1061 | "cell_type": "code", 1062 | "metadata": { 1063 | "id": "3c04bfaa-b298-4379-9a22-3a665d390eec" 1064 | }, 1065 | "source": [ 1066 | "Show_performance(RUN)" 1067 | ], 1068 | "id": "3c04bfaa-b298-4379-9a22-3a665d390eec", 1069 | "execution_count": null, 1070 | "outputs": [] 1071 | }, 1072 | { 1073 | "cell_type": "markdown", 1074 | "metadata": { 1075 | "id": "P6N6Td_CFx_m" 1076 | }, 1077 | "source": [ 1078 | "**Please note that the best mAP of our model has been about 0.4. Though not very bad, this mAP is not very high. Rather than the training hyperparameters, What other factors do you think can explain this phenomenon?**" 1079 | ], 1080 | "id": "P6N6Td_CFx_m" 1081 | }, 1082 | { 1083 | "cell_type": "markdown", 1084 | "metadata": { 1085 | "id": "sQpA826WwNe4" 1086 | }, 1087 | "source": [ 1088 | "### Part 7: Inference\n", 1089 | "As the final part of our notebook, let's apply our YOLO model to a set of images. For doing this, we separate a part of images that our model had not seen during the training (Please note that these images are different than the validation images you created above, as we are now working with a model that we had trained before; not a model that you trained here). " 1090 | ], 1091 | "id": "sQpA826WwNe4" 1092 | }, 1093 | { 1094 | "cell_type": "code", 1095 | "metadata": { 1096 | "id": "5hDzxmbq_0vp" 1097 | }, 1098 | "source": [ 1099 | "# Seperating a part of our images which our pre-trained model has not seen for training\n", 1100 | "\n", 1101 | "inference_imgs_dir = 'Inference_imgs'\n", 1102 | "os.makedirs(inference_imgs_dir, exist_ok=True)\n", 1103 | "\n", 1104 | "with open(f'{PROJECT_dir}/{RUN}/images_for_inference.txt', 'r') as f:\n", 1105 | " inference_img_names = [line.strip() for line in f.readlines()]\n", 1106 | "\n", 1107 | "# Copying 20 images for inference to the inference_imgs_dir\n", 1108 | "\n", 1109 | "count_copied = 0\n", 1110 | "for image in os.listdir('images'):\n", 1111 | " if image in inference_img_names:\n", 1112 | " shutil.copy(f'images/{image}', f'{inference_imgs_dir}/{image}')\n", 1113 | " count_copied += 1\n", 1114 | " if count_copied == 20:\n", 1115 | " break" 1116 | ], 1117 | "id": "5hDzxmbq_0vp", 1118 | "execution_count": null, 1119 | "outputs": [] 1120 | }, 1121 | { 1122 | "cell_type": "code", 1123 | "metadata": { 1124 | "id": "Bje-WKxMLsap" 1125 | }, 1126 | "source": [ 1127 | "# Setting up the pipeline for inference\n", 1128 | "\n", 1129 | "weights_path = f'{PROJECT_dir}/{RUN}/weights/best.pt'\n", 1130 | "destination_dir = 'Inference_Results'\n", 1131 | "output_name = 'Pretrained_YOLOv5_Results'\n", 1132 | "img_size = 512\n", 1133 | "conf: float = 0.222\n", 1134 | "iou_threshold: float = 0.5\n", 1135 | "max_dt = 5" 1136 | ], 1137 | "id": "Bje-WKxMLsap", 1138 | "execution_count": null, 1139 | "outputs": [] 1140 | }, 1141 | { 1142 | "cell_type": "code", 1143 | "metadata": { 1144 | "id": "qmQjQDTNM7FF" 1145 | }, 1146 | "source": [ 1147 | "# We should now run the detect.py code from the model directory and pass all \n", 1148 | "# the arguments that YOLO needs. \n", 1149 | "\n", 1150 | "!python {model_dir}/detect.py --weights {weights_path} \\\n", 1151 | " --source {inference_imgs_dir} \\\n", 1152 | " --project {destination_dir} \\\n", 1153 | " --name {output_name}\\\n", 1154 | " --img {img_size} \\\n", 1155 | " --conf {conf} \\\n", 1156 | " --iou-thres {iou_threshold} \\\n", 1157 | " --max-det {max_dt} \\\n", 1158 | " --save-txt \\\n", 1159 | " --save-conf \\\n", 1160 | " --exist-ok" 1161 | ], 1162 | "id": "qmQjQDTNM7FF", 1163 | "execution_count": null, 1164 | "outputs": [] 1165 | }, 1166 | { 1167 | "cell_type": "markdown", 1168 | "metadata": { 1169 | "id": "zRnaqsBXB0DD" 1170 | }, 1171 | "source": [ 1172 | "Before we wrap up this notebook, we can also visualize some of the YOLO predictions!" 1173 | ], 1174 | "id": "zRnaqsBXB0DD" 1175 | }, 1176 | { 1177 | "cell_type": "code", 1178 | "metadata": { 1179 | "id": "NKzxo1ehttp9" 1180 | }, 1181 | "source": [ 1182 | "def Show_prediction(img_path):\n", 1183 | " \"\"\"\n", 1184 | " A function to plot an image and show the bounding boxes predicted by YOLO for that image.\n", 1185 | " \"\"\"\n", 1186 | " index_to_label_dict = {value: key for key, value in label_to_index_dict.items()}\n", 1187 | " fig, ax = plt.subplots(1, 1, figsize=(7, 7))\n", 1188 | " ax.imshow(imread(img_path), cmap='gray')\n", 1189 | " label_path = f'{destination_dir}/{output_name}/labels/{img_path.split(\"/\")[-1][:-4]}.txt'\n", 1190 | " with open(label_path, 'r') as f:\n", 1191 | " for i, line in enumerate(f.readlines()):\n", 1192 | " line = [float(value) for value in line.strip().split(' ')]\n", 1193 | " label = index_to_label_dict[int(line[0])]\n", 1194 | " color = label_color_dict[label]\n", 1195 | " confidence = round(float(line[-1]), 2)\n", 1196 | " x_min = (line[1] - line[3]/2) * 512\n", 1197 | " y_min = (line[2] - line[4]/2) * 512\n", 1198 | " width = line[3] * 512\n", 1199 | " height = line[4] * 512 \n", 1200 | " rect = patches.Rectangle((x_min, y_min), width, height, linewidth=1, edgecolor=color, facecolor='none')\n", 1201 | " ax.add_patch(rect)\n", 1202 | " ax.text(5, 15+i*25, f'{label}-confidence: {confidence}', color='w', fontsize=10, bbox={'alpha':0.6,'color':color})\n", 1203 | "\n", 1204 | "# Now we build a list of all paths to images we had split for validaiton of our model\n", 1205 | "# And then feed them to the function we defined above.\n", 1206 | "# Feel free to visualize the predictions for multple images and see how the results \n", 1207 | "# look like!\n", 1208 | "img_paths = [os.path.join(inference_imgs_dir, img) for img in os.listdir(inference_imgs_dir)]\n", 1209 | "Show_prediction(img_paths[0])" 1210 | ], 1211 | "id": "NKzxo1ehttp9", 1212 | "execution_count": null, 1213 | "outputs": [] 1214 | }, 1215 | { 1216 | "cell_type": "markdown", 1217 | "metadata": { 1218 | "id": "t3JYz5QJFb7h" 1219 | }, 1220 | "source": [ 1221 | "That's it! Congratulations on training and applying your first YOLOv5 model. Hopefully, you have learned how to train more YOLO models on your custom datasets. In case you are interested, there are a few more things you can try with YOLOv5:\n", 1222 | "\n", 1223 | "* Changing the hyperparameters (e.g., IOU, learning rate, etc.) and retraining your model.\n", 1224 | "* Applying test time augmentation (TTA) during your inference.\n", 1225 | "* Exploring the different augmentations YOLO does during the training, especially the mosaic augmentation. \n", 1226 | "\n", 1227 | "\n", 1228 | "\n", 1229 | "**Thank you for attending this workshop!**" 1230 | ], 1231 | "id": "t3JYz5QJFb7h" 1232 | } 1233 | ] 1234 | } -------------------------------------------------------------------------------- /sessions/yolo/YOLO_RSNA2021.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/RSNA/AI-Deep-Learning-Lab-2021/bf7119ef9426453951dcd4c452e109f466c2f1ec/sessions/yolo/YOLO_RSNA2021.pdf --------------------------------------------------------------------------------