├── .gitattributes
├── .gitignore
├── Lung Cancer 3D CNN.ipynb
├── Project Report.docx
└── README.md


/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | # Byte-compiled / optimized / DLL files
 2 | __pycache__/
 3 | *.py[cod]
 4 | *$py.class
 5 | 
 6 | # C extensions
 7 | *.so
 8 | 
 9 | # Distribution / packaging
10 | .Python
11 | env/
12 | build/
13 | develop-eggs/
14 | dist/
15 | downloads/
16 | eggs/
17 | .eggs/
18 | lib/
19 | lib64/
20 | parts/
21 | sdist/
22 | var/
23 | wheels/
24 | *.egg-info/
25 | .installed.cfg
26 | *.egg
27 | 
28 | # PyInstaller
29 | #  Usually these files are written by a python script from a template
30 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
31 | *.manifest
32 | *.spec
33 | 
34 | # Installer logs
35 | pip-log.txt
36 | pip-delete-this-directory.txt
37 | 
38 | # Unit test / coverage reports
39 | htmlcov/
40 | .tox/
41 | .coverage
42 | .coverage.*
43 | .cache
44 | nosetests.xml
45 | coverage.xml
46 | *,cover
47 | .hypothesis/
48 | 
49 | # Translations
50 | *.mo
51 | *.pot
52 | 
53 | # Django stuff:
54 | *.log
55 | local_settings.py
56 | 
57 | # Flask stuff:
58 | instance/
59 | .webassets-cache
60 | 
61 | # Scrapy stuff:
62 | .scrapy
63 | 
64 | # Sphinx documentation
65 | docs/_build/
66 | 
67 | # PyBuilder
68 | target/
69 | 
70 | # Jupyter Notebook
71 | .ipynb_checkpoints
72 | 
73 | # pyenv
74 | .python-version
75 | 
76 | # celery beat schedule file
77 | celerybeat-schedule
78 | 
79 | # dotenv
80 | .env
81 | 
82 | # virtualenv
83 | .venv/
84 | venv/
85 | ENV/
86 | 
87 | # Spyder project settings
88 | .spyderproject
89 | 
90 | # Rope project settings
91 | .ropeproject
92 | 


--------------------------------------------------------------------------------
/Lung Cancer 3D CNN.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "metadata": {},
   6 |    "source": [
   7 |     "# Lung Cancer Detection using 3D Convolutional Neural Networks\n"
   8 |    ]
   9 |   },
  10 |   {
  11 |    "cell_type": "markdown",
  12 |    "metadata": {},
  13 |    "source": [
  14 |     "Kaggle.com - Data Science Bowl 2017 Competition \n",
  15 |     "\n",
  16 |     "Competition link: https://www.kaggle.com/c/data-science-bowl-2017\n"
  17 |    ]
  18 |   },
  19 |   {
  20 |    "cell_type": "markdown",
  21 |    "metadata": {},
  22 |    "source": [
  23 |     "# Problem statement\n",
  24 |     "\n",
  25 |     "\n",
  26 |     "In the United States, lung cancer strikes 225,000 people every year, and accounts for $12 billion in health care costs. Early detection is critical to give patients the best chance at recovery and survival. One year ago, the office of the U.S. Vice President spearheaded a bold new initiative, the Cancer Moonshot, to make a decade's worth of progress in cancer prevention, diagnosis, and treatment in just 5 years. \n",
  27 |     "\n",
  28 |     "In 2017, the Data Science Bowl will be a critical milestone in support of the Cancer Moonshot by convening the data science and medical communities to develop lung cancer detection algorithms. Using a data set of thousands of high-resolution lung scans provided by the National Cancer Institute, participants will develop algorithms that accurately determine when lesions in the lungs are cancerous. This will dramatically reduce the false positive rate that plagues the current detection technology, get patients earlier access to life-saving interventions, and give radiologists more time to spend with their patients.\n",
  29 |     "\n"
  30 |    ]
  31 |   },
  32 |   {
  33 |    "cell_type": "markdown",
  34 |    "metadata": {},
  35 |    "source": [
  36 |     "# Goal\n",
  37 |     "\n",
  38 |     "The goal of this project is to evaluate the data (Slices of CT scans) provided with various pre-processing techniques and analyze the data using machine learning algorithms, in this case 3D Convolutional Neural Networks to train and validate the model, to create an accurate model which can be used to determine whether a person has cancer or not. This will greatly help in the identification and elimination of cancer cells in the early stages. Therefore, an automated method capable of determining whether the patient will be diagnosed with lung cancer is the aim of this project."
  39 |    ]
  40 |   },
  41 |   {
  42 |    "cell_type": "markdown",
  43 |    "metadata": {},
  44 |    "source": [
  45 |     "## Importing dataset and libraries"
  46 |    ]
  47 |   },
  48 |   {
  49 |    "cell_type": "code",
  50 |    "execution_count": null,
  51 |    "metadata": {
  52 |     "collapsed": true
  53 |    },
  54 |    "outputs": [],
  55 |    "source": [
  56 |     "import numpy as np\n",
  57 |     "import pandas as pd\n",
  58 |     "import dicom\n",
  59 |     "import os\n",
  60 |     "import matplotlib.pyplot as plt\n",
  61 |     "import cv2\n",
  62 |     "import math\n",
  63 |     "\n",
  64 |     "##Data directory\n",
  65 |     "dataDirectory = 'Lung_Cancer/stage1/stage1/'\n",
  66 |     "lungPatients = os.listdir(dataDirectory)\n",
  67 |     "\n",
  68 |     "##Read labels csv \n",
  69 |     "labels = pd.read_csv('Lung_Cancer/stage1_labels/stage1_labels.csv', index_col=0)\n",
  70 |     "\n",
  71 |     "##Setting x*y size to 50\n",
  72 |     "size = 50\n",
  73 |     "\n",
  74 |     "## Setting z-dimension (number of slices to 20)\n",
  75 |     "NoSlices = 20"
  76 |    ]
  77 |   },
  78 |   {
  79 |    "cell_type": "markdown",
  80 |    "metadata": {},
  81 |    "source": [
  82 |     "# Data Preprocessing"
  83 |    ]
  84 |   },
  85 |   {
  86 |    "cell_type": "markdown",
  87 |    "metadata": {},
  88 |    "source": [
  89 |     "## Function to get chunks, mean and processing of images\n",
  90 |     "\n",
  91 |     "Chunks - Generating 20 chunks from a list of images. The number of chunks (z-dimension) differs from one file to another. To make it even I used a chunks function to get an even count of 20\n",
  92 |     "\n",
  93 |     "Mean - To calculate mean \n",
  94 |     "\n",
  95 |     "Data Processing - Generating 3D lung image using id (re-arrange), getting 20 chunks of each file. After processing, the output is saved to a numpy file. "
  96 |    ]
  97 |   },
  98 |   {
  99 |    "cell_type": "code",
 100 |    "execution_count": 2,
 101 |    "metadata": {
 102 |     "collapsed": false
 103 |    },
 104 |    "outputs": [
 105 |     {
 106 |      "name": "stdout",
 107 |      "output_type": "stream",
 108 |      "text": [
 109 |       "Saved - 0\n",
 110 |       "Data is unlabeled\n",
 111 |       "Data is unlabeled\n",
 112 |       "Data is unlabeled\n",
 113 |       "Data is unlabeled\n",
 114 |       "Data is unlabeled\n",
 115 |       "Data is unlabeled\n",
 116 |       "Saved - 100\n",
 117 |       "Data is unlabeled\n",
 118 |       "Data is unlabeled\n",
 119 |       "Data is unlabeled\n",
 120 |       "Data is unlabeled\n",
 121 |       "Data is unlabeled\n",
 122 |       "Data is unlabeled\n",
 123 |       "Data is unlabeled\n",
 124 |       "Data is unlabeled\n",
 125 |       "Data is unlabeled\n",
 126 |       "Data is unlabeled\n",
 127 |       "Data is unlabeled\n",
 128 |       "Data is unlabeled\n",
 129 |       "Saved - 200\n",
 130 |       "Data is unlabeled\n",
 131 |       "Data is unlabeled\n",
 132 |       "Data is unlabeled\n",
 133 |       "Data is unlabeled\n",
 134 |       "Data is unlabeled\n",
 135 |       "Data is unlabeled\n",
 136 |       "Data is unlabeled\n",
 137 |       "Saved - 300\n",
 138 |       "Data is unlabeled\n",
 139 |       "Data is unlabeled\n",
 140 |       "Data is unlabeled\n",
 141 |       "Data is unlabeled\n",
 142 |       "Data is unlabeled\n",
 143 |       "Data is unlabeled\n",
 144 |       "Saved - 400\n",
 145 |       "Data is unlabeled\n",
 146 |       "Data is unlabeled\n",
 147 |       "Data is unlabeled\n",
 148 |       "Data is unlabeled\n",
 149 |       "Data is unlabeled\n",
 150 |       "Data is unlabeled\n",
 151 |       "Data is unlabeled\n",
 152 |       "Data is unlabeled\n",
 153 |       "Data is unlabeled\n",
 154 |       "Data is unlabeled\n",
 155 |       "Data is unlabeled\n",
 156 |       "Data is unlabeled\n",
 157 |       "Saved - 500\n",
 158 |       "Data is unlabeled\n",
 159 |       "Data is unlabeled\n",
 160 |       "Data is unlabeled\n",
 161 |       "Data is unlabeled\n",
 162 |       "Data is unlabeled\n",
 163 |       "Data is unlabeled\n",
 164 |       "Data is unlabeled\n",
 165 |       "Data is unlabeled\n",
 166 |       "Data is unlabeled\n",
 167 |       "Data is unlabeled\n",
 168 |       "Data is unlabeled\n",
 169 |       "Data is unlabeled\n",
 170 |       "Data is unlabeled\n",
 171 |       "Saved - 600\n",
 172 |       "Data is unlabeled\n",
 173 |       "Data is unlabeled\n",
 174 |       "Data is unlabeled\n",
 175 |       "Data is unlabeled\n",
 176 |       "Data is unlabeled\n",
 177 |       "Data is unlabeled\n",
 178 |       "Data is unlabeled\n",
 179 |       "Data is unlabeled\n",
 180 |       "Data is unlabeled\n",
 181 |       "Data is unlabeled\n",
 182 |       "Data is unlabeled\n",
 183 |       "Data is unlabeled\n",
 184 |       "Data is unlabeled\n",
 185 |       "Data is unlabeled\n",
 186 |       "Saved - 700\n",
 187 |       "Data is unlabeled\n",
 188 |       "Data is unlabeled\n",
 189 |       "Data is unlabeled\n",
 190 |       "Data is unlabeled\n",
 191 |       "Data is unlabeled\n",
 192 |       "Data is unlabeled\n",
 193 |       "Data is unlabeled\n",
 194 |       "Data is unlabeled\n",
 195 |       "Data is unlabeled\n",
 196 |       "Data is unlabeled\n",
 197 |       "Data is unlabeled\n",
 198 |       "Saved - 800\n",
 199 |       "Data is unlabeled\n",
 200 |       "Data is unlabeled\n",
 201 |       "Data is unlabeled\n",
 202 |       "Data is unlabeled\n",
 203 |       "Data is unlabeled\n",
 204 |       "Data is unlabeled\n",
 205 |       "Data is unlabeled\n",
 206 |       "Data is unlabeled\n",
 207 |       "Data is unlabeled\n",
 208 |       "Data is unlabeled\n",
 209 |       "Data is unlabeled\n",
 210 |       "Data is unlabeled\n",
 211 |       "Data is unlabeled\n",
 212 |       "Data is unlabeled\n",
 213 |       "Data is unlabeled\n",
 214 |       "Data is unlabeled\n",
 215 |       "Data is unlabeled\n",
 216 |       "Data is unlabeled\n",
 217 |       "Data is unlabeled\n",
 218 |       "Data is unlabeled\n",
 219 |       "Saved - 900\n",
 220 |       "Data is unlabeled\n",
 221 |       "Data is unlabeled\n",
 222 |       "Data is unlabeled\n",
 223 |       "Data is unlabeled\n",
 224 |       "Data is unlabeled\n",
 225 |       "Data is unlabeled\n",
 226 |       "Data is unlabeled\n",
 227 |       "Data is unlabeled\n",
 228 |       "Data is unlabeled\n",
 229 |       "Data is unlabeled\n",
 230 |       "Data is unlabeled\n",
 231 |       "Data is unlabeled\n",
 232 |       "Data is unlabeled\n",
 233 |       "Data is unlabeled\n",
 234 |       "Data is unlabeled\n",
 235 |       "Data is unlabeled\n",
 236 |       "Data is unlabeled\n",
 237 |       "Saved - 1000\n",
 238 |       "Data is unlabeled\n",
 239 |       "Data is unlabeled\n",
 240 |       "Data is unlabeled\n",
 241 |       "Data is unlabeled\n",
 242 |       "Data is unlabeled\n",
 243 |       "Data is unlabeled\n",
 244 |       "Data is unlabeled\n",
 245 |       "Data is unlabeled\n",
 246 |       "Data is unlabeled\n",
 247 |       "Data is unlabeled\n",
 248 |       "Data is unlabeled\n",
 249 |       "Data is unlabeled\n",
 250 |       "Data is unlabeled\n",
 251 |       "Data is unlabeled\n",
 252 |       "Saved - 1100\n",
 253 |       "Data is unlabeled\n",
 254 |       "Data is unlabeled\n",
 255 |       "Data is unlabeled\n",
 256 |       "Data is unlabeled\n",
 257 |       "Data is unlabeled\n",
 258 |       "Data is unlabeled\n",
 259 |       "Data is unlabeled\n",
 260 |       "Data is unlabeled\n",
 261 |       "Data is unlabeled\n",
 262 |       "Data is unlabeled\n",
 263 |       "Data is unlabeled\n",
 264 |       "Data is unlabeled\n",
 265 |       "Data is unlabeled\n",
 266 |       "Saved - 1200\n",
 267 |       "Data is unlabeled\n",
 268 |       "Data is unlabeled\n",
 269 |       "Data is unlabeled\n",
 270 |       "Data is unlabeled\n",
 271 |       "Data is unlabeled\n",
 272 |       "Data is unlabeled\n",
 273 |       "Data is unlabeled\n",
 274 |       "Data is unlabeled\n",
 275 |       "Data is unlabeled\n",
 276 |       "Data is unlabeled\n",
 277 |       "Data is unlabeled\n",
 278 |       "Data is unlabeled\n",
 279 |       "Data is unlabeled\n",
 280 |       "Data is unlabeled\n",
 281 |       "Data is unlabeled\n",
 282 |       "Data is unlabeled\n",
 283 |       "Saved - 1300\n",
 284 |       "Data is unlabeled\n",
 285 |       "Data is unlabeled\n",
 286 |       "Data is unlabeled\n",
 287 |       "Data is unlabeled\n",
 288 |       "Data is unlabeled\n",
 289 |       "Data is unlabeled\n",
 290 |       "Data is unlabeled\n",
 291 |       "Data is unlabeled\n",
 292 |       "Data is unlabeled\n",
 293 |       "Data is unlabeled\n",
 294 |       "Data is unlabeled\n",
 295 |       "Data is unlabeled\n",
 296 |       "Saved - 1400\n",
 297 |       "Data is unlabeled\n",
 298 |       "Data is unlabeled\n",
 299 |       "Data is unlabeled\n",
 300 |       "Data is unlabeled\n",
 301 |       "Data is unlabeled\n",
 302 |       "Data is unlabeled\n",
 303 |       "Data is unlabeled\n",
 304 |       "Data is unlabeled\n",
 305 |       "Data is unlabeled\n",
 306 |       "Data is unlabeled\n",
 307 |       "Data is unlabeled\n",
 308 |       "Data is unlabeled\n",
 309 |       "Data is unlabeled\n",
 310 |       "Data is unlabeled\n",
 311 |       "Data is unlabeled\n",
 312 |       "Saved - 1500\n",
 313 |       "Data is unlabeled\n",
 314 |       "Data is unlabeled\n",
 315 |       "Data is unlabeled\n",
 316 |       "Data is unlabeled\n",
 317 |       "Data is unlabeled\n",
 318 |       "Data is unlabeled\n",
 319 |       "Data is unlabeled\n",
 320 |       "Data is unlabeled\n",
 321 |       "Data is unlabeled\n",
 322 |       "Data is unlabeled\n"
 323 |      ]
 324 |     }
 325 |    ],
 326 |    "source": [
 327 |     "def chunks(l, n):\n",
 328 |     "    count = 0\n",
 329 |     "    for i in range(0, len(l), n):\n",
 330 |     "        if (count < NoSlices):\n",
 331 |     "            yield l[i:i + n]\n",
 332 |     "            count = count + 1\n",
 333 |     "\n",
 334 |     "\n",
 335 |     "def mean(l):\n",
 336 |     "    return sum(l) / len(l)\n",
 337 |     "\n",
 338 |     "\n",
 339 |     "def dataProcessing(patient, labels_df, size=50, noslices=20, visualize=False):\n",
 340 |     "    label = labels_df.get_value(patient, 'cancer')\n",
 341 |     "    path = dataDirectory + patient\n",
 342 |     "    slices = [dicom.read_file(path + '/' + s) for s in os.listdir(path)]\n",
 343 |     "    slices.sort(key=lambda x: int(x.ImagePositionPatient[2]))\n",
 344 |     "\n",
 345 |     "    new_slices = []\n",
 346 |     "    slices = [cv2.resize(np.array(each_slice.pixel_array), (size, size)) for each_slice in slices]\n",
 347 |     "\n",
 348 |     "    chunk_sizes = math.floor(len(slices) / noslices)\n",
 349 |     "    for slice_chunk in chunks(slices, chunk_sizes):\n",
 350 |     "        slice_chunk = list(map(mean, zip(*slice_chunk)))\n",
 351 |     "        new_slices.append(slice_chunk)\n",
 352 |     "\n",
 353 |     "    if label == 1:\n",
 354 |     "        label = np.array([0, 1])\n",
 355 |     "    elif label == 0:\n",
 356 |     "        label = np.array([1, 0])\n",
 357 |     "    return np.array(new_slices), label\n",
 358 |     "\n",
 359 |     "\n",
 360 |     "imageData = []\n",
 361 |     "for num, patient in enumerate(lungPatients):\n",
 362 |     "    if num % 100 == 0:\n",
 363 |     "        print('Saved -', num)\n",
 364 |     "    try:\n",
 365 |     "        img_data, label = dataProcessing(patient, labels, size=size, noslices=NoSlices)\n",
 366 |     "        imageData.append([img_data, label,patient])\n",
 367 |     "    except KeyError as e:\n",
 368 |     "        print('Data is unlabeled')\n",
 369 |     "\n",
 370 |     "        \n",
 371 |     "##Results are saved as numpy file\n",
 372 |     "np.save('imageDataNew-{}-{}-{}.npy'.format(size, size, NoSlices), imageData)"
 373 |    ]
 374 |   },
 375 |   {
 376 |    "cell_type": "markdown",
 377 |    "metadata": {},
 378 |    "source": [
 379 |     "The above result shows that there are a lot of data that are unlabeled"
 380 |    ]
 381 |   },
 382 |   {
 383 |    "cell_type": "markdown",
 384 |    "metadata": {},
 385 |    "source": [
 386 |     "# Feeding pre-processed data into 3D Convolution layers"
 387 |    ]
 388 |   },
 389 |   {
 390 |    "cell_type": "markdown",
 391 |    "metadata": {},
 392 |    "source": [
 393 |     "## Using tensorflow framework for Conv3D and importing libraries\n"
 394 |    ]
 395 |   },
 396 |   {
 397 |    "cell_type": "code",
 398 |    "execution_count": null,
 399 |    "metadata": {
 400 |     "collapsed": true
 401 |    },
 402 |    "outputs": [],
 403 |    "source": [
 404 |     "import tensorflow as tf\n",
 405 |     "import pandas as pd\n",
 406 |     "import tflearn\n",
 407 |     "from tflearn.layers.conv import conv_3d, max_pool_3d\n",
 408 |     "from tflearn.layers.core import input_data, dropout, fully_connected\n",
 409 |     "from tflearn.layers.estimator import regression\n",
 410 |     "import numpy as np\n",
 411 |     "import matplotlib.pyplot as plt"
 412 |    ]
 413 |   },
 414 |   {
 415 |    "cell_type": "markdown",
 416 |    "metadata": {},
 417 |    "source": [
 418 |     "## Loading the numpy file and setting train and validation datasets"
 419 |    ]
 420 |   },
 421 |   {
 422 |    "cell_type": "code",
 423 |    "execution_count": null,
 424 |    "metadata": {
 425 |     "collapsed": true
 426 |    },
 427 |    "outputs": [],
 428 |    "source": [
 429 |     "imageData = np.load('imageDataNew-50-50-20.npy')\n",
 430 |     "trainingData = imageData[0:800]\n",
 431 |     "validationData = imageData[-200:-100]\n",
 432 |     "\n",
 433 |     "x = tf.placeholder('float')\n",
 434 |     "y = tf.placeholder('float')\n",
 435 |     "size = 50\n",
 436 |     "keep_rate = 0.8\n",
 437 |     "NoSlices = 20"
 438 |    ]
 439 |   },
 440 |   {
 441 |    "cell_type": "markdown",
 442 |    "metadata": {},
 443 |    "source": [
 444 |     "## Building 3D CNN model\n"
 445 |    ]
 446 |   },
 447 |   {
 448 |    "cell_type": "code",
 449 |    "execution_count": 3,
 450 |    "metadata": {
 451 |     "collapsed": false
 452 |    },
 453 |    "outputs": [
 454 |     {
 455 |      "name": "stdout",
 456 |      "output_type": "stream",
 457 |      "text": [
 458 |       "Epoch 1 completed out of 10 loss: 3.88795779023e+13\n",
 459 |       "Accuracy: 0.7\n",
 460 |       "Epoch 2 completed out of 10 loss: 1.24509060778e+13\n",
 461 |       "Accuracy: 0.6\n",
 462 |       "Epoch 3 completed out of 10 loss: 6.06095143603e+12\n",
 463 |       "Accuracy: 0.67\n",
 464 |       "Epoch 4 completed out of 10 loss: 3.88584119686e+12\n",
 465 |       "Accuracy: 0.71\n",
 466 |       "Epoch 5 completed out of 10 loss: 3.01922889866e+12\n",
 467 |       "Accuracy: 0.58\n",
 468 |       "Epoch 6 completed out of 10 loss: 1.83550548122e+12\n",
 469 |       "Accuracy: 0.48\n",
 470 |       "Epoch 7 completed out of 10 loss: 1.58098134931e+12\n",
 471 |       "Accuracy: 0.61\n",
 472 |       "Epoch 8 completed out of 10 loss: 1.02996348973e+12\n",
 473 |       "Accuracy: 0.67\n",
 474 |       "Epoch 9 completed out of 10 loss: 845322210816.0\n",
 475 |       "Accuracy: 0.52\n",
 476 |       "Epoch 10 completed out of 10 loss: 613924536640.0\n",
 477 |       "Accuracy: 0.6\n",
 478 |       "Final Accuracy: 0.61\n",
 479 |       "Patient:  dc1ecce5e7f8a4be9082cb5650fa62bd\n",
 480 |       "Actual:  No Cancer\n",
 481 |       "Predcited:  No Cancer\n",
 482 |       "Patient:  dc5cd907d9de1ed0609832f5bf1fc6e2\n",
 483 |       "Actual:  No Cancer\n",
 484 |       "Predcited:  Cancer\n",
 485 |       "Patient:  dc66d11755fd073a59743d0df6b62ee2\n",
 486 |       "Actual:  No Cancer\n",
 487 |       "Predcited:  Cancer\n",
 488 |       "Patient:  dc9854bcdcc71b690d9806438009001d\n",
 489 |       "Actual:  Cancer\n",
 490 |       "Predcited:  No Cancer\n",
 491 |       "Patient:  dcb426dd025b609489c8f520d6d644b7\n",
 492 |       "Actual:  No Cancer\n",
 493 |       "Predcited:  No Cancer\n",
 494 |       "Patient:  dcde02d4757bb845376fa6dbb0351df6\n",
 495 |       "Actual:  No Cancer\n",
 496 |       "Predcited:  No Cancer\n",
 497 |       "Patient:  dcdf0b64b314e08e8f71f3bec9ecb080\n",
 498 |       "Actual:  No Cancer\n",
 499 |       "Predcited:  No Cancer\n",
 500 |       "Patient:  dcf5fd36b9fff9183f63df286bf8eef9\n",
 501 |       "Actual:  No Cancer\n",
 502 |       "Predcited:  No Cancer\n",
 503 |       "Patient:  dcf75f484b2d2712e5033ba61fd6e2a0\n",
 504 |       "Actual:  No Cancer\n",
 505 |       "Predcited:  Cancer\n",
 506 |       "Patient:  dd281294b34eb6deb76ef9f38169d50e\n",
 507 |       "Actual:  No Cancer\n",
 508 |       "Predcited:  Cancer\n",
 509 |       "Patient:  dd571c3949cdae0b59fc0542bb23e06a\n",
 510 |       "Actual:  No Cancer\n",
 511 |       "Predcited:  Cancer\n",
 512 |       "Patient:  dd5764803d51c71a27707d9db8c84aac\n",
 513 |       "Actual:  No Cancer\n",
 514 |       "Predcited:  No Cancer\n",
 515 |       "Patient:  de04fbf5e6c2389f0d039398bdcda971\n",
 516 |       "Actual:  No Cancer\n",
 517 |       "Predcited:  No Cancer\n",
 518 |       "Patient:  de4d3724030397e71d2ac2ab16df5fba\n",
 519 |       "Actual:  No Cancer\n",
 520 |       "Predcited:  Cancer\n",
 521 |       "Patient:  de635c85f320131ee743733bb04e65b9\n",
 522 |       "Actual:  No Cancer\n",
 523 |       "Predcited:  Cancer\n",
 524 |       "Patient:  de881c07adc8d53e52391fac066ccb9f\n",
 525 |       "Actual:  No Cancer\n",
 526 |       "Predcited:  No Cancer\n",
 527 |       "Patient:  de9f65a7a70b73ce2ffef1e4a2613eee\n",
 528 |       "Actual:  No Cancer\n",
 529 |       "Predcited:  Cancer\n",
 530 |       "Patient:  df015da931ad5312ee7b24b201b67478\n",
 531 |       "Actual:  Cancer\n",
 532 |       "Predcited:  No Cancer\n",
 533 |       "Patient:  df1354de25723c9a55e1241d4c40ffe2\n",
 534 |       "Actual:  No Cancer\n",
 535 |       "Predcited:  No Cancer\n",
 536 |       "Patient:  df54dc42705decd3f75ec8fd8040e76e\n",
 537 |       "Actual:  No Cancer\n",
 538 |       "Predcited:  No Cancer\n",
 539 |       "Patient:  df75d5a21b4289e8df6e2d0e135ac48f\n",
 540 |       "Actual:  No Cancer\n",
 541 |       "Predcited:  No Cancer\n",
 542 |       "Patient:  df761dd787bfc439890740ccce934f36\n",
 543 |       "Actual:  Cancer\n",
 544 |       "Predcited:  Cancer\n",
 545 |       "Patient:  df8614fd49a196123c5b88584dd5dd65\n",
 546 |       "Actual:  No Cancer\n",
 547 |       "Predcited:  No Cancer\n",
 548 |       "Patient:  e00832e96709eb85f8e0e608ca02c2b5\n",
 549 |       "Actual:  Cancer\n",
 550 |       "Predcited:  No Cancer\n",
 551 |       "Patient:  e10c2b829c39d4a500c09caf04d461a1\n",
 552 |       "Actual:  Cancer\n",
 553 |       "Predcited:  No Cancer\n",
 554 |       "Patient:  e127111e994be5f79bb0cea52c9d563e\n",
 555 |       "Actual:  No Cancer\n",
 556 |       "Predcited:  No Cancer\n",
 557 |       "Patient:  e129305f6d074d08cd2de0ebdfeaa576\n",
 558 |       "Actual:  No Cancer\n",
 559 |       "Predcited:  No Cancer\n",
 560 |       "Patient:  e1584618a0c72f124fe618e1ed9b3e55\n",
 561 |       "Actual:  No Cancer\n",
 562 |       "Predcited:  Cancer\n",
 563 |       "Patient:  e163325ccf00afde107c80dfce2bce80\n",
 564 |       "Actual:  No Cancer\n",
 565 |       "Predcited:  No Cancer\n",
 566 |       "Patient:  e188bdeea72bb41d980dc2556dc8aafa\n",
 567 |       "Actual:  No Cancer\n",
 568 |       "Predcited:  No Cancer\n",
 569 |       "Patient:  e1c92d3f85a37bd8bb963345b6d66e03\n",
 570 |       "Actual:  No Cancer\n",
 571 |       "Predcited:  No Cancer\n",
 572 |       "Patient:  e1e47812eecd80466cf7f5b0160de446\n",
 573 |       "Actual:  No Cancer\n",
 574 |       "Predcited:  No Cancer\n",
 575 |       "Patient:  e1f3a01e73d706b7e9c30c0a17a4c0b5\n",
 576 |       "Actual:  No Cancer\n",
 577 |       "Predcited:  No Cancer\n",
 578 |       "Patient:  e2a7eaebd0830061e77690aa48f11936\n",
 579 |       "Actual:  No Cancer\n",
 580 |       "Predcited:  No Cancer\n",
 581 |       "Patient:  e2b7fe7fbb002029640c0e65e3051888\n",
 582 |       "Actual:  Cancer\n",
 583 |       "Predcited:  Cancer\n",
 584 |       "Patient:  e2bcbfe1ab0f9ddc5d6234f819cd5df5\n",
 585 |       "Actual:  No Cancer\n",
 586 |       "Predcited:  No Cancer\n",
 587 |       "Patient:  e2ea2f046495909ff89e18e05f710fee\n",
 588 |       "Actual:  No Cancer\n",
 589 |       "Predcited:  No Cancer\n",
 590 |       "Patient:  e3034ac9c2799b9a9cee2111593d9853\n",
 591 |       "Actual:  No Cancer\n",
 592 |       "Predcited:  No Cancer\n",
 593 |       "Patient:  e3423505ef6b43f03c5d7bde52a5a78c\n",
 594 |       "Actual:  Cancer\n",
 595 |       "Predcited:  No Cancer\n",
 596 |       "Patient:  e38789c5eabb3005bfb82a5298055ba0\n",
 597 |       "Actual:  Cancer\n",
 598 |       "Predcited:  Cancer\n",
 599 |       "Patient:  e3a9a6f8d21c6c459728066bcf18c615\n",
 600 |       "Actual:  No Cancer\n",
 601 |       "Predcited:  No Cancer\n",
 602 |       "Patient:  e3e518324e1a85b85f15d9127ed9ea89\n",
 603 |       "Actual:  No Cancer\n",
 604 |       "Predcited:  No Cancer\n",
 605 |       "Patient:  e414153d0e52f70cbe27c129911445a0\n",
 606 |       "Actual:  No Cancer\n",
 607 |       "Predcited:  No Cancer\n",
 608 |       "Patient:  e42815372aa308f5943847ad06f529de\n",
 609 |       "Actual:  No Cancer\n",
 610 |       "Predcited:  No Cancer\n",
 611 |       "Patient:  e43afa905c8e279f818b2d5104f6762b\n",
 612 |       "Actual:  Cancer\n",
 613 |       "Predcited:  No Cancer\n",
 614 |       "Patient:  e4421d2d5318845c1cccbc6fa308a96e\n",
 615 |       "Actual:  No Cancer\n",
 616 |       "Predcited:  No Cancer\n",
 617 |       "Patient:  e4436b5914162ff7efea2bdfb71c19ae\n",
 618 |       "Actual:  No Cancer\n",
 619 |       "Predcited:  Cancer\n",
 620 |       "Patient:  e46973b13a7a6f421430d81fc1dda970\n",
 621 |       "Actual:  No Cancer\n",
 622 |       "Predcited:  No Cancer\n",
 623 |       "Patient:  e4a87107f94e4a8e32b735d18cef1137\n",
 624 |       "Actual:  No Cancer\n",
 625 |       "Predcited:  No Cancer\n",
 626 |       "Patient:  e4ff18b33b7110a64f497e177102f23d\n",
 627 |       "Actual:  No Cancer\n",
 628 |       "Predcited:  No Cancer\n",
 629 |       "Patient:  e537c91cdfa97d20a39df7ef04a52570\n",
 630 |       "Actual:  Cancer\n",
 631 |       "Predcited:  No Cancer\n",
 632 |       "Patient:  e5438d842118e579a340a78f3c5775cc\n",
 633 |       "Actual:  No Cancer\n",
 634 |       "Predcited:  No Cancer\n",
 635 |       "Patient:  e54b574a7e7c650edc224cbdede9e675\n",
 636 |       "Actual:  Cancer\n",
 637 |       "Predcited:  No Cancer\n",
 638 |       "Patient:  e56b9f25a47a42f4ae4085005c46109c\n",
 639 |       "Actual:  Cancer\n",
 640 |       "Predcited:  No Cancer\n",
 641 |       "Patient:  e572e978c2b50aca781e6302937e5b13\n",
 642 |       "Actual:  Cancer\n",
 643 |       "Predcited:  Cancer\n",
 644 |       "Patient:  e58b78dc31d80a50285816f4ecd661e3\n",
 645 |       "Actual:  Cancer\n",
 646 |       "Predcited:  No Cancer\n",
 647 |       "Patient:  e58cc57cab8a1738041b72b156fedc56\n",
 648 |       "Actual:  No Cancer\n",
 649 |       "Predcited:  No Cancer\n",
 650 |       "Patient:  e5c68cfa0f33540da3098800f0daae2c\n",
 651 |       "Actual:  Cancer\n",
 652 |       "Predcited:  No Cancer\n",
 653 |       "Patient:  e5cf847e616cc2fe94816ffa547d2614\n",
 654 |       "Actual:  Cancer\n",
 655 |       "Predcited:  No Cancer\n",
 656 |       "Patient:  e608c0e6cf3adf3c9939593a3c322ef7\n",
 657 |       "Actual:  No Cancer\n",
 658 |       "Predcited:  No Cancer\n",
 659 |       "Patient:  e6214ef879c6d01ae598161e50e23c0c\n",
 660 |       "Actual:  No Cancer\n",
 661 |       "Predcited:  No Cancer\n",
 662 |       "Patient:  e63f43056330bc418a11208aa3a9e7f0\n",
 663 |       "Actual:  No Cancer\n",
 664 |       "Predcited:  No Cancer\n",
 665 |       "Patient:  e659f6517c4df17e86d4d87181396ea6\n",
 666 |       "Actual:  Cancer\n",
 667 |       "Predcited:  No Cancer\n",
 668 |       "Patient:  e67bc6cd24a71a486b626592d591a2da\n",
 669 |       "Actual:  No Cancer\n",
 670 |       "Predcited:  No Cancer\n",
 671 |       "Patient:  e6b3e750c6c7a70ca512d77defcfe615\n",
 672 |       "Actual:  Cancer\n",
 673 |       "Predcited:  No Cancer\n",
 674 |       "Patient:  e6d4a747235bfcc1feac759571c8485c\n",
 675 |       "Actual:  No Cancer\n",
 676 |       "Predcited:  No Cancer\n",
 677 |       "Patient:  e6d8b2631843a24e6761f2723ea30788\n",
 678 |       "Actual:  No Cancer\n",
 679 |       "Predcited:  No Cancer\n",
 680 |       "Patient:  e6f4757b8f315f31559c5c256cb8dead\n",
 681 |       "Actual:  No Cancer\n",
 682 |       "Predcited:  No Cancer\n",
 683 |       "Patient:  e709901da9ba15a95d4a29906edc01dd\n",
 684 |       "Actual:  Cancer\n",
 685 |       "Predcited:  No Cancer\n",
 686 |       "Patient:  e787e5fd289a9f1f6bba31569b7ad384\n",
 687 |       "Actual:  No Cancer\n",
 688 |       "Predcited:  No Cancer\n",
 689 |       "Patient:  e79f52e833ccca893509f0fdeeb26e9f\n",
 690 |       "Actual:  No Cancer\n",
 691 |       "Predcited:  No Cancer\n",
 692 |       "Patient:  e7adb2e4409683b9490e34b6b3604d9e\n",
 693 |       "Actual:  No Cancer\n",
 694 |       "Predcited:  Cancer\n",
 695 |       "Patient:  e7cb27a5362a7098e1437bfb1224d2dc\n",
 696 |       "Actual:  No Cancer\n",
 697 |       "Predcited:  No Cancer\n",
 698 |       "Patient:  e7d76f0723911280b64f0f83a4990c97\n",
 699 |       "Actual:  No Cancer\n",
 700 |       "Predcited:  No Cancer\n",
 701 |       "Patient:  e858263b89f0bb57597bcff325eaeecf\n",
 702 |       "Actual:  No Cancer\n",
 703 |       "Predcited:  No Cancer\n",
 704 |       "Patient:  e8be143b9f5e352f71043b24f79f5a17\n",
 705 |       "Actual:  No Cancer\n",
 706 |       "Predcited:  No Cancer\n",
 707 |       "Patient:  e8eb842ee04bbad407f85fe671f24d4f\n",
 708 |       "Actual:  Cancer\n",
 709 |       "Predcited:  No Cancer\n",
 710 |       "Patient:  e92a2ed80510513497d5252b001cfa3e\n",
 711 |       "Actual:  No Cancer\n",
 712 |       "Predcited:  No Cancer\n",
 713 |       "Patient:  e977737394cee9abb19ad07310aae8eb\n",
 714 |       "Actual:  No Cancer\n",
 715 |       "Predcited:  No Cancer\n",
 716 |       "Patient:  e9ccf1ce85c39779fafb9ec703c71555\n",
 717 |       "Actual:  Cancer\n",
 718 |       "Predcited:  No Cancer\n",
 719 |       "Patient:  ea7373271a2441b5864df2053c0f5c3e\n",
 720 |       "Actual:  Cancer\n",
 721 |       "Predcited:  No Cancer\n",
 722 |       "Patient:  eacb38abacf1214f3b456b6c9fa78697\n",
 723 |       "Actual:  No Cancer\n",
 724 |       "Predcited:  No Cancer\n",
 725 |       "Patient:  ead64f9269f2200e1d439960a1e069b4\n",
 726 |       "Actual:  No Cancer\n",
 727 |       "Predcited:  No Cancer\n",
 728 |       "Patient:  eaf753dc137e12fd06e96d27f3111043\n",
 729 |       "Actual:  Cancer\n",
 730 |       "Predcited:  No Cancer\n",
 731 |       "Patient:  eb008af181f3791fdce2376cf4773733\n",
 732 |       "Actual:  Cancer\n",
 733 |       "Predcited:  Cancer\n",
 734 |       "Patient:  eb8d5136918d6859ca3cc3abafe369ac\n",
 735 |       "Actual:  No Cancer\n",
 736 |       "Predcited:  No Cancer\n",
 737 |       "Patient:  eba18d04b18084ef64be8f22bb7905ca\n",
 738 |       "Actual:  No Cancer\n",
 739 |       "Predcited:  No Cancer\n",
 740 |       "Patient:  eba4bfb93928d424ff21b5be96b5c09b\n",
 741 |       "Actual:  No Cancer\n",
 742 |       "Predcited:  No Cancer\n",
 743 |       "Patient:  ebd601d40a18634b100c92e7db39f585\n",
 744 |       "Actual:  Cancer\n",
 745 |       "Predcited:  No Cancer\n",
 746 |       "Patient:  ed0f3c1619b2becec76ba5df66e1ea56\n",
 747 |       "Actual:  Cancer\n",
 748 |       "Predcited:  No Cancer\n",
 749 |       "Patient:  ed49b57854f5580658fb3510676e03dd\n",
 750 |       "Actual:  Cancer\n",
 751 |       "Predcited:  Cancer\n",
 752 |       "Patient:  ed83b655a1bbad40a782ad13cf27ce8f\n",
 753 |       "Actual:  No Cancer\n",
 754 |       "Predcited:  No Cancer\n",
 755 |       "Patient:  eda58f4918c4b506cd156702bf8a56a3\n",
 756 |       "Actual:  No Cancer\n",
 757 |       "Predcited:  No Cancer\n",
 758 |       "Patient:  edad1a7e85b5443e0ae9e654d2adbcba\n",
 759 |       "Actual:  No Cancer\n",
 760 |       "Predcited:  No Cancer\n",
 761 |       "Patient:  edae2e1edd1217d0c9e20eff2a7b2dd8\n",
 762 |       "Actual:  No Cancer\n",
 763 |       "Predcited:  No Cancer\n",
 764 |       "Patient:  edbf53a8478049de1494b213fdf942e6\n",
 765 |       "Actual:  No Cancer\n",
 766 |       "Predcited:  Cancer\n",
 767 |       "Patient:  ee71210fa398cbb080f6c537a503e806\n",
 768 |       "Actual:  No Cancer\n",
 769 |       "Predcited:  Cancer\n",
 770 |       "Patient:  ee88217bee233a3bfc971b450e3d8b85\n",
 771 |       "Actual:  No Cancer\n",
 772 |       "Predcited:  No Cancer\n",
 773 |       "Patient:  ee984e8fba88691aac4992fbb14f6e97\n",
 774 |       "Actual:  No Cancer\n",
 775 |       "Predcited:  Cancer\n",
 776 |       "Patient:  ee9c580272cd02741df7299892602ac7\n",
 777 |       "Actual:  No Cancer\n",
 778 |       "Predcited:  No Cancer\n",
 779 |       "Predicted   0   1\n",
 780 |       "Actual           \n",
 781 |       "0          56  17\n",
 782 |       "1          18   9\n"
 783 |      ]
 784 |     },
 785 |     {
 786 |      "name": "stderr",
 787 |      "output_type": "stream",
 788 |      "text": [
 789 |       "C:\\Users\\ashwa\\Anaconda3\\envs\\tensorflow-gpu\\lib\\site-packages\\matplotlib\\collections.py:590: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison\n",
 790 |       "  if self._edgecolors == str('face'):\n"
 791 |      ]
 792 |     }
 793 |    ],
 794 |    "source": [
 795 |     "def convolution3d(x, W):\n",
 796 |     "    return tf.nn.conv3d(x, W, strides=[1, 1, 1, 1, 1], padding='SAME')\n",
 797 |     "\n",
 798 |     "\n",
 799 |     "def maxpooling3d(x):\n",
 800 |     "    return tf.nn.max_pool3d(x, ksize=[1, 2, 2, 2, 1], strides=[1, 2, 2, 2, 1], padding='SAME')\n",
 801 |     "\n",
 802 |     "\n",
 803 |     "def cnn(x):\n",
 804 |     "    x = tf.reshape(x, shape=[-1, size, size, NoSlices, 1])\n",
 805 |     "    convolution1 = tf.nn.relu(\n",
 806 |     "        convolution3d(x, tf.Variable(tf.random_normal([3, 3, 3, 1, 32]))) + tf.Variable(tf.random_normal([32])))\n",
 807 |     "    convolution1 = maxpooling3d(convolution1)\n",
 808 |     "    convolution2 = tf.nn.relu(\n",
 809 |     "        convolution3d(convolution1, tf.Variable(tf.random_normal([3, 3, 3, 32, 64]))) + tf.Variable(\n",
 810 |     "            tf.random_normal([64])))\n",
 811 |     "    convolution2 = maxpooling3d(convolution2)\n",
 812 |     "    convolution3 = tf.nn.relu(\n",
 813 |     "        convolution3d(convolution2, tf.Variable(tf.random_normal([3, 3, 3, 64, 128]))) + tf.Variable(\n",
 814 |     "            tf.random_normal([128])))\n",
 815 |     "    convolution3 = maxpooling3d(convolution3)\n",
 816 |     "    convolution4 = tf.nn.relu(\n",
 817 |     "        convolution3d(convolution3, tf.Variable(tf.random_normal([3, 3, 3, 128, 256]))) + tf.Variable(\n",
 818 |     "            tf.random_normal([256])))\n",
 819 |     "    convolution4 = maxpooling3d(convolution4)\n",
 820 |     "    convolution5 = tf.nn.relu(\n",
 821 |     "        convolution3d(convolution4, tf.Variable(tf.random_normal([3, 3, 3, 256, 512]))) + tf.Variable(\n",
 822 |     "            tf.random_normal([512])))\n",
 823 |     "    convolution5 = maxpooling3d(convolution4)\n",
 824 |     "    fullyconnected = tf.reshape(convolution5, [-1, 1024])\n",
 825 |     "    fullyconnected = tf.nn.relu(\n",
 826 |     "        tf.matmul(fullyconnected, tf.Variable(tf.random_normal([1024, 1024]))) + tf.Variable(tf.random_normal([1024])))\n",
 827 |     "    fullyconnected = tf.nn.dropout(fullyconnected, keep_rate)\n",
 828 |     "    output = tf.matmul(fullyconnected, tf.Variable(tf.random_normal([1024, 2]))) + tf.Variable(tf.random_normal([2]))\n",
 829 |     "    return output\n",
 830 |     "\n",
 831 |     "\n",
 832 |     "def network(x):\n",
 833 |     "    prediction = cnn(x)\n",
 834 |     "    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))\n",
 835 |     "    optimizer = tf.train.AdamOptimizer(learning_rate=1e-3).minimize(cost)\n",
 836 |     "    epochs = 10\n",
 837 |     "    with tf.Session() as session:\n",
 838 |     "        session.run(tf.global_variables_initializer())\n",
 839 |     "        for epoch in range(epochs):\n",
 840 |     "            epoch_loss = 0\n",
 841 |     "            for data in trainingData:\n",
 842 |     "                try:\n",
 843 |     "                    X = data[0]\n",
 844 |     "                    Y = data[1]\n",
 845 |     "                    _, c = session.run([optimizer, cost], feed_dict={x: X, y: Y})\n",
 846 |     "                    epoch_loss += c\n",
 847 |     "                except Exception as e:\n",
 848 |     "                    pass\n",
 849 |     "\n",
 850 |     "            correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))\n",
 851 |     "           # if tf.argmax(prediction, 1) == 0:\n",
 852 |     "            accuracy = tf.reduce_mean(tf.cast(correct, 'float'))\n",
 853 |     "            print('Epoch', epoch + 1, 'completed out of', epochs, 'loss:', epoch_loss)\n",
 854 |     "            # print('Correct:',correct.eval({x:[i[0] for i in validationData], y:[i[1] for i in validationData]}))\n",
 855 |     "            print('Accuracy:', accuracy.eval({x: [i[0] for i in validationData], y: [i[1] for i in validationData]}))\n",
 856 |     "        print('Final Accuracy:', accuracy.eval({x: [i[0] for i in validationData], y: [i[1] for i in validationData]}))\n",
 857 |     "        patients = []\n",
 858 |     "        actual = []\n",
 859 |     "        predicted = []\n",
 860 |     "\n",
 861 |     "        finalprediction = tf.argmax(prediction, 1)\n",
 862 |     "        actualprediction = tf.argmax(y, 1)\n",
 863 |     "        for i in range(len(validationData)):\n",
 864 |     "            patients.append(validationData[i][2])\n",
 865 |     "        for i in finalprediction.eval({x: [i[0] for i in validationData], y: [i[1] for i in validationData]}):\n",
 866 |     "            if(i==1):\n",
 867 |     "                predicted.append(\"Cancer\")\n",
 868 |     "            else:\n",
 869 |     "                predicted.append(\"No Cancer\")\n",
 870 |     "        for i in actualprediction.eval({x: [i[0] for i in validationData], y: [i[1] for i in validationData]}):\n",
 871 |     "            if(i==1):\n",
 872 |     "                actual.append(\"Cancer\")\n",
 873 |     "            else:\n",
 874 |     "                actual.append(\"No Cancer\")\n",
 875 |     "        for i in range(len(patients)):\n",
 876 |     "            print(\"Patient: \",patients[i])\n",
 877 |     "            print(\"Actual: \", actual[i])\n",
 878 |     "            print(\"Predcited: \", predicted[i])\n",
 879 |     "\n",
 880 |     "        from sklearn.metrics import confusion_matrix\n",
 881 |     "        y_actual = pd.Series(\n",
 882 |     "            (actualprediction.eval({x: [i[0] for i in validationData], y: [i[1] for i in validationData]})),\n",
 883 |     "            name='Actual')\n",
 884 |     "        y_predicted = pd.Series(\n",
 885 |     "            (finalprediction.eval({x: [i[0] for i in validationData], y: [i[1] for i in validationData]})),\n",
 886 |     "            name='Predicted')\n",
 887 |     "        df_confusion = pd.crosstab(y_actual, y_predicted)\n",
 888 |     "        print(df_confusion)\n",
 889 |     "\n",
 890 |     "        ## Function to plot confusion matrix\n",
 891 |     "        def plot_confusion_matrix(df_confusion, title='Confusion matrix', cmap=plt.cm.gray_r):\\\n",
 892 |     "            \n",
 893 |     "            plt.matshow(df_confusion, cmap=cmap)  # imshow  \n",
 894 |     "            # plt.title(title)\n",
 895 |     "            plt.colorbar()\n",
 896 |     "            tick_marks = np.arange(len(df_confusion.columns))\n",
 897 |     "            plt.xticks(tick_marks, df_confusion.columns, rotation=45)\n",
 898 |     "            plt.yticks(tick_marks, df_confusion.index)\n",
 899 |     "            # plt.tight_layout()\n",
 900 |     "            plt.ylabel(df_confusion.index.name)\n",
 901 |     "            plt.xlabel(df_confusion.columns.name)\n",
 902 |     "            plt.show()\n",
 903 |     "        plot_confusion_matrix(df_confusion)\n",
 904 |     "        # print(y_true,y_pred)\n",
 905 |     "        # print(confusion_matrix(y_true, y_pred))\n",
 906 |     "        # print(actualprediction.eval({x:[i[0] for i in validationData], y:[i[1] for i in validationData]}))\n",
 907 |     "        # print(finalprediction.eval({x:[i[0] for i in validationData], y:[i[1] for i in validationData]}))\n",
 908 |     "network(x)"
 909 |    ]
 910 |   },
 911 |   {
 912 |    "cell_type": "markdown",
 913 |    "metadata": {},
 914 |    "source": [
 915 |     "After training 10 epochs the accurcay is 61%. The confusion matrix shows that true negatives are 56 out of 100 validation images. False negatives are 18/100 which is a big threat as cancer prediction is a critical issue and hence we should avoid false negatives completely."
 916 |    ]
 917 |   },
 918 |   {
 919 |    "cell_type": "markdown",
 920 |    "metadata": {},
 921 |    "source": [
 922 |     "## Detecting whether a patient has cancer or not"
 923 |    ]
 924 |   },
 925 |   {
 926 |    "cell_type": "code",
 927 |    "execution_count": 6,
 928 |    "metadata": {
 929 |     "collapsed": false
 930 |    },
 931 |    "outputs": [
 932 |     {
 933 |      "name": "stdout",
 934 |      "output_type": "stream",
 935 |      "text": [
 936 |       "Patient:  ffe02fe7d2223743f7fb455dfaff3842\n",
 937 |       "Predcited:  No Cancer\n"
 938 |      ]
 939 |     }
 940 |    ],
 941 |    "source": [
 942 |     "import tensorflow as tf\n",
 943 |     "import pandas as pd\n",
 944 |     "import tflearn\n",
 945 |     "from tflearn.layers.conv import conv_3d, max_pool_3d\n",
 946 |     "from tflearn.layers.core import input_data, dropout, fully_connected\n",
 947 |     "from tflearn.layers.estimator import regression\n",
 948 |     "import numpy as np\n",
 949 |     "import pandas as pd\n",
 950 |     "import matplotlib.pyplot as plt\n",
 951 |     "\n",
 952 |     "imageData = np.load('imageDataNew-50-50-20.npy')\n",
 953 |     "trainingData = imageData[0:800]\n",
 954 |     "testData = imageData[-1:]\n",
 955 |     "\n",
 956 |     "x = tf.placeholder('float')\n",
 957 |     "y = tf.placeholder('float')\n",
 958 |     "size = 50\n",
 959 |     "keep_rate = 0.8\n",
 960 |     "NoSlices = 20\n",
 961 |     "\n",
 962 |     "\n",
 963 |     "def convolution3d(x, W):\n",
 964 |     "    return tf.nn.conv3d(x, W, strides=[1, 1, 1, 1, 1], padding='SAME')\n",
 965 |     "\n",
 966 |     "\n",
 967 |     "def maxpooling3d(x):\n",
 968 |     "    return tf.nn.max_pool3d(x, ksize=[1, 2, 2, 2, 1], strides=[1, 2, 2, 2, 1], padding='SAME')\n",
 969 |     "\n",
 970 |     "\n",
 971 |     "def cnn(x):\n",
 972 |     "    x = tf.reshape(x, shape=[-1, size, size, NoSlices, 1])\n",
 973 |     "    convolution1 = tf.nn.relu(\n",
 974 |     "        convolution3d(x, tf.Variable(tf.random_normal([3, 3, 3, 1, 32]))) + tf.Variable(tf.random_normal([32])))\n",
 975 |     "    convolution1 = maxpooling3d(convolution1)\n",
 976 |     "    convolution2 = tf.nn.relu(\n",
 977 |     "        convolution3d(convolution1, tf.Variable(tf.random_normal([3, 3, 3, 32, 64]))) + tf.Variable(\n",
 978 |     "            tf.random_normal([64])))\n",
 979 |     "    convolution2 = maxpooling3d(convolution2)\n",
 980 |     "    convolution3 = tf.nn.relu(\n",
 981 |     "        convolution3d(convolution2, tf.Variable(tf.random_normal([3, 3, 3, 64, 128]))) + tf.Variable(\n",
 982 |     "            tf.random_normal([128])))\n",
 983 |     "    convolution3 = maxpooling3d(convolution3)\n",
 984 |     "    convolution4 = tf.nn.relu(\n",
 985 |     "        convolution3d(convolution3, tf.Variable(tf.random_normal([3, 3, 3, 128, 256]))) + tf.Variable(\n",
 986 |     "            tf.random_normal([256])))\n",
 987 |     "    convolution4 = maxpooling3d(convolution4)\n",
 988 |     "    convolution5 = tf.nn.relu(\n",
 989 |     "        convolution3d(convolution4, tf.Variable(tf.random_normal([3, 3, 3, 256, 512]))) + tf.Variable(\n",
 990 |     "            tf.random_normal([512])))\n",
 991 |     "    convolution5 = maxpooling3d(convolution4)\n",
 992 |     "    fullyconnected = tf.reshape(convolution5, [-1, 1024])\n",
 993 |     "    fullyconnected = tf.nn.relu(\n",
 994 |     "        tf.matmul(fullyconnected, tf.Variable(tf.random_normal([1024, 1024]))) + tf.Variable(tf.random_normal([1024])))\n",
 995 |     "    fullyconnected = tf.nn.dropout(fullyconnected, keep_rate)\n",
 996 |     "    output = tf.matmul(fullyconnected, tf.Variable(tf.random_normal([1024, 2]))) + tf.Variable(tf.random_normal([2]))\n",
 997 |     "    return output\n",
 998 |     "\n",
 999 |     "\n",
1000 |     "def network(x):\n",
1001 |     "    prediction = cnn(x)\n",
1002 |     "    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))\n",
1003 |     "    optimizer = tf.train.AdamOptimizer(learning_rate=1e-3).minimize(cost)\n",
1004 |     "    epochs = 10\n",
1005 |     "    with tf.Session() as session:\n",
1006 |     "        session.run(tf.global_variables_initializer())\n",
1007 |     "        for epoch in range(epochs):\n",
1008 |     "            epoch_loss = 0\n",
1009 |     "            for data in trainingData:\n",
1010 |     "                try:\n",
1011 |     "                    X = data[0]\n",
1012 |     "                    Y = data[1]\n",
1013 |     "                    _, c = session.run([optimizer, cost], feed_dict={x: X, y: Y})\n",
1014 |     "                    epoch_loss += c\n",
1015 |     "                except Exception as e:\n",
1016 |     "                    pass\n",
1017 |     "\n",
1018 |     "        patients = []\n",
1019 |     "        actual = []\n",
1020 |     "        predicted = []\n",
1021 |     "\n",
1022 |     "        finalprediction = tf.argmax(prediction, 1)\n",
1023 |     "        actualprediction = tf.argmax(y, 1)\n",
1024 |     "        for i in range(len(testData)):\n",
1025 |     "            patients.append(testData[i][2])\n",
1026 |     "        for i in finalprediction.eval({x: [i[0] for i in testData], y: [i[1] for i in testData]}):\n",
1027 |     "            if(i==1):\n",
1028 |     "                predicted.append(\"Cancer\")\n",
1029 |     "            else:\n",
1030 |     "                predicted.append(\"No Cancer\")\n",
1031 |     "\n",
1032 |     "        for i in range(len(patients)):\n",
1033 |     "            print(\"Patient: \",patients[i])\n",
1034 |     "            print(\"Predcited: \", predicted[i])\n",
1035 |     "\n",
1036 |     "     \n",
1037 |     "network(x)"
1038 |    ]
1039 |   },
1040 |   {
1041 |    "cell_type": "markdown",
1042 |    "metadata": {},
1043 |    "source": [
1044 |     "For the patient id ffe02fe7d2223743f7fb455dfaff3842 predicted cancer - No Cancer "
1045 |    ]
1046 |   }
1047 |  ],
1048 |  "metadata": {
1049 |   "anaconda-cloud": {},
1050 |   "kernelspec": {
1051 |    "display_name": "Python [default]",
1052 |    "language": "python",
1053 |    "name": "python3"
1054 |   },
1055 |   "language_info": {
1056 |    "codemirror_mode": {
1057 |     "name": "ipython",
1058 |     "version": 3
1059 |    },
1060 |    "file_extension": ".py",
1061 |    "mimetype": "text/x-python",
1062 |    "name": "python",
1063 |    "nbconvert_exporter": "python",
1064 |    "pygments_lexer": "ipython3",
1065 |    "version": "3.5.2"
1066 |   }
1067 |  },
1068 |  "nbformat": 4,
1069 |  "nbformat_minor": 2
1070 | }
1071 | 


--------------------------------------------------------------------------------
/Project Report.docx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/srujanielango/Lung-Cancer-Detection-using-3D-Convolutional-Neural-Networks/fcbfdaeea2747487a33aeb66cb4474faeed76c10/Project Report.docx


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Implementing 3D CNN for Lung Cancer Detection
 2 | 
 3 | •	Download and install CUDA such that GPU can be utilized for processing on data and this speeds up training by a considerate amount of time. Also Download CUDNN and copy the contents of the folder to the respective contents in the CUDA folder
 4 | 
 5 | •	Install anaconda with python 3.5
 6 | 
 7 | •	Create a conda environment in command prompt and name it as tensorflow gpu. Follow instructions in this page to setup tensorflow gpu for the system: https://www.tensorflow.org/install/install_windows 
 8 | 
 9 | •	Activate the environment
10 | 
11 | •	Import necessary libraries specified below 
12 | 
13 | •	OpenCV, Dicom, pandas, tensorflow, numpy, os, matplotlib, scikit-learn
14 | 
15 | •	After import of packages is complete, make sure that the indentation is followed precisely as that can cause multiple errors
16 | 
17 | •	Open Jupyter Notebook from within the activated environment
18 | 
19 | •	LungCancer3DCNN.ipynb has the 3D CNN model to be trained and contains model to detect individual patient's tumor
20 | 
21 | •	After the necessary parameters are specified for each layer of the network, the model will be created, to which the training data should be passed and the trained model is processed as output
22 | 
23 | •	On the trained model, we then run the test data, for which we will receive an accuracy, if the accuracy is stagnant, it means that the model has considerable amount of overfitting and datasets available are insufficient. We are able to predict the patients and whether they have cancer or not based on the trained model
24 | 
25 | •	The above steps were necessary for a person to run the code and achieve the best possible solution set accurately
26 | 


--------------------------------------------------------------------------------