├── LICENSE ├── .gitignore ├── README.md ├── simple-logreg.ipynb └── inference-45-55-600-epochs-tuned-effnet-b5-30-ep.ipynb /LICENSE: -------------------------------------------------------------------------------- 1 | This is free and unencumbered software released into the public domain. 2 | 3 | Anyone is free to copy, modify, publish, use, compile, sell, or 4 | distribute this software, either in source code form or as a compiled 5 | binary, for any purpose, commercial or non-commercial, and by any 6 | means. 7 | 8 | In jurisdictions that recognize copyright laws, the author or authors 9 | of this software dedicate any and all copyright interest in the 10 | software to the public domain. We make this dedication for the benefit 11 | of the public at large and to the detriment of our heirs and 12 | successors. We intend this dedication to be an overt act of 13 | relinquishment in perpetuity of all present and future rights to this 14 | software under copyright law. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 17 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 18 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 19 | IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR 20 | OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 21 | ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 22 | OTHER DEALINGS IN THE SOFTWARE. 23 | 24 | For more information, please refer to 25 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # OSIC Pulmonary Fibrosis Progression 1st place solution 2 | 3 | ## Introduction 4 | 5 | During the [OSIC Pulmonary Fibrosis Progression](https://www.kaggle.com/c/osic-pulmonary-fibrosis-progression) the competitors were asked to predict a patient’s severity of the decline in lung function based on a CT scan of their lungs and some additional tabular data fields. The challenge was to use machine learning techniques to make a prediction with the image, metadata, and baseline FVC as input. The task wasn't simple. Due to the rather low amount of data available, it was not easy to use traditional Computer Vision approaches to model the dependency between CT scans and patient FVC values. Moreover, the public leaderboard score was based only on 15% of test data and didn't correlate with the validation at all, which made it really hard to select the best models for final submission. 6 | 7 | In general, it is really hard to explain all the subtle points of the competition here, so I recommend to visit the competition link above and read about it yourself if you are really interested! 8 | 9 | ## Training, models and final solution 10 | 11 | I have tried a lot of things, but ironically the backbone to my best solution turned out to be this [kernel](https://www.kaggle.com/khoongweihao/efficientnets-quantile-regression-inference) :) 12 | 13 | Speaking about my final solution, here is what I have done to achieve this score. Firstly, I trained both models (Quantile Regression + EfficientNet b5) from scratch. For both models, I lowered the number of epochs. For effnet I decided to train for 30 epochs and for quantile regression for 600 epochs. Then, I changed the architecture of Quantile Regression a bit, because on validation my architecture worked a bit better. Apart from that, I removed all the "Percent" related features for both models, it turned out that it gave a huge boost on private lb for me. The hardest decision was how to choose the weights for the blend. Well, I just decided to give a slightly higher weight to Quantile Regression because for me it seemed to work better. Finally, I did some more improvements for the backbone notebook, for example, there was a part with the quantile selection based on the best loglikelihood score for the EfficientNet models. This part took ages to finish and moreover, for me, it looked like not a good decision, so I have just set the quantile to 0.5 and didn't select anything, this allowed my Inference notebooks to run in just 3 minutes total. 14 | 15 | ## How to run 16 | 17 | `effnet_b5_training_30_epochs.ipynb` - contains training pipeline for the best EfficientNet b5 model

18 | `inference-45-55-600-epochs-tuned-effnet-b5-30-ep.ipynb` - contains inference part for the kernel with the highest lb score

19 | `simple-logreg.ipynb` - contains naive approach which turned out to be in the bronze zone

20 | 21 | ## Final words 22 | 23 | Below I will attach links to my final submission notebooks, along with Medium and Kaggle writeups. 24 | 25 | [1st Place notebook](https://www.kaggle.com/artkulak/inference-45-55-600-epochs-tuned-effnet-b5-30-ep) 26 | 27 | [Bronze zone very simple solution](https://www.kaggle.com/artkulak/simple-logreg) 28 | 29 | [Kaggle writeup](https://www.kaggle.com/c/osic-pulmonary-fibrosis-progression/discussion/189346) 30 | 31 | [Medium writeup](https://medium.com/@artkulakov/how-i-achieved-the-1st-place-in-kaggle-osic-pulmonary-fibrosis-progression-competition-e410962c4edc) 32 | -------------------------------------------------------------------------------- /simple-logreg.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": { 7 | "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19", 8 | "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5", 9 | "execution": { 10 | "iopub.execute_input": "2020-10-05T12:23:51.286614Z", 11 | "iopub.status.busy": "2020-10-05T12:23:51.285687Z", 12 | "iopub.status.idle": "2020-10-05T12:23:52.299003Z", 13 | "shell.execute_reply": "2020-10-05T12:23:52.298169Z" 14 | }, 15 | "papermill": { 16 | "duration": 1.030385, 17 | "end_time": "2020-10-05T12:23:52.299142", 18 | "exception": false, 19 | "start_time": "2020-10-05T12:23:51.268757", 20 | "status": "completed" 21 | }, 22 | "tags": [] 23 | }, 24 | "outputs": [], 25 | "source": [ 26 | "import numpy as np\n", 27 | "import pandas as pd\n", 28 | "from sklearn.preprocessing import LabelEncoder\n", 29 | "import matplotlib.pyplot as plt\n", 30 | "import os\n", 31 | "from pathlib import Path\n", 32 | "from sklearn.model_selection import KFold\n", 33 | "from sklearn.metrics import mean_absolute_error\n", 34 | "\n", 35 | "le = LabelEncoder()\n", 36 | "ROOT = Path(\"../input/osic-pulmonary-fibrosis-progression\")" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": 2, 42 | "metadata": { 43 | "execution": { 44 | "iopub.execute_input": "2020-10-05T12:23:52.326476Z", 45 | "iopub.status.busy": "2020-10-05T12:23:52.325571Z", 46 | "iopub.status.idle": "2020-10-05T12:23:52.350858Z", 47 | "shell.execute_reply": "2020-10-05T12:23:52.350079Z" 48 | }, 49 | "papermill": { 50 | "duration": 0.041858, 51 | "end_time": "2020-10-05T12:23:52.350977", 52 | "exception": false, 53 | "start_time": "2020-10-05T12:23:52.309119", 54 | "status": "completed" 55 | }, 56 | "tags": [] 57 | }, 58 | "outputs": [], 59 | "source": [ 60 | "train = pd.read_csv(ROOT / 'train.csv')\n", 61 | "test = pd.read_csv(ROOT / 'test.csv')\n", 62 | "sub = pd.read_csv(ROOT / 'sample_submission.csv')" 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": 3, 68 | "metadata": { 69 | "execution": { 70 | "iopub.execute_input": "2020-10-05T12:23:52.431084Z", 71 | "iopub.status.busy": "2020-10-05T12:23:52.384694Z", 72 | "iopub.status.idle": "2020-10-05T12:23:52.715137Z", 73 | "shell.execute_reply": "2020-10-05T12:23:52.714377Z" 74 | }, 75 | "papermill": { 76 | "duration": 0.35549, 77 | "end_time": "2020-10-05T12:23:52.715260", 78 | "exception": false, 79 | "start_time": "2020-10-05T12:23:52.359770", 80 | "status": "completed" 81 | }, 82 | "tags": [] 83 | }, 84 | "outputs": [], 85 | "source": [ 86 | "# create training data\n", 87 | "\n", 88 | "trainData = []\n", 89 | "for p in train['Patient'].unique():\n", 90 | " patientData = train[train['Patient'] == p]\n", 91 | " firstMeasure = list(patientData.iloc[0, :].values)\n", 92 | " for i, week in enumerate(patientData['Weeks'].iloc[1:]):\n", 93 | " fvc = patientData.iloc[i, 2]\n", 94 | " trainDataPoint = firstMeasure + [week, fvc]\n", 95 | " trainData.append(trainDataPoint)\n", 96 | "trainData = pd.DataFrame(trainData)\n", 97 | "\n", 98 | "trainData.columns = ['PatientID', 'first_week', 'first_FVC', 'first_Percent', 'Age', 'Sex', 'SmokingStatus'] + ['target_week', 'target_FVC']\n", 99 | "trainData['delta_week'] = trainData['target_week'] - trainData['first_week']\n", 100 | "trainData.drop(columns = ['first_Percent', 'target_week', 'first_week'], inplace = True)" 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": 4, 106 | "metadata": { 107 | "execution": { 108 | "iopub.execute_input": "2020-10-05T12:23:52.747748Z", 109 | "iopub.status.busy": "2020-10-05T12:23:52.746971Z", 110 | "iopub.status.idle": "2020-10-05T12:23:52.764432Z", 111 | "shell.execute_reply": "2020-10-05T12:23:52.764995Z" 112 | }, 113 | "papermill": { 114 | "duration": 0.040873, 115 | "end_time": "2020-10-05T12:23:52.765163", 116 | "exception": false, 117 | "start_time": "2020-10-05T12:23:52.724290", 118 | "status": "completed" 119 | }, 120 | "tags": [] 121 | }, 122 | "outputs": [], 123 | "source": [ 124 | "# create testing data\n", 125 | "subSplit = np.array(list(sub['Patient_Week'].apply(lambda x: x.split('_')).values))\n", 126 | "testData = []\n", 127 | "for p in np.unique(subSplit[:, 0]):\n", 128 | " patientData = test[test['Patient'] == p]\n", 129 | " firstMeasure = list(patientData.iloc[0, :].values)\n", 130 | " for week in subSplit[subSplit[:, 0] == p, 1]:\n", 131 | " testDataPoint = firstMeasure + [week]\n", 132 | " testData.append(testDataPoint)\n", 133 | "testData = pd.DataFrame(testData)\n", 134 | "testData.columns = ['PatientID', 'first_week', 'first_FVC', 'first_Percent', 'Age', 'Sex', 'SmokingStatus'] + ['target_week']\n", 135 | "\n", 136 | "testData['delta_week'] = testData['target_week'].map(int) - testData['first_week']\n", 137 | "testData.drop(columns = ['first_Percent', 'first_week'], inplace = True)" 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": 5, 143 | "metadata": { 144 | "execution": { 145 | "iopub.execute_input": "2020-10-05T12:23:52.795129Z", 146 | "iopub.status.busy": "2020-10-05T12:23:52.794375Z", 147 | "iopub.status.idle": "2020-10-05T12:23:52.797644Z", 148 | "shell.execute_reply": "2020-10-05T12:23:52.797039Z" 149 | }, 150 | "papermill": { 151 | "duration": 0.023534, 152 | "end_time": "2020-10-05T12:23:52.797778", 153 | "exception": false, 154 | "start_time": "2020-10-05T12:23:52.774244", 155 | "status": "completed" 156 | }, 157 | "tags": [] 158 | }, 159 | "outputs": [], 160 | "source": [ 161 | "# fe engineering\n", 162 | "# trainData.drop(columns = ['PatientID'], inplace = True)\n", 163 | "# testData.drop(columns = ['PatientID'], inplace = True)\n", 164 | "\n", 165 | "trainData['Sex'] = le.fit_transform(trainData['Sex'])\n", 166 | "testData['Sex'] = le.transform(testData['Sex'])\n", 167 | "\n", 168 | "trainData['SmokingStatus'] = le.fit_transform(trainData['SmokingStatus'])\n", 169 | "testData['SmokingStatus'] = le.transform(testData['SmokingStatus'])" 170 | ] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "execution_count": 6, 175 | "metadata": { 176 | "execution": { 177 | "iopub.execute_input": "2020-10-05T12:23:52.823763Z", 178 | "iopub.status.busy": "2020-10-05T12:23:52.822949Z", 179 | "iopub.status.idle": "2020-10-05T12:23:53.066779Z", 180 | "shell.execute_reply": "2020-10-05T12:23:53.066038Z" 181 | }, 182 | "papermill": { 183 | "duration": 0.259927, 184 | "end_time": "2020-10-05T12:23:53.066920", 185 | "exception": false, 186 | "start_time": "2020-10-05T12:23:52.806993", 187 | "status": "completed" 188 | }, 189 | "tags": [] 190 | }, 191 | "outputs": [], 192 | "source": [ 193 | "from sklearn.linear_model import LinearRegression\n", 194 | "from sklearn.neighbors import KNeighborsRegressor\n", 195 | "\n", 196 | "model = LinearRegression()\n", 197 | "model.fit(trainData.drop(columns = ['PatientID', 'target_FVC']), trainData['target_FVC'])\n", 198 | "prediction = model.predict(testData.drop(columns = ['PatientID', 'target_week']))" 199 | ] 200 | }, 201 | { 202 | "cell_type": "code", 203 | "execution_count": 7, 204 | "metadata": { 205 | "execution": { 206 | "iopub.execute_input": "2020-10-05T12:23:53.122787Z", 207 | "iopub.status.busy": "2020-10-05T12:23:53.107111Z", 208 | "iopub.status.idle": "2020-10-05T12:23:53.129995Z", 209 | "shell.execute_reply": "2020-10-05T12:23:53.129169Z" 210 | }, 211 | "papermill": { 212 | "duration": 0.053582, 213 | "end_time": "2020-10-05T12:23:53.130154", 214 | "exception": false, 215 | "start_time": "2020-10-05T12:23:53.076572", 216 | "status": "completed" 217 | }, 218 | "tags": [] 219 | }, 220 | "outputs": [], 221 | "source": [ 222 | "sub = []\n", 223 | "for i in range(testData.shape[0]):\n", 224 | " patient, week, pred = testData.loc[i, 'PatientID'], testData.loc[i, 'target_week'], prediction[i]\n", 225 | " confidence = 225\n", 226 | " sub.append([patient + '_' + str(week), pred, confidence])\n", 227 | "sub = pd.DataFrame(sub)\n", 228 | "sub.columns = ['Patient_Week', 'FVC', 'Confidence']" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": 8, 234 | "metadata": { 235 | "execution": { 236 | "iopub.execute_input": "2020-10-05T12:23:53.157395Z", 237 | "iopub.status.busy": "2020-10-05T12:23:53.156434Z", 238 | "iopub.status.idle": "2020-10-05T12:23:53.555078Z", 239 | "shell.execute_reply": "2020-10-05T12:23:53.554261Z" 240 | }, 241 | "papermill": { 242 | "duration": 0.413318, 243 | "end_time": "2020-10-05T12:23:53.555215", 244 | "exception": false, 245 | "start_time": "2020-10-05T12:23:53.141897", 246 | "status": "completed" 247 | }, 248 | "tags": [] 249 | }, 250 | "outputs": [], 251 | "source": [ 252 | "sub.to_csv('submission.csv', index=False)" 253 | ] 254 | }, 255 | { 256 | "cell_type": "code", 257 | "execution_count": null, 258 | "metadata": { 259 | "papermill": { 260 | "duration": 0.009311, 261 | "end_time": "2020-10-05T12:23:53.574285", 262 | "exception": false, 263 | "start_time": "2020-10-05T12:23:53.564974", 264 | "status": "completed" 265 | }, 266 | "tags": [] 267 | }, 268 | "outputs": [], 269 | "source": [] 270 | } 271 | ], 272 | "metadata": { 273 | "kernelspec": { 274 | "display_name": "Python 3", 275 | "language": "python", 276 | "name": "python3" 277 | }, 278 | "language_info": { 279 | "codemirror_mode": { 280 | "name": "ipython", 281 | "version": 3 282 | }, 283 | "file_extension": ".py", 284 | "mimetype": "text/x-python", 285 | "name": "python", 286 | "nbconvert_exporter": "python", 287 | "pygments_lexer": "ipython3", 288 | "version": "3.7.6" 289 | }, 290 | "papermill": { 291 | "duration": 7.463196, 292 | "end_time": "2020-10-05T12:23:53.692046", 293 | "environment_variables": {}, 294 | "exception": null, 295 | "input_path": "__notebook__.ipynb", 296 | "output_path": "__notebook__.ipynb", 297 | "parameters": {}, 298 | "start_time": "2020-10-05T12:23:46.228850", 299 | "version": "2.1.0" 300 | } 301 | }, 302 | "nbformat": 4, 303 | "nbformat_minor": 4 304 | } 305 | -------------------------------------------------------------------------------- /inference-45-55-600-epochs-tuned-effnet-b5-30-ep.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "papermill": { 7 | "duration": 0.045308, 8 | "end_time": "2020-10-05T09:02:17.066605", 9 | "exception": false, 10 | "start_time": "2020-10-05T09:02:17.021297", 11 | "status": "completed" 12 | }, 13 | "tags": [] 14 | }, 15 | "source": [ 16 | "# Imports" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 1, 22 | "metadata": { 23 | "_kg_hide-output": true, 24 | "execution": { 25 | "iopub.execute_input": "2020-10-05T09:02:17.151799Z", 26 | "iopub.status.busy": "2020-10-05T09:02:17.150974Z", 27 | "iopub.status.idle": "2020-10-05T09:02:35.085064Z", 28 | "shell.execute_reply": "2020-10-05T09:02:35.083972Z" 29 | }, 30 | "papermill": { 31 | "duration": 17.973228, 32 | "end_time": "2020-10-05T09:02:35.085201", 33 | "exception": false, 34 | "start_time": "2020-10-05T09:02:17.111973", 35 | "status": "completed" 36 | }, 37 | "tags": [] 38 | }, 39 | "outputs": [ 40 | { 41 | "name": "stdout", 42 | "output_type": "stream", 43 | "text": [ 44 | "Looking in links: ./\r\n", 45 | "Processing /kaggle/input/kerasapplications/keras-team-keras-applications-3b180cb\r\n", 46 | "Requirement already satisfied: numpy>=1.9.1 in /opt/conda/lib/python3.7/site-packages (from Keras-Applications==1.0.8) (1.18.5)\r\n", 47 | "Requirement already satisfied: h5py in /opt/conda/lib/python3.7/site-packages (from Keras-Applications==1.0.8) (2.10.0)\r\n", 48 | "Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from h5py->Keras-Applications==1.0.8) (1.14.0)\r\n", 49 | "Building wheels for collected packages: Keras-Applications\r\n", 50 | " Building wheel for Keras-Applications (setup.py) ... \u001b[?25l-\b \b\\\b \b|\b \bdone\r\n", 51 | "\u001b[?25h Created wheel for Keras-Applications: filename=Keras_Applications-1.0.8-py3-none-any.whl size=50704 sha256=eb52f40d2c2029861b51bb8858eff96f384a865e21addd19c57a4bafcc28794d\r\n", 52 | " Stored in directory: /root/.cache/pip/wheels/f4/96/13/eccdd9391bd8df958d78851b98ec4dc207ba05b67b011eb70a\r\n", 53 | "Successfully built Keras-Applications\r\n", 54 | "Installing collected packages: Keras-Applications\r\n", 55 | "Successfully installed Keras-Applications-1.0.8\r\n", 56 | "Looking in links: ./\r\n", 57 | "Processing /kaggle/input/efficientnet/efficientnet-1.1.0\r\n", 58 | "Requirement already satisfied: keras_applications<=1.0.8,>=1.0.7 in /opt/conda/lib/python3.7/site-packages (from efficientnet==1.1.0) (1.0.8)\r\n", 59 | "Requirement already satisfied: scikit-image in /opt/conda/lib/python3.7/site-packages (from efficientnet==1.1.0) (0.16.2)\r\n", 60 | "Requirement already satisfied: h5py in /opt/conda/lib/python3.7/site-packages (from keras_applications<=1.0.8,>=1.0.7->efficientnet==1.1.0) (2.10.0)\r\n", 61 | "Requirement already satisfied: numpy>=1.9.1 in /opt/conda/lib/python3.7/site-packages (from keras_applications<=1.0.8,>=1.0.7->efficientnet==1.1.0) (1.18.5)\r\n", 62 | "Requirement already satisfied: scipy>=0.19.0 in /opt/conda/lib/python3.7/site-packages (from scikit-image->efficientnet==1.1.0) (1.4.1)\r\n", 63 | "Requirement already satisfied: matplotlib!=3.0.0,>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from scikit-image->efficientnet==1.1.0) (3.2.1)\r\n", 64 | "Requirement already satisfied: networkx>=2.0 in /opt/conda/lib/python3.7/site-packages (from scikit-image->efficientnet==1.1.0) (2.4)\r\n", 65 | "Requirement already satisfied: pillow>=4.3.0 in /opt/conda/lib/python3.7/site-packages (from scikit-image->efficientnet==1.1.0) (7.2.0)\r\n", 66 | "Requirement already satisfied: imageio>=2.3.0 in /opt/conda/lib/python3.7/site-packages (from scikit-image->efficientnet==1.1.0) (2.8.0)\r\n", 67 | "Requirement already satisfied: PyWavelets>=0.4.0 in /opt/conda/lib/python3.7/site-packages (from scikit-image->efficientnet==1.1.0) (1.1.1)\r\n", 68 | "Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from h5py->keras_applications<=1.0.8,>=1.0.7->efficientnet==1.1.0) (1.14.0)\r\n", 69 | "Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.7/site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image->efficientnet==1.1.0) (0.10.0)\r\n", 70 | "Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image->efficientnet==1.1.0) (1.2.0)\r\n", 71 | "Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image->efficientnet==1.1.0) (2.4.7)\r\n", 72 | "Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image->efficientnet==1.1.0) (2.8.1)\r\n", 73 | "Requirement already satisfied: decorator>=4.3.0 in /opt/conda/lib/python3.7/site-packages (from networkx>=2.0->scikit-image->efficientnet==1.1.0) (4.4.2)\r\n", 74 | "Building wheels for collected packages: efficientnet\r\n", 75 | " Building wheel for efficientnet (setup.py) ... \u001b[?25l-\b \b\\\b \bdone\r\n", 76 | "\u001b[?25h Created wheel for efficientnet: filename=efficientnet-1.1.0-py3-none-any.whl size=14141 sha256=8e595e4c423f4523eed3a875b4096d29947a6f6a2adf4575d0870f0a3c30e205\r\n", 77 | " Stored in directory: /root/.cache/pip/wheels/24/f5/31/3cc20871288fe532128224a3f5af7b4d67efb9835bd5683522\r\n", 78 | "Successfully built efficientnet\r\n", 79 | "Installing collected packages: efficientnet\r\n", 80 | "Successfully installed efficientnet-1.1.0\r\n" 81 | ] 82 | } 83 | ], 84 | "source": [ 85 | "!pip install ../input/kerasapplications/keras-team-keras-applications-3b180cb -f ./ --no-index\n", 86 | "!pip install ../input/efficientnet/efficientnet-1.1.0/ -f ./ --no-index" 87 | ] 88 | }, 89 | { 90 | "cell_type": "code", 91 | "execution_count": 2, 92 | "metadata": { 93 | "_cell_guid": "79c7e3d0-c299-4dcb-8224-4455121ee9b0", 94 | "_uuid": "d629ff2d2480ee46fbb7e2d37f6b5fab8052498a", 95 | "execution": { 96 | "iopub.execute_input": "2020-10-05T09:02:35.177202Z", 97 | "iopub.status.busy": "2020-10-05T09:02:35.176215Z", 98 | "iopub.status.idle": "2020-10-05T09:02:41.410549Z", 99 | "shell.execute_reply": "2020-10-05T09:02:41.409905Z" 100 | }, 101 | "papermill": { 102 | "duration": 6.283714, 103 | "end_time": "2020-10-05T09:02:41.410703", 104 | "exception": false, 105 | "start_time": "2020-10-05T09:02:35.126989", 106 | "status": "completed" 107 | }, 108 | "tags": [] 109 | }, 110 | "outputs": [], 111 | "source": [ 112 | "import os\n", 113 | "import cv2\n", 114 | "import pydicom\n", 115 | "import pandas as pd\n", 116 | "import numpy as np \n", 117 | "import tensorflow as tf \n", 118 | "import matplotlib.pyplot as plt \n", 119 | "import random\n", 120 | "from tqdm.notebook import tqdm \n", 121 | "from sklearn.model_selection import train_test_split, KFold\n", 122 | "from sklearn.metrics import mean_absolute_error\n", 123 | "from tensorflow_addons.optimizers import RectifiedAdam\n", 124 | "from tensorflow.keras import Model\n", 125 | "import tensorflow.keras.backend as K\n", 126 | "import tensorflow.keras.layers as L\n", 127 | "import tensorflow.keras.models as M\n", 128 | "from tensorflow.keras.optimizers import Nadam\n", 129 | "import seaborn as sns\n", 130 | "from PIL import Image\n", 131 | "\n", 132 | "def seed_everything(seed=2020):\n", 133 | " random.seed(seed)\n", 134 | " os.environ['PYTHONHASHSEED'] = str(seed)\n", 135 | " np.random.seed(seed)\n", 136 | " tf.random.set_seed(seed)\n", 137 | " \n", 138 | "seed_everything(42)" 139 | ] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": 3, 144 | "metadata": { 145 | "execution": { 146 | "iopub.execute_input": "2020-10-05T09:02:41.492318Z", 147 | "iopub.status.busy": "2020-10-05T09:02:41.491628Z", 148 | "iopub.status.idle": "2020-10-05T09:02:44.419043Z", 149 | "shell.execute_reply": "2020-10-05T09:02:44.418491Z" 150 | }, 151 | "papermill": { 152 | "duration": 2.970052, 153 | "end_time": "2020-10-05T09:02:44.419163", 154 | "exception": false, 155 | "start_time": "2020-10-05T09:02:41.449111", 156 | "status": "completed" 157 | }, 158 | "tags": [] 159 | }, 160 | "outputs": [], 161 | "source": [ 162 | "config = tf.compat.v1.ConfigProto()\n", 163 | "config.gpu_options.allow_growth = True\n", 164 | "session = tf.compat.v1.Session(config=config)" 165 | ] 166 | }, 167 | { 168 | "cell_type": "code", 169 | "execution_count": 4, 170 | "metadata": { 171 | "execution": { 172 | "iopub.execute_input": "2020-10-05T09:02:44.503733Z", 173 | "iopub.status.busy": "2020-10-05T09:02:44.502875Z", 174 | "iopub.status.idle": "2020-10-05T09:02:44.513556Z", 175 | "shell.execute_reply": "2020-10-05T09:02:44.514224Z" 176 | }, 177 | "papermill": { 178 | "duration": 0.056481, 179 | "end_time": "2020-10-05T09:02:44.514363", 180 | "exception": false, 181 | "start_time": "2020-10-05T09:02:44.457882", 182 | "status": "completed" 183 | }, 184 | "tags": [] 185 | }, 186 | "outputs": [], 187 | "source": [ 188 | "train = pd.read_csv('../input/osic-pulmonary-fibrosis-progression/train.csv') " 189 | ] 190 | }, 191 | { 192 | "cell_type": "markdown", 193 | "metadata": { 194 | "papermill": { 195 | "duration": 0.038625, 196 | "end_time": "2020-10-05T09:02:44.591792", 197 | "exception": false, 198 | "start_time": "2020-10-05T09:02:44.553167", 199 | "status": "completed" 200 | }, 201 | "tags": [] 202 | }, 203 | "source": [ 204 | "# Linear Decay (based on EfficientNets)" 205 | ] 206 | }, 207 | { 208 | "cell_type": "code", 209 | "execution_count": 5, 210 | "metadata": { 211 | "execution": { 212 | "iopub.execute_input": "2020-10-05T09:02:44.679512Z", 213 | "iopub.status.busy": "2020-10-05T09:02:44.677685Z", 214 | "iopub.status.idle": "2020-10-05T09:02:44.680178Z", 215 | "shell.execute_reply": "2020-10-05T09:02:44.680652Z" 216 | }, 217 | "papermill": { 218 | "duration": 0.050531, 219 | "end_time": "2020-10-05T09:02:44.680778", 220 | "exception": false, 221 | "start_time": "2020-10-05T09:02:44.630247", 222 | "status": "completed" 223 | }, 224 | "tags": [] 225 | }, 226 | "outputs": [], 227 | "source": [ 228 | "def get_tab(df):\n", 229 | " vector = [(df.Age.values[0] - 30) / 30] \n", 230 | " \n", 231 | " if df.Sex.values[0] == 'male':\n", 232 | " vector.append(0)\n", 233 | " else:\n", 234 | " vector.append(1)\n", 235 | " \n", 236 | " if df.SmokingStatus.values[0] == 'Never smoked':\n", 237 | " vector.extend([0,0])\n", 238 | " elif df.SmokingStatus.values[0] == 'Ex-smoker':\n", 239 | " vector.extend([1,1])\n", 240 | " elif df.SmokingStatus.values[0] == 'Currently smokes':\n", 241 | " vector.extend([0,1])\n", 242 | " else:\n", 243 | " vector.extend([1,0])\n", 244 | " return np.array(vector) " 245 | ] 246 | }, 247 | { 248 | "cell_type": "code", 249 | "execution_count": 6, 250 | "metadata": { 251 | "execution": { 252 | "iopub.execute_input": "2020-10-05T09:02:44.770445Z", 253 | "iopub.status.busy": "2020-10-05T09:02:44.769873Z", 254 | "iopub.status.idle": "2020-10-05T09:02:45.142054Z", 255 | "shell.execute_reply": "2020-10-05T09:02:45.141188Z" 256 | }, 257 | "papermill": { 258 | "duration": 0.422925, 259 | "end_time": "2020-10-05T09:02:45.142210", 260 | "exception": false, 261 | "start_time": "2020-10-05T09:02:44.719285", 262 | "status": "completed" 263 | }, 264 | "tags": [] 265 | }, 266 | "outputs": [ 267 | { 268 | "data": { 269 | "application/vnd.jupyter.widget-view+json": { 270 | "model_id": "f30423f9d7cf4780a01f51dde8ae36e3", 271 | "version_major": 2, 272 | "version_minor": 0 273 | }, 274 | "text/plain": [ 275 | "HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))" 276 | ] 277 | }, 278 | "metadata": {}, 279 | "output_type": "display_data" 280 | }, 281 | { 282 | "name": "stderr", 283 | "output_type": "stream", 284 | "text": [ 285 | "/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:9: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.\n", 286 | "To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.\n", 287 | " if __name__ == '__main__':\n" 288 | ] 289 | }, 290 | { 291 | "name": "stdout", 292 | "output_type": "stream", 293 | "text": [ 294 | "\n" 295 | ] 296 | } 297 | ], 298 | "source": [ 299 | "A = {} \n", 300 | "TAB = {} \n", 301 | "P = [] \n", 302 | "for i, p in tqdm(enumerate(train.Patient.unique())):\n", 303 | " sub = train.loc[train.Patient == p, :] \n", 304 | " fvc = sub.FVC.values\n", 305 | " weeks = sub.Weeks.values\n", 306 | " c = np.vstack([weeks, np.ones(len(weeks))]).T\n", 307 | " a, b = np.linalg.lstsq(c, fvc)[0]\n", 308 | " \n", 309 | " A[p] = a\n", 310 | " TAB[p] = get_tab(sub)\n", 311 | " P.append(p)" 312 | ] 313 | }, 314 | { 315 | "cell_type": "markdown", 316 | "metadata": { 317 | "papermill": { 318 | "duration": 0.041692, 319 | "end_time": "2020-10-05T09:02:45.238864", 320 | "exception": false, 321 | "start_time": "2020-10-05T09:02:45.197172", 322 | "status": "completed" 323 | }, 324 | "tags": [] 325 | }, 326 | "source": [ 327 | "## CNN for coeff prediction" 328 | ] 329 | }, 330 | { 331 | "cell_type": "code", 332 | "execution_count": 7, 333 | "metadata": { 334 | "execution": { 335 | "iopub.execute_input": "2020-10-05T09:02:45.325087Z", 336 | "iopub.status.busy": "2020-10-05T09:02:45.324310Z", 337 | "iopub.status.idle": "2020-10-05T09:02:45.328316Z", 338 | "shell.execute_reply": "2020-10-05T09:02:45.327833Z" 339 | }, 340 | "papermill": { 341 | "duration": 0.049175, 342 | "end_time": "2020-10-05T09:02:45.328416", 343 | "exception": false, 344 | "start_time": "2020-10-05T09:02:45.279241", 345 | "status": "completed" 346 | }, 347 | "tags": [] 348 | }, 349 | "outputs": [], 350 | "source": [ 351 | "def get_img(path):\n", 352 | " d = pydicom.dcmread(path)\n", 353 | " return cv2.resize(d.pixel_array / 2**11, (512, 512))" 354 | ] 355 | }, 356 | { 357 | "cell_type": "code", 358 | "execution_count": 8, 359 | "metadata": { 360 | "execution": { 361 | "iopub.execute_input": "2020-10-05T09:02:45.423416Z", 362 | "iopub.status.busy": "2020-10-05T09:02:45.422727Z", 363 | "iopub.status.idle": "2020-10-05T09:02:45.426079Z", 364 | "shell.execute_reply": "2020-10-05T09:02:45.426606Z" 365 | }, 366 | "papermill": { 367 | "duration": 0.058362, 368 | "end_time": "2020-10-05T09:02:45.426765", 369 | "exception": false, 370 | "start_time": "2020-10-05T09:02:45.368403", 371 | "status": "completed" 372 | }, 373 | "tags": [] 374 | }, 375 | "outputs": [], 376 | "source": [ 377 | "from tensorflow.keras.utils import Sequence\n", 378 | "\n", 379 | "class IGenerator(Sequence):\n", 380 | " BAD_ID = ['ID00011637202177653955184', 'ID00052637202186188008618']\n", 381 | " def __init__(self, keys, a, tab, batch_size=32):\n", 382 | " self.keys = [k for k in keys if k not in self.BAD_ID]\n", 383 | " self.a = a\n", 384 | " self.tab = tab\n", 385 | " self.batch_size = batch_size\n", 386 | " \n", 387 | " self.train_data = {}\n", 388 | " for p in train.Patient.values:\n", 389 | " self.train_data[p] = os.listdir(f'../input/osic-pulmonary-fibrosis-progression/train/{p}/')\n", 390 | " \n", 391 | " def __len__(self):\n", 392 | " return 1000\n", 393 | " \n", 394 | " def __getitem__(self, idx):\n", 395 | " x = []\n", 396 | " a, tab = [], [] \n", 397 | " keys = np.random.choice(self.keys, size = self.batch_size)\n", 398 | " for k in keys:\n", 399 | " try:\n", 400 | " i = np.random.choice(self.train_data[k], size=1)[0]\n", 401 | " img = get_img(f'../input/osic-pulmonary-fibrosis-progression/train/{k}/{i}')\n", 402 | " x.append(img)\n", 403 | " a.append(self.a[k])\n", 404 | " tab.append(self.tab[k])\n", 405 | " except:\n", 406 | " print(k, i)\n", 407 | " \n", 408 | " x,a,tab = np.array(x), np.array(a), np.array(tab)\n", 409 | " x = np.expand_dims(x, axis=-1)\n", 410 | " return [x, tab] , a" 411 | ] 412 | }, 413 | { 414 | "cell_type": "code", 415 | "execution_count": 9, 416 | "metadata": { 417 | "execution": { 418 | "iopub.execute_input": "2020-10-05T09:02:45.527243Z", 419 | "iopub.status.busy": "2020-10-05T09:02:45.521989Z", 420 | "iopub.status.idle": "2020-10-05T09:03:41.144478Z", 421 | "shell.execute_reply": "2020-10-05T09:03:41.145102Z" 422 | }, 423 | "papermill": { 424 | "duration": 55.671746, 425 | "end_time": "2020-10-05T09:03:41.145253", 426 | "exception": false, 427 | "start_time": "2020-10-05T09:02:45.473507", 428 | "status": "completed" 429 | }, 430 | "tags": [] 431 | }, 432 | "outputs": [ 433 | { 434 | "name": "stdout", 435 | "output_type": "stream", 436 | "text": [ 437 | "Number of models: 1\n" 438 | ] 439 | } 440 | ], 441 | "source": [ 442 | "from tensorflow.keras.layers import (\n", 443 | " Dense, Dropout, Activation, Flatten, Input, BatchNormalization, GlobalAveragePooling2D, Add, Conv2D, AveragePooling2D, \n", 444 | " LeakyReLU, Concatenate \n", 445 | ")\n", 446 | "import efficientnet.tfkeras as efn\n", 447 | "\n", 448 | "def get_efficientnet(model, shape):\n", 449 | " models_dict = {\n", 450 | " 'b0': efn.EfficientNetB0(input_shape=shape,weights=None,include_top=False),\n", 451 | " 'b1': efn.EfficientNetB1(input_shape=shape,weights=None,include_top=False),\n", 452 | " 'b2': efn.EfficientNetB2(input_shape=shape,weights=None,include_top=False),\n", 453 | " 'b3': efn.EfficientNetB3(input_shape=shape,weights=None,include_top=False),\n", 454 | " 'b4': efn.EfficientNetB4(input_shape=shape,weights=None,include_top=False),\n", 455 | " 'b5': efn.EfficientNetB5(input_shape=shape,weights=None,include_top=False),\n", 456 | " 'b6': efn.EfficientNetB6(input_shape=shape,weights=None,include_top=False),\n", 457 | " 'b7': efn.EfficientNetB7(input_shape=shape,weights=None,include_top=False)\n", 458 | " }\n", 459 | " return models_dict[model]\n", 460 | "\n", 461 | "def build_model(shape=(512, 512, 1), model_class=None):\n", 462 | " inp = Input(shape=shape)\n", 463 | " base = get_efficientnet(model_class, shape)\n", 464 | " x = base(inp)\n", 465 | " x = GlobalAveragePooling2D()(x)\n", 466 | " inp2 = Input(shape=(4,))\n", 467 | " x2 = tf.keras.layers.GaussianNoise(0.2)(inp2)\n", 468 | " x = Concatenate()([x, x2]) \n", 469 | " x = Dropout(0.5)(x) \n", 470 | " x = Dense(1)(x)\n", 471 | " model = Model([inp, inp2] , x)\n", 472 | " \n", 473 | " weights = [w for w in os.listdir('../input/osic-model-weights') if model_class in w][0]\n", 474 | "# model.load_weights('../input/osic-model-weights/' + weights)\n", 475 | " model.load_weights('../input/effnet-b5-30epochs-1/effnet_30.h5')\n", 476 | " return model\n", 477 | "\n", 478 | "model_classes = ['b5'] #['b0','b1','b2','b3',b4','b5','b6','b7']\n", 479 | "models = [build_model(shape=(512, 512, 1), model_class=m) for m in model_classes]\n", 480 | "print('Number of models: ' + str(len(models)))" 481 | ] 482 | }, 483 | { 484 | "cell_type": "code", 485 | "execution_count": 10, 486 | "metadata": { 487 | "execution": { 488 | "iopub.execute_input": "2020-10-05T09:03:41.232814Z", 489 | "iopub.status.busy": "2020-10-05T09:03:41.231923Z", 490 | "iopub.status.idle": "2020-10-05T09:03:41.234172Z", 491 | "shell.execute_reply": "2020-10-05T09:03:41.234637Z" 492 | }, 493 | "papermill": { 494 | "duration": 0.0489, 495 | "end_time": "2020-10-05T09:03:41.234770", 496 | "exception": false, 497 | "start_time": "2020-10-05T09:03:41.185870", 498 | "status": "completed" 499 | }, 500 | "tags": [] 501 | }, 502 | "outputs": [], 503 | "source": [ 504 | "from sklearn.model_selection import train_test_split \n", 505 | "\n", 506 | "tr_p, vl_p = train_test_split(P, \n", 507 | " shuffle=True, \n", 508 | " train_size= 1) " 509 | ] 510 | }, 511 | { 512 | "cell_type": "code", 513 | "execution_count": 11, 514 | "metadata": { 515 | "execution": { 516 | "iopub.execute_input": "2020-10-05T09:03:41.320375Z", 517 | "iopub.status.busy": "2020-10-05T09:03:41.319770Z", 518 | "iopub.status.idle": "2020-10-05T09:03:41.507834Z", 519 | "shell.execute_reply": "2020-10-05T09:03:41.508365Z" 520 | }, 521 | "papermill": { 522 | "duration": 0.233191, 523 | "end_time": "2020-10-05T09:03:41.508496", 524 | "exception": false, 525 | "start_time": "2020-10-05T09:03:41.275305", 526 | "status": "completed" 527 | }, 528 | "tags": [] 529 | }, 530 | "outputs": [ 531 | { 532 | "data": { 533 | "image/png": "\n", 534 | "text/plain": [ 535 | "
" 536 | ] 537 | }, 538 | "metadata": { 539 | "needs_background": "light" 540 | }, 541 | "output_type": "display_data" 542 | } 543 | ], 544 | "source": [ 545 | "sns.distplot(list(A.values()));" 546 | ] 547 | }, 548 | { 549 | "cell_type": "code", 550 | "execution_count": 12, 551 | "metadata": { 552 | "execution": { 553 | "iopub.execute_input": "2020-10-05T09:03:41.601132Z", 554 | "iopub.status.busy": "2020-10-05T09:03:41.600409Z", 555 | "iopub.status.idle": "2020-10-05T09:03:41.604055Z", 556 | "shell.execute_reply": "2020-10-05T09:03:41.603560Z" 557 | }, 558 | "papermill": { 559 | "duration": 0.050863, 560 | "end_time": "2020-10-05T09:03:41.604153", 561 | "exception": false, 562 | "start_time": "2020-10-05T09:03:41.553290", 563 | "status": "completed" 564 | }, 565 | "tags": [] 566 | }, 567 | "outputs": [], 568 | "source": [ 569 | "def score(fvc_true, fvc_pred, sigma):\n", 570 | " sigma_clip = np.maximum(sigma, 70) # changed from 70, trie 66.7 too\n", 571 | " delta = np.abs(fvc_true - fvc_pred)\n", 572 | " delta = np.minimum(delta, 1000)\n", 573 | " sq2 = np.sqrt(2)\n", 574 | " metric = (delta / sigma_clip)*sq2 + np.log(sigma_clip* sq2)\n", 575 | " return np.mean(metric)" 576 | ] 577 | }, 578 | { 579 | "cell_type": "code", 580 | "execution_count": 13, 581 | "metadata": { 582 | "execution": { 583 | "iopub.execute_input": "2020-10-05T09:03:41.707784Z", 584 | "iopub.status.busy": "2020-10-05T09:03:41.707072Z", 585 | "iopub.status.idle": "2020-10-05T09:04:25.837165Z", 586 | "shell.execute_reply": "2020-10-05T09:04:25.836620Z" 587 | }, 588 | "papermill": { 589 | "duration": 44.190109, 590 | "end_time": "2020-10-05T09:04:25.837283", 591 | "exception": false, 592 | "start_time": "2020-10-05T09:03:41.647174", 593 | "status": "completed" 594 | }, 595 | "tags": [] 596 | }, 597 | "outputs": [], 598 | "source": [ 599 | "subs = []\n", 600 | "for model in models:\n", 601 | "\n", 602 | " q = 0.5\n", 603 | "\n", 604 | " sub = pd.read_csv('../input/osic-pulmonary-fibrosis-progression/sample_submission.csv') \n", 605 | " test = pd.read_csv('../input/osic-pulmonary-fibrosis-progression/test.csv') \n", 606 | " A_test, B_test, P_test,W, FVC= {}, {}, {},{},{} \n", 607 | " STD, WEEK = {}, {} \n", 608 | " for p in test.Patient.unique():\n", 609 | " x = [] \n", 610 | " tab = [] \n", 611 | " ldir = os.listdir(f'../input/osic-pulmonary-fibrosis-progression/test/{p}/')\n", 612 | " for i in ldir:\n", 613 | " if int(i[:-4]) / len(ldir) < 1.1 and int(i[:-4]) / len(ldir) > -0.1:\n", 614 | " x.append(get_img(f'../input/osic-pulmonary-fibrosis-progression/test/{p}/{i}')) \n", 615 | " tab.append(get_tab(test.loc[test.Patient == p, :])) \n", 616 | " if len(x) <= 1:\n", 617 | " continue\n", 618 | " tab = np.array(tab) \n", 619 | "\n", 620 | " x = np.expand_dims(x, axis=-1) \n", 621 | " _a = model.predict([x, tab]) \n", 622 | " a = np.quantile(_a, q)\n", 623 | " A_test[p] = a\n", 624 | " B_test[p] = test.FVC.values[test.Patient == p] - a*test.Weeks.values[test.Patient == p]\n", 625 | " P_test[p] = test.Percent.values[test.Patient == p] \n", 626 | " WEEK[p] = test.Weeks.values[test.Patient == p]\n", 627 | "\n", 628 | " for k in sub.Patient_Week.values:\n", 629 | " p, w = k.split('_')\n", 630 | " w = int(w) \n", 631 | "\n", 632 | " fvc = A_test[p] * w + B_test[p]\n", 633 | " sub.loc[sub.Patient_Week == k, 'FVC'] = fvc\n", 634 | " sub.loc[sub.Patient_Week == k, 'Confidence'] = (\n", 635 | " P_test[p] - A_test[p] * abs(WEEK[p] - w) \n", 636 | " ) \n", 637 | "\n", 638 | " _sub = sub[[\"Patient_Week\",\"FVC\",\"Confidence\"]].copy()\n", 639 | " subs.append(_sub)" 640 | ] 641 | }, 642 | { 643 | "cell_type": "markdown", 644 | "metadata": { 645 | "papermill": { 646 | "duration": 0.041647, 647 | "end_time": "2020-10-05T09:04:25.921062", 648 | "exception": false, 649 | "start_time": "2020-10-05T09:04:25.879415", 650 | "status": "completed" 651 | }, 652 | "tags": [] 653 | }, 654 | "source": [ 655 | "## Averaging Predictions" 656 | ] 657 | }, 658 | { 659 | "cell_type": "code", 660 | "execution_count": 14, 661 | "metadata": { 662 | "execution": { 663 | "iopub.execute_input": "2020-10-05T09:04:26.012723Z", 664 | "iopub.status.busy": "2020-10-05T09:04:26.011917Z", 665 | "iopub.status.idle": "2020-10-05T09:04:26.032006Z", 666 | "shell.execute_reply": "2020-10-05T09:04:26.031484Z" 667 | }, 668 | "papermill": { 669 | "duration": 0.069391, 670 | "end_time": "2020-10-05T09:04:26.032114", 671 | "exception": false, 672 | "start_time": "2020-10-05T09:04:25.962723", 673 | "status": "completed" 674 | }, 675 | "tags": [] 676 | }, 677 | "outputs": [], 678 | "source": [ 679 | "N = len(subs)\n", 680 | "sub = subs[0].copy() # ref\n", 681 | "sub[\"FVC\"] = 0\n", 682 | "sub[\"Confidence\"] = 0\n", 683 | "for i in range(N):\n", 684 | " sub[\"FVC\"] += subs[0][\"FVC\"] * (1/N)\n", 685 | " sub[\"Confidence\"] += subs[0][\"Confidence\"] * (1/N)" 686 | ] 687 | }, 688 | { 689 | "cell_type": "code", 690 | "execution_count": 15, 691 | "metadata": { 692 | "execution": { 693 | "iopub.execute_input": "2020-10-05T09:04:26.123341Z", 694 | "iopub.status.busy": "2020-10-05T09:04:26.122702Z", 695 | "iopub.status.idle": "2020-10-05T09:04:26.132073Z", 696 | "shell.execute_reply": "2020-10-05T09:04:26.131603Z" 697 | }, 698 | "papermill": { 699 | "duration": 0.058087, 700 | "end_time": "2020-10-05T09:04:26.132165", 701 | "exception": false, 702 | "start_time": "2020-10-05T09:04:26.074078", 703 | "status": "completed" 704 | }, 705 | "tags": [] 706 | }, 707 | "outputs": [ 708 | { 709 | "data": { 710 | "text/html": [ 711 | "
\n", 712 | "\n", 725 | "\n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | "
Patient_WeekFVCConfidence
0ID00419637202311204720264_-123089.912209140.099064
1ID00421637202311550012437_-122843.369867186.415158
2ID00422637202311677017371_-121999.601912146.274405
3ID00423637202312137826377_-123405.996907191.255810
4ID00426637202313170790466_-122967.838420114.663388
\n", 767 | "
" 768 | ], 769 | "text/plain": [ 770 | " Patient_Week FVC Confidence\n", 771 | "0 ID00419637202311204720264_-12 3089.912209 140.099064\n", 772 | "1 ID00421637202311550012437_-12 2843.369867 186.415158\n", 773 | "2 ID00422637202311677017371_-12 1999.601912 146.274405\n", 774 | "3 ID00423637202312137826377_-12 3405.996907 191.255810\n", 775 | "4 ID00426637202313170790466_-12 2967.838420 114.663388" 776 | ] 777 | }, 778 | "execution_count": 15, 779 | "metadata": {}, 780 | "output_type": "execute_result" 781 | } 782 | ], 783 | "source": [ 784 | "sub.head()" 785 | ] 786 | }, 787 | { 788 | "cell_type": "code", 789 | "execution_count": 16, 790 | "metadata": { 791 | "execution": { 792 | "iopub.execute_input": "2020-10-05T09:04:26.225603Z", 793 | "iopub.status.busy": "2020-10-05T09:04:26.224982Z", 794 | "iopub.status.idle": "2020-10-05T09:04:26.318096Z", 795 | "shell.execute_reply": "2020-10-05T09:04:26.317144Z" 796 | }, 797 | "papermill": { 798 | "duration": 0.141316, 799 | "end_time": "2020-10-05T09:04:26.318208", 800 | "exception": false, 801 | "start_time": "2020-10-05T09:04:26.176892", 802 | "status": "completed" 803 | }, 804 | "tags": [] 805 | }, 806 | "outputs": [], 807 | "source": [ 808 | "sub[[\"Patient_Week\",\"FVC\",\"Confidence\"]].to_csv(\"submission_img.csv\", index=False)" 809 | ] 810 | }, 811 | { 812 | "cell_type": "code", 813 | "execution_count": 17, 814 | "metadata": { 815 | "execution": { 816 | "iopub.execute_input": "2020-10-05T09:04:26.410304Z", 817 | "iopub.status.busy": "2020-10-05T09:04:26.409511Z", 818 | "iopub.status.idle": "2020-10-05T09:04:26.411851Z", 819 | "shell.execute_reply": "2020-10-05T09:04:26.412413Z" 820 | }, 821 | "papermill": { 822 | "duration": 0.05105, 823 | "end_time": "2020-10-05T09:04:26.412526", 824 | "exception": false, 825 | "start_time": "2020-10-05T09:04:26.361476", 826 | "status": "completed" 827 | }, 828 | "tags": [] 829 | }, 830 | "outputs": [], 831 | "source": [ 832 | "img_sub = sub[[\"Patient_Week\",\"FVC\",\"Confidence\"]].copy()" 833 | ] 834 | }, 835 | { 836 | "cell_type": "markdown", 837 | "metadata": { 838 | "papermill": { 839 | "duration": 0.042567, 840 | "end_time": "2020-10-05T09:04:26.497975", 841 | "exception": false, 842 | "start_time": "2020-10-05T09:04:26.455408", 843 | "status": "completed" 844 | }, 845 | "tags": [] 846 | }, 847 | "source": [ 848 | "# Osic-Multiple-Quantile-Regression" 849 | ] 850 | }, 851 | { 852 | "cell_type": "code", 853 | "execution_count": 18, 854 | "metadata": { 855 | "execution": { 856 | "iopub.execute_input": "2020-10-05T09:04:26.594894Z", 857 | "iopub.status.busy": "2020-10-05T09:04:26.594085Z", 858 | "iopub.status.idle": "2020-10-05T09:04:26.630770Z", 859 | "shell.execute_reply": "2020-10-05T09:04:26.631250Z" 860 | }, 861 | "papermill": { 862 | "duration": 0.090334, 863 | "end_time": "2020-10-05T09:04:26.631360", 864 | "exception": false, 865 | "start_time": "2020-10-05T09:04:26.541026", 866 | "status": "completed" 867 | }, 868 | "tags": [] 869 | }, 870 | "outputs": [ 871 | { 872 | "name": "stdout", 873 | "output_type": "stream", 874 | "text": [ 875 | "add infos\n" 876 | ] 877 | } 878 | ], 879 | "source": [ 880 | "ROOT = \"../input/osic-pulmonary-fibrosis-progression\"\n", 881 | "BATCH_SIZE=128\n", 882 | "\n", 883 | "tr = pd.read_csv(f\"{ROOT}/train.csv\")\n", 884 | "tr.drop_duplicates(keep=False, inplace=True, subset=['Patient','Weeks'])\n", 885 | "chunk = pd.read_csv(f\"{ROOT}/test.csv\")\n", 886 | "\n", 887 | "print(\"add infos\")\n", 888 | "sub = pd.read_csv(f\"{ROOT}/sample_submission.csv\")\n", 889 | "sub['Patient'] = sub['Patient_Week'].apply(lambda x:x.split('_')[0])\n", 890 | "sub['Weeks'] = sub['Patient_Week'].apply(lambda x: int(x.split('_')[-1]))\n", 891 | "sub = sub[['Patient','Weeks','Confidence','Patient_Week']]\n", 892 | "sub = sub.merge(chunk.drop('Weeks', axis=1), on=\"Patient\")" 893 | ] 894 | }, 895 | { 896 | "cell_type": "code", 897 | "execution_count": 19, 898 | "metadata": { 899 | "execution": { 900 | "iopub.execute_input": "2020-10-05T09:04:26.732054Z", 901 | "iopub.status.busy": "2020-10-05T09:04:26.731249Z", 902 | "iopub.status.idle": "2020-10-05T09:04:26.734099Z", 903 | "shell.execute_reply": "2020-10-05T09:04:26.734589Z" 904 | }, 905 | "papermill": { 906 | "duration": 0.059864, 907 | "end_time": "2020-10-05T09:04:26.734703", 908 | "exception": false, 909 | "start_time": "2020-10-05T09:04:26.674839", 910 | "status": "completed" 911 | }, 912 | "tags": [] 913 | }, 914 | "outputs": [], 915 | "source": [ 916 | "tr['WHERE'] = 'train'\n", 917 | "chunk['WHERE'] = 'val'\n", 918 | "sub['WHERE'] = 'test'\n", 919 | "data = tr.append([chunk, sub])" 920 | ] 921 | }, 922 | { 923 | "cell_type": "code", 924 | "execution_count": 20, 925 | "metadata": { 926 | "execution": { 927 | "iopub.execute_input": "2020-10-05T09:04:26.828284Z", 928 | "iopub.status.busy": "2020-10-05T09:04:26.827430Z", 929 | "iopub.status.idle": "2020-10-05T09:04:26.833477Z", 930 | "shell.execute_reply": "2020-10-05T09:04:26.833923Z" 931 | }, 932 | "papermill": { 933 | "duration": 0.055205, 934 | "end_time": "2020-10-05T09:04:26.834037", 935 | "exception": false, 936 | "start_time": "2020-10-05T09:04:26.778832", 937 | "status": "completed" 938 | }, 939 | "tags": [] 940 | }, 941 | "outputs": [ 942 | { 943 | "name": "stdout", 944 | "output_type": "stream", 945 | "text": [ 946 | "(1535, 8) (5, 8) (730, 10) (2270, 10)\n", 947 | "176 5 5 176\n" 948 | ] 949 | } 950 | ], 951 | "source": [ 952 | "print(tr.shape, chunk.shape, sub.shape, data.shape)\n", 953 | "print(tr.Patient.nunique(), chunk.Patient.nunique(), sub.Patient.nunique(), \n", 954 | " data.Patient.nunique())\n", 955 | "#" 956 | ] 957 | }, 958 | { 959 | "cell_type": "code", 960 | "execution_count": 21, 961 | "metadata": { 962 | "execution": { 963 | "iopub.execute_input": "2020-10-05T09:04:26.929964Z", 964 | "iopub.status.busy": "2020-10-05T09:04:26.929130Z", 965 | "iopub.status.idle": "2020-10-05T09:04:26.940627Z", 966 | "shell.execute_reply": "2020-10-05T09:04:26.940119Z" 967 | }, 968 | "papermill": { 969 | "duration": 0.062825, 970 | "end_time": "2020-10-05T09:04:26.940721", 971 | "exception": false, 972 | "start_time": "2020-10-05T09:04:26.877896", 973 | "status": "completed" 974 | }, 975 | "tags": [] 976 | }, 977 | "outputs": [], 978 | "source": [ 979 | "data['min_week'] = data['Weeks']\n", 980 | "data.loc[data.WHERE=='test','min_week'] = np.nan\n", 981 | "data['min_week'] = data.groupby('Patient')['min_week'].transform('min')" 982 | ] 983 | }, 984 | { 985 | "cell_type": "code", 986 | "execution_count": 22, 987 | "metadata": { 988 | "execution": { 989 | "iopub.execute_input": "2020-10-05T09:04:27.044644Z", 990 | "iopub.status.busy": "2020-10-05T09:04:27.043876Z", 991 | "iopub.status.idle": "2020-10-05T09:04:27.046933Z", 992 | "shell.execute_reply": "2020-10-05T09:04:27.046416Z" 993 | }, 994 | "papermill": { 995 | "duration": 0.062012, 996 | "end_time": "2020-10-05T09:04:27.047028", 997 | "exception": false, 998 | "start_time": "2020-10-05T09:04:26.985016", 999 | "status": "completed" 1000 | }, 1001 | "tags": [] 1002 | }, 1003 | "outputs": [], 1004 | "source": [ 1005 | "base = data.loc[data.Weeks == data.min_week]\n", 1006 | "base = base[['Patient','FVC']].copy()\n", 1007 | "base.columns = ['Patient','min_FVC']\n", 1008 | "base['nb'] = 1\n", 1009 | "base['nb'] = base.groupby('Patient')['nb'].transform('cumsum')\n", 1010 | "base = base[base.nb==1]\n", 1011 | "base.drop('nb', axis=1, inplace=True)" 1012 | ] 1013 | }, 1014 | { 1015 | "cell_type": "code", 1016 | "execution_count": 23, 1017 | "metadata": { 1018 | "execution": { 1019 | "iopub.execute_input": "2020-10-05T09:04:27.143987Z", 1020 | "iopub.status.busy": "2020-10-05T09:04:27.143130Z", 1021 | "iopub.status.idle": "2020-10-05T09:04:27.152004Z", 1022 | "shell.execute_reply": "2020-10-05T09:04:27.151508Z" 1023 | }, 1024 | "papermill": { 1025 | "duration": 0.060986, 1026 | "end_time": "2020-10-05T09:04:27.152098", 1027 | "exception": false, 1028 | "start_time": "2020-10-05T09:04:27.091112", 1029 | "status": "completed" 1030 | }, 1031 | "tags": [] 1032 | }, 1033 | "outputs": [], 1034 | "source": [ 1035 | "data = data.merge(base, on='Patient', how='left')\n", 1036 | "data['base_week'] = data['Weeks'] - data['min_week']\n", 1037 | "del base" 1038 | ] 1039 | }, 1040 | { 1041 | "cell_type": "code", 1042 | "execution_count": 24, 1043 | "metadata": { 1044 | "execution": { 1045 | "iopub.execute_input": "2020-10-05T09:04:27.250246Z", 1046 | "iopub.status.busy": "2020-10-05T09:04:27.249367Z", 1047 | "iopub.status.idle": "2020-10-05T09:04:27.256648Z", 1048 | "shell.execute_reply": "2020-10-05T09:04:27.256166Z" 1049 | }, 1050 | "papermill": { 1051 | "duration": 0.059698, 1052 | "end_time": "2020-10-05T09:04:27.256743", 1053 | "exception": false, 1054 | "start_time": "2020-10-05T09:04:27.197045", 1055 | "status": "completed" 1056 | }, 1057 | "tags": [] 1058 | }, 1059 | "outputs": [], 1060 | "source": [ 1061 | "COLS = ['Sex','SmokingStatus'] #,'Age'\n", 1062 | "FE = []\n", 1063 | "for col in COLS:\n", 1064 | " for mod in data[col].unique():\n", 1065 | " FE.append(mod)\n", 1066 | " data[mod] = (data[col] == mod).astype(int)" 1067 | ] 1068 | }, 1069 | { 1070 | "cell_type": "code", 1071 | "execution_count": 25, 1072 | "metadata": { 1073 | "execution": { 1074 | "iopub.execute_input": "2020-10-05T09:04:27.358079Z", 1075 | "iopub.status.busy": "2020-10-05T09:04:27.357262Z", 1076 | "iopub.status.idle": "2020-10-05T09:04:27.360070Z", 1077 | "shell.execute_reply": "2020-10-05T09:04:27.359560Z" 1078 | }, 1079 | "papermill": { 1080 | "duration": 0.058536, 1081 | "end_time": "2020-10-05T09:04:27.360168", 1082 | "exception": false, 1083 | "start_time": "2020-10-05T09:04:27.301632", 1084 | "status": "completed" 1085 | }, 1086 | "tags": [] 1087 | }, 1088 | "outputs": [], 1089 | "source": [ 1090 | "#\n", 1091 | "data['age'] = (data['Age'] - data['Age'].min() ) / ( data['Age'].max() - data['Age'].min() )\n", 1092 | "data['BASE'] = (data['min_FVC'] - data['min_FVC'].min() ) / ( data['min_FVC'].max() - data['min_FVC'].min() )\n", 1093 | "data['week'] = (data['base_week'] - data['base_week'].min() ) / ( data['base_week'].max() - data['base_week'].min() )\n", 1094 | "#data['percent'] = (data['Percent'] - data['Percent'].min() ) / ( data['Percent'].max() - data['Percent'].min() )\n", 1095 | "FE += ['age','week','BASE']" 1096 | ] 1097 | }, 1098 | { 1099 | "cell_type": "code", 1100 | "execution_count": 26, 1101 | "metadata": { 1102 | "execution": { 1103 | "iopub.execute_input": "2020-10-05T09:04:27.457101Z", 1104 | "iopub.status.busy": "2020-10-05T09:04:27.456211Z", 1105 | "iopub.status.idle": "2020-10-05T09:04:27.463453Z", 1106 | "shell.execute_reply": "2020-10-05T09:04:27.462971Z" 1107 | }, 1108 | "papermill": { 1109 | "duration": 0.058703, 1110 | "end_time": "2020-10-05T09:04:27.463553", 1111 | "exception": false, 1112 | "start_time": "2020-10-05T09:04:27.404850", 1113 | "status": "completed" 1114 | }, 1115 | "tags": [] 1116 | }, 1117 | "outputs": [], 1118 | "source": [ 1119 | "tr = data.loc[data.WHERE=='train']\n", 1120 | "chunk = data.loc[data.WHERE=='val']\n", 1121 | "sub = data.loc[data.WHERE=='test']\n", 1122 | "del data" 1123 | ] 1124 | }, 1125 | { 1126 | "cell_type": "code", 1127 | "execution_count": 27, 1128 | "metadata": { 1129 | "execution": { 1130 | "iopub.execute_input": "2020-10-05T09:04:27.558320Z", 1131 | "iopub.status.busy": "2020-10-05T09:04:27.557490Z", 1132 | "iopub.status.idle": "2020-10-05T09:04:27.561353Z", 1133 | "shell.execute_reply": "2020-10-05T09:04:27.560819Z" 1134 | }, 1135 | "papermill": { 1136 | "duration": 0.053164, 1137 | "end_time": "2020-10-05T09:04:27.561454", 1138 | "exception": false, 1139 | "start_time": "2020-10-05T09:04:27.508290", 1140 | "status": "completed" 1141 | }, 1142 | "tags": [] 1143 | }, 1144 | "outputs": [ 1145 | { 1146 | "data": { 1147 | "text/plain": [ 1148 | "((1535, 21), (5, 21), (730, 21))" 1149 | ] 1150 | }, 1151 | "execution_count": 27, 1152 | "metadata": {}, 1153 | "output_type": "execute_result" 1154 | } 1155 | ], 1156 | "source": [ 1157 | "tr.shape, chunk.shape, sub.shape" 1158 | ] 1159 | }, 1160 | { 1161 | "cell_type": "code", 1162 | "execution_count": 28, 1163 | "metadata": { 1164 | "execution": { 1165 | "iopub.execute_input": "2020-10-05T09:04:27.672021Z", 1166 | "iopub.status.busy": "2020-10-05T09:04:27.664694Z", 1167 | "iopub.status.idle": "2020-10-05T09:04:27.674846Z", 1168 | "shell.execute_reply": "2020-10-05T09:04:27.674335Z" 1169 | }, 1170 | "papermill": { 1171 | "duration": 0.06829, 1172 | "end_time": "2020-10-05T09:04:27.674937", 1173 | "exception": false, 1174 | "start_time": "2020-10-05T09:04:27.606647", 1175 | "status": "completed" 1176 | }, 1177 | "tags": [] 1178 | }, 1179 | "outputs": [], 1180 | "source": [ 1181 | "C1, C2 = tf.constant(70, dtype='float32'), tf.constant(1000, dtype=\"float32\")\n", 1182 | "\n", 1183 | "def score(y_true, y_pred):\n", 1184 | " tf.dtypes.cast(y_true, tf.float32)\n", 1185 | " tf.dtypes.cast(y_pred, tf.float32)\n", 1186 | " sigma = y_pred[:, 2] - y_pred[:, 0]\n", 1187 | " fvc_pred = y_pred[:, 1]\n", 1188 | " \n", 1189 | " #sigma_clip = sigma + C1\n", 1190 | " sigma_clip = tf.maximum(sigma, C1)\n", 1191 | " delta = tf.abs(y_true[:, 0] - fvc_pred)\n", 1192 | " delta = tf.minimum(delta, C2)\n", 1193 | " sq2 = tf.sqrt( tf.dtypes.cast(2, dtype=tf.float32) )\n", 1194 | " metric = (delta / sigma_clip)*sq2 + tf.math.log(sigma_clip* sq2)\n", 1195 | " return K.mean(metric)\n", 1196 | "\n", 1197 | "def qloss(y_true, y_pred):\n", 1198 | " # Pinball loss for multiple quantiles\n", 1199 | " qs = [0.2, 0.50, 0.8]\n", 1200 | " q = tf.constant(np.array([qs]), dtype=tf.float32)\n", 1201 | " e = y_true - y_pred\n", 1202 | " v = tf.maximum(q*e, (q-1)*e)\n", 1203 | " return K.mean(v)\n", 1204 | "\n", 1205 | "def mloss(_lambda):\n", 1206 | " def loss(y_true, y_pred):\n", 1207 | " return _lambda * qloss(y_true, y_pred) + (1 - _lambda)*score(y_true, y_pred)\n", 1208 | " return loss\n", 1209 | "\n", 1210 | "def make_model(nh):\n", 1211 | " z = L.Input((nh,), name=\"Patient\")\n", 1212 | " x = L.Dense(100, activation=\"relu\", name=\"d1\")(z)\n", 1213 | " x = L.Dense(100, activation=\"relu\", name=\"d2\")(x)\n", 1214 | " #x = L.Dense(100, activation=\"relu\", name=\"d3\")(x)\n", 1215 | " p1 = L.Dense(3, activation=\"linear\", name=\"p1\")(x)\n", 1216 | " p2 = L.Dense(3, activation=\"relu\", name=\"p2\")(x)\n", 1217 | " preds = L.Lambda(lambda x: x[0] + tf.cumsum(x[1], axis=1), \n", 1218 | " name=\"preds\")([p1, p2])\n", 1219 | " \n", 1220 | " model = M.Model(z, preds, name=\"CNN\")\n", 1221 | " #model.compile(loss=qloss, optimizer=\"adam\", metrics=[score])\n", 1222 | " model.compile(loss=mloss(0.8), optimizer=tf.keras.optimizers.Adam(lr=0.1, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.01, amsgrad=False), metrics=[score])\n", 1223 | " return model" 1224 | ] 1225 | }, 1226 | { 1227 | "cell_type": "code", 1228 | "execution_count": 29, 1229 | "metadata": { 1230 | "execution": { 1231 | "iopub.execute_input": "2020-10-05T09:04:27.777357Z", 1232 | "iopub.status.busy": "2020-10-05T09:04:27.776419Z", 1233 | "iopub.status.idle": "2020-10-05T09:04:27.778805Z", 1234 | "shell.execute_reply": "2020-10-05T09:04:27.779292Z" 1235 | }, 1236 | "papermill": { 1237 | "duration": 0.058762, 1238 | "end_time": "2020-10-05T09:04:27.779417", 1239 | "exception": false, 1240 | "start_time": "2020-10-05T09:04:27.720655", 1241 | "status": "completed" 1242 | }, 1243 | "tags": [] 1244 | }, 1245 | "outputs": [], 1246 | "source": [ 1247 | "y = tr['FVC'].values\n", 1248 | "z = tr[FE].values\n", 1249 | "ze = sub[FE].values\n", 1250 | "nh = z.shape[1]\n", 1251 | "pe = np.zeros((ze.shape[0], 3))\n", 1252 | "pred = np.zeros((z.shape[0], 3))" 1253 | ] 1254 | }, 1255 | { 1256 | "cell_type": "code", 1257 | "execution_count": 30, 1258 | "metadata": { 1259 | "execution": { 1260 | "iopub.execute_input": "2020-10-05T09:04:27.877350Z", 1261 | "iopub.status.busy": "2020-10-05T09:04:27.876491Z", 1262 | "iopub.status.idle": "2020-10-05T09:04:27.956928Z", 1263 | "shell.execute_reply": "2020-10-05T09:04:27.956106Z" 1264 | }, 1265 | "papermill": { 1266 | "duration": 0.131912, 1267 | "end_time": "2020-10-05T09:04:27.957045", 1268 | "exception": false, 1269 | "start_time": "2020-10-05T09:04:27.825133", 1270 | "status": "completed" 1271 | }, 1272 | "tags": [] 1273 | }, 1274 | "outputs": [ 1275 | { 1276 | "name": "stdout", 1277 | "output_type": "stream", 1278 | "text": [ 1279 | "Model: \"CNN\"\n", 1280 | "__________________________________________________________________________________________________\n", 1281 | "Layer (type) Output Shape Param # Connected to \n", 1282 | "==================================================================================================\n", 1283 | "Patient (InputLayer) [(None, 8)] 0 \n", 1284 | "__________________________________________________________________________________________________\n", 1285 | "d1 (Dense) (None, 100) 900 Patient[0][0] \n", 1286 | "__________________________________________________________________________________________________\n", 1287 | "d2 (Dense) (None, 100) 10100 d1[0][0] \n", 1288 | "__________________________________________________________________________________________________\n", 1289 | "p1 (Dense) (None, 3) 303 d2[0][0] \n", 1290 | "__________________________________________________________________________________________________\n", 1291 | "p2 (Dense) (None, 3) 303 d2[0][0] \n", 1292 | "__________________________________________________________________________________________________\n", 1293 | "preds (Lambda) (None, 3) 0 p1[0][0] \n", 1294 | " p2[0][0] \n", 1295 | "==================================================================================================\n", 1296 | "Total params: 11,606\n", 1297 | "Trainable params: 11,606\n", 1298 | "Non-trainable params: 0\n", 1299 | "__________________________________________________________________________________________________\n", 1300 | "None\n", 1301 | "11606\n" 1302 | ] 1303 | } 1304 | ], 1305 | "source": [ 1306 | "net = make_model(nh)\n", 1307 | "print(net.summary())\n", 1308 | "print(net.count_params())" 1309 | ] 1310 | }, 1311 | { 1312 | "cell_type": "code", 1313 | "execution_count": 31, 1314 | "metadata": { 1315 | "execution": { 1316 | "iopub.execute_input": "2020-10-05T09:04:28.055428Z", 1317 | "iopub.status.busy": "2020-10-05T09:04:28.054683Z", 1318 | "iopub.status.idle": "2020-10-05T09:04:28.057776Z", 1319 | "shell.execute_reply": "2020-10-05T09:04:28.057183Z" 1320 | }, 1321 | "papermill": { 1322 | "duration": 0.054006, 1323 | "end_time": "2020-10-05T09:04:28.057873", 1324 | "exception": false, 1325 | "start_time": "2020-10-05T09:04:28.003867", 1326 | "status": "completed" 1327 | }, 1328 | "tags": [] 1329 | }, 1330 | "outputs": [], 1331 | "source": [ 1332 | "NFOLD = 2 # originally 5\n", 1333 | "kf = KFold(n_splits=NFOLD)" 1334 | ] 1335 | }, 1336 | { 1337 | "cell_type": "code", 1338 | "execution_count": 32, 1339 | "metadata": { 1340 | "execution": { 1341 | "iopub.execute_input": "2020-10-05T09:04:28.160715Z", 1342 | "iopub.status.busy": "2020-10-05T09:04:28.159798Z", 1343 | "iopub.status.idle": "2020-10-05T09:05:25.385796Z", 1344 | "shell.execute_reply": "2020-10-05T09:05:25.386648Z" 1345 | }, 1346 | "papermill": { 1347 | "duration": 57.282521, 1348 | "end_time": "2020-10-05T09:05:25.386855", 1349 | "exception": false, 1350 | "start_time": "2020-10-05T09:04:28.104334", 1351 | "status": "completed" 1352 | }, 1353 | "tags": [] 1354 | }, 1355 | "outputs": [ 1356 | { 1357 | "name": "stdout", 1358 | "output_type": "stream", 1359 | "text": [ 1360 | "FOLD 1\n", 1361 | "train [45.55668258666992, 6.585787296295166]\n", 1362 | "val [57.895904541015625, 6.83064603805542]\n", 1363 | "predict val...\n", 1364 | "predict test...\n", 1365 | "FOLD 2\n", 1366 | "train [49.8366813659668, 6.653332233428955]\n", 1367 | "val [50.33049392700195, 6.696572780609131]\n", 1368 | "predict val...\n", 1369 | "predict test...\n", 1370 | "CPU times: user 1min 1s, sys: 3.56 s, total: 1min 5s\n", 1371 | "Wall time: 57.2 s\n" 1372 | ] 1373 | } 1374 | ], 1375 | "source": [ 1376 | "%%time\n", 1377 | "cnt = 0\n", 1378 | "EPOCHS = 600\n", 1379 | "for tr_idx, val_idx in kf.split(z):\n", 1380 | " cnt += 1\n", 1381 | " print(f\"FOLD {cnt}\")\n", 1382 | " net = make_model(nh)\n", 1383 | " net.fit(z[tr_idx], y[tr_idx], batch_size=BATCH_SIZE, epochs=EPOCHS, \n", 1384 | " validation_data=(z[val_idx], y[val_idx]), verbose=0) #\n", 1385 | " print(\"train\", net.evaluate(z[tr_idx], y[tr_idx], verbose=0, batch_size=BATCH_SIZE))\n", 1386 | " print(\"val\", net.evaluate(z[val_idx], y[val_idx], verbose=0, batch_size=BATCH_SIZE))\n", 1387 | " print(\"predict val...\")\n", 1388 | " pred[val_idx] = net.predict(z[val_idx], batch_size=BATCH_SIZE, verbose=0)\n", 1389 | " print(\"predict test...\")\n", 1390 | " pe += net.predict(ze, batch_size=BATCH_SIZE, verbose=0) / NFOLD" 1391 | ] 1392 | }, 1393 | { 1394 | "cell_type": "code", 1395 | "execution_count": 33, 1396 | "metadata": { 1397 | "execution": { 1398 | "iopub.execute_input": "2020-10-05T09:05:25.490113Z", 1399 | "iopub.status.busy": "2020-10-05T09:05:25.489309Z", 1400 | "iopub.status.idle": "2020-10-05T09:05:25.492946Z", 1401 | "shell.execute_reply": "2020-10-05T09:05:25.493432Z" 1402 | }, 1403 | "papermill": { 1404 | "duration": 0.057826, 1405 | "end_time": "2020-10-05T09:05:25.493544", 1406 | "exception": false, 1407 | "start_time": "2020-10-05T09:05:25.435718", 1408 | "status": "completed" 1409 | }, 1410 | "tags": [] 1411 | }, 1412 | "outputs": [ 1413 | { 1414 | "name": "stdout", 1415 | "output_type": "stream", 1416 | "text": [ 1417 | "156.29871337095378 260.86731060413274\n" 1418 | ] 1419 | } 1420 | ], 1421 | "source": [ 1422 | "sigma_opt = mean_absolute_error(y, pred[:, 1])\n", 1423 | "unc = pred[:,2] - pred[:, 0]\n", 1424 | "sigma_mean = np.mean(unc)\n", 1425 | "print(sigma_opt, sigma_mean)" 1426 | ] 1427 | }, 1428 | { 1429 | "cell_type": "code", 1430 | "execution_count": 34, 1431 | "metadata": { 1432 | "execution": { 1433 | "iopub.execute_input": "2020-10-05T09:05:25.605603Z", 1434 | "iopub.status.busy": "2020-10-05T09:05:25.599658Z", 1435 | "iopub.status.idle": "2020-10-05T09:05:25.789798Z", 1436 | "shell.execute_reply": "2020-10-05T09:05:25.789284Z" 1437 | }, 1438 | "papermill": { 1439 | "duration": 0.24657, 1440 | "end_time": "2020-10-05T09:05:25.789903", 1441 | "exception": false, 1442 | "start_time": "2020-10-05T09:05:25.543333", 1443 | "status": "completed" 1444 | }, 1445 | "tags": [] 1446 | }, 1447 | "outputs": [ 1448 | { 1449 | "data": { 1450 | "image/png": "\n", 1451 | "text/plain": [ 1452 | "
" 1453 | ] 1454 | }, 1455 | "metadata": { 1456 | "needs_background": "light" 1457 | }, 1458 | "output_type": "display_data" 1459 | } 1460 | ], 1461 | "source": [ 1462 | "idxs = np.random.randint(0, y.shape[0], 100)\n", 1463 | "plt.plot(y[idxs], label=\"ground truth\")\n", 1464 | "plt.plot(pred[idxs, 0], label=\"q25\")\n", 1465 | "plt.plot(pred[idxs, 1], label=\"q50\")\n", 1466 | "plt.plot(pred[idxs, 2], label=\"q75\")\n", 1467 | "plt.legend(loc=\"best\")\n", 1468 | "plt.show()" 1469 | ] 1470 | }, 1471 | { 1472 | "cell_type": "code", 1473 | "execution_count": 35, 1474 | "metadata": { 1475 | "execution": { 1476 | "iopub.execute_input": "2020-10-05T09:05:25.897149Z", 1477 | "iopub.status.busy": "2020-10-05T09:05:25.895304Z", 1478 | "iopub.status.idle": "2020-10-05T09:05:25.900144Z", 1479 | "shell.execute_reply": "2020-10-05T09:05:25.899440Z" 1480 | }, 1481 | "papermill": { 1482 | "duration": 0.060561, 1483 | "end_time": "2020-10-05T09:05:25.900266", 1484 | "exception": false, 1485 | "start_time": "2020-10-05T09:05:25.839705", 1486 | "status": "completed" 1487 | }, 1488 | "tags": [] 1489 | }, 1490 | "outputs": [ 1491 | { 1492 | "name": "stdout", 1493 | "output_type": "stream", 1494 | "text": [ 1495 | "77.932861328125 260.86731060413274 453.388671875 1.0\n" 1496 | ] 1497 | } 1498 | ], 1499 | "source": [ 1500 | "print(unc.min(), unc.mean(), unc.max(), (unc>=0).mean())" 1501 | ] 1502 | }, 1503 | { 1504 | "cell_type": "code", 1505 | "execution_count": 36, 1506 | "metadata": { 1507 | "execution": { 1508 | "iopub.execute_input": "2020-10-05T09:05:26.014854Z", 1509 | "iopub.status.busy": "2020-10-05T09:05:26.012723Z", 1510 | "iopub.status.idle": "2020-10-05T09:05:26.177835Z", 1511 | "shell.execute_reply": "2020-10-05T09:05:26.177336Z" 1512 | }, 1513 | "papermill": { 1514 | "duration": 0.225937, 1515 | "end_time": "2020-10-05T09:05:26.177936", 1516 | "exception": false, 1517 | "start_time": "2020-10-05T09:05:25.951999", 1518 | "status": "completed" 1519 | }, 1520 | "tags": [] 1521 | }, 1522 | "outputs": [ 1523 | { 1524 | "data": { 1525 | "image/png": "\n", 1526 | "text/plain": [ 1527 | "
" 1528 | ] 1529 | }, 1530 | "metadata": { 1531 | "needs_background": "light" 1532 | }, 1533 | "output_type": "display_data" 1534 | } 1535 | ], 1536 | "source": [ 1537 | "plt.hist(unc)\n", 1538 | "plt.title(\"uncertainty in prediction\")\n", 1539 | "plt.show()" 1540 | ] 1541 | }, 1542 | { 1543 | "cell_type": "code", 1544 | "execution_count": 37, 1545 | "metadata": { 1546 | "execution": { 1547 | "iopub.execute_input": "2020-10-05T09:05:26.303619Z", 1548 | "iopub.status.busy": "2020-10-05T09:05:26.291722Z", 1549 | "iopub.status.idle": "2020-10-05T09:05:26.316703Z", 1550 | "shell.execute_reply": "2020-10-05T09:05:26.317190Z" 1551 | }, 1552 | "papermill": { 1553 | "duration": 0.086982, 1554 | "end_time": "2020-10-05T09:05:26.317302", 1555 | "exception": false, 1556 | "start_time": "2020-10-05T09:05:26.230320", 1557 | "status": "completed" 1558 | }, 1559 | "tags": [] 1560 | }, 1561 | "outputs": [ 1562 | { 1563 | "data": { 1564 | "text/html": [ 1565 | "
\n", 1566 | "\n", 1579 | "\n", 1580 | " \n", 1581 | " \n", 1582 | " \n", 1583 | " \n", 1584 | " \n", 1585 | " \n", 1586 | " \n", 1587 | " \n", 1588 | " \n", 1589 | " \n", 1590 | " \n", 1591 | " \n", 1592 | " \n", 1593 | " \n", 1594 | " \n", 1595 | " \n", 1596 | " \n", 1597 | " \n", 1598 | " \n", 1599 | " \n", 1600 | " \n", 1601 | " \n", 1602 | " \n", 1603 | " \n", 1604 | " \n", 1605 | " \n", 1606 | " \n", 1607 | " \n", 1608 | " \n", 1609 | " \n", 1610 | " \n", 1611 | " \n", 1612 | " \n", 1613 | " \n", 1614 | " \n", 1615 | " \n", 1616 | " \n", 1617 | " \n", 1618 | " \n", 1619 | " \n", 1620 | " \n", 1621 | " \n", 1622 | " \n", 1623 | " \n", 1624 | " \n", 1625 | " \n", 1626 | " \n", 1627 | " \n", 1628 | " \n", 1629 | " \n", 1630 | " \n", 1631 | " \n", 1632 | " \n", 1633 | " \n", 1634 | " \n", 1635 | " \n", 1636 | " \n", 1637 | " \n", 1638 | " \n", 1639 | " \n", 1640 | " \n", 1641 | " \n", 1642 | " \n", 1643 | " \n", 1644 | " \n", 1645 | " \n", 1646 | " \n", 1647 | " \n", 1648 | " \n", 1649 | " \n", 1650 | " \n", 1651 | " \n", 1652 | " \n", 1653 | " \n", 1654 | " \n", 1655 | " \n", 1656 | " \n", 1657 | " \n", 1658 | " \n", 1659 | " \n", 1660 | " \n", 1661 | " \n", 1662 | " \n", 1663 | " \n", 1664 | " \n", 1665 | " \n", 1666 | " \n", 1667 | " \n", 1668 | " \n", 1669 | " \n", 1670 | " \n", 1671 | " \n", 1672 | " \n", 1673 | " \n", 1674 | " \n", 1675 | " \n", 1676 | " \n", 1677 | " \n", 1678 | " \n", 1679 | " \n", 1680 | " \n", 1681 | " \n", 1682 | " \n", 1683 | " \n", 1684 | " \n", 1685 | " \n", 1686 | " \n", 1687 | " \n", 1688 | " \n", 1689 | " \n", 1690 | " \n", 1691 | " \n", 1692 | " \n", 1693 | " \n", 1694 | " \n", 1695 | " \n", 1696 | " \n", 1697 | " \n", 1698 | " \n", 1699 | " \n", 1700 | " \n", 1701 | " \n", 1702 | " \n", 1703 | " \n", 1704 | " \n", 1705 | " \n", 1706 | " \n", 1707 | " \n", 1708 | " \n", 1709 | " \n", 1710 | " \n", 1711 | " \n", 1712 | " \n", 1713 | " \n", 1714 | " \n", 1715 | " \n", 1716 | " \n", 1717 | " \n", 1718 | " \n", 1719 | " \n", 1720 | " \n", 1721 | " \n", 1722 | " \n", 1723 | " \n", 1724 | " \n", 1725 | " \n", 1726 | " \n", 1727 | " \n", 1728 | "
PatientWeeksFVCPercentAgeSexSmokingStatusWHEREConfidencePatient_Week...min_FVCbase_weekMaleFemaleEx-smokerNever smokedCurrently smokesageBASEweek
1540ID00419637202311204720264-12302070.18685573MaleEx-smokertest100.0ID00419637202311204720264_-12...3020-18.0101000.6153850.37240.067901
1541ID00419637202311204720264-11302070.18685573MaleEx-smokertest100.0ID00419637202311204720264_-11...3020-17.0101000.6153850.37240.074074
1542ID00419637202311204720264-10302070.18685573MaleEx-smokertest100.0ID00419637202311204720264_-10...3020-16.0101000.6153850.37240.080247
1543ID00419637202311204720264-9302070.18685573MaleEx-smokertest100.0ID00419637202311204720264_-9...3020-15.0101000.6153850.37240.086420
1544ID00419637202311204720264-8302070.18685573MaleEx-smokertest100.0ID00419637202311204720264_-8...3020-14.0101000.6153850.37240.092593
\n", 1729 | "

5 rows × 21 columns

\n", 1730 | "
" 1731 | ], 1732 | "text/plain": [ 1733 | " Patient Weeks FVC Percent Age Sex \\\n", 1734 | "1540 ID00419637202311204720264 -12 3020 70.186855 73 Male \n", 1735 | "1541 ID00419637202311204720264 -11 3020 70.186855 73 Male \n", 1736 | "1542 ID00419637202311204720264 -10 3020 70.186855 73 Male \n", 1737 | "1543 ID00419637202311204720264 -9 3020 70.186855 73 Male \n", 1738 | "1544 ID00419637202311204720264 -8 3020 70.186855 73 Male \n", 1739 | "\n", 1740 | " SmokingStatus WHERE Confidence Patient_Week ... \\\n", 1741 | "1540 Ex-smoker test 100.0 ID00419637202311204720264_-12 ... \n", 1742 | "1541 Ex-smoker test 100.0 ID00419637202311204720264_-11 ... \n", 1743 | "1542 Ex-smoker test 100.0 ID00419637202311204720264_-10 ... \n", 1744 | "1543 Ex-smoker test 100.0 ID00419637202311204720264_-9 ... \n", 1745 | "1544 Ex-smoker test 100.0 ID00419637202311204720264_-8 ... \n", 1746 | "\n", 1747 | " min_FVC base_week Male Female Ex-smoker Never smoked \\\n", 1748 | "1540 3020 -18.0 1 0 1 0 \n", 1749 | "1541 3020 -17.0 1 0 1 0 \n", 1750 | "1542 3020 -16.0 1 0 1 0 \n", 1751 | "1543 3020 -15.0 1 0 1 0 \n", 1752 | "1544 3020 -14.0 1 0 1 0 \n", 1753 | "\n", 1754 | " Currently smokes age BASE week \n", 1755 | "1540 0 0.615385 0.3724 0.067901 \n", 1756 | "1541 0 0.615385 0.3724 0.074074 \n", 1757 | "1542 0 0.615385 0.3724 0.080247 \n", 1758 | "1543 0 0.615385 0.3724 0.086420 \n", 1759 | "1544 0 0.615385 0.3724 0.092593 \n", 1760 | "\n", 1761 | "[5 rows x 21 columns]" 1762 | ] 1763 | }, 1764 | "execution_count": 37, 1765 | "metadata": {}, 1766 | "output_type": "execute_result" 1767 | } 1768 | ], 1769 | "source": [ 1770 | "sub.head()" 1771 | ] 1772 | }, 1773 | { 1774 | "cell_type": "code", 1775 | "execution_count": 38, 1776 | "metadata": { 1777 | "execution": { 1778 | "iopub.execute_input": "2020-10-05T09:05:26.434984Z", 1779 | "iopub.status.busy": "2020-10-05T09:05:26.433750Z", 1780 | "iopub.status.idle": "2020-10-05T09:05:26.449190Z", 1781 | "shell.execute_reply": "2020-10-05T09:05:26.448719Z" 1782 | }, 1783 | "papermill": { 1784 | "duration": 0.078273, 1785 | "end_time": "2020-10-05T09:05:26.449311", 1786 | "exception": false, 1787 | "start_time": "2020-10-05T09:05:26.371038", 1788 | "status": "completed" 1789 | }, 1790 | "tags": [] 1791 | }, 1792 | "outputs": [ 1793 | { 1794 | "data": { 1795 | "text/html": [ 1796 | "
\n", 1797 | "\n", 1810 | "\n", 1811 | " \n", 1812 | " \n", 1813 | " \n", 1814 | " \n", 1815 | " \n", 1816 | " \n", 1817 | " \n", 1818 | " \n", 1819 | " \n", 1820 | " \n", 1821 | " \n", 1822 | " \n", 1823 | " \n", 1824 | " \n", 1825 | " \n", 1826 | " \n", 1827 | " \n", 1828 | " \n", 1829 | " \n", 1830 | " \n", 1831 | " \n", 1832 | " \n", 1833 | " \n", 1834 | " \n", 1835 | " \n", 1836 | " \n", 1837 | " \n", 1838 | " \n", 1839 | " \n", 1840 | " \n", 1841 | " \n", 1842 | " \n", 1843 | " \n", 1844 | " \n", 1845 | " \n", 1846 | " \n", 1847 | " \n", 1848 | " \n", 1849 | " \n", 1850 | " \n", 1851 | " \n", 1852 | " \n", 1853 | " \n", 1854 | " \n", 1855 | " \n", 1856 | " \n", 1857 | " \n", 1858 | " \n", 1859 | " \n", 1860 | " \n", 1861 | " \n", 1862 | " \n", 1863 | " \n", 1864 | " \n", 1865 | " \n", 1866 | " \n", 1867 | " \n", 1868 | " \n", 1869 | " \n", 1870 | " \n", 1871 | " \n", 1872 | " \n", 1873 | " \n", 1874 | " \n", 1875 | " \n", 1876 | " \n", 1877 | " \n", 1878 | " \n", 1879 | " \n", 1880 | " \n", 1881 | " \n", 1882 | " \n", 1883 | " \n", 1884 | " \n", 1885 | " \n", 1886 | " \n", 1887 | " \n", 1888 | " \n", 1889 | " \n", 1890 | " \n", 1891 | " \n", 1892 | " \n", 1893 | " \n", 1894 | " \n", 1895 | " \n", 1896 | " \n", 1897 | " \n", 1898 | " \n", 1899 | " \n", 1900 | " \n", 1901 | " \n", 1902 | " \n", 1903 | "
Patient_WeekFVCConfidenceFVC1Confidence1
1540ID00419637202311204720264_-123020100.03041.346558222.091309
1541ID00419637202311204720264_-113020100.03037.395996224.152710
1542ID00419637202311204720264_-103020100.03033.445312226.214111
1543ID00419637202311204720264_-93020100.03029.494751228.275513
1544ID00419637202311204720264_-83020100.03025.544189230.336670
1545ID00419637202311204720264_-73020100.03021.593384232.398193
1546ID00419637202311204720264_-63020100.03017.642822234.459473
1547ID00419637202311204720264_-53020100.03013.692383236.520996
1548ID00419637202311204720264_-43020100.03009.741699238.582153
1549ID00419637202311204720264_-33020100.03005.791260240.643677
\n", 1904 | "
" 1905 | ], 1906 | "text/plain": [ 1907 | " Patient_Week FVC Confidence FVC1 \\\n", 1908 | "1540 ID00419637202311204720264_-12 3020 100.0 3041.346558 \n", 1909 | "1541 ID00419637202311204720264_-11 3020 100.0 3037.395996 \n", 1910 | "1542 ID00419637202311204720264_-10 3020 100.0 3033.445312 \n", 1911 | "1543 ID00419637202311204720264_-9 3020 100.0 3029.494751 \n", 1912 | "1544 ID00419637202311204720264_-8 3020 100.0 3025.544189 \n", 1913 | "1545 ID00419637202311204720264_-7 3020 100.0 3021.593384 \n", 1914 | "1546 ID00419637202311204720264_-6 3020 100.0 3017.642822 \n", 1915 | "1547 ID00419637202311204720264_-5 3020 100.0 3013.692383 \n", 1916 | "1548 ID00419637202311204720264_-4 3020 100.0 3009.741699 \n", 1917 | "1549 ID00419637202311204720264_-3 3020 100.0 3005.791260 \n", 1918 | "\n", 1919 | " Confidence1 \n", 1920 | "1540 222.091309 \n", 1921 | "1541 224.152710 \n", 1922 | "1542 226.214111 \n", 1923 | "1543 228.275513 \n", 1924 | "1544 230.336670 \n", 1925 | "1545 232.398193 \n", 1926 | "1546 234.459473 \n", 1927 | "1547 236.520996 \n", 1928 | "1548 238.582153 \n", 1929 | "1549 240.643677 " 1930 | ] 1931 | }, 1932 | "execution_count": 38, 1933 | "metadata": {}, 1934 | "output_type": "execute_result" 1935 | } 1936 | ], 1937 | "source": [ 1938 | "# PREDICTION\n", 1939 | "sub['FVC1'] = 1.*pe[:, 1]\n", 1940 | "sub['Confidence1'] = pe[:, 2] - pe[:, 0]\n", 1941 | "subm = sub[['Patient_Week','FVC','Confidence','FVC1','Confidence1']].copy()\n", 1942 | "subm.loc[~subm.FVC1.isnull()].head(10)" 1943 | ] 1944 | }, 1945 | { 1946 | "cell_type": "code", 1947 | "execution_count": 39, 1948 | "metadata": { 1949 | "execution": { 1950 | "iopub.execute_input": "2020-10-05T09:05:26.568502Z", 1951 | "iopub.status.busy": "2020-10-05T09:05:26.567657Z", 1952 | "iopub.status.idle": "2020-10-05T09:05:26.574331Z", 1953 | "shell.execute_reply": "2020-10-05T09:05:26.573860Z" 1954 | }, 1955 | "papermill": { 1956 | "duration": 0.070024, 1957 | "end_time": "2020-10-05T09:05:26.574425", 1958 | "exception": false, 1959 | "start_time": "2020-10-05T09:05:26.504401", 1960 | "status": "completed" 1961 | }, 1962 | "tags": [] 1963 | }, 1964 | "outputs": [], 1965 | "source": [ 1966 | "subm.loc[~subm.FVC1.isnull(),'FVC'] = subm.loc[~subm.FVC1.isnull(),'FVC1']\n", 1967 | "sigma_mean = 60\n", 1968 | "if sigma_mean\n", 1998 | "\n", 2011 | "\n", 2012 | " \n", 2013 | " \n", 2014 | " \n", 2015 | " \n", 2016 | " \n", 2017 | " \n", 2018 | " \n", 2019 | " \n", 2020 | " \n", 2021 | " \n", 2022 | " \n", 2023 | " \n", 2024 | " \n", 2025 | " \n", 2026 | " \n", 2027 | " \n", 2028 | " \n", 2029 | " \n", 2030 | " \n", 2031 | " \n", 2032 | " \n", 2033 | " \n", 2034 | " \n", 2035 | " \n", 2036 | " \n", 2037 | " \n", 2038 | " \n", 2039 | " \n", 2040 | " \n", 2041 | " \n", 2042 | " \n", 2043 | " \n", 2044 | " \n", 2045 | " \n", 2046 | " \n", 2047 | " \n", 2048 | " \n", 2049 | " \n", 2050 | " \n", 2051 | " \n", 2052 | " \n", 2053 | " \n", 2054 | " \n", 2055 | " \n", 2056 | " \n", 2057 | " \n", 2058 | " \n", 2059 | " \n", 2060 | " \n", 2061 | " \n", 2062 | " \n", 2063 | " \n", 2064 | "
Patient_WeekFVCConfidenceFVC1Confidence1
1540ID00419637202311204720264_-123041.346558222.0913093041.346558222.091309
1541ID00419637202311204720264_-113037.395996224.1527103037.395996224.152710
1542ID00419637202311204720264_-103033.445312226.2141113033.445312226.214111
1543ID00419637202311204720264_-93029.494751228.2755133029.494751228.275513
1544ID00419637202311204720264_-83025.544189230.3366703025.544189230.336670
\n", 2065 | "" 2066 | ], 2067 | "text/plain": [ 2068 | " Patient_Week FVC Confidence FVC1 \\\n", 2069 | "1540 ID00419637202311204720264_-12 3041.346558 222.091309 3041.346558 \n", 2070 | "1541 ID00419637202311204720264_-11 3037.395996 224.152710 3037.395996 \n", 2071 | "1542 ID00419637202311204720264_-10 3033.445312 226.214111 3033.445312 \n", 2072 | "1543 ID00419637202311204720264_-9 3029.494751 228.275513 3029.494751 \n", 2073 | "1544 ID00419637202311204720264_-8 3025.544189 230.336670 3025.544189 \n", 2074 | "\n", 2075 | " Confidence1 \n", 2076 | "1540 222.091309 \n", 2077 | "1541 224.152710 \n", 2078 | "1542 226.214111 \n", 2079 | "1543 228.275513 \n", 2080 | "1544 230.336670 " 2081 | ] 2082 | }, 2083 | "execution_count": 40, 2084 | "metadata": {}, 2085 | "output_type": "execute_result" 2086 | } 2087 | ], 2088 | "source": [ 2089 | "subm.head()" 2090 | ] 2091 | }, 2092 | { 2093 | "cell_type": "code", 2094 | "execution_count": 41, 2095 | "metadata": { 2096 | "execution": { 2097 | "iopub.execute_input": "2020-10-05T09:05:26.834992Z", 2098 | "iopub.status.busy": "2020-10-05T09:05:26.834274Z", 2099 | "iopub.status.idle": "2020-10-05T09:05:26.857156Z", 2100 | "shell.execute_reply": "2020-10-05T09:05:26.857656Z" 2101 | }, 2102 | "papermill": { 2103 | "duration": 0.086668, 2104 | "end_time": "2020-10-05T09:05:26.857775", 2105 | "exception": false, 2106 | "start_time": "2020-10-05T09:05:26.771107", 2107 | "status": "completed" 2108 | }, 2109 | "tags": [] 2110 | }, 2111 | "outputs": [ 2112 | { 2113 | "data": { 2114 | "text/html": [ 2115 | "
\n", 2116 | "\n", 2129 | "\n", 2130 | " \n", 2131 | " \n", 2132 | " \n", 2133 | " \n", 2134 | " \n", 2135 | " \n", 2136 | " \n", 2137 | " \n", 2138 | " \n", 2139 | " \n", 2140 | " \n", 2141 | " \n", 2142 | " \n", 2143 | " \n", 2144 | " \n", 2145 | " \n", 2146 | " \n", 2147 | " \n", 2148 | " \n", 2149 | " \n", 2150 | " \n", 2151 | " \n", 2152 | " \n", 2153 | " \n", 2154 | " \n", 2155 | " \n", 2156 | " \n", 2157 | " \n", 2158 | " \n", 2159 | " \n", 2160 | " \n", 2161 | " \n", 2162 | " \n", 2163 | " \n", 2164 | " \n", 2165 | " \n", 2166 | " \n", 2167 | " \n", 2168 | " \n", 2169 | " \n", 2170 | " \n", 2171 | " \n", 2172 | " \n", 2173 | " \n", 2174 | " \n", 2175 | " \n", 2176 | " \n", 2177 | " \n", 2178 | " \n", 2179 | " \n", 2180 | " \n", 2181 | " \n", 2182 | " \n", 2183 | " \n", 2184 | " \n", 2185 | " \n", 2186 | " \n", 2187 | " \n", 2188 | " \n", 2189 | "
countmeanstdmin25%50%75%max
FVC730.02579.475721434.0976461671.2131962423.7149662649.1751712871.4500123359.120605
Confidence730.0319.17770157.769768193.531860278.825699315.067993371.083679417.567139
FVC1730.02579.475721434.0976461671.2131962423.7149662649.1751712871.4500123359.120605
Confidence1730.0319.17770157.769768193.531860278.825699315.067993371.083679417.567139
\n", 2190 | "
" 2191 | ], 2192 | "text/plain": [ 2193 | " count mean std min 25% \\\n", 2194 | "FVC 730.0 2579.475721 434.097646 1671.213196 2423.714966 \n", 2195 | "Confidence 730.0 319.177701 57.769768 193.531860 278.825699 \n", 2196 | "FVC1 730.0 2579.475721 434.097646 1671.213196 2423.714966 \n", 2197 | "Confidence1 730.0 319.177701 57.769768 193.531860 278.825699 \n", 2198 | "\n", 2199 | " 50% 75% max \n", 2200 | "FVC 2649.175171 2871.450012 3359.120605 \n", 2201 | "Confidence 315.067993 371.083679 417.567139 \n", 2202 | "FVC1 2649.175171 2871.450012 3359.120605 \n", 2203 | "Confidence1 315.067993 371.083679 417.567139 " 2204 | ] 2205 | }, 2206 | "execution_count": 41, 2207 | "metadata": {}, 2208 | "output_type": "execute_result" 2209 | } 2210 | ], 2211 | "source": [ 2212 | "subm.describe().T" 2213 | ] 2214 | }, 2215 | { 2216 | "cell_type": "code", 2217 | "execution_count": 42, 2218 | "metadata": { 2219 | "execution": { 2220 | "iopub.execute_input": "2020-10-05T09:05:27.004978Z", 2221 | "iopub.status.busy": "2020-10-05T09:05:27.004268Z", 2222 | "iopub.status.idle": "2020-10-05T09:05:27.007670Z", 2223 | "shell.execute_reply": "2020-10-05T09:05:27.008110Z" 2224 | }, 2225 | "papermill": { 2226 | "duration": 0.093123, 2227 | "end_time": "2020-10-05T09:05:27.008227", 2228 | "exception": false, 2229 | "start_time": "2020-10-05T09:05:26.915104", 2230 | "status": "completed" 2231 | }, 2232 | "tags": [] 2233 | }, 2234 | "outputs": [], 2235 | "source": [ 2236 | "otest = pd.read_csv('../input/osic-pulmonary-fibrosis-progression/test.csv')\n", 2237 | "for i in range(len(otest)):\n", 2238 | " subm.loc[subm['Patient_Week']==otest.Patient[i]+'_'+str(otest.Weeks[i]), 'FVC'] = otest.FVC[i]\n", 2239 | " subm.loc[subm['Patient_Week']==otest.Patient[i]+'_'+str(otest.Weeks[i]), 'Confidence'] = 0.1" 2240 | ] 2241 | }, 2242 | { 2243 | "cell_type": "code", 2244 | "execution_count": 43, 2245 | "metadata": { 2246 | "execution": { 2247 | "iopub.execute_input": "2020-10-05T09:05:27.125110Z", 2248 | "iopub.status.busy": "2020-10-05T09:05:27.124289Z", 2249 | "iopub.status.idle": "2020-10-05T09:05:27.133221Z", 2250 | "shell.execute_reply": "2020-10-05T09:05:27.132704Z" 2251 | }, 2252 | "papermill": { 2253 | "duration": 0.069817, 2254 | "end_time": "2020-10-05T09:05:27.133353", 2255 | "exception": false, 2256 | "start_time": "2020-10-05T09:05:27.063536", 2257 | "status": "completed" 2258 | }, 2259 | "tags": [] 2260 | }, 2261 | "outputs": [], 2262 | "source": [ 2263 | "subm[[\"Patient_Week\",\"FVC\",\"Confidence\"]].to_csv(\"submission_regression.csv\", index=False)" 2264 | ] 2265 | }, 2266 | { 2267 | "cell_type": "code", 2268 | "execution_count": 44, 2269 | "metadata": { 2270 | "execution": { 2271 | "iopub.execute_input": "2020-10-05T09:05:27.250401Z", 2272 | "iopub.status.busy": "2020-10-05T09:05:27.249764Z", 2273 | "iopub.status.idle": "2020-10-05T09:05:27.253289Z", 2274 | "shell.execute_reply": "2020-10-05T09:05:27.252789Z" 2275 | }, 2276 | "papermill": { 2277 | "duration": 0.064003, 2278 | "end_time": "2020-10-05T09:05:27.253381", 2279 | "exception": false, 2280 | "start_time": "2020-10-05T09:05:27.189378", 2281 | "status": "completed" 2282 | }, 2283 | "tags": [] 2284 | }, 2285 | "outputs": [], 2286 | "source": [ 2287 | "reg_sub = subm[[\"Patient_Week\",\"FVC\",\"Confidence\"]].copy()" 2288 | ] 2289 | }, 2290 | { 2291 | "cell_type": "markdown", 2292 | "metadata": { 2293 | "papermill": { 2294 | "duration": 0.055012, 2295 | "end_time": "2020-10-05T09:05:27.363446", 2296 | "exception": false, 2297 | "start_time": "2020-10-05T09:05:27.308434", 2298 | "status": "completed" 2299 | }, 2300 | "tags": [] 2301 | }, 2302 | "source": [ 2303 | "# Ensemble (Simple Blend)" 2304 | ] 2305 | }, 2306 | { 2307 | "cell_type": "code", 2308 | "execution_count": 45, 2309 | "metadata": { 2310 | "execution": { 2311 | "iopub.execute_input": "2020-10-05T09:05:27.483201Z", 2312 | "iopub.status.busy": "2020-10-05T09:05:27.480256Z", 2313 | "iopub.status.idle": "2020-10-05T09:05:27.486668Z", 2314 | "shell.execute_reply": "2020-10-05T09:05:27.486160Z" 2315 | }, 2316 | "papermill": { 2317 | "duration": 0.068733, 2318 | "end_time": "2020-10-05T09:05:27.486769", 2319 | "exception": false, 2320 | "start_time": "2020-10-05T09:05:27.418036", 2321 | "status": "completed" 2322 | }, 2323 | "tags": [] 2324 | }, 2325 | "outputs": [], 2326 | "source": [ 2327 | "df1 = img_sub.sort_values(by=['Patient_Week'], ascending=True).reset_index(drop=True)\n", 2328 | "df2 = reg_sub.sort_values(by=['Patient_Week'], ascending=True).reset_index(drop=True)" 2329 | ] 2330 | }, 2331 | { 2332 | "cell_type": "code", 2333 | "execution_count": 46, 2334 | "metadata": { 2335 | "execution": { 2336 | "iopub.execute_input": "2020-10-05T09:05:27.615760Z", 2337 | "iopub.status.busy": "2020-10-05T09:05:27.614874Z", 2338 | "iopub.status.idle": "2020-10-05T09:05:27.618554Z", 2339 | "shell.execute_reply": "2020-10-05T09:05:27.619188Z" 2340 | }, 2341 | "papermill": { 2342 | "duration": 0.076792, 2343 | "end_time": "2020-10-05T09:05:27.619344", 2344 | "exception": false, 2345 | "start_time": "2020-10-05T09:05:27.542552", 2346 | "status": "completed" 2347 | }, 2348 | "tags": [] 2349 | }, 2350 | "outputs": [ 2351 | { 2352 | "data": { 2353 | "text/html": [ 2354 | "
\n", 2355 | "\n", 2368 | "\n", 2369 | " \n", 2370 | " \n", 2371 | " \n", 2372 | " \n", 2373 | " \n", 2374 | " \n", 2375 | " \n", 2376 | " \n", 2377 | " \n", 2378 | " \n", 2379 | " \n", 2380 | " \n", 2381 | " \n", 2382 | " \n", 2383 | " \n", 2384 | " \n", 2385 | " \n", 2386 | " \n", 2387 | " \n", 2388 | " \n", 2389 | " \n", 2390 | " \n", 2391 | " \n", 2392 | " \n", 2393 | " \n", 2394 | " \n", 2395 | " \n", 2396 | " \n", 2397 | " \n", 2398 | " \n", 2399 | " \n", 2400 | " \n", 2401 | " \n", 2402 | " \n", 2403 | " \n", 2404 | " \n", 2405 | " \n", 2406 | " \n", 2407 | " \n", 2408 | " \n", 2409 | "
Patient_WeekFVCConfidence
0ID00419637202311204720264_-13020.074010178.440218
1ID00419637202311204720264_-103055.359806183.966730
2ID00419637202311204720264_-113059.280487184.580764
3ID00419637202311204720264_-123063.201101185.194799
4ID00419637202311204720264_-23023.994692179.054252
\n", 2410 | "
" 2411 | ], 2412 | "text/plain": [ 2413 | " Patient_Week FVC Confidence\n", 2414 | "0 ID00419637202311204720264_-1 3020.074010 178.440218\n", 2415 | "1 ID00419637202311204720264_-10 3055.359806 183.966730\n", 2416 | "2 ID00419637202311204720264_-11 3059.280487 184.580764\n", 2417 | "3 ID00419637202311204720264_-12 3063.201101 185.194799\n", 2418 | "4 ID00419637202311204720264_-2 3023.994692 179.054252" 2419 | ] 2420 | }, 2421 | "execution_count": 46, 2422 | "metadata": {}, 2423 | "output_type": "execute_result" 2424 | } 2425 | ], 2426 | "source": [ 2427 | "df = df1[['Patient_Week']].copy()\n", 2428 | "df['FVC'] = (0.45*df1['FVC'] + 0.55*df2['FVC'])\n", 2429 | "df['Confidence'] = (0.45*df1['Confidence'] + 0.55*df2['Confidence'])\n", 2430 | "df.head()" 2431 | ] 2432 | }, 2433 | { 2434 | "cell_type": "code", 2435 | "execution_count": 47, 2436 | "metadata": { 2437 | "execution": { 2438 | "iopub.execute_input": "2020-10-05T09:05:27.740282Z", 2439 | "iopub.status.busy": "2020-10-05T09:05:27.739353Z", 2440 | "iopub.status.idle": "2020-10-05T09:05:27.748555Z", 2441 | "shell.execute_reply": "2020-10-05T09:05:27.749079Z" 2442 | }, 2443 | "papermill": { 2444 | "duration": 0.071238, 2445 | "end_time": "2020-10-05T09:05:27.749193", 2446 | "exception": false, 2447 | "start_time": "2020-10-05T09:05:27.677955", 2448 | "status": "completed" 2449 | }, 2450 | "tags": [] 2451 | }, 2452 | "outputs": [], 2453 | "source": [ 2454 | "df.to_csv('submission.csv', index=False)" 2455 | ] 2456 | } 2457 | ], 2458 | "metadata": { 2459 | "kernelspec": { 2460 | "display_name": "Python 3", 2461 | "language": "python", 2462 | "name": "python3" 2463 | }, 2464 | "language_info": { 2465 | "codemirror_mode": { 2466 | "name": "ipython", 2467 | "version": 3 2468 | }, 2469 | "file_extension": ".py", 2470 | "mimetype": "text/x-python", 2471 | "name": "python", 2472 | "nbconvert_exporter": "python", 2473 | "pygments_lexer": "ipython3", 2474 | "version": "3.7.6" 2475 | }, 2476 | "papermill": { 2477 | "duration": 196.958096, 2478 | "end_time": "2020-10-05T09:05:29.577830", 2479 | "environment_variables": {}, 2480 | "exception": null, 2481 | "input_path": "__notebook__.ipynb", 2482 | "output_path": "__notebook__.ipynb", 2483 | "parameters": {}, 2484 | "start_time": "2020-10-05T09:02:12.619734", 2485 | "version": "2.1.0" 2486 | }, 2487 | "widgets": { 2488 | "application/vnd.jupyter.widget-state+json": { 2489 | "state": { 2490 | "18a39542417944fe997f2604989c82fd": { 2491 | "model_module": "@jupyter-widgets/controls", 2492 | "model_module_version": "1.5.0", 2493 | "model_name": "DescriptionStyleModel", 2494 | "state": { 2495 | "_model_module": "@jupyter-widgets/controls", 2496 | "_model_module_version": "1.5.0", 2497 | "_model_name": "DescriptionStyleModel", 2498 | "_view_count": null, 2499 | "_view_module": "@jupyter-widgets/base", 2500 | "_view_module_version": "1.2.0", 2501 | "_view_name": "StyleView", 2502 | "description_width": "" 2503 | } 2504 | }, 2505 | "366e2485b939417c948814b0a3ff2139": { 2506 | "model_module": "@jupyter-widgets/controls", 2507 | "model_module_version": "1.5.0", 2508 | "model_name": "FloatProgressModel", 2509 | "state": { 2510 | "_dom_classes": [], 2511 | "_model_module": "@jupyter-widgets/controls", 2512 | "_model_module_version": "1.5.0", 2513 | "_model_name": "FloatProgressModel", 2514 | "_view_count": null, 2515 | "_view_module": "@jupyter-widgets/controls", 2516 | "_view_module_version": "1.5.0", 2517 | "_view_name": "ProgressView", 2518 | "bar_style": "success", 2519 | "description": "", 2520 | "description_tooltip": null, 2521 | "layout": "IPY_MODEL_63acf496e7b14fce855c7c4856d31be2", 2522 | "max": 1.0, 2523 | "min": 0.0, 2524 | "orientation": "horizontal", 2525 | "style": "IPY_MODEL_bedb7932c35f4112948fe4823bac7d25", 2526 | "value": 1.0 2527 | } 2528 | }, 2529 | "63acf496e7b14fce855c7c4856d31be2": { 2530 | "model_module": "@jupyter-widgets/base", 2531 | "model_module_version": "1.2.0", 2532 | "model_name": "LayoutModel", 2533 | "state": { 2534 | "_model_module": "@jupyter-widgets/base", 2535 | "_model_module_version": "1.2.0", 2536 | "_model_name": "LayoutModel", 2537 | "_view_count": null, 2538 | "_view_module": "@jupyter-widgets/base", 2539 | "_view_module_version": "1.2.0", 2540 | "_view_name": "LayoutView", 2541 | "align_content": null, 2542 | "align_items": null, 2543 | "align_self": null, 2544 | "border": null, 2545 | "bottom": null, 2546 | "display": null, 2547 | "flex": null, 2548 | "flex_flow": null, 2549 | "grid_area": null, 2550 | "grid_auto_columns": null, 2551 | "grid_auto_flow": null, 2552 | "grid_auto_rows": null, 2553 | "grid_column": null, 2554 | "grid_gap": null, 2555 | "grid_row": null, 2556 | "grid_template_areas": null, 2557 | "grid_template_columns": null, 2558 | "grid_template_rows": null, 2559 | "height": null, 2560 | "justify_content": null, 2561 | "justify_items": null, 2562 | "left": null, 2563 | "margin": null, 2564 | "max_height": null, 2565 | "max_width": null, 2566 | "min_height": null, 2567 | "min_width": null, 2568 | "object_fit": null, 2569 | "object_position": null, 2570 | "order": null, 2571 | "overflow": null, 2572 | "overflow_x": null, 2573 | "overflow_y": null, 2574 | "padding": null, 2575 | "right": null, 2576 | "top": null, 2577 | "visibility": null, 2578 | "width": null 2579 | } 2580 | }, 2581 | "94ffc2800d05426e858c1dc67fe9c342": { 2582 | "model_module": "@jupyter-widgets/controls", 2583 | "model_module_version": "1.5.0", 2584 | "model_name": "HTMLModel", 2585 | "state": { 2586 | "_dom_classes": [], 2587 | "_model_module": "@jupyter-widgets/controls", 2588 | "_model_module_version": "1.5.0", 2589 | "_model_name": "HTMLModel", 2590 | "_view_count": null, 2591 | "_view_module": "@jupyter-widgets/controls", 2592 | "_view_module_version": "1.5.0", 2593 | "_view_name": "HTMLView", 2594 | "description": "", 2595 | "description_tooltip": null, 2596 | "layout": "IPY_MODEL_fefedecca76c4931a85df5962b88e751", 2597 | "placeholder": "​", 2598 | "style": "IPY_MODEL_18a39542417944fe997f2604989c82fd", 2599 | "value": " 176/? [00:00<00:00, 202.96it/s]" 2600 | } 2601 | }, 2602 | "bedb7932c35f4112948fe4823bac7d25": { 2603 | "model_module": "@jupyter-widgets/controls", 2604 | "model_module_version": "1.5.0", 2605 | "model_name": "ProgressStyleModel", 2606 | "state": { 2607 | "_model_module": "@jupyter-widgets/controls", 2608 | "_model_module_version": "1.5.0", 2609 | "_model_name": "ProgressStyleModel", 2610 | "_view_count": null, 2611 | "_view_module": "@jupyter-widgets/base", 2612 | "_view_module_version": "1.2.0", 2613 | "_view_name": "StyleView", 2614 | "bar_color": null, 2615 | "description_width": "initial" 2616 | } 2617 | }, 2618 | "d8beae9411ec4c2588cb183dadacf987": { 2619 | "model_module": "@jupyter-widgets/base", 2620 | "model_module_version": "1.2.0", 2621 | "model_name": "LayoutModel", 2622 | "state": { 2623 | "_model_module": "@jupyter-widgets/base", 2624 | "_model_module_version": "1.2.0", 2625 | "_model_name": "LayoutModel", 2626 | "_view_count": null, 2627 | "_view_module": "@jupyter-widgets/base", 2628 | "_view_module_version": "1.2.0", 2629 | "_view_name": "LayoutView", 2630 | "align_content": null, 2631 | "align_items": null, 2632 | "align_self": null, 2633 | "border": null, 2634 | "bottom": null, 2635 | "display": null, 2636 | "flex": null, 2637 | "flex_flow": null, 2638 | "grid_area": null, 2639 | "grid_auto_columns": null, 2640 | "grid_auto_flow": null, 2641 | "grid_auto_rows": null, 2642 | "grid_column": null, 2643 | "grid_gap": null, 2644 | "grid_row": null, 2645 | "grid_template_areas": null, 2646 | "grid_template_columns": null, 2647 | "grid_template_rows": null, 2648 | "height": null, 2649 | "justify_content": null, 2650 | "justify_items": null, 2651 | "left": null, 2652 | "margin": null, 2653 | "max_height": null, 2654 | "max_width": null, 2655 | "min_height": null, 2656 | "min_width": null, 2657 | "object_fit": null, 2658 | "object_position": null, 2659 | "order": null, 2660 | "overflow": null, 2661 | "overflow_x": null, 2662 | "overflow_y": null, 2663 | "padding": null, 2664 | "right": null, 2665 | "top": null, 2666 | "visibility": null, 2667 | "width": null 2668 | } 2669 | }, 2670 | "f30423f9d7cf4780a01f51dde8ae36e3": { 2671 | "model_module": "@jupyter-widgets/controls", 2672 | "model_module_version": "1.5.0", 2673 | "model_name": "HBoxModel", 2674 | "state": { 2675 | "_dom_classes": [], 2676 | "_model_module": "@jupyter-widgets/controls", 2677 | "_model_module_version": "1.5.0", 2678 | "_model_name": "HBoxModel", 2679 | "_view_count": null, 2680 | "_view_module": "@jupyter-widgets/controls", 2681 | "_view_module_version": "1.5.0", 2682 | "_view_name": "HBoxView", 2683 | "box_style": "", 2684 | "children": [ 2685 | "IPY_MODEL_366e2485b939417c948814b0a3ff2139", 2686 | "IPY_MODEL_94ffc2800d05426e858c1dc67fe9c342" 2687 | ], 2688 | "layout": "IPY_MODEL_d8beae9411ec4c2588cb183dadacf987" 2689 | } 2690 | }, 2691 | "fefedecca76c4931a85df5962b88e751": { 2692 | "model_module": "@jupyter-widgets/base", 2693 | "model_module_version": "1.2.0", 2694 | "model_name": "LayoutModel", 2695 | "state": { 2696 | "_model_module": "@jupyter-widgets/base", 2697 | "_model_module_version": "1.2.0", 2698 | "_model_name": "LayoutModel", 2699 | "_view_count": null, 2700 | "_view_module": "@jupyter-widgets/base", 2701 | "_view_module_version": "1.2.0", 2702 | "_view_name": "LayoutView", 2703 | "align_content": null, 2704 | "align_items": null, 2705 | "align_self": null, 2706 | "border": null, 2707 | "bottom": null, 2708 | "display": null, 2709 | "flex": null, 2710 | "flex_flow": null, 2711 | "grid_area": null, 2712 | "grid_auto_columns": null, 2713 | "grid_auto_flow": null, 2714 | "grid_auto_rows": null, 2715 | "grid_column": null, 2716 | "grid_gap": null, 2717 | "grid_row": null, 2718 | "grid_template_areas": null, 2719 | "grid_template_columns": null, 2720 | "grid_template_rows": null, 2721 | "height": null, 2722 | "justify_content": null, 2723 | "justify_items": null, 2724 | "left": null, 2725 | "margin": null, 2726 | "max_height": null, 2727 | "max_width": null, 2728 | "min_height": null, 2729 | "min_width": null, 2730 | "object_fit": null, 2731 | "object_position": null, 2732 | "order": null, 2733 | "overflow": null, 2734 | "overflow_x": null, 2735 | "overflow_y": null, 2736 | "padding": null, 2737 | "right": null, 2738 | "top": null, 2739 | "visibility": null, 2740 | "width": null 2741 | } 2742 | } 2743 | }, 2744 | "version_major": 2, 2745 | "version_minor": 0 2746 | } 2747 | } 2748 | }, 2749 | "nbformat": 4, 2750 | "nbformat_minor": 4 2751 | } 2752 | --------------------------------------------------------------------------------