├── .gitignore ├── LICENSE ├── Makefile ├── Procfile ├── README.md ├── azure-pipelines.yml ├── data ├── hiv-protease-consensus.txt ├── hiv-protease-data-expanded.csv ├── models │ ├── ATV.pkl.gz │ ├── DRV.pkl.gz │ ├── FPV.pkl.gz │ ├── IDV.pkl.gz │ ├── LPV.pkl.gz │ ├── NFV.pkl.gz │ ├── SQV.pkl.gz │ └── TPV.pkl.gz └── scores.pkl.gz ├── environment.yml ├── hiv-resistance.ipynb ├── home.ipynb ├── index.html ├── iris.ipynb ├── model-training.ipynb ├── requirements.txt ├── test_utils.py └── utils.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Custom 2 | dask-worker-space/* 3 | .vscode/* 4 | 5 | # Byte-compiled / optimized / DLL files 6 | __pycache__/ 7 | *.py[cod] 8 | *$py.class 9 | 10 | # C extensions 11 | *.so 12 | 13 | # Distribution / packaging 14 | .Python 15 | build/ 16 | develop-eggs/ 17 | dist/ 18 | downloads/ 19 | eggs/ 20 | .eggs/ 21 | lib/ 22 | lib64/ 23 | parts/ 24 | sdist/ 25 | var/ 26 | wheels/ 27 | *.egg-info/ 28 | .installed.cfg 29 | *.egg 30 | MANIFEST 31 | 32 | # PyInstaller 33 | # Usually these files are written by a python script from a template 34 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 35 | *.manifest 36 | *.spec 37 | 38 | # Installer logs 39 | pip-log.txt 40 | pip-delete-this-directory.txt 41 | 42 | # Unit test / coverage reports 43 | htmlcov/ 44 | .tox/ 45 | .coverage 46 | .coverage.* 47 | .cache 48 | nosetests.xml 49 | coverage.xml 50 | *.cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | 63 | # Flask stuff: 64 | instance/ 65 | .webassets-cache 66 | 67 | # Scrapy stuff: 68 | .scrapy 69 | 70 | # Sphinx documentation 71 | docs/_build/ 72 | 73 | # PyBuilder 74 | target/ 75 | 76 | # Jupyter Notebook 77 | .ipynb_checkpoints 78 | 79 | # pyenv 80 | .python-version 81 | 82 | # celery beat schedule file 83 | celerybeat-schedule 84 | 85 | # SageMath parsed files 86 | *.sage.py 87 | 88 | # Environments 89 | .env 90 | .venv 91 | env/ 92 | venv/ 93 | ENV/ 94 | env.bak/ 95 | venv.bak/ 96 | 97 | # Spyder project settings 98 | .spyderproject 99 | .spyproject 100 | 101 | # Rope project settings 102 | .ropeproject 103 | 104 | # mkdocs documentation 105 | /site 106 | 107 | # mypy 108 | .mypy_cache/ 109 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Eric Ma 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | test: 2 | jupyter nbconvert --execute minimal-panel.ipynb 3 | rm minimal-panel.html 4 | -------------------------------------------------------------------------------- /Procfile: -------------------------------------------------------------------------------- 1 | web: panel serve --address="0.0.0.0" --port=$PORT hiv-resistance.ipynb home.ipynb iris.ipynb --allow-websocket-origin=minimal-panel-app.herokuapp.com --index=`pwd`/index.html 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # minimal-panel-app 2 | 3 | A pedagogical implementation of panel apps served up on a remote machine. 4 | 5 | See the full app [here](http://minimal-panel-app.herokuapp.com/home). 6 | 7 | ## why this project exists 8 | 9 | I spent a day figuring out how to make this happen at work, 10 | and decided to spend an evening consolidating my knowledge. 11 | 12 | ## "how to use" 13 | 14 | ``` 15 | git clone https://github.com/ericmjl/minimal-panel-app 16 | ``` 17 | 18 | ## anything else interesting? 19 | 20 | ### iPad development 21 | 22 | The first version of the app was coded up entirely on an iPad, 23 | using a combination of [blink](http://blink.sh) 24 | [Juno](jhttp://juno.sh), 25 | and `nano` on my home remote server 26 | (which is nothing more than a converted gaming tower). 27 | 28 | Web app development in Python is now doable 29 | and we can use modern tablets as a thin client! 30 | 31 | ### memory usage 32 | 33 | Deploying the HIV drug resistance model to Heroku was challenging 34 | because I had to watch out for memory and storage usage. 35 | There are 8 models to make predictions on, 36 | and loading all of them together causes memory overload 37 | on Heroku's free tier. 38 | 39 | I got around this by pickling the models individually, 40 | and only loading them when needed. 41 | I also minimized disk usage by using gzip 42 | when pickling the files. 43 | 44 | ### multi-app hosting 45 | 46 | There are multiple "apps" that are being hosted by a single Panel server here. 47 | Each "app" is basically one Jupyter notebook. 48 | In each notebook, I define a self-contained, hostable unit 49 | that an end-user can interact with. 50 | One of them is the homepage, 51 | written using Panel's tooling just to prove the point, 52 | but the others are actual user-facing interfaces 53 | that provide a way to interact with either data or a machine learning model. 54 | -------------------------------------------------------------------------------- /azure-pipelines.yml: -------------------------------------------------------------------------------- 1 | pr: 2 | - master 3 | 4 | jobs: 5 | - job: linux 6 | variables: 7 | activate.command: "source activate" 8 | strategy: 9 | matrix: 10 | py37: 11 | python.version: "3.7" 12 | 13 | pool: 14 | vmImage: ubuntu-16.04 15 | 16 | steps: 17 | - bash: echo "##vso[task.prependpath]$CONDA/bin" 18 | displayName: Add conda to PATH 19 | - script: | 20 | conda env create -f environment.yml 21 | displayName: Create environment 22 | - script: | 23 | source activate minimal-panel 24 | python -m ipykernel install --user --name minimal-panel 25 | make test 26 | displayName: Run tests 27 | -------------------------------------------------------------------------------- /data/hiv-protease-consensus.txt: -------------------------------------------------------------------------------- 1 | >protease 2 | PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNF 3 | -------------------------------------------------------------------------------- /data/models/ATV.pkl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ericmjl/minimal-panel-app/40b95008fec7c95241b0296d8bd57cdcb1b1eb17/data/models/ATV.pkl.gz -------------------------------------------------------------------------------- /data/models/DRV.pkl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ericmjl/minimal-panel-app/40b95008fec7c95241b0296d8bd57cdcb1b1eb17/data/models/DRV.pkl.gz -------------------------------------------------------------------------------- /data/models/FPV.pkl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ericmjl/minimal-panel-app/40b95008fec7c95241b0296d8bd57cdcb1b1eb17/data/models/FPV.pkl.gz -------------------------------------------------------------------------------- /data/models/IDV.pkl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ericmjl/minimal-panel-app/40b95008fec7c95241b0296d8bd57cdcb1b1eb17/data/models/IDV.pkl.gz -------------------------------------------------------------------------------- /data/models/LPV.pkl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ericmjl/minimal-panel-app/40b95008fec7c95241b0296d8bd57cdcb1b1eb17/data/models/LPV.pkl.gz -------------------------------------------------------------------------------- /data/models/NFV.pkl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ericmjl/minimal-panel-app/40b95008fec7c95241b0296d8bd57cdcb1b1eb17/data/models/NFV.pkl.gz -------------------------------------------------------------------------------- /data/models/SQV.pkl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ericmjl/minimal-panel-app/40b95008fec7c95241b0296d8bd57cdcb1b1eb17/data/models/SQV.pkl.gz -------------------------------------------------------------------------------- /data/models/TPV.pkl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ericmjl/minimal-panel-app/40b95008fec7c95241b0296d8bd57cdcb1b1eb17/data/models/TPV.pkl.gz -------------------------------------------------------------------------------- /data/scores.pkl.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ericmjl/minimal-panel-app/40b95008fec7c95241b0296d8bd57cdcb1b1eb17/data/scores.pkl.gz -------------------------------------------------------------------------------- /environment.yml: -------------------------------------------------------------------------------- 1 | name: minimal-panel 2 | channels: 3 | - conda-forge 4 | - defaults 5 | dependencies: 6 | - python=3.7 7 | - biopython=1.74 8 | - bokeh=1.2.0 9 | - dask=2.6.0 10 | - holoviews=1.12.3 11 | - hvplot 12 | - ipython 13 | - jupyter 14 | - jupyterlab 15 | - pandas=0.24.0 16 | - panel=0.6.0 17 | - pyjanitor=0.18.0 18 | - scikit-learn=0.21.2 19 | - pytest 20 | - hypothesis 21 | - black 22 | - pylint 23 | - pycodestyle 24 | -------------------------------------------------------------------------------- /index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 |
4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /model-training.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Introduction\n", 8 | "\n", 9 | "This notebook gives you a short introduction on how to use Dask to parallelize model training, particularly if you have multiple learning tasks on which you want to train individual models for.\n", 10 | "\n", 11 | "For brevity, I will not be elaborating on the exact machine learning task here, but focus on the idioms that we need to use Dask for this task." 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 1, 17 | "metadata": {}, 18 | "outputs": [], 19 | "source": [ 20 | "%load_ext autoreload\n", 21 | "%autoreload 2\n", 22 | "%matplotlib inline\n", 23 | "%config InlineBackend.figure_format = 'retina'\n", 24 | "\n", 25 | "from dask.distributed import LocalCluster, Client\n", 26 | "import numpy as np\n", 27 | "import pandas as pd\n", 28 | "import janitor" 29 | ] 30 | }, 31 | { 32 | "cell_type": "markdown", 33 | "metadata": {}, 34 | "source": [ 35 | "## Instantiate a Dask Cluster\n", 36 | "\n", 37 | "Here, we instantiate a Dask `cluster` (this is only a `LocalCluster`, but other cluster types can be created too, such as an `SGECluster` or `KubeCluster`. We then connect a `client` to the cluster." 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": 2, 43 | "metadata": {}, 44 | "outputs": [ 45 | { 46 | "name": "stderr", 47 | "output_type": "stream", 48 | "text": [ 49 | "/home/ericmjl/anaconda/envs/minimal-panel/lib/python3.7/site-packages/distributed/dashboard/core.py:72: UserWarning: \n", 50 | "Port 8787 is already in use. \n", 51 | "Perhaps you already have a cluster running?\n", 52 | "Hosting the diagnostics dashboard on a random port instead.\n", 53 | " warnings.warn(\"\\n\" + msg)\n" 54 | ] 55 | } 56 | ], 57 | "source": [ 58 | "client = Client()" 59 | ] 60 | }, 61 | { 62 | "cell_type": "markdown", 63 | "metadata": {}, 64 | "source": [ 65 | "## Data Preprocessing\n", 66 | "\n", 67 | "We will now preprocess our data and get it into a shape for machine learning." 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": 4, 73 | "metadata": {}, 74 | "outputs": [], 75 | "source": [ 76 | "from utils import molecular_weights, featurize_sequence_" 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": 6, 82 | "metadata": {}, 83 | "outputs": [], 84 | "source": [ 85 | "drugs = ['ATV', 'DRV', 'FPV', 'IDV', 'LPV', 'NFV', 'SQV', 'TPV']\n", 86 | "\n", 87 | "data = (\n", 88 | " pd.read_csv(\"data/hiv-protease-data-expanded.csv\", index_col=0)\n", 89 | " .query(\"weight == 1.0\")\n", 90 | " .transform_column(\"sequence\", lambda x: len(x), \"seq_length\")\n", 91 | " .query(\"seq_length == 99\")\n", 92 | " .transform_column(\"sequence\", featurize_sequence_, \"features\")\n", 93 | " .transform_columns(drugs, np.log10)\n", 94 | ")\n", 95 | "\n", 96 | "features = pd.DataFrame(np.vstack(data['features'])).set_index(data.index)" 97 | ] 98 | }, 99 | { 100 | "cell_type": "code", 101 | "execution_count": 7, 102 | "metadata": {}, 103 | "outputs": [ 104 | { 105 | "data": { 106 | "text/html": [ 107 | "\n", 125 | " | ATV | \n", 126 | "DRV | \n", 127 | "FPV | \n", 128 | "IDV | \n", 129 | "LPV | \n", 130 | "NFV | \n", 131 | "SQV | \n", 132 | "SeqID | \n", 133 | "TPV | \n", 134 | "seqid | \n", 135 | "sequence | \n", 136 | "sequence_object | \n", 137 | "weight | \n", 138 | "seq_length | \n", 139 | "features | \n", 140 | "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
6 | \n", 145 | "1.50515 | \n", 146 | "NaN | \n", 147 | "0.477121 | \n", 148 | "1.544068 | \n", 149 | "1.50515 | \n", 150 | "1.462398 | \n", 151 | "2.214844 | \n", 152 | "4426 | \n", 153 | "NaN | \n", 154 | "4426-0 | \n", 155 | "PQITLWQRPIVTIKIGGQLKEALLDTGADDTVLEEMNLPGKWKPKM... | \n", 156 | "ID: 4426-0\\nName: <unknown name>\\nDescription:... | \n", 157 | "1.0 | \n", 158 | "99 | \n", 159 | "[[115.131, 146.1451, 131.1736, 119.1197, 131.1... | \n", 160 | "
7 | \n", 163 | "NaN | \n", 164 | "NaN | \n", 165 | "0.176091 | \n", 166 | "0.000000 | \n", 167 | "NaN | \n", 168 | "0.342423 | \n", 169 | "0.041393 | \n", 170 | "4432 | \n", 171 | "NaN | \n", 172 | "4432-0 | \n", 173 | "PQITLWQRPLVTVKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKM... | \n", 174 | "ID: 4432-0\\nName: <unknown name>\\nDescription:... | \n", 175 | "1.0 | \n", 176 | "99 | \n", 177 | "[[115.131, 146.1451, 131.1736, 119.1197, 131.1... | \n", 178 | "
14 | \n", 181 | "NaN | \n", 182 | "NaN | \n", 183 | "0.491362 | \n", 184 | "0.939519 | \n", 185 | "NaN | \n", 186 | "1.505150 | \n", 187 | "1.227887 | \n", 188 | "4664 | \n", 189 | "NaN | \n", 190 | "4664-0 | \n", 191 | "PQITLWQRPIVTIKVGGQLIEALLDTGADDTVLEEINLPGRWKPKM... | \n", 192 | "ID: 4664-0\\nName: <unknown name>\\nDescription:... | \n", 193 | "1.0 | \n", 194 | "99 | \n", 195 | "[[115.131, 146.1451, 131.1736, 119.1197, 131.1... | \n", 196 | "
\n", 258 | " | 0 | \n", 259 | "1 | \n", 260 | "2 | \n", 261 | "3 | \n", 262 | "4 | \n", 263 | "5 | \n", 264 | "6 | \n", 265 | "7 | \n", 266 | "8 | \n", 267 | "9 | \n", 268 | "... | \n", 269 | "89 | \n", 270 | "90 | \n", 271 | "91 | \n", 272 | "92 | \n", 273 | "93 | \n", 274 | "94 | \n", 275 | "95 | \n", 276 | "96 | \n", 277 | "97 | \n", 278 | "98 | \n", 279 | "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
6 | \n", 284 | "115.131 | \n", 285 | "146.1451 | \n", 286 | "131.1736 | \n", 287 | "119.1197 | \n", 288 | "131.1736 | \n", 289 | "204.2262 | \n", 290 | "146.1451 | \n", 291 | "174.2017 | \n", 292 | "115.131 | \n", 293 | "131.1736 | \n", 294 | "... | \n", 295 | "131.1736 | \n", 296 | "119.1197 | \n", 297 | "146.1451 | \n", 298 | "131.1736 | \n", 299 | "75.0669 | \n", 300 | "121.159 | \n", 301 | "119.1197 | \n", 302 | "131.1736 | \n", 303 | "132.1184 | \n", 304 | "165.19 | \n", 305 | "
7 | \n", 308 | "115.131 | \n", 309 | "146.1451 | \n", 310 | "131.1736 | \n", 311 | "119.1197 | \n", 312 | "131.1736 | \n", 313 | "204.2262 | \n", 314 | "146.1451 | \n", 315 | "174.2017 | \n", 316 | "115.131 | \n", 317 | "131.1736 | \n", 318 | "... | \n", 319 | "131.1736 | \n", 320 | "119.1197 | \n", 321 | "146.1451 | \n", 322 | "131.1736 | \n", 323 | "75.0669 | \n", 324 | "121.159 | \n", 325 | "119.1197 | \n", 326 | "131.1736 | \n", 327 | "132.1184 | \n", 328 | "165.19 | \n", 329 | "
14 | \n", 332 | "115.131 | \n", 333 | "146.1451 | \n", 334 | "131.1736 | \n", 335 | "119.1197 | \n", 336 | "131.1736 | \n", 337 | "204.2262 | \n", 338 | "146.1451 | \n", 339 | "174.2017 | \n", 340 | "115.131 | \n", 341 | "131.1736 | \n", 342 | "... | \n", 343 | "149.2124 | \n", 344 | "119.1197 | \n", 345 | "146.1451 | \n", 346 | "131.1736 | \n", 347 | "75.0669 | \n", 348 | "121.159 | \n", 349 | "119.1197 | \n", 350 | "131.1736 | \n", 351 | "132.1184 | \n", 352 | "165.19 | \n", 353 | "
3 rows × 99 columns
\n", 357 | "