├── .gitpod.yml ├── settings └── proxychains.conf ├── .github └── workflows │ └── create_pull_request.yml ├── Dockerfile ├── LICENSE └── readme.md /.gitpod.yml: -------------------------------------------------------------------------------- 1 | image: dclong/gitpod 2 | tasks: 3 | - command: . /scripts/gitpod.sh 4 | -------------------------------------------------------------------------------- /settings/proxychains.conf: -------------------------------------------------------------------------------- 1 | strict_chain 2 | quiet_mode 3 | tcp_read_time_out 15000 4 | tcp_connect_time_out 8000 5 | localnet 127.0.0.1/255.0.0.0 6 | 7 | [ProxyList] 8 | socks5 127.0.0.1 1080 9 | -------------------------------------------------------------------------------- /.github/workflows/create_pull_request.yml: -------------------------------------------------------------------------------- 1 | name: Create Pull Request 2 | on: 3 | push: 4 | branches: 5 | - dev 6 | jobs: 7 | create_pull_request: 8 | runs-on: ubuntu-latest 9 | steps: 10 | - uses: actions/checkout@v2 11 | with: 12 | ref: main 13 | - name: Reset main branch with dev changes 14 | run: | 15 | git fetch origin dev:dev 16 | git reset --hard dev 17 | - name: Create pull request from dev to main 18 | uses: peter-evans/create-pull-request@v3 19 | with: 20 | token: ${{ secrets.GITHUBACTIONS }} 21 | title: Merge dev into main 22 | branch: dev 23 | author: dclong 24 | assignees: dclong 25 | 26 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | # NAME: dclong/jupyterhub-ds 2 | FROM dclong/jupyterhub-more 3 | # GIT: https://github.com/legendu-net/docker-jupyterhub-more.git 4 | 5 | RUN apt-get update -y \ 6 | && apt-get install -y --no-install-recommends \ 7 | cron wamerican \ 8 | proxychains wget git-lfs \ 9 | highlight \ 10 | && /scripts/sys/purge_cache.sh 11 | 12 | #RUN pip3 install --upgrade --ignore-installed entrypoints 13 | RUN pip3 install --break-system-packages \ 14 | loguru pysnooper \ 15 | numpy scipy polars pandas 'pyarrow>=0.14.0' \ 16 | scikit-learn lightgbm graphviz \ 17 | matplotlib bokeh holoviews[recommended] hvplot pyviz_comms \ 18 | tabulate \ 19 | 'JPype1>=0.7.0' sqlparse \ 20 | requests[socks] lxml notifiers \ 21 | aiutil[jupyter] \ 22 | && /scripts/sys/purge_cache.sh 23 | 24 | #COPY scripts/ /scripts/ 25 | # proxychains configuration 26 | COPY settings/proxychains.conf /etc/proxychains.conf 27 | 28 | # expose an extra port just in case you need to start another service inside the docker 29 | EXPOSE 5006 30 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) [2019-present] [Chuanlong Du] 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /readme.md: -------------------------------------------------------------------------------- 1 | # dclong/jupyterhub-ds [@DockerHub](https://hub.docker.com/r/dclong/jupyterhub-ds/) | [@GitHub](https://github.com/dclong/docker-jupyterhub-ds) 2 | 3 | JupyterHub for Data Science. 4 | **This is the recommended Docker image to use 5 | if you want to do data science related work in JupyterLab/Jupyter Notebook. 6 | For deep learning leveraging GPU, 7 | please use [dclong/jupyterhub-pytorch](https://hub.docker.com/r/dclong/jupyterhub-pytorch/).** 8 | Note: Python packages in this version are managed using pip instead of conda. 9 | 10 | ## [Recommended Docker Images](http://www.legendu.net/en/blog/my-docker-images/#recommended-docker-images) 11 | 12 | ## Prerequisite 13 | You need to [install Docker](http://www.legendu.net/en/blog/docker-installation/) before you use this Docker image. 14 | 15 | ## Usage in Linux/Unix 16 | 17 | Please refer to the section 18 | [Usage](http://www.legendu.net/en/blog/my-docker-images/#usage) 19 | of the post [My Docker Images](http://www.legendu.net/en/blog/my-docker-images/) 20 | for detailed instruction on how to use the Docker image. 21 | 22 | The following command starts a container 23 | and mounts the current working directory and `/home` on the host machine 24 | to `/workdir` and `/home_host` in the container respectively. 25 | ``` 26 | docker run -d --init \ 27 | --privileged \ 28 | --cap-add SYS_ADMIN \ 29 | --platform linux/amd64 \ 30 | --hostname jupyterhub-ds \ 31 | --log-opt max-size=50m \ 32 | -p 8000:8000 \ 33 | -p 5006:5006 \ 34 | -e DOCKER_USER=$(id -un) \ 35 | -e DOCKER_USER_ID=$(id -u) \ 36 | -e DOCKER_PASSWORD=$(id -un) \ 37 | -e DOCKER_GROUP_ID=$(id -g) \ 38 | -e DOCKER_ADMIN_USER=$(id -un) \ 39 | -v "$(pwd)":/workdir \ 40 | -v "$(dirname $HOME)":/home_host \ 41 | dclong/jupyterhub-ds /scripts/sys/init.sh 42 | ``` 43 | Use the image with the `next` tag (which is the testing/next version of dclong/jupyterhub-ds). 44 | ``` 45 | docker run -d --init \ 46 | --privileged \ 47 | --cap-add SYS_ADMIN \ 48 | --platform linux/amd64 \ 49 | --hostname jupyterhub-ds \ 50 | --log-opt max-size=50m \ 51 | -p 8000:8000 \ 52 | -p 5006:5006 \ 53 | -e DOCKER_USER=$(id -un) \ 54 | -e DOCKER_USER_ID=$(id -u) \ 55 | -e DOCKER_PASSWORD=$(id -un) \ 56 | -e DOCKER_GROUP_ID=$(id -g) \ 57 | -e DOCKER_ADMIN_USER=$(id -un) \ 58 | -v "$(pwd)":/workdir \ 59 | -v "$(dirname $HOME)":/home_host \ 60 | dclong/jupyterhub-ds:next /scripts/sys/init.sh 61 | ``` 62 | The following command (*only works on Linux*) does the same as the above one 63 | except that it limits the use of CPU and memory. 64 | ``` 65 | docker run -d --init \ 66 | --privileged \ 67 | --cap-add SYS_ADMIN \ 68 | --platform linux/amd64 \ 69 | --hostname jupyterhub-ds \ 70 | --log-opt max-size=50m \ 71 | --memory=$(($(head -n 1 /proc/meminfo | awk '{print $2}') * 4 / 5))k \ 72 | --cpus=$(($(nproc) - 1)) \ 73 | -p 8000:8000 \ 74 | -p 5006:5006 \ 75 | -e DOCKER_USER=$(id -un) \ 76 | -e DOCKER_USER_ID=$(id -u) \ 77 | -e DOCKER_PASSWORD=$(id -un) \ 78 | -e DOCKER_GROUP_ID=$(id -g) \ 79 | -e DOCKER_ADMIN_USER=$(id -un) \ 80 | -v "$(pwd)":/workdir \ 81 | -v "$(dirname $HOME)":/home_host \ 82 | dclong/jupyterhub-ds /scripts/sys/init.sh 83 | ``` 84 | Use the image with the `next` tag (which is the testing/next version of dclong/jupyterhub-ds). 85 | ``` 86 | docker run -d --init \ 87 | --privileged \ 88 | --cap-add SYS_ADMIN \ 89 | --platform linux/amd64 \ 90 | --hostname jupyterhub-ds \ 91 | --log-opt max-size=50m \ 92 | --memory=$(($(head -n 1 /proc/meminfo | awk '{print $2}') * 4 / 5))k \ 93 | --cpus=$(($(nproc) - 1)) \ 94 | -p 8000:8000 \ 95 | -p 5006:5006 \ 96 | -e DOCKER_USER=$(id -un) \ 97 | -e DOCKER_USER_ID=$(id -u) \ 98 | -e DOCKER_PASSWORD=$(id -un) \ 99 | -e DOCKER_GROUP_ID=$(id -g) \ 100 | -e DOCKER_ADMIN_USER=$(id -un) \ 101 | -v "$(pwd)":/workdir \ 102 | -v "$(dirname $HOME)":/home_host \ 103 | dclong/jupyterhub-ds:next /scripts/sys/init.sh 104 | ``` 105 | ### Launch a JupyterLab Instead of JupyterHub 106 | 107 | You can still launch a JupyterLab service using this Docker image. 108 | ``` 109 | docker run -d --init \ 110 | --privileged \ 111 | --cap-add SYS_ADMIN \ 112 | --platform linux/amd64 \ 113 | --hostname jupyterlab \ 114 | --log-opt max-size=50m \ 115 | --memory=$(($(head -n 1 /proc/meminfo | awk '{print $2}') * 4 / 5))k \ 116 | --cpus=$(($(nproc) - 1)) \ 117 | -p 8888:8888 \ 118 | -e DOCKER_USER=$(id -un) \ 119 | -e DOCKER_USER_ID=$(id -u) \ 120 | -e DOCKER_PASSWORD=$(id -un) \ 121 | -e DOCKER_GROUP_ID=$(id -g) \ 122 | -e DOCKER_ADMIN_USER=$(id -un) \ 123 | -v "$(pwd)":/workdir \ 124 | -v "$(dirname $HOME)":/home_host \ 125 | dclong/jupyterhub-ds /scripts/sys/init.sh /scripts/sys/launch_jlab.sh 126 | ``` 127 | 128 | ## [Use the JupyterHub Server](http://www.legendu.net/en/blog/my-docker-images/#use-the-jupyterhub-server) 129 | 130 | ## [Add a New User to the JupyterHub Server](http://www.legendu.net/en/blog/my-docker-images/#add-a-new-user-to-the-jupyterhub-server) 131 | 132 | ## [Use Spark in JupyterLab Notebook](http://www.legendu.net/en/blog/my-docker-images/#use-spark-in-jupyterlab-notebook) 133 | 134 | ## [Log Information](http://www.legendu.net/en/blog/my-docker-images/#docker-container-logs) 135 | 136 | ## [Detailed Information](http://www.legendu.net/en/blog/my-docker-images/#list-of-images-and-detailed-information) 137 | 138 | ## [Known Issues](http://www.legendu.net/en/blog/my-docker-images/#known-issues) 139 | 140 | ## [About the Author](http://www.legendu.net/pages/about) 141 | --------------------------------------------------------------------------------