├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── build_env.sh ├── environment.yml ├── example.ipynb └── start_env.sh /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *main* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of 4 | this software and associated documentation files (the "Software"), to deal in 5 | the Software without restriction, including without limitation the rights to 6 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 7 | the Software, and to permit persons to whom the Software is furnished to do so. 8 | 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 10 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 11 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 12 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 13 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 14 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 15 | 16 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # aws-sagemaker-custom-jupyter-kernel 2 | 3 | ## Introduction 4 | There are several built-in Juptyer Kernels ready-to-use when working on an AWS Sagemaker Jupyter Notebook Instance. 5 | They are easy to use but generally not convenient to customize: 6 | 7 | - If you want to install new packages to the built-in Juptyer Kernels, there is no guarantee that the new packages are compatible with the existing ones. 8 | - Even if you can install new packages to the built-in Juptyer Kernels, you will lose the modified/custom kernels when restarting the Notebook Instance. 9 | 10 | This repo provides a couple of easy-to-use template scripts to help you set up a custom jupyter kernel on a AWS Sagemaker Jupyter Notebook Instance. 11 | 12 | 13 | ## Build/Start Custom Jupyter Kernel 14 | 15 | ### When first create the Jupyter notebook instance. 16 | 1. Need to edit `environment.yml` to specify your custom environment if you want to build a different Python Kernel to this example in this repo. 17 | 2. Run the command below to build the custom Python Kernel named `Conda_my-custom-jupyter-kernel`. 18 | 19 | ./build_env.sh 20 | 21 | ### (Optional): Every time re-start the Jupyter Notebook instance, run the command below to add the custom Python Kernel. 22 | 23 | ./start_env.sh 24 | 25 | ### To use the custom Kernel, create a new Jupyter notebook, and select `Conda_my-custom-jupyter-kernel` as the Python Kernel. 26 | 27 | -------------------------------------------------------------------------------- /build_env.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash -ex 2 | 3 | WORKING_DIR=./.myenv 4 | # get the env name 5 | line=$(head -n 1 environment.yml) 6 | ENV_NAME="${line/name:\ /}" 7 | 8 | mkdir -p "${WORKING_DIR}" 9 | PWD=$(pwd) 10 | 11 | # fix an issue for displaying plotly 12 | # jupyter labextension install jupyterlab-plotly 13 | 14 | # Install Miniconda to get a separate python and pip 15 | wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O "$WORKING_DIR/miniconda.sh" 16 | 17 | # Install Miniconda into the working directory 18 | bash "$WORKING_DIR/miniconda.sh" -b -u -p "$WORKING_DIR/miniconda" 19 | 20 | # Install pinned versions of any dependencies 21 | source "$WORKING_DIR/miniconda/bin/activate" 22 | 23 | # Set cuda variable if GPU is needed for some packages 24 | # export CUDA_VISIBLE_DEVICES=0 25 | 26 | conda env create -f environment.yml 27 | 28 | conda activate $ENV_NAME 29 | 30 | # add this as a kernel 31 | pip install ipykernel 32 | 33 | # Cleanup 34 | conda deactivate 35 | source "${WORKING_DIR}/miniconda/bin/deactivate" 36 | rm -rf "${WORKING_DIR}/miniconda.sh" 37 | 38 | # Add the following env dir to envs_dirs 39 | conda config --add envs_dirs "$PWD/$WORKING_DIR/miniconda/envs" 40 | 41 | # Activate the kernel by list the envs 42 | conda env list 43 | 44 | # Optional 45 | #sudo initctl restart jupyter-server --no-wait 46 | -------------------------------------------------------------------------------- /environment.yml: -------------------------------------------------------------------------------- 1 | name: my-custom-jupyter-kernel 2 | channels: 3 | - conda-forge 4 | dependencies: 5 | - python=3.6 # or 2.7 6 | - vowpalwabbit=8.8.1 7 | - pip: 8 | - pandas==1.1.5 9 | -------------------------------------------------------------------------------- /example.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "from vowpalwabbit import pyvw" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 2, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "import pandas as pd" 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": 3, 24 | "metadata": {}, 25 | "outputs": [ 26 | { 27 | "ename": "ModuleNotFoundError", 28 | "evalue": "No module named 'sklearn'", 29 | "output_type": "error", 30 | "traceback": [ 31 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 32 | "\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)", 33 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mimport\u001b[0m \u001b[0msklearn\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 34 | "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'sklearn'" 35 | ] 36 | } 37 | ], 38 | "source": [ 39 | "import sklearn" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "execution_count": 4, 45 | "metadata": {}, 46 | "outputs": [ 47 | { 48 | "name": "stdout", 49 | "output_type": "stream", 50 | "text": [ 51 | "Collecting scikit-learn==0.23.2\n", 52 | " Using cached scikit_learn-0.23.2-cp36-cp36m-manylinux1_x86_64.whl (6.8 MB)\n", 53 | "Requirement already satisfied: numpy>=1.13.3 in ./.myenv/miniconda/envs/my-custom-jupyter-kernel/lib/python3.6/site-packages (from scikit-learn==0.23.2) (1.19.4)\n", 54 | "Collecting joblib>=0.11\n", 55 | " Downloading joblib-1.0.0-py3-none-any.whl (302 kB)\n", 56 | "\u001b[K |████████████████████████████████| 302 kB 24.2 MB/s eta 0:00:01\n", 57 | "\u001b[?25hCollecting scipy>=0.19.1\n", 58 | " Using cached scipy-1.5.4-cp36-cp36m-manylinux1_x86_64.whl (25.9 MB)\n", 59 | "Requirement already satisfied: numpy>=1.13.3 in ./.myenv/miniconda/envs/my-custom-jupyter-kernel/lib/python3.6/site-packages (from scikit-learn==0.23.2) (1.19.4)\n", 60 | "Collecting threadpoolctl>=2.0.0\n", 61 | " Using cached threadpoolctl-2.1.0-py3-none-any.whl (12 kB)\n", 62 | "Installing collected packages: threadpoolctl, scipy, joblib, scikit-learn\n", 63 | "Successfully installed joblib-1.0.0 scikit-learn-0.23.2 scipy-1.5.4 threadpoolctl-2.1.0\n" 64 | ] 65 | } 66 | ], 67 | "source": [ 68 | "!pip install scikit-learn==0.23.2" 69 | ] 70 | }, 71 | { 72 | "cell_type": "code", 73 | "execution_count": 5, 74 | "metadata": {}, 75 | "outputs": [], 76 | "source": [ 77 | "import sklearn" 78 | ] 79 | } 80 | ], 81 | "metadata": { 82 | "kernelspec": { 83 | "display_name": "conda_my-custom-jupyter-kernel", 84 | "language": "python", 85 | "name": "conda_my-custom-jupyter-kernel" 86 | }, 87 | "language_info": { 88 | "codemirror_mode": { 89 | "name": "ipython", 90 | "version": 3 91 | }, 92 | "file_extension": ".py", 93 | "mimetype": "text/x-python", 94 | "name": "python", 95 | "nbconvert_exporter": "python", 96 | "pygments_lexer": "ipython3", 97 | "version": "3.6.12" 98 | } 99 | }, 100 | "nbformat": 4, 101 | "nbformat_minor": 4 102 | } 103 | -------------------------------------------------------------------------------- /start_env.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash -ex 2 | 3 | WORKING_DIR=./.myenv 4 | # get the env name 5 | line=$(head -n 1 environment.yml) 6 | ENV_NAME="${line/name:\ /}" 7 | PWD=$(pwd) 8 | 9 | # fix an issue for displaying plotly 10 | # jupyter labextension install jupyterlab-plotly 11 | 12 | source "$WORKING_DIR/miniconda/bin/activate" 13 | conda activate $ENV_NAME 14 | 15 | # Cleanup 16 | conda deactivate 17 | source "${WORKING_DIR}/miniconda/bin/deactivate" 18 | 19 | # Add the following env dir to envs_dirs 20 | conda config --add envs_dirs "$PWD/$WORKING_DIR/miniconda/envs" 21 | 22 | # Activate the kernel by list the envs 23 | conda env list 24 | 25 | # Optional 26 | #sudo initctl restart jupyter-server --no-wait 27 | --------------------------------------------------------------------------------