├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── build_env.sh
├── environment.yml
├── example.ipynb
└── start_env.sh


/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | ## Code of Conduct
2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
4 | opensource-codeofconduct@amazon.com with any additional questions or comments.
5 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
 1 | # Contributing Guidelines
 2 | 
 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional
 4 | documentation, we greatly value feedback and contributions from our community.
 5 | 
 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary
 7 | information to effectively respond to your bug report or contribution.
 8 | 
 9 | 
10 | ## Reporting Bugs/Feature Requests
11 | 
12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features.
13 | 
14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already
15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:
16 | 
17 | * A reproducible test case or series of steps
18 | * The version of our code being used
19 | * Any modifications you've made relevant to the bug
20 | * Anything unusual about your environment or deployment
21 | 
22 | 
23 | ## Contributing via Pull Requests
24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:
25 | 
26 | 1. You are working against the latest source on the *main* branch.
27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted.
29 | 
30 | To send us a pull request, please:
31 | 
32 | 1. Fork the repository.
33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
34 | 3. Ensure local tests pass.
35 | 4. Commit to your fork using clear commit messages.
36 | 5. Send us a pull request, answering any default questions in the pull request interface.
37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
38 | 
39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and
40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/).
41 | 
42 | 
43 | ## Finding contributions to work on
44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start.
45 | 
46 | 
47 | ## Code of Conduct
48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
50 | opensource-codeofconduct@amazon.com with any additional questions or comments.
51 | 
52 | 
53 | ## Security issue notifications
54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
55 | 
56 | 
57 | ## Licensing
58 | 
59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
60 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
 2 | 
 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of
 4 | this software and associated documentation files (the "Software"), to deal in
 5 | the Software without restriction, including without limitation the rights to
 6 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
 7 | the Software, and to permit persons to whom the Software is furnished to do so.
 8 | 
 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
10 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
11 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
12 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
13 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
14 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
15 | 
16 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # aws-sagemaker-custom-jupyter-kernel
 2 | 
 3 | ## Introduction
 4 | There are several built-in Juptyer Kernels ready-to-use when working on an AWS Sagemaker Jupyter Notebook Instance.
 5 | They are easy to use but generally not convenient to customize:
 6 | 
 7 | - If you want to install new packages to the built-in Juptyer Kernels, there is no guarantee that the new packages are compatible with the existing ones.
 8 | - Even if you can install new packages to the built-in Juptyer Kernels, you will lose the modified/custom kernels when restarting the Notebook Instance.
 9 | 
10 | This repo provides a couple of easy-to-use template scripts to help you set up a custom jupyter kernel on a AWS Sagemaker Jupyter Notebook Instance. 
11 | 
12 | 
13 | ## Build/Start Custom Jupyter Kernel
14 | 
15 | ### When first create the Jupyter notebook instance.
16 | 1. Need to edit `environment.yml` to specify your custom environment if you want to build a different Python Kernel to this example in this repo.
17 | 2. Run the command below to build the custom Python Kernel named `Conda_my-custom-jupyter-kernel`.
18 |         
19 |     ./build_env.sh
20 |     
21 | ### (Optional): Every time re-start the Jupyter Notebook instance, run the command below to add the custom Python Kernel.
22 | 
23 |     ./start_env.sh
24 | 
25 | ### To use the custom Kernel, create a new Jupyter notebook, and select `Conda_my-custom-jupyter-kernel` as the Python Kernel.
26 | 
27 | 


--------------------------------------------------------------------------------
/build_env.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash -ex
 2 | 
 3 | WORKING_DIR=./.myenv
 4 | # get the env name
 5 | line=$(head -n 1 environment.yml)
 6 | ENV_NAME="${line/name:\ /}"
 7 | 
 8 | mkdir -p "${WORKING_DIR}"
 9 | PWD=$(pwd)
10 | 
11 | # fix an issue for displaying plotly
12 | # jupyter labextension install jupyterlab-plotly
13 | 
14 | # Install Miniconda to get a separate python and pip
15 | wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O "$WORKING_DIR/miniconda.sh"
16 | 
17 | # Install Miniconda into the working directory
18 | bash "$WORKING_DIR/miniconda.sh" -b -u -p "$WORKING_DIR/miniconda"
19 | 
20 | # Install pinned versions of any dependencies
21 | source "$WORKING_DIR/miniconda/bin/activate"
22 | 
23 | # Set cuda variable if GPU is needed for some packages 
24 | # export CUDA_VISIBLE_DEVICES=0
25 | 
26 | conda env create -f environment.yml
27 | 
28 | conda activate $ENV_NAME
29 | 
30 | # add this as a kernel
31 | pip install ipykernel
32 | 
33 | # Cleanup
34 | conda deactivate
35 | source "${WORKING_DIR}/miniconda/bin/deactivate"
36 | rm -rf "${WORKING_DIR}/miniconda.sh"
37 | 
38 | # Add the following env dir to envs_dirs
39 | conda config --add envs_dirs "$PWD/$WORKING_DIR/miniconda/envs"
40 | 
41 | # Activate the kernel by list the envs
42 | conda env list
43 | 
44 | # Optional
45 | #sudo initctl restart jupyter-server --no-wait
46 | 


--------------------------------------------------------------------------------
/environment.yml:
--------------------------------------------------------------------------------
1 | name: my-custom-jupyter-kernel
2 | channels:
3 |   - conda-forge
4 | dependencies:
5 |   - python=3.6   # or 2.7
6 |   - vowpalwabbit=8.8.1
7 |   - pip:
8 |     - pandas==1.1.5
9 | 


--------------------------------------------------------------------------------
/example.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "from vowpalwabbit import pyvw"
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "code",
 14 |    "execution_count": 2,
 15 |    "metadata": {},
 16 |    "outputs": [],
 17 |    "source": [
 18 |     "import pandas as pd"
 19 |    ]
 20 |   },
 21 |   {
 22 |    "cell_type": "code",
 23 |    "execution_count": 3,
 24 |    "metadata": {},
 25 |    "outputs": [
 26 |     {
 27 |      "ename": "ModuleNotFoundError",
 28 |      "evalue": "No module named 'sklearn'",
 29 |      "output_type": "error",
 30 |      "traceback": [
 31 |       "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
 32 |       "\u001b[0;31mModuleNotFoundError\u001b[0m                       Traceback (most recent call last)",
 33 |       "\u001b[0;32m<ipython-input-3-b7c74cbf5af0>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mimport\u001b[0m \u001b[0msklearn\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
 34 |       "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'sklearn'"
 35 |      ]
 36 |     }
 37 |    ],
 38 |    "source": [
 39 |     "import sklearn"
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "code",
 44 |    "execution_count": 4,
 45 |    "metadata": {},
 46 |    "outputs": [
 47 |     {
 48 |      "name": "stdout",
 49 |      "output_type": "stream",
 50 |      "text": [
 51 |       "Collecting scikit-learn==0.23.2\n",
 52 |       "  Using cached scikit_learn-0.23.2-cp36-cp36m-manylinux1_x86_64.whl (6.8 MB)\n",
 53 |       "Requirement already satisfied: numpy>=1.13.3 in ./.myenv/miniconda/envs/my-custom-jupyter-kernel/lib/python3.6/site-packages (from scikit-learn==0.23.2) (1.19.4)\n",
 54 |       "Collecting joblib>=0.11\n",
 55 |       "  Downloading joblib-1.0.0-py3-none-any.whl (302 kB)\n",
 56 |       "\u001b[K     |████████████████████████████████| 302 kB 24.2 MB/s eta 0:00:01\n",
 57 |       "\u001b[?25hCollecting scipy>=0.19.1\n",
 58 |       "  Using cached scipy-1.5.4-cp36-cp36m-manylinux1_x86_64.whl (25.9 MB)\n",
 59 |       "Requirement already satisfied: numpy>=1.13.3 in ./.myenv/miniconda/envs/my-custom-jupyter-kernel/lib/python3.6/site-packages (from scikit-learn==0.23.2) (1.19.4)\n",
 60 |       "Collecting threadpoolctl>=2.0.0\n",
 61 |       "  Using cached threadpoolctl-2.1.0-py3-none-any.whl (12 kB)\n",
 62 |       "Installing collected packages: threadpoolctl, scipy, joblib, scikit-learn\n",
 63 |       "Successfully installed joblib-1.0.0 scikit-learn-0.23.2 scipy-1.5.4 threadpoolctl-2.1.0\n"
 64 |      ]
 65 |     }
 66 |    ],
 67 |    "source": [
 68 |     "!pip install scikit-learn==0.23.2"
 69 |    ]
 70 |   },
 71 |   {
 72 |    "cell_type": "code",
 73 |    "execution_count": 5,
 74 |    "metadata": {},
 75 |    "outputs": [],
 76 |    "source": [
 77 |     "import sklearn"
 78 |    ]
 79 |   }
 80 |  ],
 81 |  "metadata": {
 82 |   "kernelspec": {
 83 |    "display_name": "conda_my-custom-jupyter-kernel",
 84 |    "language": "python",
 85 |    "name": "conda_my-custom-jupyter-kernel"
 86 |   },
 87 |   "language_info": {
 88 |    "codemirror_mode": {
 89 |     "name": "ipython",
 90 |     "version": 3
 91 |    },
 92 |    "file_extension": ".py",
 93 |    "mimetype": "text/x-python",
 94 |    "name": "python",
 95 |    "nbconvert_exporter": "python",
 96 |    "pygments_lexer": "ipython3",
 97 |    "version": "3.6.12"
 98 |   }
 99 |  },
100 |  "nbformat": 4,
101 |  "nbformat_minor": 4
102 | }
103 | 


--------------------------------------------------------------------------------
/start_env.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash -ex
 2 | 
 3 | WORKING_DIR=./.myenv
 4 | # get the env name
 5 | line=$(head -n 1 environment.yml)
 6 | ENV_NAME="${line/name:\ /}"
 7 | PWD=$(pwd)
 8 | 
 9 | # fix an issue for displaying plotly
10 | # jupyter labextension install jupyterlab-plotly
11 | 
12 | source "$WORKING_DIR/miniconda/bin/activate"
13 | conda activate $ENV_NAME
14 | 
15 | # Cleanup
16 | conda deactivate
17 | source "${WORKING_DIR}/miniconda/bin/deactivate"
18 | 
19 | # Add the following env dir to envs_dirs
20 | conda config --add envs_dirs "$PWD/$WORKING_DIR/miniconda/envs"
21 | 
22 | # Activate the kernel by list the envs
23 | conda env list
24 | 
25 | # Optional
26 | #sudo initctl restart jupyter-server --no-wait
27 | 


--------------------------------------------------------------------------------