├── .gitignore
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── DEVELOPMENT.md
├── LICENSE
├── README.md
├── code-editor
└── auto-stop-idle
│ ├── README.md
│ ├── on-start.sh
│ └── python-package
│ ├── .gitignore
│ ├── setup.py
│ └── src
│ └── sagemaker_code_editor_auto_shut_down
│ ├── __init__.py
│ ├── auto_stop_idle.py
│ └── version.py
├── common-scripts
└── ebs-s3-backup-restore
│ ├── README.md
│ └── on-start.sh
└── jupyterlab
└── auto-stop-idle
├── README.md
├── on-start.sh
└── python-package
├── .gitignore
├── setup.py
└── src
└── sagemaker_studio_jlab_auto_stop_idle
├── __init__.py
├── auto_stop_idle.py
└── version.py
/.gitignore:
--------------------------------------------------------------------------------
1 | # OS and IDE files:
2 | .DS_Store
3 | .idea
4 | .ipynb_checkpoints/
5 | .vscode/
6 |
7 | # Python build artifacts:
8 | __pycache__/
9 |
10 | # gzip tarballs
11 | *.tar.gz
--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | ## Code of Conduct
2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
4 | opensource-codeofconduct@amazon.com with any additional questions or comments.
5 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | # Contributing Guidelines
2 |
3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional
4 | documentation, we greatly value feedback and contributions from our community.
5 |
6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary
7 | information to effectively respond to your bug report or contribution.
8 |
9 |
10 | ## Reporting Bugs/Feature Requests
11 |
12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features.
13 |
14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already
15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:
16 |
17 | * A reproducible test case or series of steps
18 | * The version of our code being used
19 | * Any modifications you've made relevant to the bug
20 | * Anything unusual about your environment or deployment
21 |
22 |
23 | ## Contributing via Pull Requests
24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:
25 |
26 | 1. You are working against the latest source on the *main* branch.
27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted.
29 |
30 | To send us a pull request, please:
31 |
32 | 1. Fork the repository.
33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
34 | 3. Ensure local tests pass.
35 | 4. Commit to your fork using clear commit messages.
36 | 5. Send us a pull request, answering any default questions in the pull request interface.
37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
38 |
39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and
40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/).
41 |
42 |
43 | ## Finding contributions to work on
44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start.
45 |
46 |
47 | ## Code of Conduct
48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
50 | opensource-codeofconduct@amazon.com with any additional questions or comments.
51 |
52 |
53 | ## Security issue notifications
54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
55 |
56 |
57 | ## Licensing
58 |
59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
60 |
--------------------------------------------------------------------------------
/DEVELOPMENT.md:
--------------------------------------------------------------------------------
1 | ## Best practicies for developing Lifecycle Configuration scripts for SageMaker Studio applications
2 |
3 | ### SageMaker JupyterLab
4 |
5 | 1. You can test JupyterLab scripts in the JupyterLab **Terminal**. If the scripts are running without issues in terminals, you can safely assume it will run as an LCC script as well.
6 |
7 | 2. Always add the `set -eux` command to the beginning of your script. This command will print out the commands executed by your script line by line and will be visible in the logs as well. This helps you to troubleshoot your scripts faster.
8 |
9 | 3. The script will be running as `sagemaker-user`. Use `sudo` to run commands as `root`.
10 |
11 | 4. If you are installing Jupyter Lab or Jupyter Server extensions, ensure they're compatible with the Studio JupyterLab version.
12 |
13 | 5. Persistent EBS storage is mounted at `/home/sagemaker-user`; leverage persistent storage to avoid re-installing libraries or packages at each restart.
14 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT No Attribution
2 |
3 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy of
6 | this software and associated documentation files (the "Software"), to deal in
7 | the Software without restriction, including without limitation the rights to
8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
9 | the Software, and to permit persons to whom the Software is furnished to do so.
10 |
11 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
12 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
13 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
14 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
15 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
16 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
17 |
18 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # SageMaker Studio Lifecycle Configuration examples
2 |
3 | ## Overview
4 | A collection of sample scripts customizing SageMaker Studio applications using lifecycle configurations.
5 |
6 | Lifecycle Configurations (LCCs) provide a mechanism to customize SageMaker Studio applications via shell scripts that are executed at application bootstrap. For further information on how to use lifecycle configurations with SageMaker Studio applications, please refer to the AWS documentation:
7 |
8 | - [Using Lifecycle Configurations with JupyterLab](https://docs.aws.amazon.com/sagemaker/latest/dg/jl-lcc.html)
9 | - [Using Lifecycle Configurations with Code Editor](https://docs.aws.amazon.com/sagemaker/latest/dg/code-editor-use-lifecycle-configurations.html)
10 |
11 | > **Warning**
12 | > The sample scripts in this repository are designed to work with SageMaker Studio JupyterLab and Code Editor applications. If you are using SageMaker Studio Classic, please refer to https://github.com/aws-samples/sagemaker-studio-lifecycle-config-examples
13 |
14 | ## Sample Scripts
15 |
16 | ### [SageMaker JupyterLab](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-updated-jl.html)
17 | - [auto-stop-idle](jupyterlab/auto-stop-idle/) - Automatically shuts down JupyterLab applications that have been idle for a configurable time.
18 |
19 | ### [SageMaker Code Editor](https://docs.aws.amazon.com/sagemaker/latest/dg/code-editor.html)
20 | - [auto-stop-idle](code-editor/auto-stop-idle/) - Automatically shuts down Code Editor applications that have been idle for a configurable time.
21 |
22 | ### Common scripts
23 | These scripts will work with both SageMaker JupyterLab and SageMaker Code Editor apps. Note that if you want the script to be available across both apps, you will need to set them as an LCC script for both apps.
24 | - [ebs-s3-backup-restore](common-scripts/ebs-s3-backup-restore) - This script backs up content in a user space's EBS volume (user's home directory under `/home/sagemaker-user`) to an S3 bucket that's specified on the script, optionally on a schedule. If the user profile is tagged with a `SM_EBS_RESTORE_TIMESTAMP` tag, then the script will restore the backup files into the user's home directory, in addition to backups.
25 |
26 | ## Developing LLCs for SageMaker Studio applications
27 | For best practices, please check [DEVELOPMENT](DEVELOPMENT.md).
28 |
29 | ## License
30 | This project is licensed under the [MIT-0 License](LICENSE).
31 |
32 | ## Authors
33 | [Giuseppe A. Porcelli](https://www.linkedin.com/in/giuporcelli/) - Principal, ML Specialist Solutions Architect - Amazon SageMaker
34 |
Spencer Ng - Software Development Engineer - Amazon SageMaker
35 |
Durga Sury - Senior ML Specialist Solutions Architect - Amazon SageMaker
--------------------------------------------------------------------------------
/code-editor/auto-stop-idle/README.md:
--------------------------------------------------------------------------------
1 | # SageMaker Code Editor Auto-Stop for Idle Instances
2 |
3 | The `auto_stop_idle.py` Python script, coupled with the `on-start.sh` shell script, is designed to automatically shut down idle SageMaker Code Editor applications after a configurable time of inactivity. This solution is intended to help manage costs by ensuring that resources are not left running when not in use.
4 |
5 | ## Installation for SageMaker Studio User Profiles
6 |
7 | ### Prerequisites
8 |
9 | - AWS CLI configured with appropriate permissions
10 | - Access to the SageMaker Studio domain where the user profiles are located
11 |
12 | ### Installation for all user profiles in a SageMaker Studio domain
13 |
14 | From a terminal appropriately configured with AWS CLI, run the following commands (replace fields as needed):
15 |
16 | ```
17 | ASI_VERSION=0.3.1
18 |
19 | curl -LO https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/releases/download/v$ASI_VERSION/code-editor-lccs-$ASI_VERSION.tar.gz
20 | tar -xvzf code-editor-lccs-$ASI_VERSION.tar.gz
21 |
22 | cd auto-stop-idle
23 |
24 | REGION=
25 | DOMAIN_ID=
26 | ACCOUNT_ID=
27 | LCC_NAME=code-editor-auto-stop-idle
28 | LCC_CONTENT=`openssl base64 -A -in on-start.sh`
29 |
30 | aws sagemaker create-studio-lifecycle-config \
31 | --studio-lifecycle-config-name $LCC_NAME \
32 | --studio-lifecycle-config-content $LCC_CONTENT \
33 | --studio-lifecycle-config-app-type CodeEditor \
34 | --query 'StudioLifecycleConfigArn'
35 |
36 | aws sagemaker update-domain \
37 | --region "$REGION" \
38 | --domain-id "$DOMAIN_ID" \
39 | --default-user-settings \
40 | '{
41 | "CodeEditorAppSettings": {
42 | "DefaultResourceSpec": {
43 | "LifecycleConfigArn": "arn:aws:sagemaker:'"$REGION"':'"$ACCOUNT_ID"':studio-lifecycle-config/'"$LCC_NAME"'",
44 | "InstanceType": "ml.t3.medium"
45 | },
46 | "LifecycleConfigArns": [
47 | "arn:aws:sagemaker:'"$REGION"':'"$ACCOUNT_ID"':studio-lifecycle-config/'"$LCC_NAME"'"
48 | ]
49 | }
50 | }'
51 |
52 | ```
53 |
54 | 2. After successful domain update, navigate to Code Editor, and select the LCC when starting your Code Editor application.
55 |
56 | Note: Currently this script does not support installtion in Internet Free VPC enviornments.
57 |
58 | ### Definition of idleness
59 |
60 | The current implementation of idleness includes the following criteria:
61 |
62 | 1. There are no file changes made in the Code Editor application for a time period greater than `IDLE_TIME`. File changes include adding new files, deleting files, and/or updating files.
63 | * Note: The implementation does not currently support terminal activity detection.
64 |
65 | ### Configurations
66 |
67 | The `on-start.sh` script can be customized by modifying:
68 |
69 | * `IDLE_TIME` the time in seconds that the application is in "idle" state before being shut down. Default: `3600` seconds
70 | * `ASI_VERSION` the version of the Auto Shut Down solution. Please note that Code Editor starts at `v0.3.0`.
71 |
72 | ### Acknowledgement
73 |
74 | A special acknowledgement to Lavaraja Padala for his foundational work on Lifecycle Configuration (LCC) implementation. We're grateful for his contribution to the community!
75 |
--------------------------------------------------------------------------------
/code-editor/auto-stop-idle/on-start.sh:
--------------------------------------------------------------------------------
1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2 | # SPDX-License-Identifier: MIT-0
3 |
4 | #!/bin/bash
5 | set -eux
6 | ASI_VERSION=0.3.1
7 |
8 | # User variables [update as needed]
9 | IDLE_TIME_IN_SECONDS=3600 # in seconds, change this to desired idleness time before app shuts down
10 |
11 | # System variables [do not change if not needed]
12 | CONDA_HOME=/opt/conda/bin
13 | LOG_FILE=/var/log/apps/app_container.log # Writing to app_container.log delivers logs to CW logs.
14 | SOLUTION_DIR=/var/tmp/auto-stop-idle # Do not use /home/sagemaker-user
15 | PYTHON_PACKAGE=sagemaker_code_editor_auto_shut_down-$ASI_VERSION.tar.gz
16 | PYTHON_SCRIPT_PATH=$SOLUTION_DIR/sagemaker_code_editor_auto_shut_down/auto_stop_idle.py
17 |
18 | # Installing cron
19 | sudo apt-get update -y
20 |
21 | # Issue - https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/issues/12
22 | # SM Distribution image 1.6 is not starting cron service by default https://github.com/aws/sagemaker-distribution/issues/354
23 |
24 | # Check if cron needs to be installed ## Handle scenario where script exiting("set -eux") due to non-zero return code by adding true command.
25 | status="$(dpkg-query -W --showformat='${db:Status-Status}' "cron" 2>&1)" || true
26 | if [ ! $? = 0 ] || [ ! "$status" = installed ]; then
27 | # Fixing invoke-rc.d: policy-rc.d denied execution of restart.
28 | sudo /bin/bash -c "echo '#!/bin/sh
29 | exit 0' > /usr/sbin/policy-rc.d"
30 |
31 | # Installing cron.
32 | echo "Installing cron..."
33 | sudo apt install cron
34 | else
35 | echo "Package cron is already installed."
36 | # Start/restart the service.
37 | sudo service cron restart
38 | fi
39 |
40 | # Creating solution directory.
41 | sudo mkdir -p $SOLUTION_DIR
42 |
43 | # Downloading autostop idle Python package.
44 | echo "Downloading autostop idle Python package..."
45 | curl -LO --output-dir /var/tmp/ https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/releases/download/v$ASI_VERSION/$PYTHON_PACKAGE
46 | sudo $CONDA_HOME/pip install -U -t $SOLUTION_DIR /var/tmp/$PYTHON_PACKAGE
47 |
48 | # Touch file to ensure idleness timer is reset to 0
49 | echo "Touching file to reset idleness timer"
50 | touch /opt/amazon/sagemaker/sagemaker-code-editor-server-data/data/User/History/startup_timestamp
51 |
52 | # Setting container credential URI variable to /etc/environment to make it available to cron
53 | sudo /bin/bash -c "echo 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI=$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI' >> /etc/environment"
54 |
55 | # Add script to crontab for root.
56 | echo "Adding autostop idle Python script to crontab..."
57 | echo "*/2 * * * * /bin/bash -ic '$CONDA_HOME/python $PYTHON_SCRIPT_PATH --time $IDLE_TIME_IN_SECONDS --region $AWS_DEFAULT_REGION >> $LOG_FILE'" | sudo crontab -
--------------------------------------------------------------------------------
/code-editor/auto-stop-idle/python-package/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | share/python-wheels/
24 | *.egg-info/
25 | .installed.cfg
26 | *.egg
27 | MANIFEST
--------------------------------------------------------------------------------
/code-editor/auto-stop-idle/python-package/setup.py:
--------------------------------------------------------------------------------
1 | from __future__ import absolute_import
2 |
3 | from glob import glob
4 | import os
5 | from os.path import basename
6 | from os.path import splitext
7 |
8 | from setuptools import find_packages, setup
9 | from distutils.util import convert_path
10 |
11 | main_ns = {}
12 | ver_path = convert_path('src/sagemaker_code_editor_auto_shut_down/version.py')
13 | with open(ver_path) as ver_file:
14 | exec(ver_file.read(), main_ns)
15 |
16 | setup(
17 | name='sagemaker_code_editor_auto_shut_down',
18 | version=main_ns['__version__'],
19 | description='Auto Stop idle Code Editor Apps.',
20 |
21 | packages=find_packages(where='src', exclude=('test',)),
22 | package_dir={'': 'src'},
23 | py_modules=[splitext(basename(path))[0] for path in glob('src/*.py')],
24 |
25 | author='Amazon Web Services',
26 | url='https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples',
27 | license='MIT-0',
28 |
29 | classifiers=[
30 | "Development Status :: 5 - Production/Stable",
31 | "Intended Audience :: Developers",
32 | "Natural Language :: English",
33 | "License :: OSI Approved :: MIT-0",
34 | "Programming Language :: Python",
35 | 'Programming Language :: Python :: 3.9',
36 | 'Programming Language :: Python :: 3.10'
37 | ],
38 |
39 | install_requires=[],
40 | extras_require={
41 | }
42 | )
--------------------------------------------------------------------------------
/code-editor/auto-stop-idle/python-package/src/sagemaker_code_editor_auto_shut_down/__init__.py:
--------------------------------------------------------------------------------
1 | from __future__ import absolute_import
2 | from sagemaker_code_editor_auto_shut_down.version import __version__
--------------------------------------------------------------------------------
/code-editor/auto-stop-idle/python-package/src/sagemaker_code_editor_auto_shut_down/auto_stop_idle.py:
--------------------------------------------------------------------------------
1 | from datetime import datetime
2 | import os
3 | import time
4 | import boto3
5 | import json
6 | import sys
7 | import argparse
8 |
9 | DATE_FORMAT = "%Y-%m-%dT%H:%M:%S.%fz"
10 |
11 | def log_message(message):
12 | """
13 | Logs a message.
14 | """
15 | print(f"{datetime.now().strftime(DATE_FORMAT)} - {message}")
16 |
17 | def check_user_activity(workspace_dir, idle_threshold):
18 | # Get the timestamp of the most recently modified file or folder
19 | recent_item = max(
20 | (os.path.join(root, file) for root, _, files in os.walk(workspace_dir) for file in files),
21 | key=lambda x: os.lstat(x).st_mtime,
22 | default=None
23 | )
24 |
25 | # Get the current time
26 | current_time = time.time()
27 |
28 | # Calculate the time difference
29 | time_diff = current_time - os.stat(recent_item).st_mtime if recent_item else float('inf')
30 | log_message(f"[auto-stop-idle] - Logging time difference between current time and time files were last changed {time_diff}.")
31 |
32 | # Check if the user is idle based on the idle time threshold
33 | if time_diff > idle_threshold:
34 | return "idle"
35 | else:
36 | return "active"
37 |
38 | # Create an argument parser
39 | parser = argparse.ArgumentParser(description='Check user activity and terminate SageMaker Studio app if idle.')
40 | parser.add_argument('--time', type=int, help='Idle time threshold in seconds')
41 | parser.add_argument('--region', type=str, help='AWS region')
42 |
43 | # Parse the command-line arguments
44 | args = parser.parse_args()
45 |
46 | # Check if idle_threshold is provided
47 | if args.time is None:
48 | parser.print_help()
49 | sys.exit(1)
50 |
51 | if args.region is None:
52 | parser.print_help()
53 | sys.exit(1)
54 |
55 | # Monitor workspace_dirs for changes to implement auto-shutdown, as these paths track updates to both unsaved and saved editor content, covering all user activity scenarios.
56 | workspace_dirs = ["/opt/amazon/sagemaker/sagemaker-code-editor-server-data/data/User/History", "/opt/amazon/sagemaker/sagemaker-code-editor-server-data/data/User/Backups/empty-window/untitled"]
57 | idle_threshold = args.time # this is in seconds. for ex: 1800 seconds for 30 minutes
58 | aws_region = args.region # get the region.
59 |
60 | # Track the activity status for each directory
61 | activity_status = [check_user_activity(directory, idle_threshold) for directory in workspace_dirs]
62 |
63 | # Terminate the SageMaker Studio app if all directories are idle and no activity is observed.
64 | if all(status == "idle" for status in activity_status):
65 | # Load the resource metadata from the file
66 | with open('/opt/ml/metadata/resource-metadata.json') as f:
67 | resource_metadata = json.load(f)
68 |
69 | # Extract the required details for deleting the app
70 | domain_id = resource_metadata['DomainId']
71 | space_name = resource_metadata['SpaceName']
72 | app_name = resource_metadata['ResourceName']
73 | app_type = resource_metadata['AppType']
74 | resource_arn = resource_metadata["ResourceArn"]
75 |
76 | # Use boto3 api call to delete the app.
77 | sm_client = boto3.client('sagemaker',region_name=aws_region)
78 | response = sm_client.delete_app(
79 | DomainId=domain_id,
80 | AppType=app_type,
81 | AppName=app_name,
82 | SpaceName=space_name
83 | )
84 | log_message(f"[auto-stop-idle] - Deleting app {app_type}-{app_name}. Domain ID: {domain_id}. Space name: {space_name}. Resource ARN: {resource_arn}.")
85 | log_message("[auto-stop-idle] - SageMaker Code Editor app terminated due to being idle for given duration.")
86 | else:
87 | log_message("[auto-stop-idle] - SageMaker Code Editor app is not idle. Passing check.")
88 |
--------------------------------------------------------------------------------
/code-editor/auto-stop-idle/python-package/src/sagemaker_code_editor_auto_shut_down/version.py:
--------------------------------------------------------------------------------
1 | __version__ = "0.3.1"
--------------------------------------------------------------------------------
/common-scripts/ebs-s3-backup-restore/README.md:
--------------------------------------------------------------------------------
1 | # SageMaker Studio EBS Backup and Recovery
2 |
3 | SageMaker Studio uses Elastic Block Storage (EBS) for persistent storage of users' files. See the blog [Boost productivity on Amazon SageMaker Studio: Introducing JupyterLab Spaces and generative AI tools](https://aws.amazon.com/blogs/machine-learning/boost-productivity-on-amazon-sagemaker-studio-introducing-jupyterlab-spaces-and-generative-ai-tools/) for a detailed look at Studio architecture.
4 |
5 | Since the EBS volume is managed by SageMaker, customers want a mechanism to backup and restore files from users' spaces in the event of a disaster, or any other scenarios such as recreating a space or user profile.
6 |
7 | When set as a Lifecycle Configuration, the `on-start.sh` shell script backs up the user's file in the space home directory (`/home/sagemaker-user`) into an S3 location. The S3 bucket and prefix are specified by the administrator through the script, and the script saves the files under `s3://////`. The admin can also choose to run the S3 sync at regular intervals, the default provided is 12 hours. We recommend not going less than 6 hours on the time interval, so that notebook performance is not affected by the background sync.
8 |
9 | When the administrator needs to restore the files, the user profile simply needs to be tagged with the timestamp. If there is a timestamp tag on the user, the script will restore the files from the timestamp, in addition to backing up files to S3.
10 | *Note: Admins should remove the timestamp tag from the user profile, after the LCC is run. Otherwise, the script will continue to restore from S3.*
11 |
12 | ## Installation for SageMaker Studio User Profiles
13 |
14 | ### Prerequisites
15 |
16 | - AWS CLI configured with appropriate permissions
17 | - Access to the SageMaker Studio domain where the user profiles are located
18 |
19 | ### Installation for all user profiles in a SageMaker Studio domain
20 |
21 | From a terminal appropriately configured with AWS CLI, run the following commands (replace fields as needed):
22 |
23 | ```
24 | REGION=
25 | DOMAIN_ID=
26 | ACCOUNT_ID=
27 | LCC_NAME=ebs-s3-backup-restore
28 | LCC_CONTENT=`openssl base64 -A -in on-start.sh`
29 |
30 | # replace CodeEditor with JupyterLab if setting this LCC for JupyterLab apps
31 | aws sagemaker create-studio-lifecycle-config \
32 | --studio-lifecycle-config-name $LCC_NAME \
33 | --studio-lifecycle-config-content $LCC_CONTENT \
34 | --studio-lifecycle-config-app-type CodeEditor \
35 | --query 'StudioLifecycleConfigArn'
36 |
37 | aws sagemaker update-domain \
38 | --region "$REGION" \
39 | --domain-id "$DOMAIN_ID" \
40 | --default-user-settings \
41 | '{
42 | "CodeEditorAppSettings": {
43 | "DefaultResourceSpec": {
44 | "LifecycleConfigArn": "arn:aws:sagemaker:'"$REGION"':'"$ACCOUNT_ID"':studio-lifecycle-config/'"$LCC_NAME"'",
45 | "InstanceType": "ml.t3.medium"
46 | },
47 | "LifecycleConfigArns": [
48 | "arn:aws:sagemaker:'"$REGION"':'"$ACCOUNT_ID"':studio-lifecycle-config/'"$LCC_NAME"'"
49 | ]
50 | }
51 | }'
52 |
53 | ```
54 |
55 | 2. After successful domain update, navigate to your space, and select the LCC when starting your default application.
56 |
57 |
58 | ### Configurations
59 |
60 | The `on-start.sh` script can be customized by modifying:
61 |
62 | * `ENABLE_SCHEDULED_SYNC` - set to 1 to enable scheduled syncs to S3 . Default value is `1` (enabled).
63 | * `SYNC_INTERVAL` - if scheduled sync is enabled, the time interval in hours for syncing files to S3. Default value is `12`.
--------------------------------------------------------------------------------
/common-scripts/ebs-s3-backup-restore/on-start.sh:
--------------------------------------------------------------------------------
1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2 | # SPDX-License-Identifier: MIT-0
3 |
4 | #!/bin/bash
5 | set -eux
6 |
7 | # User variables [update as needed]
8 | export SM_BCK_BUCKET=studio-backup-bucket
9 | export SM_BCK_PREFIX=studio-backups
10 | ENABLE_SCHEDULED_SYNC=1 # If set to 1, the user home directory will be synched with Amazon S3 every SYNC_INTERVAL_IN_HOURS
11 | SYNC_INTERVAL_IN_HOURS=12 # Determines how frequently synch the user home directory on Amazon S3
12 |
13 | # System variables [do not change if not needed]
14 | export SM_BCK_HOME=$HOME
15 | LOG_FILE=/var/log/apps/app_container.log # Writing to app_container.log delivers logs to CW logs
16 |
17 | if [ $ENABLE_SCHEDULED_SYNC -eq 1 ]
18 | then
19 | echo "[EBS backup LCC] - Scheduled sync is enabled. Installing cron."
20 |
21 | # Installing cron
22 | sudo apt-get update -y
23 | sudo sh -c 'printf "#!/bin/sh\nexit 0" > /usr/sbin/policy-rc.d'
24 | sudo apt-get install -y cron
25 | fi
26 |
27 | # Installing jq
28 | sudo apt-get install -y jq
29 |
30 | export SM_BCK_SPACE_NAME=$(cat /opt/ml/metadata/resource-metadata.json | jq -r '.SpaceName')
31 | export SM_BCK_DOMAIN_ID=$(cat /opt/ml/metadata/resource-metadata.json | jq -r '.DomainId')
32 | export SM_BCK_USER_PROFILE_NAME=$(aws sagemaker describe-space --domain-id=$SM_BCK_DOMAIN_ID --space-name=$SM_BCK_SPACE_NAME | jq -r '.OwnershipSettings.OwnerUserProfileName')
33 | USER_PROFILE_ARN=$(aws sagemaker describe-user-profile --domain-id $SM_BCK_DOMAIN_ID --user-profile-name $SM_BCK_USER_PROFILE_NAME | jq -r '.UserProfileArn')
34 | RESTORE_TIMESTAMP=$(aws sagemaker list-tags --resource-arn $USER_PROFILE_ARN| jq -r '.Tags[] | select(.Key=="SM_EBS_RESTORE_TIMESTAMP").Value')
35 |
36 | # Creating backup script (if needed)
37 | if ! [ -f $HOME/.backup/backup.sh ]; then
38 | echo "[EBS backup LCC] - Creating backup script."
39 | mkdir -p $HOME/.backup
40 |
41 | cat << "EOF" > $HOME/.backup/backup.tp
42 | #!/bin/bash
43 |
44 | BACKUP_TIMESTAMP=`date +%F-%H-%M-%S`
45 | SNAPSHOT=${SM_BCK_USER_PROFILE_NAME}/${SM_BCK_SPACE_NAME}/${BACKUP_TIMESTAMP}
46 | echo "[EBS backup LCC] - Backup up $SM_BCK_HOME to s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT}/"
47 |
48 | # sync to S3 and skip files if they have been restored to avoid redundant copies and exclude hidden files
49 | aws s3 sync --exclude "*/lost+found/*" --exclude "restored-files/*" --exclude ".*/*" $SM_BCK_HOME s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT}/
50 |
51 | exitcode=$?
52 | echo "[EBS backup LCC] - S3 sync result (backup): "
53 | echo $exitcode
54 |
55 | if [ $exitcode -eq 0 ] || [ $exitcode -eq 2 ]
56 | then
57 | echo "[EBS backup LCC] - Created s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT}/" >> $SM_BCK_HOME/.backup/${BACKUP_TIMESTAMP}_BACKUP_COMPLETE
58 | CURRENT_TIMESTAMP=`date +%F-%H-%M-%S`
59 | echo "[EBS backup LCC] - Backup completed at $CURRENT_TIMESTAMP" >> $SM_BCK_HOME/.backup/${BACKUP_TIMESTAMP}_BACKUP_COMPLETE
60 | aws s3 cp $SM_BCK_HOME/.backup/${BACKUP_TIMESTAMP}_BACKUP_COMPLETE s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT}/${BACKUP_TIMESTAMP}_BACKUP_COMPLETE
61 | fi
62 | EOF
63 | envsubst "$(printf '${%s} ' ${!SM_BCK_*})" < $HOME/.backup/backup.tp > $HOME/.backup/backup.sh
64 |
65 | chmod +x $HOME/.backup/backup.sh
66 | fi
67 |
68 | # Creating restore script (if needed)
69 | if ! [ -f $HOME/.restore/restore.sh ]; then
70 | echo "[EBS backup LCC] - Creating restore script."
71 | mkdir -p $HOME/.restore
72 |
73 | cat << "EOF" > $HOME/.restore/restore.tp
74 | #!/bin/bash
75 |
76 | RESTORE_TIMESTAMP_ARG=$1
77 | SNAPSHOT=${SM_BCK_USER_PROFILE_NAME}/${SM_BCK_SPACE_NAME}/${RESTORE_TIMESTAMP_ARG}
78 |
79 | # check if SNAPSHOT exists, if not, proceed without sync
80 | echo "[EBS backup LCC] - Checking if s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT} exists..."
81 | aws s3 ls s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT} || (echo "[EBS backup LCC] - Snapshot s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT} does not exist. Proceed without the sync."; exit 0)
82 |
83 | # files are backed up to 'restored-files' to avoid overwriting
84 | echo "[EBS backup LCC] - Syncing s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT} to $SM_BCK_HOME/restored-files"
85 | aws s3 sync s3://${SM_BCK_BUCKET}/${SM_BCK_PREFIX}/${SNAPSHOT} $SM_BCK_HOME/restored-files/${RESTORE_TIMESTAMP_ARG}
86 |
87 | exitcode=$?
88 | echo "[EBS backup LCC] - S3 sync result (restore): "
89 | echo $exitcode
90 | if [ $exitcode -eq 0 ] || [ $exitcode -eq 2 ]
91 | then
92 | CURRENT_TIMESTAMP=`date +%F-%H-%M-%S`
93 | echo "[EBS backup LCC] - Restore completed at $CURRENT_TIMESTAMP" >> $SM_BCK_HOME/.restore/${RESTORE_TIMESTAMP_ARG}_SYNC_COMPLETE
94 | fi
95 |
96 | EOF
97 | envsubst "$(printf '${%s} ' ${!SM_BCK_*})" < $HOME/.restore/restore.tp > $HOME/.restore/restore.sh
98 |
99 | chmod +x $HOME/.restore/restore.sh
100 | fi
101 |
102 | # Run backup (at least once at bootstrap)
103 | echo "[EBS backup LCC] - Executing backup at bootstrap."
104 | nohup $HOME/.backup/backup.sh >> $LOG_FILE 2>&1 &
105 |
106 | # Check if scheduled backup needs to be enabled.
107 | if [ $ENABLE_SCHEDULED_SYNC -eq 1 ]
108 | then
109 | echo "[EBS backup LCC] - Adding backup script to crontab..."
110 | sudo mkdir -p /var/tmp
111 | sudo rm -f /var/tmp/ebs_backup.sh
112 | cp $HOME/.backup/backup.sh /var/tmp/ebs_backup.sh
113 | sudo chown root:root /var/tmp/ebs_backup.sh
114 | sudo chmod +x /var/tmp/ebs_backup.sh
115 | echo "0 */$SYNC_INTERVAL_IN_HOURS * * * /bin/bash -ic '/var/tmp/ebs_backup.sh >> $LOG_FILE'" | sudo crontab -
116 | fi
117 |
118 | # Check if restore timestamp is set.
119 | if ! [ -z "$RESTORE_TIMESTAMP" ]
120 | then
121 | echo "[EBS backup LCC] - User profile tagged with restore timestamp: ${RESTORE_TIMESTAMP}. Restoring files..."
122 | # nohup to bypass the LCC timeout at start
123 | nohup $HOME/.restore/restore.sh $RESTORE_TIMESTAMP >> $LOG_FILE 2>&1 &
124 | fi
125 |
--------------------------------------------------------------------------------
/jupyterlab/auto-stop-idle/README.md:
--------------------------------------------------------------------------------
1 | # SageMaker Studio JupyterLab auto-stop idle notebooks
2 | The `on-start.sh` script, designed to run as a [SageMaker Studio lifecycle configuration](https://docs.aws.amazon.com/sagemaker/latest/dg/jl-lcc.html), automatically shuts down idle JupyterLab applications after a configurable time of inactivity.
3 |
4 | ## Installation for all user profiles in a SageMaker Studio domain
5 |
6 | From a terminal appropriately configured with AWS CLI, run the following commands:
7 |
8 | ASI_VERSION=0.3.1
9 | curl -LO https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/releases/download/v$ASI_VERSION/jupyterlab-lccs-$ASI_VERSION.tar.gz
10 | tar -xvzf jupyterlab-lccs-$ASI_VERSION.tar.gz
11 |
12 | cd auto-stop-idle
13 |
14 | REGION=
15 | DOMAIN_ID=
16 | ACCOUNT_ID=
17 | LCC_NAME=auto-stop-idle
18 | LCC_CONTENT=`openssl base64 -A -in on-start.sh`
19 |
20 | aws sagemaker create-studio-lifecycle-config \
21 | --studio-lifecycle-config-name $LCC_NAME \
22 | --studio-lifecycle-config-content $LCC_CONTENT \
23 | --studio-lifecycle-config-app-type JupyterLab \
24 | --query 'StudioLifecycleConfigArn'
25 |
26 | aws sagemaker update-domain \
27 | --region $REGION \
28 | --domain-id $DOMAIN_ID \
29 | --default-user-settings \
30 | "{
31 | \"JupyterLabAppSettings\": {
32 | \"DefaultResourceSpec\": {
33 | \"LifecycleConfigArn\": \"arn:aws:sagemaker:$REGION:$ACCOUNT_ID:studio-lifecycle-config/$LCC_NAME\",
34 | \"InstanceType\": \"ml.t3.medium\"
35 | },
36 | \"LifecycleConfigArns\": [
37 | \"arn:aws:sagemaker:$REGION:$ACCOUNT_ID:studio-lifecycle-config/$LCC_NAME\"
38 | ]
39 | }
40 | }"
41 |
42 | Make sure to replace , , and in the previous commands with the AWS region, the Studio domain ID, and AWS Account ID you are using respectively.
43 |
44 | ## Definition of idleness
45 | The implementation considers a JupyterLab application as idle when:
46 | 1. The running Jupyter kernels and terminals have been idle for more than `IDLE_TIME_IN_SECONDS` (see Configuration section), based on their execution state and last activity date
47 | 2. There are no running kernels and terminals, but the last activity date recorded for the last running kernel or terminal plus `IDLE_TIME_IN_SECONDS` is lower than the current date.
48 |
49 | **Note**: if the JupyterLab application is started and no kernels or terminals are executed, idleness is computed based on the recorded last activity date. As a consequence, if users work with JupyterLab for more than `IDLE_TIME_IN_SECONDS` without running any Jupyter kernel or terminal, the application will be considered idle and shut down.
50 |
51 | ## Configuration
52 | The `on-start.sh` script can be customized by modifying the following variables:
53 |
54 | - `IDLE_TIME_IN_SECONDS` the time in seconds for which JupyterLab has to be idle before shutting down the application. **Default**: `3600`
55 | - `IGNORE_CONNECTIONS` whether active Jupyter Notebook sessions on idle kernels should be ingored. **Default**: `True`
56 | - `SKIP_TERMINALS` whether skipping any idleness check on Jupyter Terminals. **Default**: `False`
57 |
58 | In addition, the following advanced configuration is available (do not change unless explicitly required by your setup):
59 |
60 | - `ASI_VERSION` the version of the Auto Stop Idle (ASI) solution.
61 | - `JL_HOSTNAME` the host name for the JupyterLab application. **Default**: `0.0.0.0`
62 | - `JL_PORT` JupyterLab port. **Default**: `8888`
63 | - `JL_BASE_URL` JupyterLab base URL. **Default**: `/jupyterlab/default/`
64 | - `CONDA_HOME` Conda home directory. **Default**: `/opt/conda/bin`
65 | - `LOG_FILE` Path to the file where logs are written; defaults to the location of the Studio app logs, that are automatically delivered to Amazon CloudWatch. **Default**: `/var/log/apps/app_container.log`
66 | - `SOLUTION_DIR` The directory where the solution will be installed. **Default**: `/var/tmp/auto-stop-idle`
67 | - `STATE_FILE` Path to a file that is used to save the state for the Python script (given it's execution is stateless). The location of this file has to be transient, i.e. not persisted across restarts of the Studio JupyterLab app; as a consequence, do not use EBS-backed directories like `/home/sagemaker-user/`. **Default**: `/var/tmp/auto-stop-idle/auto_stop_idle.st`
68 | - `PYTHON_PACKAGE` The name (with version) of the auto stop idle Python package. **Default**: `sagemaker_studio_jlab_auto_stop_idle-$ASI_VERSION.tar.gz`
69 |
70 |
71 | ## Architecture considerations
72 | - The `on-start.sh` lifecycle configuration script adds a `cron` job for `root` using `crontab`, that is configured to run every `2` minutes. The job runs the Python script at `PYTHON_SCRIPT_PATH` which checks for idleness. If the JupyterLab application is detected being idle, the Python script deletes the application by invoking the Amazon SageMaker `DeleteApp` API.
73 | - This solution requires:
74 | 1. internet access to download `PYTHON_PACKAGE`.
75 | 2. access to the Amazon SageMaker `DeleteApp` API. From the authorization perspective, the execution role associated to the Studio Domain or User Profile must have an associated IAM policy allowing the `sagemaker:DeleteApp` action.
76 | - Studio JupyterLab application is run as `sagemaker-user`, which has `sudo` privileges; as a consequence, users could potentially remove the cron task and stop any idleness checks. To prevent this behavior, you can modify the configuration in `/etc/sudoers` to remove sudo privileges to `sagemaker-user`.
77 |
78 | ### Installing in internet-free VPCs
79 | To install the auto-stop-idle solution in an internet-free VPC configurtation you can use Amazon S3 and S3 VPC endpoints to download the auto stop idle Python package. In addition, you will need to configure SageMaker API VPC endpoints for the DeleteApp() operation.
80 |
81 | Following are the instructions on how to modify the lifecycle configuration to support internet-free VPC configurations:
82 |
83 | 1. Download and extract the auto-stop-idle solution tarball:
84 |
85 | ```
86 | ASI_VERSION=0.3.1
87 | curl -LO https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/releases/download/v$ASI_VERSION/jupyterlab-lccs-$ASI_VERSION.tar.gz
88 | tar -xvzf jupyterlab-lccs-$ASI_VERSION.tar.gz
89 | ```
90 |
91 | 2. Download and copy the auto stop idle Python package to a location of choice in Amazon S3. The Execution Role associated to the Studio domain or user profiles must have IAM policies that allow read access to such S3 location.
92 |
93 | ```
94 | cd auto-stop-idle
95 |
96 | PYTHON_PACKAGE=sagemaker_studio_jlab_auto_stop_idle-$ASI_VERSION.tar.gz
97 | curl -LO https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/releases/download/v$ASI_VERSION/$PYTHON_PACKAGE
98 | aws s3 cp $PYTHON_PACKAGE s3:////
99 | ```
100 |
101 | 3. Edit the `on-start.sh` file and replace line 56 with:
102 |
103 | ```
104 | sudo aws s3 cp s3:////$PYTHON_PACKAGE /var/tmp/
105 | ```
106 |
107 | 4. Create the LCC and attach to the Studio domain:
108 |
109 | ```
110 | REGION=
111 | DOMAIN_ID=
112 | ACCOUNT_ID=
113 | LCC_NAME=auto-stop-idle
114 | LCC_CONTENT=`openssl base64 -A -in on-start.sh`
115 |
116 | aws sagemaker create-studio-lifecycle-config \
117 | --studio-lifecycle-config-name $LCC_NAME \
118 | --studio-lifecycle-config-content $LCC_CONTENT \
119 | --studio-lifecycle-config-app-type JupyterLab \
120 | --query 'StudioLifecycleConfigArn'
121 |
122 | aws sagemaker update-domain \
123 | --region $REGION \
124 | --domain-id $DOMAIN_ID \
125 | --default-user-settings \
126 | "{
127 | \"JupyterLabAppSettings\": {
128 | \"DefaultResourceSpec\": {
129 | \"LifecycleConfigArn\": \"arn:aws:sagemaker:$REGION:$ACCOUNT_ID:studio-lifecycle-config/$LCC_NAME\",
130 | \"InstanceType\": \"ml.t3.medium\"
131 | },
132 | \"LifecycleConfigArns\": [
133 | \"arn:aws:sagemaker:$REGION:$ACCOUNT_ID:studio-lifecycle-config/$LCC_NAME\"
134 | ]
135 | }
136 | }"
137 | ```
138 |
--------------------------------------------------------------------------------
/jupyterlab/auto-stop-idle/on-start.sh:
--------------------------------------------------------------------------------
1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2 | # SPDX-License-Identifier: MIT-0
3 |
4 | #!/bin/bash
5 | set -eux
6 | ASI_VERSION=0.3.1
7 |
8 | # OVERVIEW
9 | # This script stops a SageMaker Studio JupyterLab app, once it's idle for more than X seconds, based on IDLE_TIME_IN_SECONDS configuration.
10 | # Note that this script will fail if either condition is not met:
11 | # 1. The JupyterLab app has internet connectivity to fetch the autostop idle Python package
12 | # 2. The Studio Domain or User Profile execution role has permissions to SageMaker:DeleteApp to delete the JupyterLab app
13 |
14 | # User variables [update as needed]
15 | IDLE_TIME_IN_SECONDS=3600 # The max time (in seconds) the JupyterLab app can stay idle before being terminated.
16 |
17 | # User variables - advanced [update only if needed]
18 | IGNORE_CONNECTIONS=True # Set to False if you want to consider idle JL sessions with active connections as not idle.
19 | SKIP_TERMINALS=False # Set to True if you want to skip any idleness check on Jupyter terminals.
20 |
21 | # System variables [do not change if not needed]
22 | JL_HOSTNAME=0.0.0.0
23 | JL_PORT=8888
24 | JL_BASE_URL=/jupyterlab/default/
25 | CONDA_HOME=/opt/conda/bin
26 | LOG_FILE=/var/log/apps/app_container.log # Writing to app_container.log delivers logs to CW logs.
27 | SOLUTION_DIR=/var/tmp/auto-stop-idle # Do not use /home/sagemaker-user
28 | STATE_FILE=$SOLUTION_DIR/auto_stop_idle.st
29 | PYTHON_PACKAGE=sagemaker_studio_jlab_auto_stop_idle-$ASI_VERSION.tar.gz
30 | PYTHON_SCRIPT_PATH=$SOLUTION_DIR/sagemaker_studio_jlab_auto_stop_idle/auto_stop_idle.py
31 |
32 | # Issue - https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/issues/12
33 | # SM Distribution image 1.6 is not starting cron service by default https://github.com/aws/sagemaker-distribution/issues/354
34 |
35 | # Check if cron needs to be installed ## Handle scenario where script exiting("set -eux") due to non-zero return code by adding true command.
36 | status="$(dpkg-query -W --showformat='${db:Status-Status}' "cron" 2>&1)" || true
37 | if [ ! $? = 0 ] || [ ! "$status" = installed ]; then
38 | # Fixing invoke-rc.d: policy-rc.d denied execution of restart.
39 | sudo /bin/bash -c "echo '#!/bin/sh
40 | exit 0' > /usr/sbin/policy-rc.d"
41 |
42 | # Installing cron.
43 | echo "Installing cron..."
44 | sudo apt install cron
45 | else
46 | echo "Package cron is already installed."
47 | # start/restart the service.
48 | sudo service cron restart
49 | fi
50 |
51 | # Creating solution directory.
52 | sudo mkdir -p $SOLUTION_DIR
53 |
54 | # Downloading autostop idle Python package.
55 | echo "Downloading autostop idle Python package..."
56 | curl -LO --output-dir /var/tmp/ https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/releases/download/v$ASI_VERSION/$PYTHON_PACKAGE
57 | sudo $CONDA_HOME/pip install -U -t $SOLUTION_DIR /var/tmp/$PYTHON_PACKAGE
58 |
59 | # Setting container credential URI variable to /etc/environment to make it available to cron
60 | sudo /bin/bash -c "echo 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI=$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI' >> /etc/environment"
61 |
62 | # Add script to crontab for root.
63 | echo "Adding autostop idle Python script to crontab..."
64 | echo "*/2 * * * * /bin/bash -ic '$CONDA_HOME/python $PYTHON_SCRIPT_PATH --idle-time $IDLE_TIME_IN_SECONDS --hostname $JL_HOSTNAME \
65 | --port $JL_PORT --base-url $JL_BASE_URL --ignore-connections $IGNORE_CONNECTIONS \
66 | --skip-terminals $SKIP_TERMINALS --state-file-path $STATE_FILE >> $LOG_FILE'" | sudo crontab -
67 |
--------------------------------------------------------------------------------
/jupyterlab/auto-stop-idle/python-package/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | share/python-wheels/
24 | *.egg-info/
25 | .installed.cfg
26 | *.egg
27 | MANIFEST
28 |
--------------------------------------------------------------------------------
/jupyterlab/auto-stop-idle/python-package/setup.py:
--------------------------------------------------------------------------------
1 | from __future__ import absolute_import
2 |
3 | from glob import glob
4 | import os
5 | from os.path import basename
6 | from os.path import splitext
7 |
8 | from setuptools import find_packages, setup
9 | from distutils.util import convert_path
10 |
11 | main_ns = {}
12 | ver_path = convert_path('src/sagemaker_studio_jlab_auto_stop_idle/version.py')
13 | with open(ver_path) as ver_file:
14 | exec(ver_file.read(), main_ns)
15 |
16 | setup(
17 | name='sagemaker_studio_jlab_auto_stop_idle',
18 | version=main_ns['__version__'],
19 | description='Auto-stops idle SageMaker Studio JupyterLab applications.',
20 |
21 | packages=find_packages(where='src', exclude=('test',)),
22 | package_dir={'': 'src'},
23 | py_modules=[splitext(basename(path))[0] for path in glob('src/*.py')],
24 |
25 | author='Amazon Web Services',
26 | url='https://github.com/aws-samples/sagemaker-studio-apps-lifecycle-config-examples/tree/main/jupyterlab/scripts/auto-stop-idle',
27 | license='MIT-0',
28 |
29 | classifiers=[
30 | "Development Status :: 5 - Production/Stable",
31 | "Intended Audience :: Developers",
32 | "Natural Language :: English",
33 | "License :: OSI Approved :: MIT-0",
34 | "Programming Language :: Python",
35 | 'Programming Language :: Python :: 3.9',
36 | 'Programming Language :: Python :: 3.10'
37 | ],
38 |
39 | install_requires=[],
40 | extras_require={
41 | },
42 | )
--------------------------------------------------------------------------------
/jupyterlab/auto-stop-idle/python-package/src/sagemaker_studio_jlab_auto_stop_idle/__init__.py:
--------------------------------------------------------------------------------
1 | from __future__ import absolute_import
2 | from sagemaker_studio_jlab_auto_stop_idle.version import __version__
--------------------------------------------------------------------------------
/jupyterlab/auto-stop-idle/python-package/src/sagemaker_studio_jlab_auto_stop_idle/auto_stop_idle.py:
--------------------------------------------------------------------------------
1 | # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2 | # SPDX-License-Identifier: MIT-0
3 |
4 | from datetime import datetime
5 | import os
6 | import getopt, sys
7 | import json
8 | import boto3
9 | import botocore
10 | import requests
11 |
12 | from urllib.parse import urljoin
13 |
14 | DATE_FORMAT = "%Y-%m-%dT%H:%M:%S.%fz"
15 |
16 | def log_message(message):
17 | """
18 | Logs a message.
19 | """
20 | print(f"{datetime.now().strftime(DATE_FORMAT)} - {message}")
21 |
22 | def get_json_medatada():
23 | """
24 | Gets the metadata of the current instance, which include Studio domain identifier and app information.
25 | """
26 | metadata_path = '/opt/ml/metadata/resource-metadata.json'
27 | with open(metadata_path, 'r') as metadata:
28 | json_metadata = json.load(metadata)
29 | return json_metadata
30 |
31 | def delete_app():
32 | """
33 | Deletes the JupyterLab app.
34 | """
35 | metadata = get_json_medatada();
36 | domain_id = metadata["DomainId"]
37 | app_type = metadata["AppType"]
38 | app_name = metadata["ResourceName"]
39 | space_name = metadata["SpaceName"]
40 | resource_arn = metadata["ResourceArn"]
41 | aws_region = resource_arn.split(":")[3]
42 |
43 | log_message(f"[auto-stop-idle] - Deleting app {app_type}-{app_name}. Domain ID: {domain_id}. Space name: {space_name}. Resource ARN: {resource_arn}.")
44 |
45 | try:
46 | client = boto3.client('sagemaker', region_name=aws_region)
47 | client.delete_app(
48 | DomainId=domain_id,
49 | AppType=app_type,
50 | AppName=app_name,
51 | SpaceName=space_name
52 | )
53 | except botocore.exceptions.ClientError as client_error:
54 | error_code = client_error.response['Error']['Code']
55 | if error_code == 'AccessDeniedException' or error_code == 'NotAuthorized':
56 | log_message(f"[auto-stop-idle] - The current execution role does not allow executing the DeleteApp() operation. Please check IAM policy configurations. Exception: {client_error}")
57 | else:
58 | log_message(f"[auto-stop-idle] - An error accurred while deleting app. Exception: {client_error}")
59 | except Exception as e:
60 | log_message(f"[auto-stop-idle] - An error accurred while deleting app. Exception: {e}")
61 |
62 | def create_state_file(state_file_path):
63 | """
64 | Creates a file to store the state for the auto-stop-idle script, only if it does not exist.
65 | Stores the current date in the state file.
66 | """
67 | if not os.path.exists(state_file_path):
68 | with open(state_file_path, 'w') as f:
69 | current_date_as_string = datetime.now().strftime(DATE_FORMAT)
70 | f.write(current_date_as_string)
71 |
72 | def get_state_file_contents(state_file_path):
73 | """
74 | Gets the contents of the state file, consisting of the computed last modified date.
75 | """
76 | with open(state_file_path) as f:
77 | contents = f.readline()
78 | return contents
79 |
80 | def update_state_file(sessions, terminals, state_file_path):
81 | """
82 | Updates the state file with the max last_activity date found in sessions or terminals
83 | """
84 | max_last_activity = datetime.min
85 | if sessions is not None and len(sessions) > 0:
86 | for notebook_session in sessions:
87 | notebook_kernel = notebook_session["kernel"]
88 | last_activity = notebook_kernel["last_activity"]
89 | last_activity_date = datetime.strptime(last_activity, DATE_FORMAT)
90 | if last_activity_date > max_last_activity:
91 | max_last_activity = last_activity_date
92 | if terminals is not None and len(terminals) > 0:
93 | for terminal in terminals:
94 | last_activity = terminal["last_activity"]
95 | last_activity_date = datetime.strptime(last_activity, DATE_FORMAT)
96 | if last_activity_date > max_last_activity:
97 | max_last_activity = last_activity_date
98 |
99 | if max_last_activity > datetime.min:
100 | with open(state_file_path, 'w') as f:
101 | date_as_string = max_last_activity.strftime(DATE_FORMAT)
102 | log_message(f"[auto-stop-idle] - Updating state with last activity date {date_as_string}.")
103 | f.write(date_as_string)
104 |
105 | def is_idle(idle_time, last_activity):
106 | """
107 | Compares the last_activity date with the current date, to check if idle_time has elapsed.
108 | """
109 | last_activity_date = datetime.strptime(last_activity, DATE_FORMAT)
110 | return ((datetime.now() - last_activity_date).total_seconds() > idle_time)
111 |
112 | def get_terminals(app_url, base_url):
113 | """
114 | Gets the running terminals.
115 | """
116 | api_url = urljoin(urljoin(app_url, base_url), "api/terminals")
117 | response = requests.get(api_url, verify=False)
118 | return response.json()
119 |
120 | def get_sessions(app_url, base_url):
121 | """
122 | Gets the running notebook sessions.
123 | """
124 | api_url = urljoin(urljoin(app_url, base_url), "api/sessions")
125 | response = requests.get(api_url, verify=False)
126 | return response.json()
127 |
128 | def check_idle(app_url, base_url, idle_time, ignore_connections, skip_terminals, state_file_path):
129 | """
130 | Checks if all terminals or notebook sessions are idle.
131 | """
132 | idle = True
133 |
134 | # Create state file.
135 | create_state_file(state_file_path)
136 |
137 | terminals = get_terminals(app_url, base_url)
138 | sessions = get_sessions(app_url, base_url)
139 |
140 | terminal_count = len(terminals) if terminals is not None else 0
141 | session_count = len(sessions) if sessions is not None else 0
142 |
143 | # Check sessions.
144 | if session_count > 0:
145 | for notebook_session in sessions:
146 | session_name = notebook_session["name"]
147 | session_id = notebook_session["id"]
148 |
149 | notebook_kernel = notebook_session["kernel"]
150 | kernel_name = notebook_kernel["name"]
151 | kernel_id = notebook_kernel["id"]
152 |
153 | if notebook_kernel["execution_state"] == "idle":
154 | connections = int(notebook_kernel["connections"])
155 | last_activity = notebook_kernel["last_activity"]
156 |
157 | if ignore_connections or connections <= 0:
158 | idle = is_idle(idle_time, last_activity)
159 | if not idle:
160 | reason = f"kernel not idle based on last activity. Last activity: {last_activity}"
161 | else:
162 | reason = "kernel has active connections"
163 | idle = False
164 | else:
165 | reason = "kernel execution state is not idle"
166 | idle = False
167 |
168 | if not idle:
169 | log_message(f"[auto-stop-idle] - Notebook session {session_name} (ID: {session_id}), with kernel {kernel_name} (ID: {kernel_id}) is not idle. Reason: {reason}.")
170 | break
171 |
172 | # Check terminals.
173 | if idle and terminal_count > 0 and not skip_terminals:
174 | for terminal in terminals:
175 | terminal_name = terminal["name"]
176 | last_activity = terminal["last_activity"]
177 |
178 | idle = is_idle(idle_time, last_activity)
179 | if not idle:
180 | reason = f"terminal not idle based on last activity. Last activity: {last_activity}"
181 |
182 | if not idle:
183 | log_message(f"[auto-stop-idle] - Terminal {terminal_name} is not idle. Reason: {reason}.")
184 | break
185 |
186 | # Check last activity date from state.
187 | if idle and session_count <= 0 and (terminal_count <= 0 or skip_terminals):
188 | state_file_contents = get_state_file_contents(state_file_path)
189 | idle = is_idle(idle_time, state_file_contents)
190 | if not idle:
191 | log_message(f"[auto-stop-idle] - App not idle based on last activity state. Last activity: {state_file_contents}")
192 |
193 | # Update state file.
194 | update_state_file(sessions, terminals, state_file_path)
195 |
196 | return idle
197 |
198 | if __name__ == '__main__':
199 |
200 | # Usage info
201 | usage_info = """Usage:
202 | This scripts checks if Studio JupyterLab is idle for X seconds. If it does, it'll stop it:
203 | python auto_stop_idle.py --idle-time [--port ] [--hostname ]
204 | [--base-url ] [--ignore-connections ] [--skip-terminals ] [--state-file-path ]
205 | Type "python auto_stop_idle.py -h" for the available options.
206 | """
207 | # Help info
208 | help_info = """ -t, --idle-time
209 | idle time in seconds
210 | -p, --port
211 | jupyter port
212 | -k, --hostname
213 | jupyter hostname
214 | -u, --base-url
215 | jupyter base URL
216 | -c --ignore-connections
217 | ignoring users connected to idle notebook sessions
218 | -s --skip-terminals
219 | skip checks on terminals
220 | -a --state-file-path
221 | path to a file where to save the state
222 | -h, --help
223 | help information
224 | """
225 |
226 | # Setting default values.
227 | idle_time = 3600
228 | hostname = "0.0.0.0"
229 | base_url = "/jupyterlab/default/"
230 | port = 8888
231 | ignore_connections = True
232 | skip_terminals = False
233 | state_file_path = "/var/tmp/auto-stop-idle/auto_stop_idle.st"
234 |
235 | # Read in command-line parameters
236 | try:
237 | opts, args = getopt.getopt(sys.argv[1:], "ht:p:k:u:c:s:a:", ["help","idle-time=","port=","hostname=","base-url=","ignore-connections=", "skip-terminals=", "state-file-path="])
238 | if len(opts) == 0:
239 | raise getopt.GetoptError("No input parameters!")
240 | for opt, arg in opts:
241 | if opt in ("-h", "--help"):
242 | print(help_info)
243 | exit(0)
244 | elif opt in ("-t", "--idle-time"):
245 | idle_time = int(arg)
246 | elif opt in ("-p", "--port"):
247 | port = str(arg)
248 | elif opt in ("-k", "--hostname"):
249 | hostname = str(arg)
250 | elif opt in ("-u", "--base-url"):
251 | base_url = str(arg)
252 | elif opt in ("-c", "--ignore-connections"):
253 | ignore_connections = False if arg == "False" else True
254 | elif opt in ("-s", "--skip-terminals"):
255 | skip_terminals = True if arg == "True" else False
256 | elif opt in ("-a", "--state-file-path"):
257 | state_file_path = str(arg)
258 | except getopt.GetoptError:
259 | print(usage_info)
260 | exit(1)
261 |
262 | try:
263 | if not idle_time:
264 | log_message("[auto-stop-idle] - Missing '-t' or '--idle_time'")
265 | exit(2)
266 | else:
267 | app_url = f"http://{hostname}:{port}"
268 | idle = check_idle(app_url, base_url, idle_time, ignore_connections, skip_terminals, state_file_path)
269 |
270 | if idle:
271 | log_message("[auto-stop-idle] - Detected JupyterLab idle state. Stopping notebook.")
272 | delete_app()
273 | else:
274 | log_message("[auto-stop-idle] - JupyterLab is not idle. Passing check.")
275 | exit(0)
276 | except Exception as e:
277 | log_message(f"[auto-stop-idle] - An error accurred while checking idle state. Exception: {e}")
278 |
--------------------------------------------------------------------------------
/jupyterlab/auto-stop-idle/python-package/src/sagemaker_studio_jlab_auto_stop_idle/version.py:
--------------------------------------------------------------------------------
1 | __version__ = "0.3.1"
--------------------------------------------------------------------------------